Open Access System for Information Sharing

Department of Mechanical Engineering (기계공학과) 3. Theses_Ph.D.

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

GPU-accelerated ADI methods for numerical solutions of incompressible and compressible Navier-Stokes equations

Title: GPU-accelerated ADI methods for numerical solutions of incompressible and compressible Navier-Stokes equations

Authors: 하상현

Date Issued: 2021

Publisher: 포항공과대학교

Abstract: Computational methods for GPU-accelerated solutions of incompressible and compressible Navier-Stokes equations have been developed. In particular, numerical methods based on the Alternating Direction Implicit (ADI) method have been implemented for fast computation on a heterogeneous computing environment. In the first part of the research, a single-GPU implementation of a semi-implicit fractional-step method for incompressible Navier-Stokes equations is presented. Non-iterative, direct solution methods used in the fractional-step method take advantage of tridiagonal systems whose solution is known to be the main bottleneck for a massively parallel computation. Various aspects of the programming model of Compute Unified Device Architecture (CUDA) are considered to develop a well-optimized flow solver for Direct Numerical Simulation (DNS) of a spatially developing boundary layer. Up to a 65× speedup on 134 million grid cells is achieved on a single Tesla P100 GPU in comparison with a single-core Xeon E5-2660 v3 CPU. In the second part of the research, methodologies for domain decomposition are proposed to extend the flow solver to a multi-GPU environment. When data is distributed to multiple GPUs, matrix transposition requires all-to-all communication, which significantly increases computational overhead. Thus, a new strategy for domain decomposition free from all-to-all transposition is proposed. To do so, the computational domain is divided in the wall-normal direction, and decoupled tridiagonal systems are obtained using Parallel Diagonal Dominant (PDD) and Parallel Partition (PPT) methods. An optimal batch size is determined to maximize the performance of these methods within a given amount of GPU memory. The utility of this method is demonstrated in two cases of DNS: a zero-pressure gradient turbulent boundary layer on 1.1 billion grid cells and a K-type transitional boundary layer on 1.4 billion grid cells. Both these simulations have been run on two GPU nodes consisting of totally 8 P100 GPUs, which shows a promising potential of the present method in large-scale simulations. In the third part of the research, a new algorithm for solutions of compressible Reynolds-averaged Navier-Stokes (RANS) equations is developed to apply the ADI method to complex flow configurations such as turbomachinery flows. The ADI method applied to compressible Navier-Stokes equations on a multi-block grid requires the solution of block-tridiagonal systems consisting of 5 × 5 dense sub-systems. To design an algorithm suited for GPUs, these systems are reordered according to even-odd indices to reduce them in a divide-and-conquer manner. Recursive application of the algorithm results in a number of smaller sub-systems, which are well suited for GPU acceleration. Using a single Titan V GPU, a 26× speedup is achieved in a simulation of the NASA Rotor 67 compressor on 4.6 million grid cells.

URI: http://postech.dcollection.net/common/orgView/200000372703
https://oasis.postech.ac.kr/handle/2014.oak/112011

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Mechanical Engineering (기계공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse