AFiD-GPU: A versatile Navier–Stokes solver for wall-bounded turbulent flows on GPU clusters

Xiaojue Zhu (Corresponding Author), Everett Phillips, Vamsi Spandan, John Donners, Gregory Ruetsch, Joshua Romero, Rodolfo Ostilla-Mónico, Yantao Yang, Detlef Lohse, Roberto Verzicco, Massimiliano Fatica, Richard J.A.M. Stevens

Research output: Contribution to journalArticleAcademicpeer-review

8 Citations (Scopus)
26 Downloads (Pure)

Abstract

The AFiD code, an open source solver for the incompressible Navier–Stokes equations (http://www.afid.eu), has been ported to GPU clusters to tackle large-scale wall-bounded turbulent flow simulations. The GPU porting has been carried out in CUDA Fortran with the extensive use of kernel loop directives (CUF kernels) in order to have a source code as close as possible to the original CPU version; just a few routines have been manually rewritten. A new transpose scheme has been devised to improve the scaling of the Poisson solver, which is the main bottleneck of incompressible solvers. For large meshes the GPU version of the code shows good strong scaling characteristics, and the wall-clock time per step for the GPU version is an order of magnitude smaller than for the CPU version of the code. Due to the increased performance and efficient use of memory, the GPU version of AFiD can perform simulations in parameter ranges that are unprecedented in thermally-driven wall-bounded turbulence. To verify the accuracy of the code, turbulent Rayleigh–Bénard convection and plane Couette flow are simulated and the results are in excellent agreement with the experimental and computational data that have been published in literature. Program summary: Program Title: AFiD-GPU Program Files doi: http://dx.doi.org/10.17632/rwjdg7ry66.1 Licensing provisions: MIT Programming language: Fortran 90, CUDA Fortran, MPI External routines: PGI, CUDA Toolkit, FFTW3, HDF5 Nature of problem: Solving the three-dimensional Navier–Stokes equations coupled with a scalar field in a cubic box bounded between two walls and with periodic boundary conditions in the horizontal directions. Solution method: Second order finite difference method for spatial discretization, third order Runge–Kutta scheme in combination with Crank–Nicolson for the implicit terms for time advancement, two dimensional pencil distributed MPI parallelization, GPU accelerated routines. Additional comments including restrictions and unusual features: The code is available and supported on https://github.com/PhysicsofFluids/AFiD_GPU_opensource.

Original languageEnglish
Pages (from-to)199-210
Number of pages12
JournalComputer physics communications
Volume229
Early online date5 Apr 2018
DOIs
Publication statusPublished - 1 Aug 2018

Fingerprint

turbulent flow
Turbulent flow
Program processors
scaling
licensing
programming languages
Couette flow
problem solving
files
clocks
boxes
Graphics processing unit
mesh
constrictions
Flow simulation
convection
simulation
turbulence
Finite difference method
Computer programming languages

Keywords

  • Finite-difference scheme
  • GPU
  • Parallelization
  • Plane Couette flow
  • Rayleigh–Bénard convection
  • Turbulent flow

Cite this

Zhu, Xiaojue ; Phillips, Everett ; Spandan, Vamsi ; Donners, John ; Ruetsch, Gregory ; Romero, Joshua ; Ostilla-Mónico, Rodolfo ; Yang, Yantao ; Lohse, Detlef ; Verzicco, Roberto ; Fatica, Massimiliano ; Stevens, Richard J.A.M. / AFiD-GPU : A versatile Navier–Stokes solver for wall-bounded turbulent flows on GPU clusters. In: Computer physics communications. 2018 ; Vol. 229. pp. 199-210.
@article{3d8b89b88d514bc8918948751cd32217,
title = "AFiD-GPU: A versatile Navier–Stokes solver for wall-bounded turbulent flows on GPU clusters",
abstract = "The AFiD code, an open source solver for the incompressible Navier–Stokes equations (http://www.afid.eu), has been ported to GPU clusters to tackle large-scale wall-bounded turbulent flow simulations. The GPU porting has been carried out in CUDA Fortran with the extensive use of kernel loop directives (CUF kernels) in order to have a source code as close as possible to the original CPU version; just a few routines have been manually rewritten. A new transpose scheme has been devised to improve the scaling of the Poisson solver, which is the main bottleneck of incompressible solvers. For large meshes the GPU version of the code shows good strong scaling characteristics, and the wall-clock time per step for the GPU version is an order of magnitude smaller than for the CPU version of the code. Due to the increased performance and efficient use of memory, the GPU version of AFiD can perform simulations in parameter ranges that are unprecedented in thermally-driven wall-bounded turbulence. To verify the accuracy of the code, turbulent Rayleigh–B{\'e}nard convection and plane Couette flow are simulated and the results are in excellent agreement with the experimental and computational data that have been published in literature. Program summary: Program Title: AFiD-GPU Program Files doi: http://dx.doi.org/10.17632/rwjdg7ry66.1 Licensing provisions: MIT Programming language: Fortran 90, CUDA Fortran, MPI External routines: PGI, CUDA Toolkit, FFTW3, HDF5 Nature of problem: Solving the three-dimensional Navier–Stokes equations coupled with a scalar field in a cubic box bounded between two walls and with periodic boundary conditions in the horizontal directions. Solution method: Second order finite difference method for spatial discretization, third order Runge–Kutta scheme in combination with Crank–Nicolson for the implicit terms for time advancement, two dimensional pencil distributed MPI parallelization, GPU accelerated routines. Additional comments including restrictions and unusual features: The code is available and supported on https://github.com/PhysicsofFluids/AFiD_GPU_opensource.",
keywords = "Finite-difference scheme, GPU, Parallelization, Plane Couette flow, Rayleigh–B{\'e}nard convection, Turbulent flow",
author = "Xiaojue Zhu and Everett Phillips and Vamsi Spandan and John Donners and Gregory Ruetsch and Joshua Romero and Rodolfo Ostilla-M{\'o}nico and Yantao Yang and Detlef Lohse and Roberto Verzicco and Massimiliano Fatica and Stevens, {Richard J.A.M.}",
year = "2018",
month = "8",
day = "1",
doi = "10.1016/j.cpc.2018.03.026",
language = "English",
volume = "229",
pages = "199--210",
journal = "Computer physics communications",
issn = "0010-4655",
publisher = "Elsevier",

}

Zhu, X, Phillips, E, Spandan, V, Donners, J, Ruetsch, G, Romero, J, Ostilla-Mónico, R, Yang, Y, Lohse, D, Verzicco, R, Fatica, M & Stevens, RJAM 2018, 'AFiD-GPU: A versatile Navier–Stokes solver for wall-bounded turbulent flows on GPU clusters' Computer physics communications, vol. 229, pp. 199-210. https://doi.org/10.1016/j.cpc.2018.03.026

AFiD-GPU : A versatile Navier–Stokes solver for wall-bounded turbulent flows on GPU clusters. / Zhu, Xiaojue (Corresponding Author); Phillips, Everett; Spandan, Vamsi; Donners, John; Ruetsch, Gregory; Romero, Joshua; Ostilla-Mónico, Rodolfo; Yang, Yantao; Lohse, Detlef; Verzicco, Roberto; Fatica, Massimiliano; Stevens, Richard J.A.M.

In: Computer physics communications, Vol. 229, 01.08.2018, p. 199-210.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - AFiD-GPU

T2 - A versatile Navier–Stokes solver for wall-bounded turbulent flows on GPU clusters

AU - Zhu, Xiaojue

AU - Phillips, Everett

AU - Spandan, Vamsi

AU - Donners, John

AU - Ruetsch, Gregory

AU - Romero, Joshua

AU - Ostilla-Mónico, Rodolfo

AU - Yang, Yantao

AU - Lohse, Detlef

AU - Verzicco, Roberto

AU - Fatica, Massimiliano

AU - Stevens, Richard J.A.M.

PY - 2018/8/1

Y1 - 2018/8/1

N2 - The AFiD code, an open source solver for the incompressible Navier–Stokes equations (http://www.afid.eu), has been ported to GPU clusters to tackle large-scale wall-bounded turbulent flow simulations. The GPU porting has been carried out in CUDA Fortran with the extensive use of kernel loop directives (CUF kernels) in order to have a source code as close as possible to the original CPU version; just a few routines have been manually rewritten. A new transpose scheme has been devised to improve the scaling of the Poisson solver, which is the main bottleneck of incompressible solvers. For large meshes the GPU version of the code shows good strong scaling characteristics, and the wall-clock time per step for the GPU version is an order of magnitude smaller than for the CPU version of the code. Due to the increased performance and efficient use of memory, the GPU version of AFiD can perform simulations in parameter ranges that are unprecedented in thermally-driven wall-bounded turbulence. To verify the accuracy of the code, turbulent Rayleigh–Bénard convection and plane Couette flow are simulated and the results are in excellent agreement with the experimental and computational data that have been published in literature. Program summary: Program Title: AFiD-GPU Program Files doi: http://dx.doi.org/10.17632/rwjdg7ry66.1 Licensing provisions: MIT Programming language: Fortran 90, CUDA Fortran, MPI External routines: PGI, CUDA Toolkit, FFTW3, HDF5 Nature of problem: Solving the three-dimensional Navier–Stokes equations coupled with a scalar field in a cubic box bounded between two walls and with periodic boundary conditions in the horizontal directions. Solution method: Second order finite difference method for spatial discretization, third order Runge–Kutta scheme in combination with Crank–Nicolson for the implicit terms for time advancement, two dimensional pencil distributed MPI parallelization, GPU accelerated routines. Additional comments including restrictions and unusual features: The code is available and supported on https://github.com/PhysicsofFluids/AFiD_GPU_opensource.

AB - The AFiD code, an open source solver for the incompressible Navier–Stokes equations (http://www.afid.eu), has been ported to GPU clusters to tackle large-scale wall-bounded turbulent flow simulations. The GPU porting has been carried out in CUDA Fortran with the extensive use of kernel loop directives (CUF kernels) in order to have a source code as close as possible to the original CPU version; just a few routines have been manually rewritten. A new transpose scheme has been devised to improve the scaling of the Poisson solver, which is the main bottleneck of incompressible solvers. For large meshes the GPU version of the code shows good strong scaling characteristics, and the wall-clock time per step for the GPU version is an order of magnitude smaller than for the CPU version of the code. Due to the increased performance and efficient use of memory, the GPU version of AFiD can perform simulations in parameter ranges that are unprecedented in thermally-driven wall-bounded turbulence. To verify the accuracy of the code, turbulent Rayleigh–Bénard convection and plane Couette flow are simulated and the results are in excellent agreement with the experimental and computational data that have been published in literature. Program summary: Program Title: AFiD-GPU Program Files doi: http://dx.doi.org/10.17632/rwjdg7ry66.1 Licensing provisions: MIT Programming language: Fortran 90, CUDA Fortran, MPI External routines: PGI, CUDA Toolkit, FFTW3, HDF5 Nature of problem: Solving the three-dimensional Navier–Stokes equations coupled with a scalar field in a cubic box bounded between two walls and with periodic boundary conditions in the horizontal directions. Solution method: Second order finite difference method for spatial discretization, third order Runge–Kutta scheme in combination with Crank–Nicolson for the implicit terms for time advancement, two dimensional pencil distributed MPI parallelization, GPU accelerated routines. Additional comments including restrictions and unusual features: The code is available and supported on https://github.com/PhysicsofFluids/AFiD_GPU_opensource.

KW - Finite-difference scheme

KW - GPU

KW - Parallelization

KW - Plane Couette flow

KW - Rayleigh–Bénard convection

KW - Turbulent flow

UR - http://www.scopus.com/inward/record.url?scp=85045850157&partnerID=8YFLogxK

U2 - 10.1016/j.cpc.2018.03.026

DO - 10.1016/j.cpc.2018.03.026

M3 - Article

VL - 229

SP - 199

EP - 210

JO - Computer physics communications

JF - Computer physics communications

SN - 0010-4655

ER -