Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

Rio Yokota, L. A. Barba, Tetsu Narumi, Kenji Yasuoka

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (fmm) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the fft algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the tsubame-2.0 system). The fft-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-All communication pattern of the fft algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.

Original languageEnglish
Pages (from-to)445-455
Number of pages11
JournalComputer Physics Communications
Volume184
Issue number3
DOIs
Publication statusPublished - 2013 Mar

Fingerprint

spectral methods
multipoles
Vortex flow
Turbulence
turbulence
vortices
simulation
engines
Engines
Direct numerical simulation
Kinetic energy
Navier Stokes equations
direct numerical simulation
Navier-Stokes equation
mesh
Hardware
hardware
energy spectra
kinetic energy
communication

Keywords

  • Fast multipole method
  • gpu
  • Integral equations
  • Isotropic turbulence

ASJC Scopus subject areas

  • Hardware and Architecture
  • Physics and Astronomy(all)

Cite this

Petascale turbulence simulation using a highly parallel fast multipole method on GPUs. / Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji.

In: Computer Physics Communications, Vol. 184, No. 3, 03.2013, p. 445-455.

Research output: Contribution to journalArticle

Yokota, Rio ; Barba, L. A. ; Narumi, Tetsu ; Yasuoka, Kenji. / Petascale turbulence simulation using a highly parallel fast multipole method on GPUs. In: Computer Physics Communications. 2013 ; Vol. 184, No. 3. pp. 445-455.
@article{702494dfe8564f2b8433c6d308693148,
title = "Petascale turbulence simulation using a highly parallel fast multipole method on GPUs",
abstract = "This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (fmm) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the fft algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74{\%} parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the tsubame-2.0 system). The fft-based spectral method is able to achieve just 14{\%} parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-All communication pattern of the fft algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.",
keywords = "Fast multipole method, gpu, Integral equations, Isotropic turbulence",
author = "Rio Yokota and Barba, {L. A.} and Tetsu Narumi and Kenji Yasuoka",
year = "2013",
month = "3",
doi = "10.1016/j.cpc.2012.09.011",
language = "English",
volume = "184",
pages = "445--455",
journal = "Computer Physics Communications",
issn = "0010-4655",
publisher = "Elsevier",
number = "3",

}

TY - JOUR

T1 - Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

AU - Yokota, Rio

AU - Barba, L. A.

AU - Narumi, Tetsu

AU - Yasuoka, Kenji

PY - 2013/3

Y1 - 2013/3

N2 - This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (fmm) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the fft algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the tsubame-2.0 system). The fft-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-All communication pattern of the fft algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.

AB - This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (fmm) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the fft algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the tsubame-2.0 system). The fft-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-All communication pattern of the fft algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.

KW - Fast multipole method

KW - gpu

KW - Integral equations

KW - Isotropic turbulence

UR - http://www.scopus.com/inward/record.url?scp=84872043994&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872043994&partnerID=8YFLogxK

U2 - 10.1016/j.cpc.2012.09.011

DO - 10.1016/j.cpc.2012.09.011

M3 - Article

VL - 184

SP - 445

EP - 455

JO - Computer Physics Communications

JF - Computer Physics Communications

SN - 0010-4655

IS - 3

ER -