Tree cutting approach for domain partitioning on forest-of-octrees-based block-structured static adaptive mesh refinement with lattice Boltzmann method

Yuta Hasegawa, Takayuki Aoki, Hiromichi Kobayashi, Yasuhiro Idomura, Naoyuki Onodera

研究成果: Article査読

抄録

The aerodynamics simulation code based on the lattice Boltzmann method (LBM) using forest-of-octrees-based block-structured adaptive mesh refinement (AMR) with temporary-fixed refinement was implemented, and its performance was evaluated on GPU-based supercomputers. Although the Space-Filling-Curve-based (SFC) domain partitioning algorithm for the octree-based AMR has been widely used on conventional CPU-based supercomputers, accelerated computation on GPU-based supercomputers revealed a bottleneck due to costly halo data communication. Our new tree cutting approach adopts a hybrid domain partitioning with the coarse structured block decomposition and the SFC partitioning in each block. This hybrid approach improved the locality and the topology of the partitioned sub-domains and reduced the amount of the halo communication to one-third of the original SFC approach. In the strong scaling test, the code achieved maximum ×1.82 speedup at the performance of 2207 MLUPS (mega-lattice update per second) on 128 GPUs (NVIDIA® Tesla® V100). In the weak scaling test, the code achieved 9620 MLUPS at 128 GPUs with 4.473 billion grid points, while keeping the parallel efficiency of 93.4% from 8 to 128 GPUs.

本文言語English
論文番号102851
ジャーナルParallel Computing
108
DOI
出版ステータスPublished - 2021 12月

ASJC Scopus subject areas

  • ソフトウェア
  • 理論的コンピュータサイエンス
  • ハードウェアとアーキテクチャ
  • コンピュータ ネットワークおよび通信
  • コンピュータ グラフィックスおよびコンピュータ支援設計
  • 人工知能

フィンガープリント

「Tree cutting approach for domain partitioning on forest-of-octrees-based block-structured static adaptive mesh refinement with lattice Boltzmann method」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル