TY - JOUR
T1 - An efficient blocking M2L translation for low-frequency fast multipole method in three dimensions
AU - Takahashi, Toru
AU - Shimba, Yuta
AU - Isakari, Hiroshi
AU - Matsumoto, Toshiro
N1 - Funding Information:
We would like to thank Mr. Horibe for developing and testing a part of the computer programs used in this study. This work was partially supported by JSPS KAKENHI Grant Number 15K06683 .
Publisher Copyright:
© 2016 Elsevier B.V.
PY - 2016/5/1
Y1 - 2016/5/1
N2 - We propose an efficient scheme to perform the multipole-to-local (M2L) translation in the three-dimensional low-frequency fast multipole method (LFFMM). Our strategy is to combine a group of matrix-vector products associated with M2L translation into a matrix-matrix product in order to diminish the memory traffic. For this purpose, we first developed a grouping method (termed as internal blocking) based on the congruent transformations (rotational and reflectional symmetries) of M2L-translators for each target box in the FMM hierarchy (adaptive octree). Next, we considered another method of grouping (termed as external blocking) that was able to handle M2L translations for multiple target boxes collectively by using the translational invariance of the M2L translation. By combining these internal and external blockings, the M2L translation can be performed efficiently whilst preservingthe numerical accuracy exactly. We assessed the proposed blocking scheme numerically and applied it to the boundary integral equation method to solve electromagnetic scattering problems for perfectly electrical conductor. From the numerical results, it was found that the proposed M2L scheme achieved a few times speedup compared to the non-blocking scheme.
AB - We propose an efficient scheme to perform the multipole-to-local (M2L) translation in the three-dimensional low-frequency fast multipole method (LFFMM). Our strategy is to combine a group of matrix-vector products associated with M2L translation into a matrix-matrix product in order to diminish the memory traffic. For this purpose, we first developed a grouping method (termed as internal blocking) based on the congruent transformations (rotational and reflectional symmetries) of M2L-translators for each target box in the FMM hierarchy (adaptive octree). Next, we considered another method of grouping (termed as external blocking) that was able to handle M2L translations for multiple target boxes collectively by using the translational invariance of the M2L translation. By combining these internal and external blockings, the M2L translation can be performed efficiently whilst preservingthe numerical accuracy exactly. We assessed the proposed blocking scheme numerically and applied it to the boundary integral equation method to solve electromagnetic scattering problems for perfectly electrical conductor. From the numerical results, it was found that the proposed M2L scheme achieved a few times speedup compared to the non-blocking scheme.
KW - BLAS
KW - Boundary element method
KW - Cache/register blocking
KW - Congruent transformation
KW - Fast multipole method
KW - Matrix-matrix product
KW - Multipole-to-local translation
KW - Numerical linear algebra
UR - http://www.scopus.com/inward/record.url?scp=84969331970&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969331970&partnerID=8YFLogxK
U2 - 10.1016/j.cpc.2016.01.008
DO - 10.1016/j.cpc.2016.01.008
M3 - Article
AN - SCOPUS:84969331970
SN - 0010-4655
VL - 202
SP - 151
EP - 164
JO - Computer Physics Communications
JF - Computer Physics Communications
ER -