Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL

Hiroyuki Noda, Ryotaro Sakai, Takaaki Miyajima, Naoyuki Fujita, Hideharu Amano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters which are a type of electric propulsion engines. It treats ions, neutrons, and electrons as particles and is highly accurate compared with other methods which treat them as a fluid. However, it requires a large computational cost. The Japan Aerospace Exploration Agency (JAXA) is developing a software package called NSRU-Full-PIC that implements such a method. One of the important computing tasks in NSRU-Full-PIC is the aggregation process, which causes Read-After-write (RAW) hazards, and hence makes parallel computation difficult. In this paper, we tackle this problem by introducing a reduction operation with an FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating-point DSPs for high performance numerical computation. Intel FPGA SDK for OpenCL is available for this platform for easy offloading of complex tasks. We implemented 4 types reduction kernels and compared their performance. As a result, the aggregation process becomes 76.4 times faster than the single-thread version on an ARM Cortex-A9 1.5 GHz, and 14.1 times faster than that on a Xeon E5-2660 2.9 GHz in our fastest implementation, Read-16-Vect. In this implementation, we achieved 93.5% of theoretical performance with optimized FPGA resources.

Original languageEnglish
Title of host publicationProceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450353168
DOIs
Publication statusPublished - 2017 Jun 7
Event8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017 - Bochum, Germany
Duration: 2017 Jun 72017 Jun 9

Other

Other8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017
CountryGermany
CityBochum
Period17/6/717/6/9

Fingerprint

Hall thrusters
Field programmable gate arrays (FPGA)
Agglomeration
Electric propulsion
Software packages
Particle accelerators
Hazards
Neutrons
Engines
Fluids
Electrons
Computer simulation
Ions
Costs

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Noda, H., Sakai, R., Miyajima, T., Fujita, N., & Amano, H. (2017). Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL. In Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017 [20] Association for Computing Machinery. https://doi.org/10.1145/3120895.3120915

Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL. / Noda, Hiroyuki; Sakai, Ryotaro; Miyajima, Takaaki; Fujita, Naoyuki; Amano, Hideharu.

Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017. Association for Computing Machinery, 2017. 20.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Noda, H, Sakai, R, Miyajima, T, Fujita, N & Amano, H 2017, Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL. in Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017., 20, Association for Computing Machinery, 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017, Bochum, Germany, 17/6/7. https://doi.org/10.1145/3120895.3120915
Noda H, Sakai R, Miyajima T, Fujita N, Amano H. Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL. In Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017. Association for Computing Machinery. 2017. 20 https://doi.org/10.1145/3120895.3120915
Noda, Hiroyuki ; Sakai, Ryotaro ; Miyajima, Takaaki ; Fujita, Naoyuki ; Amano, Hideharu. / Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL. Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017. Association for Computing Machinery, 2017.
@inproceedings{8468fb4b6aa9430fb1085baaa5b6da0e,
title = "Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL",
abstract = "The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters which are a type of electric propulsion engines. It treats ions, neutrons, and electrons as particles and is highly accurate compared with other methods which treat them as a fluid. However, it requires a large computational cost. The Japan Aerospace Exploration Agency (JAXA) is developing a software package called NSRU-Full-PIC that implements such a method. One of the important computing tasks in NSRU-Full-PIC is the aggregation process, which causes Read-After-write (RAW) hazards, and hence makes parallel computation difficult. In this paper, we tackle this problem by introducing a reduction operation with an FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating-point DSPs for high performance numerical computation. Intel FPGA SDK for OpenCL is available for this platform for easy offloading of complex tasks. We implemented 4 types reduction kernels and compared their performance. As a result, the aggregation process becomes 76.4 times faster than the single-thread version on an ARM Cortex-A9 1.5 GHz, and 14.1 times faster than that on a Xeon E5-2660 2.9 GHz in our fastest implementation, Read-16-Vect. In this implementation, we achieved 93.5{\%} of theoretical performance with optimized FPGA resources.",
author = "Hiroyuki Noda and Ryotaro Sakai and Takaaki Miyajima and Naoyuki Fujita and Hideharu Amano",
year = "2017",
month = "6",
day = "7",
doi = "10.1145/3120895.3120915",
language = "English",
booktitle = "Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL

AU - Noda, Hiroyuki

AU - Sakai, Ryotaro

AU - Miyajima, Takaaki

AU - Fujita, Naoyuki

AU - Amano, Hideharu

PY - 2017/6/7

Y1 - 2017/6/7

N2 - The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters which are a type of electric propulsion engines. It treats ions, neutrons, and electrons as particles and is highly accurate compared with other methods which treat them as a fluid. However, it requires a large computational cost. The Japan Aerospace Exploration Agency (JAXA) is developing a software package called NSRU-Full-PIC that implements such a method. One of the important computing tasks in NSRU-Full-PIC is the aggregation process, which causes Read-After-write (RAW) hazards, and hence makes parallel computation difficult. In this paper, we tackle this problem by introducing a reduction operation with an FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating-point DSPs for high performance numerical computation. Intel FPGA SDK for OpenCL is available for this platform for easy offloading of complex tasks. We implemented 4 types reduction kernels and compared their performance. As a result, the aggregation process becomes 76.4 times faster than the single-thread version on an ARM Cortex-A9 1.5 GHz, and 14.1 times faster than that on a Xeon E5-2660 2.9 GHz in our fastest implementation, Read-16-Vect. In this implementation, we achieved 93.5% of theoretical performance with optimized FPGA resources.

AB - The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters which are a type of electric propulsion engines. It treats ions, neutrons, and electrons as particles and is highly accurate compared with other methods which treat them as a fluid. However, it requires a large computational cost. The Japan Aerospace Exploration Agency (JAXA) is developing a software package called NSRU-Full-PIC that implements such a method. One of the important computing tasks in NSRU-Full-PIC is the aggregation process, which causes Read-After-write (RAW) hazards, and hence makes parallel computation difficult. In this paper, we tackle this problem by introducing a reduction operation with an FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating-point DSPs for high performance numerical computation. Intel FPGA SDK for OpenCL is available for this platform for easy offloading of complex tasks. We implemented 4 types reduction kernels and compared their performance. As a result, the aggregation process becomes 76.4 times faster than the single-thread version on an ARM Cortex-A9 1.5 GHz, and 14.1 times faster than that on a Xeon E5-2660 2.9 GHz in our fastest implementation, Read-16-Vect. In this implementation, we achieved 93.5% of theoretical performance with optimized FPGA resources.

UR - http://www.scopus.com/inward/record.url?scp=85040673526&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85040673526&partnerID=8YFLogxK

U2 - 10.1145/3120895.3120915

DO - 10.1145/3120895.3120915

M3 - Conference contribution

AN - SCOPUS:85040673526

BT - Proceedings of the 8th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2017

PB - Association for Computing Machinery

ER -