An ARM-based heterogeneous FPGA accelerator for hall thruster simulation

Hiroyuki Noda, Manfred Orsztynowicz, Kensuke Iizuka, Takaaki Miyajima, Naoyuki Fujita, Hideharu Amano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters, a propulsion mechanism of satellites. The Japan Aerospace Exploration Agency (JAXA) has been developing a software package called NSRU-Full-PIC for the design of Hall thrusters. Since the numerical simulation of NSRU-Full-PIC requires a large computing power, energy efficient accelerators are essential. However, because of the frequent random memory access and Read-After-Write (RAW) hazard, acceleration with GPUs is difficult. In this paper, we tackle the problems by cooperating a CPU and an FPGA in an ARM-based heterogeneous FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating point DSPs for high performance yet low power numerical computation. Intel FPGA SDK for OpenCL is available in the platform for easy offloading of complex tasks. Heavy load processes in NSRU-Full-PIC are implemented with a hardware/software co-design on Arria 10 SoC. Our implementation improved the power consumption by 5.66 times compared to the original code on a Xeon E5-2680 v2 2.8 GHz . The total energy consumption was reduced to 88.44% of the Xeon implementation. The target tasks become 3.48 times faster than the original code on an only ARM Cortex-A9 1.5 GHz in Arria 10 SoC, and 2.50 times faster than the implementation using atomic instructions on an NVIDIA K20c GPU.

Original languageEnglish
Title of host publicationProceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450372558
DOIs
Publication statusPublished - 2019 Jun 6
Event10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019 - Nagasaki, Japan
Duration: 2019 Jun 62019 Jun 7

Publication series

NameACM International Conference Proceeding Series

Conference

Conference10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019
CountryJapan
CityNagasaki
Period19/6/619/6/7

Fingerprint

Hall thrusters
Particle accelerators
Field programmable gate arrays (FPGA)
Computer simulation
Software packages
Propulsion
Program processors
Hazards
Electric power utilization
Energy utilization
Satellites
Hardware
Data storage equipment
System-on-chip
Graphics processing unit

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Noda, H., Orsztynowicz, M., Iizuka, K., Miyajima, T., Fujita, N., & Amano, H. (2019). An ARM-based heterogeneous FPGA accelerator for hall thruster simulation. In Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019 [9] (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3337801.3337812

An ARM-based heterogeneous FPGA accelerator for hall thruster simulation. / Noda, Hiroyuki; Orsztynowicz, Manfred; Iizuka, Kensuke; Miyajima, Takaaki; Fujita, Naoyuki; Amano, Hideharu.

Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019. Association for Computing Machinery, 2019. 9 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Noda, H, Orsztynowicz, M, Iizuka, K, Miyajima, T, Fujita, N & Amano, H 2019, An ARM-based heterogeneous FPGA accelerator for hall thruster simulation. in Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019., 9, ACM International Conference Proceeding Series, Association for Computing Machinery, 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019, Nagasaki, Japan, 19/6/6. https://doi.org/10.1145/3337801.3337812
Noda H, Orsztynowicz M, Iizuka K, Miyajima T, Fujita N, Amano H. An ARM-based heterogeneous FPGA accelerator for hall thruster simulation. In Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019. Association for Computing Machinery. 2019. 9. (ACM International Conference Proceeding Series). https://doi.org/10.1145/3337801.3337812
Noda, Hiroyuki ; Orsztynowicz, Manfred ; Iizuka, Kensuke ; Miyajima, Takaaki ; Fujita, Naoyuki ; Amano, Hideharu. / An ARM-based heterogeneous FPGA accelerator for hall thruster simulation. Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019. Association for Computing Machinery, 2019. (ACM International Conference Proceeding Series).
@inproceedings{44971278e5044b02bd05a6ee8a073c61,
title = "An ARM-based heterogeneous FPGA accelerator for hall thruster simulation",
abstract = "The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters, a propulsion mechanism of satellites. The Japan Aerospace Exploration Agency (JAXA) has been developing a software package called NSRU-Full-PIC for the design of Hall thrusters. Since the numerical simulation of NSRU-Full-PIC requires a large computing power, energy efficient accelerators are essential. However, because of the frequent random memory access and Read-After-Write (RAW) hazard, acceleration with GPUs is difficult. In this paper, we tackle the problems by cooperating a CPU and an FPGA in an ARM-based heterogeneous FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating point DSPs for high performance yet low power numerical computation. Intel FPGA SDK for OpenCL is available in the platform for easy offloading of complex tasks. Heavy load processes in NSRU-Full-PIC are implemented with a hardware/software co-design on Arria 10 SoC. Our implementation improved the power consumption by 5.66 times compared to the original code on a Xeon E5-2680 v2 2.8 GHz . The total energy consumption was reduced to 88.44{\%} of the Xeon implementation. The target tasks become 3.48 times faster than the original code on an only ARM Cortex-A9 1.5 GHz in Arria 10 SoC, and 2.50 times faster than the implementation using atomic instructions on an NVIDIA K20c GPU.",
author = "Hiroyuki Noda and Manfred Orsztynowicz and Kensuke Iizuka and Takaaki Miyajima and Naoyuki Fujita and Hideharu Amano",
year = "2019",
month = "6",
day = "6",
doi = "10.1145/3337801.3337812",
language = "English",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",
booktitle = "Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019",

}

TY - GEN

T1 - An ARM-based heterogeneous FPGA accelerator for hall thruster simulation

AU - Noda, Hiroyuki

AU - Orsztynowicz, Manfred

AU - Iizuka, Kensuke

AU - Miyajima, Takaaki

AU - Fujita, Naoyuki

AU - Amano, Hideharu

PY - 2019/6/6

Y1 - 2019/6/6

N2 - The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters, a propulsion mechanism of satellites. The Japan Aerospace Exploration Agency (JAXA) has been developing a software package called NSRU-Full-PIC for the design of Hall thrusters. Since the numerical simulation of NSRU-Full-PIC requires a large computing power, energy efficient accelerators are essential. However, because of the frequent random memory access and Read-After-Write (RAW) hazard, acceleration with GPUs is difficult. In this paper, we tackle the problems by cooperating a CPU and an FPGA in an ARM-based heterogeneous FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating point DSPs for high performance yet low power numerical computation. Intel FPGA SDK for OpenCL is available in the platform for easy offloading of complex tasks. Heavy load processes in NSRU-Full-PIC are implemented with a hardware/software co-design on Arria 10 SoC. Our implementation improved the power consumption by 5.66 times compared to the original code on a Xeon E5-2680 v2 2.8 GHz . The total energy consumption was reduced to 88.44% of the Xeon implementation. The target tasks become 3.48 times faster than the original code on an only ARM Cortex-A9 1.5 GHz in Arria 10 SoC, and 2.50 times faster than the implementation using atomic instructions on an NVIDIA K20c GPU.

AB - The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters, a propulsion mechanism of satellites. The Japan Aerospace Exploration Agency (JAXA) has been developing a software package called NSRU-Full-PIC for the design of Hall thrusters. Since the numerical simulation of NSRU-Full-PIC requires a large computing power, energy efficient accelerators are essential. However, because of the frequent random memory access and Read-After-Write (RAW) hazard, acceleration with GPUs is difficult. In this paper, we tackle the problems by cooperating a CPU and an FPGA in an ARM-based heterogeneous FPGA accelerator. We use Intel’s mid-range SoC, Arria 10 which embeds floating point DSPs for high performance yet low power numerical computation. Intel FPGA SDK for OpenCL is available in the platform for easy offloading of complex tasks. Heavy load processes in NSRU-Full-PIC are implemented with a hardware/software co-design on Arria 10 SoC. Our implementation improved the power consumption by 5.66 times compared to the original code on a Xeon E5-2680 v2 2.8 GHz . The total energy consumption was reduced to 88.44% of the Xeon implementation. The target tasks become 3.48 times faster than the original code on an only ARM Cortex-A9 1.5 GHz in Arria 10 SoC, and 2.50 times faster than the implementation using atomic instructions on an NVIDIA K20c GPU.

UR - http://www.scopus.com/inward/record.url?scp=85070566440&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070566440&partnerID=8YFLogxK

U2 - 10.1145/3337801.3337812

DO - 10.1145/3337801.3337812

M3 - Conference contribution

T3 - ACM International Conference Proceeding Series

BT - Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019

PB - Association for Computing Machinery

ER -