An FPGA-based low-latency network processing for spark streaming

Kohei Nakamura, Ami Hayashi, Hiroki Matsutani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Low-latency stream data processing is a key enabler for on-line data analysis applications, such as detecting anomaly conditions and change points from stream data continuously generated from sensors and networking services. Existing stream processing frameworks are classified into micro-batch and one-at-a-time processing methodology. Apache Spark Streaming employs the micro-batch methodology, where data analysis is repeatedly performed for a series of data arrived during a short time period, called a micro batch. A rich set of data analysis libraries provided by Spark, such as machine learning and graph processing, can be applied for the micro batches. However, a drawback of the micro-batch processing methodology is a high latency for detecting anomaly conditions and change points. This is because data are accumulated in a micro batch (e.g., 1 sec length) and then data analysis is performed for the micro batch. In this paper, we propose to offload one-at-a-time methodology analysis functions on an FPGA-based 10Gbit Ethernet network interface card (FPGA NIC) in cooperation with Spark Streaming framework, in order to significantly reduce the processing latency and improve the processing throughput. We implemented word count and change-point detection applications on Spark Streaming with our FPGA NIC, where a one-at-a-time methodology analysis logic is implemented. Experiment results demonstrates that the word count throughput is improved by 22x and the change-point detection latency is reduced by 94.12% compared to the original Spark Streaming. Our approach can complement the existing micro-batch methodology data analysis framework with ultra low latency one-at-a-time methodology logic.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
EditorsRonay Ak, George Karypis, Yinglong Xia, Xiaohua Tony Hu, Philip S. Yu, James Joshi, Lyle Ungar, Ling Liu, Aki-Hiro Sato, Toyotaro Suzumura, Sudarsan Rachuri, Rama Govindaraju, Weijia Xu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2410-2415
Number of pages6
ISBN (Electronic)9781467390040
DOIs
Publication statusPublished - 2016 Jan 1
Event4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States
Duration: 2016 Dec 52016 Dec 8

Publication series

NameProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016

Other

Other4th IEEE International Conference on Big Data, Big Data 2016
CountryUnited States
CityWashington
Period16/12/516/12/8

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'An FPGA-based low-latency network processing for spark streaming'. Together they form a unique fingerprint.

  • Cite this

    Nakamura, K., Hayashi, A., & Matsutani, H. (2016). An FPGA-based low-latency network processing for spark streaming. In R. Ak, G. Karypis, Y. Xia, X. T. Hu, P. S. Yu, J. Joshi, L. Ungar, L. Liu, A-H. Sato, T. Suzumura, S. Rachuri, R. Govindaraju, & W. Xu (Eds.), Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 (pp. 2410-2415). [7840876] (Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2016.7840876