TY - GEN
T1 - FPGA based Power-Efficient Edge Server to Accelerate Speech Interface for Socially Assistive Robotics
AU - Gulzar, Haris
AU - Shakeel, Muhammad
AU - Itoyama, Katsutoshi
AU - Nakadai, Kazuhiro
AU - Nishida, Kenji
AU - Amano, Hideharu
AU - Eda, Takeharu
N1 - Funding Information:
*This work is supported by JST, CREST Grant No. JPMJCR19K1, Japan. 1The authors are with the Department of Systems and Control Engineering, School of Engineering, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8552, JAPAN. (e-mail: {gulzar, shakeel, nishida, itoyama, nakadai}@ra.sc.e.titech.ac.jp) 2Honda Research Institute Japan Co., Ltd., 8-1, Honcho, Wako, Saitama, 351-0188, JAPAN. 3Department of Information and Computer Science, Keio University, Yokohama, Saitama, 223-8522, JAPAN (e-mail: hunga@am.ics.keio.ac.jp) 4NTT Software Innovation Center, Musashino, Tokyo, JAPAN (e-mail: takeharu.eda.bx@hco.ntt.co.jp) Fig. 1. Edge server receives the speech signal from connected microphone devices, processes it and sends commands to the relevant connected devices.
Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Socially Assistive Robotics (SAR) is a sustainable solution for the growing elderly and disabled population requiring proper care and supervision. Internet of Things (IoT) and Edge Computing can leverage SAR by providing in-house computation of connected devices and offering a secure, autonomous, and power-efficient framework. In this study, we have proposed using a System-on-Chip (SoC) based device as an edge server, which provides a local speech recognition interface for connected IoT devices in the targeted area. Convolutional Neural Network (CNN) is used to detect a set of frequently used speech commands which are useful to control home appliances and interact with assistive robots. Proposed CNN achieves state-of-the-art accuracy with a meager computing budget. It delivers 96.14% accuracy with a 20X smaller number of parameters and 137X fewer Floating Point Operations (FLOPS) compared to similarly performing CNN networks. To address the challenge of latency requirement for practical applications, parallelization of CNN helped to achieve 6.67X times faster inference speed than its base implementation. Lastly, implementing CNN on SoC-based edge device achieved at least 5X and 7X reduction in net power consumption compared to GPU and CPU devices respectively.
AB - Socially Assistive Robotics (SAR) is a sustainable solution for the growing elderly and disabled population requiring proper care and supervision. Internet of Things (IoT) and Edge Computing can leverage SAR by providing in-house computation of connected devices and offering a secure, autonomous, and power-efficient framework. In this study, we have proposed using a System-on-Chip (SoC) based device as an edge server, which provides a local speech recognition interface for connected IoT devices in the targeted area. Convolutional Neural Network (CNN) is used to detect a set of frequently used speech commands which are useful to control home appliances and interact with assistive robots. Proposed CNN achieves state-of-the-art accuracy with a meager computing budget. It delivers 96.14% accuracy with a 20X smaller number of parameters and 137X fewer Floating Point Operations (FLOPS) compared to similarly performing CNN networks. To address the challenge of latency requirement for practical applications, parallelization of CNN helped to achieve 6.67X times faster inference speed than its base implementation. Lastly, implementing CNN on SoC-based edge device achieved at least 5X and 7X reduction in net power consumption compared to GPU and CPU devices respectively.
KW - Ambient Assisted Living (AAL)
KW - Edge Computing
KW - FPGA
KW - Internet of Things (IoT)
KW - Machine Learning
KW - Socially Assistive Robotics (SAR)
KW - Speech Recognition
UR - http://www.scopus.com/inward/record.url?scp=85149131028&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85149131028&partnerID=8YFLogxK
U2 - 10.1109/SII55687.2023.10039093
DO - 10.1109/SII55687.2023.10039093
M3 - Conference contribution
AN - SCOPUS:85149131028
T3 - 2023 IEEE/SICE International Symposium on System Integration, SII 2023
BT - 2023 IEEE/SICE International Symposium on System Integration, SII 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE/SICE International Symposium on System Integration, SII 2023
Y2 - 17 January 2023 through 20 January 2023
ER -