A fine-grained multicasting of configuration data for coarse-grained reconfigurable architectures

Takuya Kojima, Hideharu Amano

Research output: Contribution to journalArticle

Abstract

A novel configuration data compression technique for coarse-grained reconfigurable architectures (CGRAs) is proposed. Reducing the size of configuration data of CGRAs shortens the reconfiguration time especially when the communication bandwidth between a CGRA and a host CPU is limited. In addition, it saves energy consumption of configuration cache and controller. The proposed technique is based on a multicast configuration technique called RoMultiC, which reduces the configuration time by multicasting the same data to multiple PEs (Processing Elements) with two bit-maps. Scheduling algorithms for an optimizing the order of multicasting have been proposed. However, the multicasting is possible only if each PE has completely the same configuration. In general, configuration data for CGRAs can be divided into some fields like machine code formats of general perpose CPUs. The proposed scheme confines a part of fields for multicasting so that the possibility of multicasting more PEs can be increased. This paper analyzes algorithms to find a configuration pattern which maximizes the number of multicasted PEs. We implemented the proposed scheme to CMA (Cool Mega Array), a straight forward CGRA as a case study. Experimental results show that the proposed method achieves 40.0% smaller configuration than a previous method for an image processing application at maximum. The exploration of the multicasted grain size reveals the effective grain size for each algorithm. Furthermore, since both a dynamic power consumption of the configuration controller and a configuration time are improved, it achieves 50.1% reduction of the energy consumption for the configuration with a negligible area overhead.

Original languageEnglish
Pages (from-to)1247-1256
Number of pages10
JournalIEICE Transactions on Information and Systems
VolumeE102D
Issue number7
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Reconfigurable architectures
Multicasting
Processing
Program processors
Energy utilization
Controllers
Data compression
Scheduling algorithms
Image processing
Electric power utilization
Bandwidth
Communication

Keywords

  • CGRA
  • Configuration reduction
  • Integer-linear-program
  • Multicasting

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Cite this

A fine-grained multicasting of configuration data for coarse-grained reconfigurable architectures. / Kojima, Takuya; Amano, Hideharu.

In: IEICE Transactions on Information and Systems, Vol. E102D, No. 7, 01.01.2019, p. 1247-1256.

Research output: Contribution to journalArticle

@article{0dbe46b8e1a84150aa22889aaa73a0b8,
title = "A fine-grained multicasting of configuration data for coarse-grained reconfigurable architectures",
abstract = "A novel configuration data compression technique for coarse-grained reconfigurable architectures (CGRAs) is proposed. Reducing the size of configuration data of CGRAs shortens the reconfiguration time especially when the communication bandwidth between a CGRA and a host CPU is limited. In addition, it saves energy consumption of configuration cache and controller. The proposed technique is based on a multicast configuration technique called RoMultiC, which reduces the configuration time by multicasting the same data to multiple PEs (Processing Elements) with two bit-maps. Scheduling algorithms for an optimizing the order of multicasting have been proposed. However, the multicasting is possible only if each PE has completely the same configuration. In general, configuration data for CGRAs can be divided into some fields like machine code formats of general perpose CPUs. The proposed scheme confines a part of fields for multicasting so that the possibility of multicasting more PEs can be increased. This paper analyzes algorithms to find a configuration pattern which maximizes the number of multicasted PEs. We implemented the proposed scheme to CMA (Cool Mega Array), a straight forward CGRA as a case study. Experimental results show that the proposed method achieves 40.0{\%} smaller configuration than a previous method for an image processing application at maximum. The exploration of the multicasted grain size reveals the effective grain size for each algorithm. Furthermore, since both a dynamic power consumption of the configuration controller and a configuration time are improved, it achieves 50.1{\%} reduction of the energy consumption for the configuration with a negligible area overhead.",
keywords = "CGRA, Configuration reduction, Integer-linear-program, Multicasting",
author = "Takuya Kojima and Hideharu Amano",
year = "2019",
month = "1",
day = "1",
doi = "10.1587/transinf.2018EDP7336",
language = "English",
volume = "E102D",
pages = "1247--1256",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "7",

}

TY - JOUR

T1 - A fine-grained multicasting of configuration data for coarse-grained reconfigurable architectures

AU - Kojima, Takuya

AU - Amano, Hideharu

PY - 2019/1/1

Y1 - 2019/1/1

N2 - A novel configuration data compression technique for coarse-grained reconfigurable architectures (CGRAs) is proposed. Reducing the size of configuration data of CGRAs shortens the reconfiguration time especially when the communication bandwidth between a CGRA and a host CPU is limited. In addition, it saves energy consumption of configuration cache and controller. The proposed technique is based on a multicast configuration technique called RoMultiC, which reduces the configuration time by multicasting the same data to multiple PEs (Processing Elements) with two bit-maps. Scheduling algorithms for an optimizing the order of multicasting have been proposed. However, the multicasting is possible only if each PE has completely the same configuration. In general, configuration data for CGRAs can be divided into some fields like machine code formats of general perpose CPUs. The proposed scheme confines a part of fields for multicasting so that the possibility of multicasting more PEs can be increased. This paper analyzes algorithms to find a configuration pattern which maximizes the number of multicasted PEs. We implemented the proposed scheme to CMA (Cool Mega Array), a straight forward CGRA as a case study. Experimental results show that the proposed method achieves 40.0% smaller configuration than a previous method for an image processing application at maximum. The exploration of the multicasted grain size reveals the effective grain size for each algorithm. Furthermore, since both a dynamic power consumption of the configuration controller and a configuration time are improved, it achieves 50.1% reduction of the energy consumption for the configuration with a negligible area overhead.

AB - A novel configuration data compression technique for coarse-grained reconfigurable architectures (CGRAs) is proposed. Reducing the size of configuration data of CGRAs shortens the reconfiguration time especially when the communication bandwidth between a CGRA and a host CPU is limited. In addition, it saves energy consumption of configuration cache and controller. The proposed technique is based on a multicast configuration technique called RoMultiC, which reduces the configuration time by multicasting the same data to multiple PEs (Processing Elements) with two bit-maps. Scheduling algorithms for an optimizing the order of multicasting have been proposed. However, the multicasting is possible only if each PE has completely the same configuration. In general, configuration data for CGRAs can be divided into some fields like machine code formats of general perpose CPUs. The proposed scheme confines a part of fields for multicasting so that the possibility of multicasting more PEs can be increased. This paper analyzes algorithms to find a configuration pattern which maximizes the number of multicasted PEs. We implemented the proposed scheme to CMA (Cool Mega Array), a straight forward CGRA as a case study. Experimental results show that the proposed method achieves 40.0% smaller configuration than a previous method for an image processing application at maximum. The exploration of the multicasted grain size reveals the effective grain size for each algorithm. Furthermore, since both a dynamic power consumption of the configuration controller and a configuration time are improved, it achieves 50.1% reduction of the energy consumption for the configuration with a negligible area overhead.

KW - CGRA

KW - Configuration reduction

KW - Integer-linear-program

KW - Multicasting

UR - http://www.scopus.com/inward/record.url?scp=85069745345&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069745345&partnerID=8YFLogxK

U2 - 10.1587/transinf.2018EDP7336

DO - 10.1587/transinf.2018EDP7336

M3 - Article

AN - SCOPUS:85069745345

VL - E102D

SP - 1247

EP - 1256

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 7

ER -