A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU

Takaaki Miyajima, David Thomas, Hideharu Amano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Computationally intensive applications, such as OpenCV, can be off-loaded to accelerators to reduce execution time. However, developing an accelerated system requires a significant amount of time, requiring the developer to first choose an accelerator and which parts to off-load, then to port and the offloaded kernels to the accelerator using many accelerator-specific tools. In addition to the low-level parallelism of the accelerator, the developer also needs to extract and utilize systemlevel parallelism found within the application, while making sure that the application still executes correctly. This paper presents Courier, a toolchain and a domain specific language for Runtime Binary Acceleration, designed to simplify many of the steps involved in accelerating an application. The Courier toolchain can extract dataflow from a running software binary file, explore the off-loaded execution time on an accelerator, and then actually accelerate the original binary. By utilizing Courier, both expert and non-expert users can easily extract systemlevel parallelism and decide which part should be off-loaded to accelerators in a mixed software-hardware environment, without special knowledge on the target application source code and accelerator architecture. In a case study an OpenCV application is accelerated by 2.06 times using Courier, without requiring the application source code or any re-compilation of the application.

Original languageEnglish
Title of host publicationProceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012
Pages175-181
Number of pages7
DOIs
Publication statusPublished - 2012
Event2012 3rd International Conference on Networking and Computing, ICNC 2012 - Naha, Okinawa, Japan
Duration: 2012 Dec 52012 Dec 7

Other

Other2012 3rd International Conference on Networking and Computing, ICNC 2012
CountryJapan
CityNaha, Okinawa
Period12/12/512/12/7

Fingerprint

Particle accelerators
Graphics processing unit
Hardware

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Miyajima, T., Thomas, D., & Amano, H. (2012). A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU. In Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012 (pp. 175-181). [6424560] https://doi.org/10.1109/ICNC.2012.34

A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU. / Miyajima, Takaaki; Thomas, David; Amano, Hideharu.

Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012. 2012. p. 175-181 6424560.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Miyajima, T, Thomas, D & Amano, H 2012, A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU. in Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012., 6424560, pp. 175-181, 2012 3rd International Conference on Networking and Computing, ICNC 2012, Naha, Okinawa, Japan, 12/12/5. https://doi.org/10.1109/ICNC.2012.34
Miyajima T, Thomas D, Amano H. A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU. In Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012. 2012. p. 175-181. 6424560 https://doi.org/10.1109/ICNC.2012.34
Miyajima, Takaaki ; Thomas, David ; Amano, Hideharu. / A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU. Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012. 2012. pp. 175-181
@inproceedings{6f5583d5de7c4b74b8660f5b52c16e07,
title = "A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU",
abstract = "Computationally intensive applications, such as OpenCV, can be off-loaded to accelerators to reduce execution time. However, developing an accelerated system requires a significant amount of time, requiring the developer to first choose an accelerator and which parts to off-load, then to port and the offloaded kernels to the accelerator using many accelerator-specific tools. In addition to the low-level parallelism of the accelerator, the developer also needs to extract and utilize systemlevel parallelism found within the application, while making sure that the application still executes correctly. This paper presents Courier, a toolchain and a domain specific language for Runtime Binary Acceleration, designed to simplify many of the steps involved in accelerating an application. The Courier toolchain can extract dataflow from a running software binary file, explore the off-loaded execution time on an accelerator, and then actually accelerate the original binary. By utilizing Courier, both expert and non-expert users can easily extract systemlevel parallelism and decide which part should be off-loaded to accelerators in a mixed software-hardware environment, without special knowledge on the target application source code and accelerator architecture. In a case study an OpenCV application is accelerated by 2.06 times using Courier, without requiring the application source code or any re-compilation of the application.",
author = "Takaaki Miyajima and David Thomas and Hideharu Amano",
year = "2012",
doi = "10.1109/ICNC.2012.34",
language = "English",
isbn = "9780769548937",
pages = "175--181",
booktitle = "Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012",

}

TY - GEN

T1 - A domain specific language and toolchain for OpenCV Runtime Binary Acceleration using GPU

AU - Miyajima, Takaaki

AU - Thomas, David

AU - Amano, Hideharu

PY - 2012

Y1 - 2012

N2 - Computationally intensive applications, such as OpenCV, can be off-loaded to accelerators to reduce execution time. However, developing an accelerated system requires a significant amount of time, requiring the developer to first choose an accelerator and which parts to off-load, then to port and the offloaded kernels to the accelerator using many accelerator-specific tools. In addition to the low-level parallelism of the accelerator, the developer also needs to extract and utilize systemlevel parallelism found within the application, while making sure that the application still executes correctly. This paper presents Courier, a toolchain and a domain specific language for Runtime Binary Acceleration, designed to simplify many of the steps involved in accelerating an application. The Courier toolchain can extract dataflow from a running software binary file, explore the off-loaded execution time on an accelerator, and then actually accelerate the original binary. By utilizing Courier, both expert and non-expert users can easily extract systemlevel parallelism and decide which part should be off-loaded to accelerators in a mixed software-hardware environment, without special knowledge on the target application source code and accelerator architecture. In a case study an OpenCV application is accelerated by 2.06 times using Courier, without requiring the application source code or any re-compilation of the application.

AB - Computationally intensive applications, such as OpenCV, can be off-loaded to accelerators to reduce execution time. However, developing an accelerated system requires a significant amount of time, requiring the developer to first choose an accelerator and which parts to off-load, then to port and the offloaded kernels to the accelerator using many accelerator-specific tools. In addition to the low-level parallelism of the accelerator, the developer also needs to extract and utilize systemlevel parallelism found within the application, while making sure that the application still executes correctly. This paper presents Courier, a toolchain and a domain specific language for Runtime Binary Acceleration, designed to simplify many of the steps involved in accelerating an application. The Courier toolchain can extract dataflow from a running software binary file, explore the off-loaded execution time on an accelerator, and then actually accelerate the original binary. By utilizing Courier, both expert and non-expert users can easily extract systemlevel parallelism and decide which part should be off-loaded to accelerators in a mixed software-hardware environment, without special knowledge on the target application source code and accelerator architecture. In a case study an OpenCV application is accelerated by 2.06 times using Courier, without requiring the application source code or any re-compilation of the application.

UR - http://www.scopus.com/inward/record.url?scp=84874272435&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874272435&partnerID=8YFLogxK

U2 - 10.1109/ICNC.2012.34

DO - 10.1109/ICNC.2012.34

M3 - Conference contribution

SN - 9780769548937

SP - 175

EP - 181

BT - Proceedings of the 2012 3rd International Conference on Networking and Computing, ICNC 2012

ER -