QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS

Kodai Ueyoshi, Kota Ando, Kazutoshi Hirose, Shinya Takamaeda-Yamazaki, Junichiro Kadomoto, Tomoki Miyata, Mototsugu Hamada, Tadahiro Kuroda, Masato Motomura

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    42 Citations (Scopus)

    Abstract

    A key consideration for deep neural network (DNN) inference accelerators is the need for large and high-bandwidth external memories. Although an architectural concept for stacking a DNN accelerator with DRAMs has been proposed previously, long DRAM latency remains problematic and limits the performance [1]. Recent algorithm-level optimizations, such as network pruning and compression, have shown success in reducing the DNN memory size [2]; however, since networks become irregular and sparse, they induce an additional need for agile random accesses to the memory systems.

    Original languageEnglish
    Title of host publication2018 IEEE International Solid-State Circuits Conference, ISSCC 2018
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages216-218
    Number of pages3
    Volume61
    ISBN (Electronic)9781509049394
    DOIs
    Publication statusPublished - 2018 Mar 8
    Event65th IEEE International Solid-State Circuits Conference, ISSCC 2018 - San Francisco, United States
    Duration: 2018 Feb 112018 Feb 15

    Other

    Other65th IEEE International Solid-State Circuits Conference, ISSCC 2018
    CountryUnited States
    CitySan Francisco
    Period18/2/1118/2/15

    ASJC Scopus subject areas

    • Electronic, Optical and Magnetic Materials
    • Electrical and Electronic Engineering

    Fingerprint Dive into the research topics of 'QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS'. Together they form a unique fingerprint.

    Cite this