Near-memory data transformation for efficient sparse matrix multi-vector multiplication

Daichi Fujiki, Niladrish Chatterjee, Donghyuk Lee, Mike O'Connor

研究成果: Conference contribution

7 被引用数 (Scopus)

抄録

Efficient manipulation of sparse matrices is critical to a wide range of HPC applications. Increasingly, GPUs are used to accelerate these sparse matrix operations. We study one common operation, Sparse Matrix Multi-Vector Multiplication (SpMM), and evaluate the impact of the sparsity, distribution of non-zero elements, and tile-traversal strategies on GPU implementations. Using these insights, we determine that operating on these sparse matrices in a Densified Compressed Sparse Row (DCSR) is well-suited to the parallel warp-synchronous execution model of the GPU processing elements. Preprocessing or storing the sparse matrix in the DCSR format, however, often requires significantly more memory storage than conventional Compressed Sparse Row (CSR) or Compressed Sparse Column (CSC) formats. Given that SpMM kernels are often bottlenecked on DRAM bandwidth, the increase in DRAM traffic to access the larger DCSR formatted data structure can result in a slowdown for many matrices. We propose a near-memory transform engine to dynamically create DCSR formatted tiles for the GPU processing elements from the CSC formatted matrix in memory. This work enhances a GPU's last-level cache/memory controller unit to act as an efficient translator between the compute-optimized representation of data and its corresponding storage/bandwidth-optimized format to accelerate sparse workloads. Our approach achieves 2.26× better performance on average compared to the vendor supplied optimized library for sparse matrix operations, cuSPARSE.

本文言語English
ホスト出版物のタイトルProceedings of SC 2019
ホスト出版物のサブタイトルThe International Conference for High Performance Computing, Networking, Storage and Analysis
出版社IEEE Computer Society
ISBN(電子版)9781450362290
DOI
出版ステータスPublished - 2019 11月 17
外部発表はい
イベント2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019 - Denver, United States
継続期間: 2019 11月 172019 11月 22

出版物シリーズ

名前International Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN(印刷版)2167-4329
ISSN(電子版)2167-4337

Conference

Conference2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019
国/地域United States
CityDenver
Period19/11/1719/11/22

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用
  • ハードウェアとアーキテクチャ
  • ソフトウェア

フィンガープリント

「Near-memory data transformation for efficient sparse matrix multi-vector multiplication」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル