ログイン 新規登録
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 研究報告
  2. ハイパフォーマンスコンピューティング(HPC)
  3. 2023
  4. 2023-HPC-192

An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs

https://ipsj.ixsq.nii.ac.jp/records/231113
https://ipsj.ixsq.nii.ac.jp/records/231113
30bbc993-8a07-475f-bc7e-3f5180becebd
名前 / ファイル ライセンス アクション
IPSJ-HPC23192035.pdf IPSJ-HPC23192035.pdf (2.3 MB)
Copyright (c) 2023 by the Institute of Electronics, Information and Communication Engineers This SIG report is only available to those in membership of the SIG.
HPC:会員:¥0, DLIB:会員:¥0
Item type SIG Technical Reports(1)
公開日 2023-11-28
タイトル
タイトル An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs
タイトル
言語 en
タイトル An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs
言語
言語 eng
キーワード
主題Scheme Other
主題 アクセラレータ
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_18gh
資源タイプ technical report
著者所属
Japan Advanced Institute of Science and Technology
著者所属
Japan Advanced Institute of Science and Technology
著者所属(英)
en
Japan Advanced Institute of Science and Technology
著者所属(英)
en
Japan Advanced Institute of Science and Technology
著者名 Ryohei, Izawa

× Ryohei, Izawa

Ryohei, Izawa

Search repository
Yasushi, Inoguchi

× Yasushi, Inoguchi

Yasushi, Inoguchi

Search repository
著者名(英) Ryohei, Izawa

× Ryohei, Izawa

en Ryohei, Izawa

Search repository
Yasushi, Inoguchi

× Yasushi, Inoguchi

en Yasushi, Inoguchi

Search repository
論文抄録
内容記述タイプ Other
内容記述 The utilization of sparse matrix storage formats is widespread across various fields, including scientific computing, machine learning, and statistics. Within these domains, there is a need to perform Sparse Matrix-Vector Multiplication (SpMV) and Sparse Matrix-Transpose-Vector Multiplication (SpMVT) iteratively within a single application. However, executing SpMV and SpMVT on GPUs using existing sparse matrix storage formats presents challenges in terms of memory usage, memory access and load balancing. In our study, we present a novel sparse matrix storage format named GCSB, designed specifically for optimizing SpMV and SpMVT operations on GPUs through the implementation of advanced memory compression techniques. Expanding upon the pre-existing CSB format compatible with CPU-based SpMV and SpMVT, we extend its functionality to the GPU environment. This adaptation enables quicker execution of SpMV and SpMVT in comparison to CSR, achieved by effectively utilizing the L1 cache and ensuring load balancing, while maintaining the theoretical memory usage equivalent to that of CSR. Through our experiments, we demonstrate that GCSB achieves comparable theoretical memory usage to CSR while outperforming CSR in terms of speed on various matrices sourced from the University of Florida Sparse Matrix Collection. GCSB achieves a speedup of up to 1.47 speedup on TITAN RTX and 2.75 on A100. Furthermore, we show that GCSB reduces the L1 cache miss counts by strategically grouping and rearranging non-zero elements. Additionally, we conduct a qualitative assessment, affirming that GCSB exhibits superior performance, particularly when non-zero elements are widely dispersed throughout the matrix and the proportion of non-zero elements within the matrix is relatively high.
論文抄録(英)
内容記述タイプ Other
内容記述 The utilization of sparse matrix storage formats is widespread across various fields, including scientific computing, machine learning, and statistics. Within these domains, there is a need to perform Sparse Matrix-Vector Multiplication (SpMV) and Sparse Matrix-Transpose-Vector Multiplication (SpMVT) iteratively within a single application. However, executing SpMV and SpMVT on GPUs using existing sparse matrix storage formats presents challenges in terms of memory usage, memory access and load balancing. In our study, we present a novel sparse matrix storage format named GCSB, designed specifically for optimizing SpMV and SpMVT operations on GPUs through the implementation of advanced memory compression techniques. Expanding upon the pre-existing CSB format compatible with CPU-based SpMV and SpMVT, we extend its functionality to the GPU environment. This adaptation enables quicker execution of SpMV and SpMVT in comparison to CSR, achieved by effectively utilizing the L1 cache and ensuring load balancing, while maintaining the theoretical memory usage equivalent to that of CSR. Through our experiments, we demonstrate that GCSB achieves comparable theoretical memory usage to CSR while outperforming CSR in terms of speed on various matrices sourced from the University of Florida Sparse Matrix Collection. GCSB achieves a speedup of up to 1.47 speedup on TITAN RTX and 2.75 on A100. Furthermore, we show that GCSB reduces the L1 cache miss counts by strategically grouping and rearranging non-zero elements. Additionally, we conduct a qualitative assessment, affirming that GCSB exhibits superior performance, particularly when non-zero elements are widely dispersed throughout the matrix and the proportion of non-zero elements within the matrix is relatively high.
書誌レコードID
収録物識別子タイプ NCID
収録物識別子 AN10463942
書誌情報 研究報告ハイパフォーマンスコンピューティング(HPC)

巻 2023-HPC-192, 号 35, p. 1-6, 発行日 2023-11-28
ISSN
収録物識別子タイプ ISSN
収録物識別子 2188-8841
Notice
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.
出版者
言語 ja
出版者 情報処理学会
戻る
0
views
See details
Views

Versions

Ver.1 2025-01-19 10:52:23.610916
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3