| Item type |
SIG Technical Reports(1) |
| 公開日 |
2025-07-28 |
| タイトル |
|
|
言語 |
ja |
|
タイトル |
Can Tensor Cores Accelerate Non-GEMMWorkloads? An Analytical Study |
| タイトル |
|
|
言語 |
en |
|
タイトル |
Can Tensor Cores Accelerate Non-GEMMWorkloads? An Analytical Study |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
計算方法 |
| 資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
| 著者所属 |
|
|
|
RIKEN Center for Computational Science |
| 著者所属 |
|
|
|
University of South Florida |
| 著者所属 |
|
|
|
Argonne National Laboratory |
| 著者所属 |
|
|
|
RIKEN Center for Computational Science |
| 著者所属 |
|
|
|
RIKEN Center for Computational Science |
| 著者所属(英) |
|
|
|
en |
|
|
RIKEN Center for Computational Science |
| 著者所属(英) |
|
|
|
en |
|
|
University of South Florida |
| 著者所属(英) |
|
|
|
en |
|
|
Argonne National Laboratory |
| 著者所属(英) |
|
|
|
en |
|
|
RIKEN Center for Computational Science |
| 著者所属(英) |
|
|
|
en |
|
|
RIKEN Center for Computational Science |
| 著者名 |
Lingqi,Zhang
Jiajun,Huang
Sheng,Di
Satoshi,Matsuoka
Mohamed,Wahib
|
| 著者名(英) |
Lingqi Zhang
Jiajun Huang
Sheng Di
Satoshi Matsuoka
Mohamed Wahib
|
| 論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Tensor Cores are specialized units integrated in modern GPUs, designed to accelerate dense matrix operations with remarkable efficiency. They have proven particularly effective in compute-bound workloads, such as those found in deep learning training, where general matrix-matrix multiplication (GEMM) is prevalent. Motivated by this success, recent efforts have explored extending Tensor Core usage to non-GEMM computational patterns. However, despite their potential, effectively utilizing Tensor Cores in broader contexts requires a thorough understanding of their performance characteristics across diverse workloads. This work investigates the applicability of Tensor Cores to non-GEMM workloads, seeking to answer a fundamental question: Can Tensor Cores accelerate non-GEMM kernels? |
| 論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Tensor Cores are specialized units integrated in modern GPUs, designed to accelerate dense matrix operations with remarkable efficiency. They have proven particularly effective in compute-bound workloads, such as those found in deep learning training, where general matrix-matrix multiplication (GEMM) is prevalent. Motivated by this success, recent efforts have explored extending Tensor Core usage to non-GEMM computational patterns. However, despite their potential, effectively utilizing Tensor Cores in broader contexts requires a thorough understanding of their performance characteristics across diverse workloads. This work investigates the applicability of Tensor Cores to non-GEMM workloads, seeking to answer a fundamental question: Can Tensor Cores accelerate non-GEMM kernels? |
| 書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10463942 |
| 書誌情報 |
研究報告ハイパフォーマンスコンピューティング(HPC)
巻 2025-HPC-200,
号 3,
p. 1-8,
発行日 2025-07-28
|
| ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8841 |
| Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
| 出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |