| Item type |
SIG Technical Reports(1) |
| 公開日 |
2015-02-23 |
| タイトル |
|
|
タイトル |
A Communication Avoiding and Reducing Algorithm for Symmetric Eigenproblem for Very Small Matrices |
| タイトル |
|
|
言語 |
en |
|
タイトル |
A Communication Avoiding and Reducing Algorithm for Symmetric Eigenproblem for Very Small Matrices |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
線形代数 |
| 資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
| 著者所属 |
|
|
|
Information Technology Center, The University of Tokyo |
| 著者所属 |
|
|
|
Department of Applied Physics School of Engineering, The University of Tokyo |
| 著者所属 |
|
|
|
Department of Applied Physics School of Engineering, The University of Tokyo |
| 著者所属(英) |
|
|
|
en |
|
|
Information Technology Center, The University of Tokyo |
| 著者所属(英) |
|
|
|
en |
|
|
Department of Applied Physics School of Engineering, The University of Tokyo |
| 著者所属(英) |
|
|
|
en |
|
|
Department of Applied Physics School of Engineering, The University of Tokyo |
| 著者名 |
Takahiro, Katagiri
Jun'ichi, Iwata
Kazuyuki, Uchida
|
| 著者名(英) |
Takahiro, Katagiri
Jun'ichi, Iwata
Kazuyuki, Uchida
|
| 論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
In this paper, a parallel symmetric eigensolver with very small matrices in massively parallel processing is considered. We define very small matrices that fit the sizes of caches per node in a supercomputer. We assume that the sizes also fit the exa-scale computing requirements of current production runs of an application. To minimize communication time, we added several communication avoiding and communication reducing algorithms based on Message Passing Interface (MPI) non-blocking implementations. A performance evaluation with up to full nodes of the FX10 system indicates that (1) the MPI non-blocking implementation is 3x as efficient as the baseline implementation, (2) the hybrid MPI execution is 1.9x faster than the pure MPI execution, (3) our proposed solver is 2.3x and 22x faster than a ScaLAPACK routine with optimized blocking size and cyclic-cyclic distribution, respectively. |
| 論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
In this paper, a parallel symmetric eigensolver with very small matrices in massively parallel processing is considered. We define very small matrices that fit the sizes of caches per node in a supercomputer. We assume that the sizes also fit the exa-scale computing requirements of current production runs of an application. To minimize communication time, we added several communication avoiding and communication reducing algorithms based on Message Passing Interface (MPI) non-blocking implementations. A performance evaluation with up to full nodes of the FX10 system indicates that (1) the MPI non-blocking implementation is 3x as efficient as the baseline implementation, (2) the hybrid MPI execution is 1.9x faster than the pure MPI execution, (3) our proposed solver is 2.3x and 22x faster than a ScaLAPACK routine with optimized blocking size and cyclic-cyclic distribution, respectively. |
| 書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10463942 |
| 書誌情報 |
研究報告ハイパフォーマンスコンピューティング(HPC)
巻 2015-HPC-148,
号 2,
p. 1-17,
発行日 2015-02-23
|
| Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
| 出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |