@techreport{oai:ipsj.ixsq.nii.ac.jp:00081361, author = {片桐, 孝洋 and 尾崎, 克久 and 荻田, 武史 and 大石, 進一 and Takahiro, Katagiri and Katsuhisa, Ozaki and Takeshi, Ogita and Shin'ichi, Oishi}, issue = {26}, month = {Mar}, note = {行列-行列積に代表される基本線形計算を集約したライブラリ BLAS (Basic Linear Algebra Subprograms) は,多くの線形計算で必須の処理である.従来の数値計算ライブラリは,演算速度は考慮しているが演算精度の考慮が不十分であり,解の精度保証が重要な課題となっている.本研究では,大石グループで開発された高精度行列-行列演算に 2 種のスレッド並列化を行った.予備評価の結果,並列処理の規模に応じ並列化方式を切り替える必要があることが判明した.また,その切り替えを実現できる自動チューニング (AT)を,AT 言語の ABCLibScript を用いて実現した.T2K オープンスパコン (1 ノード,16 スレッド) を用いた性能評価の結果,AT による並列化方式の切り替えで最大で 5 倍程度の速度向上を確認した., BLAS (Basic Linear Algebra Subprograms), including matrix-matrix multiplication, is a crucial numerical library for many linear algebra computations. However, conventional numerical libraries are not enough taking into account for computing accuracy, while they are optimized for execution speed. Guaranteeing computational accuracy is one of important topics. In this research we parallelize an accuracy guaranteed matrix-matrix multiplication algorithm proposed by Oishi group by utilizing two kinds of thread implementations. As a result of preliminary evaluation, we found that selecting parallel method according to the number of threads is critical. In addition we adapt an auto-tuning (AT) language to establish the selection by using ABCLibScript. As a result of performance evaluation on the T2K open supercomputer (1 node, 16 threads), we obtained maximum 5x speedup by using the AT.}, title = {高精度行列‐行列積アルゴリズムのスレッド並列化とABCLibScriptへの機能実装}, year = {2012} }