2024-03-29T05:10:31Zhttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_oaipmhoai:ipsj.ixsq.nii.ac.jp:001982012020-10-27T05:02:43Z00934:00935:09619:09824
Parallelization of Matrix Partitioning in Construction of Hierarchical Matrices Using Task Parallel LanguagesParallelization of Matrix Partitioning in Construction of Hierarchical Matrices Using Task Parallel Languageseng[発表概要,Unrefereed Presentation Abstract] http://id.nii.ac.jp/1001/00198111/Articlehttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_action_common_download&item_id=198201&item_no=1&attribute_id=1&file_no=1Copyright (c) 2019 by the Information Processing Society of JapanGraduate School of Informatics, Kyoto UniversityAcademic Center for Computing and Media Studies, Kyoto UniversityAcademic Center for Computing and Media Studies, Kyoto UniversityInformation Technology Center, the University of TokyoDepartment of Artificial Intelligence, Kyushu Institute of TechnologyZhengyang, BaiTasuku, HiraishiHiroshi, NakashimaAkihiro, IdaMasahiro, YasugiHierarchical matrix (H-matrix) is an approximated form to represent N × N correlations of N objects. H-matrix construction is done by partitioning a matrix into submatrices, followed by calculating element values of these submatrices. This paper proposes implementations of the matrix partitioning using task parallel languages, Cilk Plus and Tascell. The matrix partitioning is divided into two steps: cluster tree construction by dividing objects into clusters hierarchically, and block cluster tree construction by finding out all cluster pairs that satisfy an admissiblity condition from the cluster tree. Because the cluster tree constructed and traversed in these steps is unpredictably unbalanced, it is expected that we can parallelize these both steps efficiently using task parallel languages. To obtain sufficient parallelism in the cluster tree construction, we not only execute recursive calls in parallel, but also parallelized inside of each recursive step. In the block cluster tree construction, we assigned each workers its own space to store the cluster pairs without using locks. As a result, comparing to a sequential implementation in C, we achieved a 11.8 times speedup using Cilk Plus for the cluster tree construction and a 38.8 times speedup using Tascell for the block cluster tree construction, both on a 36-core Xeon processor.Hierarchical matrix (H-matrix) is an approximated form to represent N × N correlations of N objects. H-matrix construction is done by partitioning a matrix into submatrices, followed by calculating element values of these submatrices. This paper proposes implementations of the matrix partitioning using task parallel languages, Cilk Plus and Tascell. The matrix partitioning is divided into two steps: cluster tree construction by dividing objects into clusters hierarchically, and block cluster tree construction by finding out all cluster pairs that satisfy an admissiblity condition from the cluster tree. Because the cluster tree constructed and traversed in these steps is unpredictably unbalanced, it is expected that we can parallelize these both steps efficiently using task parallel languages. To obtain sufficient parallelism in the cluster tree construction, we not only execute recursive calls in parallel, but also parallelized inside of each recursive step. In the block cluster tree construction, we assigned each workers its own space to store the cluster pairs without using locks. As a result, comparing to a sequential implementation in C, we achieved a 11.8 times speedup using Cilk Plus for the cluster tree construction and a 38.8 times speedup using Tascell for the block cluster tree construction, both on a 36-core Xeon processor.AA11464814情報処理学会論文誌プログラミング(PRO)12310102019-07-171882-78022019-07-10