2024-10-12T12:21:40Zhttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_oaipmhoai:ipsj.ixsq.nii.ac.jp:001742012020-10-27T05:02:56Z00934:00989:08398:08850
Performance Evaluation of Golub-Kahan-Lanczos Algorithm with Reorthogonalization by Classical Gram-Schmidt Algorithm and OpenMPPerformance Evaluation of Golub-Kahan-Lanczos Algorithm with Reorthogonalization by Classical Gram-Schmidt Algorithm and OpenMPeng[オリジナル論文] subset computation of singular pairs, Golub-Kahan-Lanczos algorithm with reorthogonalization, classical Gram-Schmidt algorithm with reorthogonalization, OpenMP, sharedmemory multi-core processinghttp://id.nii.ac.jp/1001/00174167/Articlehttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_action_common_download&item_id=174201&item_no=1&attribute_id=1&file_no=1Copyright (c) 2016 by the Information Processing Society of JapanResearch Group of Information and Communication Technology for Life, Nara Women's UniversityGraduate School of Informatics, Kyoto UniversityGraduate School of Informatics, Kyoto UniversityGraduate School of Informatics, Kyoto UniversityGraduate School of Informatics, Kyoto UniversityGraduate School of Informatics, Kyoto UniversityMasami, TakataHiroyuki, IshigamiKinji, KimuraYuki, FujiiHiroki, TanakaYoshimasa, NakamuraThe Golub-Kahan-Lanczos algorithm with reorthogonalization (GKLR algorithm) is an algorithm for computing a subset of singular triplets of large-scale sparse matrices. The reorthogonalization tends to become a bottleneck of the execution time, as the iteration number of the GKLR algorithm increases. In this paper, OpenMP-based parallel implementation of the classical Gram-Schmidt algorithm with reorthogonalization (OMP-CGS2 algorithm) is introduced. The OMP-CGS2 algorithm has the advantage of data reusability and is expected to achieve higher performance of the reorthogonalization computations on shared-memory multi-core processors with large caches than the conventional reorthogonalization algorithms. Numerical experiments on shared-memory multi-core processors show that the OMP-CGS2 algorithm accelerates the GKLR algorithm more effectively for computing a subset of singular triplets of some matrices than the conventional reorthogonalization algorithms. In addition, we discuss the cache utilization in the OMP-CGS2 algorithm and a condition that the OMP-CGS2 algorithm achieves higher performance than the CGS2 algorithm.The Golub-Kahan-Lanczos algorithm with reorthogonalization (GKLR algorithm) is an algorithm for computing a subset of singular triplets of large-scale sparse matrices. The reorthogonalization tends to become a bottleneck of the execution time, as the iteration number of the GKLR algorithm increases. In this paper, OpenMP-based parallel implementation of the classical Gram-Schmidt algorithm with reorthogonalization (OMP-CGS2 algorithm) is introduced. The OMP-CGS2 algorithm has the advantage of data reusability and is expected to achieve higher performance of the reorthogonalization computations on shared-memory multi-core processors with large caches than the conventional reorthogonalization algorithms. Numerical experiments on shared-memory multi-core processors show that the OMP-CGS2 algorithm accelerates the GKLR algorithm more effectively for computing a subset of singular triplets of some matrices than the conventional reorthogonalization algorithms. In addition, we discuss the cache utilization in the OMP-CGS2 algorithm and a condition that the OMP-CGS2 algorithm achieves higher performance than the CGS2 algorithm.AA11464803情報処理学会論文誌数理モデル化と応用（TOM）92182016-08-101882-77802016-08-04