GPU上での高速なブロック化フロイド・ワーシャル法

奥山, 倫弘; 伊野, 文彦; 萩原, 兼一; Tomohiro, Okuyama; Fumihiko, Ino; Kenichi, Hagihara

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

GPU上での高速なブロック化フロイド・ワーシャル法

https://ipsj.ixsq.nii.ac.jp/records/69734

名前 / ファイル	ライセンス	アクション
IPSJ-TACS0302009.pdf (525.9 kB)	Copyright (c) 2010 by the Information Processing Society of Japan
オープンアクセス

Item type

Trans(1)

公開日

2010-06-21

タイトル

GPU上での高速なブロック化フロイド・ワーシャル法

タイトル

言語

タイトル

Fast Blocked Floyd-Warshall Algorithm on the GPU

言語

jpn

キーワード

主題Scheme

Other

主題

GPU応用

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

大阪大学大学院情報科学研究科コンピュータサイエンス専攻

著者所属

大阪大学大学院情報科学研究科コンピュータサイエンス専攻

著者所属

大阪大学大学院情報科学研究科コンピュータサイエンス専攻

著者所属(英)

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

著者所属(英)

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

著者所属(英)

Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

著者名

奥山, 倫弘

著者名(英)

Tomohiro, Okuyama

論文抄録

内容記述タイプ

Other

内容記述

本稿では，GPU （Graphics Processing Unit）を用いて全点対最短経路問題を高速に解くために，反復型ブロック化フロイド・ワーシャル法の高速化手法を提案する．提案手法は 2 種類あり，いずれもオフチップメモリの参照量を削減することで高速化を図る．1 つ目の手法は，共有メモリの代わりにレジスタを優先的に用い，命令数を削減する．もう 1 つの手法は，オフチップメモリの参照量がブロックの大きさに反比例することに着目し，より大きなブロックを用いる．既存の再帰型ブロック化手法と比較した結果，頂点数が 256～1,024 個の場合，両手法ともに 4% 以上高速であった．頂点数が多い場合，性能は 1～10% ほど下回った．GPU のピーク演算性能に対して 70% の効率を達成した．

論文抄録(英)

内容記述タイプ

Other

内容記述

This paper proposes an acceleration method for the iterative blocked Floyd-Warshall (BFW) algorithm, aiming at solving the all-pairs shortest path problem rapidly on the graphics processing unit (GPU). The proposed method contains two variations, both designed to reduce data access to off-chip memory. The first one also reduces the number of instructions by using registers rather than shared memory. The remaining one increases the block size because it is inversely proportional to the amount of off-chip memory access. For graphs with 256-1,024 vertices, both variations demonstrate 4% faster performance than a previous method that employs a recursive BFW algorithm. For larger graphs, on the other hand, our iterative method is 1-10% slower than the recursive method. The proposed method achieves approximately 70% of peak computational performance.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11833852

書誌情報

情報処理学会論文誌コンピューティングシステム（ACS）

巻 3, 号 2, p. 57-66, 発行日 2010-06-21

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7829

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-21 23:47:55.363232

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

GPU上での高速なブロック化フロイド・ワーシャル法

× 奥山, 倫弘

× Tomohiro, Okuyama

Versions

Share

Cite as

エクスポート