WEKO3
アイテム
Limits of Thread-Level Parallelism in Non-numerical Programs
https://ipsj.ixsq.nii.ac.jp/records/18326
https://ipsj.ixsq.nii.ac.jp/records/1832644934002-c2ae-46a9-9261-25b66b469a85
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2006 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Trans(1) | |||||||
---|---|---|---|---|---|---|---|---|
公開日 | 2006-05-15 | |||||||
タイトル | ||||||||
タイトル | Limits of Thread-Level Parallelism in Non-numerical Programs | |||||||
タイトル | ||||||||
言語 | en | |||||||
タイトル | Limits of Thread-Level Parallelism in Non-numerical Programs | |||||||
言語 | ||||||||
言語 | eng | |||||||
キーワード | ||||||||
主題Scheme | Other | |||||||
主題 | システム性能評価 | |||||||
資源タイプ | ||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||
資源タイプ | journal article | |||||||
著者所属 | ||||||||
Department of Electrical Engineering and Computer Science Nagoya University Presently with Hitachi Ltd. | ||||||||
著者所属 | ||||||||
Department of Electrical Engineering and Computer Science Nagoya University | ||||||||
著者所属 | ||||||||
Department of Computational Science and Engineering Nagoya University | ||||||||
著者所属 | ||||||||
Department of Electrical Engineering and Computer Science Nagoya University | ||||||||
著者所属(英) | ||||||||
en | ||||||||
Department of Electrical Engineering and Computer Science, Nagoya University, Presently with Hitachi Ltd. | ||||||||
著者所属(英) | ||||||||
en | ||||||||
Department of Electrical Engineering and Computer Science, Nagoya University | ||||||||
著者所属(英) | ||||||||
en | ||||||||
Department of Computational Science and Engineering, Nagoya University | ||||||||
著者所属(英) | ||||||||
en | ||||||||
Department of Electrical Engineering and Computer Science, Nagoya University | ||||||||
著者名 |
Akio, Nakajima
× Akio, Nakajima
|
|||||||
著者名(英) |
Akio, Nakajima
× Akio, Nakajima
|
|||||||
論文抄録 | ||||||||
内容記述タイプ | Other | |||||||
内容記述 | Chip multiprocessors (CMPs) which recently became available with the advance of LSI technology can outperform current superscalar processors by exploiting thread-level parallelism (TLP). However the effectiveness of CMPs unfortunately depends greatly on their applications. In particular they have so far not brought any significant benefit to non-numerical programs. This study explores what techniques are required to extract large amounts of TLP in non-numerical programs. We focus particularly on three techniques: thread partitioning with various control structure levels speculative thread execution and speculative register communication. We evaluate these techniques by examining the upper bound of the TLP using trace-driven simulations. Our results are as follows. First little TLP can be extracted without both of the speculations in any of the partitioning levels. Second with the speculations available TLP is still limited in conventional function-level and loop-level partitioning. However it increases considerably with basic block-level partitioning. Finally in basic blocklevel partitioning focusing on control-equivalence instead of post-domination can significantly reduce the compile time with a modest degradation of TLP. | |||||||
論文抄録(英) | ||||||||
内容記述タイプ | Other | |||||||
内容記述 | Chip multiprocessors (CMPs), which recently became available with the advance of LSI technology, can outperform current superscalar processors by exploiting thread-level parallelism (TLP). However, the effectiveness of CMPs unfortunately depends greatly on their applications. In particular, they have so far not brought any significant benefit to non-numerical programs. This study explores what techniques are required to extract large amounts of TLP in non-numerical programs. We focus particularly on three techniques: thread partitioning with various control structure levels, speculative thread execution, and speculative register communication. We evaluate these techniques by examining the upper bound of the TLP, using trace-driven simulations. Our results are as follows. First, little TLP can be extracted without both of the speculations in any of the partitioning levels. Second, with the speculations, available TLP is still limited in conventional function-level and loop-level partitioning. However, it increases considerably with basic block-level partitioning. Finally, in basic blocklevel partitioning, focusing on control-equivalence instead of post-domination can significantly reduce the compile time, with a modest degradation of TLP. | |||||||
書誌レコードID | ||||||||
収録物識別子タイプ | NCID | |||||||
収録物識別子 | AA11833852 | |||||||
書誌情報 |
情報処理学会論文誌コンピューティングシステム(ACS) 巻 47, 号 SIG7(ACS14), p. 12-20, 発行日 2006-05-15 |
|||||||
ISSN | ||||||||
収録物識別子タイプ | ISSN | |||||||
収録物識別子 | 1882-7829 | |||||||
出版者 | ||||||||
言語 | ja | |||||||
出版者 | 情報処理学会 |