情報学広場：情報処理学会電子図書館

WEKO3

To

lat lon distance

[[sub_check.contents]]

[[sub_check.contents]]

[[sub_radio.contents]]

To

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

SMPサーバ及び組込み用マルチコア上でのOSCARマルチグレイン自動並列化コンパイラの性能

https://ipsj.ixsq.nii.ac.jp/records/23047

名前 / ファイル	ライセンス	アクション
IPSJ-ARC06170002.pdf (897.6 kB)	Copyright (c) 2006 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2006-11-28

タイトル

タイトル

SMPサーバ及び組込み用マルチコア上でのOSCARマルチグレイン自動並列化コンパイラの性能

タイトル

言語

en

タイトル

Performance of OSCAR Multigrain Parallelizing Compiler on SMP Servers and Embedded Multicore

言語

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属

早稲田大学理工学部コンピュータ・ネットワーク工学科

著者所属(英)

en

Department of Computer Science, Waseda University

著者所属(英)

en

Department of Computer Science, Waseda University

著者所属(英)

en

Department of Computer Science, Waseda University

著者所属(英)

en

Department of Computer Science, Waseda University

著者所属(英)

en

Department of Computer Science, Waseda University

著者所属(英)

en

Department of Computer Science, Waseda University

著者所属(英)

en

Department of Computer Science, Waseda University

著者名

白子, 準田川, 友博三浦, 剛宮本, 孝道中野, 啓史木村, 啓二笠原, 博徳

	白子, 準田川, 友博三浦, 剛宮本, 孝道中野, 啓史木村, 啓二笠原, 博徳

Search repository

著者名(英)

Jun, SHIRAKO Tomohiro, TAGAWA Tsuyoshi, MIURA Takamichi, MIYAMOTO Hirofumi, NAKANO Keiji, KIMURA Hironori, KASAHARA

en	Jun, SHIRAKO Tomohiro, TAGAWA Tsuyoshi, MIURA Takamichi, MIYAMOTO Hirofumi, NAKANO Keiji, KIMURA Hironori, KASAHARA

Search repository

論文抄録

内容記述タイプ

Other

内容記述

半導体集積度向上に伴うスケーラブルな性能向上、低消費電力、価格性能を達成するためにマルチコアプロセッサが大きな注目を集めている。このようなマルチコアプロセッサの性能を最大限に引き出し、ソフトウェア/ハードウェア開発期間を短縮するためには自動並列化コンパイラが重要な役目を果たす。本論文ではループ並列処理に加え、粗粒度タスク並列処理・近細粒度並列処理によりプログラム全域にわたる並列化を行うOSCARマルチグレイン自動並列化コンパイラを用いた、最新SMPサーバ及び組込み組込み用マルチコアプロセッサ上での性能評価について述べる。OSCARコンパイラではプログラム中の各部分に対する適切な処理プロセッサ数と並列処理手法の決定、複数のループや粗粒度タスク間にまたがる広域的なキャッシュメモリ最適化技術が実現されている。SPEC CFP95ベンチマーク全10本とCFP2000ベンチマーク4本を用いた性能評価において、OSCARコンパイラはIBM p5 550Q Power+8 プロセッササーバ上でIBM XL Fortran コンパイラ version 10.1の自動並列化性能に比べ平均2.74倍、IBM pSeries690 Power4 24 プロセッササーバ上でIBM XL Fortran コンパイラ version 8.1 の自動並列化性能に比べ平均4.82倍の性能向上が得られた。またNEC/ARM MPCore ARMv6 4 プロセッサ集積組込み用マルチコアにおいて、OpenMP API の一部機能をサポートすることでOSCARコンパイラによる自動並列化を実現した。組込み用途を考慮しデータセットを縮小したSPEC CFP95 を用いた評価において、逐次処理に比べtomcatv で4.08倍、swim で3.90倍、su2cor で2.21倍、hydro2d で3.53倍、mgrid で3.85倍、applu で3.62倍、turb3d で3.20倍の性能向上が得られた。

論文抄録(英)

内容記述タイプ

Other

内容記述

Currently, multiprocessor systems, especially multicore processors, are attracting much attention for performance, low power consumption and short hardware/software development period. To take the full advantage of multiprocessor systems, parallelizing compilers serve important roles. This paper describes the execution performance of OSCAR multigrain parallelizing compiler using coarse grain task parallelization and near fine grain parallelization in addition to loop parallelization, on the latest SMP servers and a SMP embedded multicore. The OSCAR compiler has realized the automatic determination of parallelizing layer, which decides the suitable number of processors and parallelizing technique for each nested part of the program, and global cache memory optimization over loops and coarse grain tasks. In the performance evaluation using 10 SPEC CFP95 benchmark programs and 4 SPEC CFP2000, OSCAR compiler gave us 2.74 times speedup compared with IBM XL Fortran compiler 10.1 on IBM p5550Q Power5+ 8 processors server, 4.82 times speedup compared with IBM XL Fortran compiler 8.1 on IBM pSeries690 Power4 24 processors server. OSCAR compiler can be also applied for NEC/ARM MPCore ARMv6 4 processors low power embedded multicore, using subset of OpenMP libraries and g77 compiler. In the evaluation using SPEC CFP95 benchmarks with reduced data sets, OSCAR compiler achieved 4.08 times speedup for tomcatv, 3.90 times speedup for swim, 2.21 times speedup for su2cor, 3.53 times speedup for hydro2d, 3.85 times speedup for mgrid, 3.62 times speedup for applu and 3.20 times speedup for turb3d against the sequential execution.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10096105

書誌情報

情報処理学会研究報告計算機アーキテクチャ（ARC）

巻 2006, 号 127(2006-ARC-170), p. 7-12, 発行日 2006-11-28

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

ja

出版者

情報処理学会

戻る

0

views

	Views

Versions

Ver.1

2025-01-22 20:35:53.288536

Show All versions

Share

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX