WEKO3
アイテム
Introducing a Multithread and Multistage Mechanism for the Global Load Balancing Library of X10
https://ipsj.ixsq.nii.ac.jp/records/157623
https://ipsj.ixsq.nii.ac.jp/records/157623cbc89021-d458-4799-a1f4-6e95aa3454bd
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
Copyright (c) 2016 by the Information Processing Society of Japan
|
|
| オープンアクセス | ||
| Item type | Trans(1) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2016-02-26 | |||||||||
| タイトル | ||||||||||
| タイトル | Introducing a Multithread and Multistage Mechanism for the Global Load Balancing Library of X10 | |||||||||
| タイトル | ||||||||||
| 言語 | en | |||||||||
| タイトル | Introducing a Multithread and Multistage Mechanism for the Global Load Balancing Library of X10 | |||||||||
| 言語 | ||||||||||
| 言語 | eng | |||||||||
| キーワード | ||||||||||
| 主題Scheme | Other | |||||||||
| 主題 | [通常論文] dynamic load balancing, X10, GLB | |||||||||
| 資源タイプ | ||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||
| 資源タイプ | journal article | |||||||||
| 著者所属 | ||||||||||
| Kobe University | ||||||||||
| 著者所属 | ||||||||||
| Kobe University / RIKEN Advanced Institute for Computational Science | ||||||||||
| 著者所属(英) | ||||||||||
| en | ||||||||||
| Kobe University | ||||||||||
| 著者所属(英) | ||||||||||
| en | ||||||||||
| Kobe University / RIKEN Advanced Institute for Computational Science | ||||||||||
| 著者名 |
Kento, Yamashita
× Kento, Yamashita
× Tomio, Kamada
|
|||||||||
| 著者名(英) |
Kento, Yamashita
× Kento, Yamashita
× Tomio, Kamada
|
|||||||||
| 論文抄録 | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | Load balancing is a major concern in massively parallel computing. X10 is a partitioned global address space language for scale-out computing and provides a global load balancing (GLB) library that shows high scalability over ten thousand CPU cores. This study proposes a multistage mechanism for GLB to assign execution stages to tasks and introduces a multithread design into GLB to allow efficient data sharing between CPU cores. The system gives high priority to tasks that are assigned to earlier stages and then proceeds with subsequent stage tasks. When a computing node runs out of tasks at the earliest stage, it requests tasks at the earliest stage from other nodes and awaits responses by processing subsequent stage tasks. When the system identifies the task termination at a certain stage, it executes a reduction operation over nodes. Programmers can define their reduction operations to gather or exchange results of completed tasks. This study provides the implementation method of the extended library and evaluates its runtime overhead using the K computer to a maximum of 256 nodes. \n------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.24(2016) No.2(online) ------------------------------ |
|||||||||
| 論文抄録(英) | ||||||||||
| 内容記述タイプ | Other | |||||||||
| 内容記述 | Load balancing is a major concern in massively parallel computing. X10 is a partitioned global address space language for scale-out computing and provides a global load balancing (GLB) library that shows high scalability over ten thousand CPU cores. This study proposes a multistage mechanism for GLB to assign execution stages to tasks and introduces a multithread design into GLB to allow efficient data sharing between CPU cores. The system gives high priority to tasks that are assigned to earlier stages and then proceeds with subsequent stage tasks. When a computing node runs out of tasks at the earliest stage, it requests tasks at the earliest stage from other nodes and awaits responses by processing subsequent stage tasks. When the system identifies the task termination at a certain stage, it executes a reduction operation over nodes. Programmers can define their reduction operations to gather or exchange results of completed tasks. This study provides the implementation method of the extended library and evaluates its runtime overhead using the K computer to a maximum of 256 nodes. \n------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.24(2016) No.2(online) ------------------------------ |
|||||||||
| 書誌レコードID | ||||||||||
| 収録物識別子タイプ | NCID | |||||||||
| 収録物識別子 | AA11464814 | |||||||||
| 書誌情報 |
情報処理学会論文誌プログラミング(PRO) 巻 9, 号 1, 発行日 2016-02-26 |
|||||||||
| ISSN | ||||||||||
| 収録物識別子タイプ | ISSN | |||||||||
| 収録物識別子 | 1882-7802 | |||||||||
| 出版者 | ||||||||||
| 言語 | ja | |||||||||
| 出版者 | 情報処理学会 | |||||||||