| Item type |
Trans(1) |
| 公開日 |
2017-07-21 |
| タイトル |
|
|
タイトル |
Performance Modeling of Task Parallel Programs |
| タイトル |
|
|
言語 |
en |
|
タイトル |
Performance Modeling of Task Parallel Programs |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
[発表概要] |
| 資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_6501 |
|
資源タイプ |
journal article |
| 著者所属 |
|
|
|
Graduate School of Information Science and Technology, The University of Tokyo |
| 著者所属 |
|
|
|
Graduate School of Information Science and Technology, The University of Tokyo |
| 著者所属(英) |
|
|
|
en |
|
|
Graduate School of Information Science and Technology, The University of Tokyo |
| 著者所属(英) |
|
|
|
en |
|
|
Graduate School of Information Science and Technology, The University of Tokyo |
| 著者名 |
Byambajav, Namsraijav
Kenjiro, Taura
|
| 著者名(英) |
Byambajav, Namsraijav
Kenjiro, Taura
|
| 論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
To estimate the execution time of an application for given parameters on a given hardware, people often use analytical models if they are familiar with the application logic. But such approaches become increasingly difficult as the application becomes complicated and the number of input parameters increase. Besides for task parallel applications, the performance is highly dependent on the characteristics of the underlying dynamic task scheduler runtime. Therefore, it is extremely challenging to formulate analytical models for task parallel applications as the performance depends on not only the application logic but also the underlying hardware properties and the task parallel runtime's performance. Machine learning techniques can be used to build a performance model when formulating reliable analytical model is unfeasible. First, we run the target application multiple times to learn the execution time for differing input parameters and worker counts. Then a machine learning model is built using that training data. However, past such models, mainly developed for well load-balanced applications, perform poorly when applied to task parallel applications where there is much load-imbalance. Also, the accuracy of those models significantly decreases when the number of workers used in the prediction target execution becomes bigger than the number of maximum workers used during the training. We investigate whether applying modern machine learning techniques can address these challenges. In this presentation, we build performance models for task parallel programs using Lars-Lasso and deep neural network regression. We also exploit some additional information which we can gather during the training, such as the hardware performance counter values, the number of tasks created, and the task-stealing overhead, to guide these models. We evaluate the proposed models for BOTS and PARSEC benchmark applications running on multiple task parallel runtimes. |
| 論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
To estimate the execution time of an application for given parameters on a given hardware, people often use analytical models if they are familiar with the application logic. But such approaches become increasingly difficult as the application becomes complicated and the number of input parameters increase. Besides for task parallel applications, the performance is highly dependent on the characteristics of the underlying dynamic task scheduler runtime. Therefore, it is extremely challenging to formulate analytical models for task parallel applications as the performance depends on not only the application logic but also the underlying hardware properties and the task parallel runtime's performance. Machine learning techniques can be used to build a performance model when formulating reliable analytical model is unfeasible. First, we run the target application multiple times to learn the execution time for differing input parameters and worker counts. Then a machine learning model is built using that training data. However, past such models, mainly developed for well load-balanced applications, perform poorly when applied to task parallel applications where there is much load-imbalance. Also, the accuracy of those models significantly decreases when the number of workers used in the prediction target execution becomes bigger than the number of maximum workers used during the training. We investigate whether applying modern machine learning techniques can address these challenges. In this presentation, we build performance models for task parallel programs using Lars-Lasso and deep neural network regression. We also exploit some additional information which we can gather during the training, such as the hardware performance counter values, the number of tasks created, and the task-stealing overhead, to guide these models. We evaluate the proposed models for BOTS and PARSEC benchmark applications running on multiple task parallel runtimes. |
| 書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AA11464814 |
| 書誌情報 |
情報処理学会論文誌プログラミング(PRO)
巻 10,
号 4,
p. 28-28,
発行日 2017-07-21
|
| ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
1882-7802 |
| 出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |