Item type |
SIG Technical Reports(1) |
公開日 |
2016-09-08 |
タイトル |
|
|
タイトル |
Spark as Data Supplier for MPI Deep Learning Processes |
タイトル |
|
|
言語 |
en |
|
タイトル |
Spark as Data Supplier for MPI Deep Learning Processes |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
ストレージ |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
著者所属 |
|
|
|
ESIEE PARIS/FUJITSU LABORATORIES LTD. |
著者所属 |
|
|
|
FUJITSU LABORATORIES LTD. |
著者所属 |
|
|
|
FUJITSU LABORATORIES LTD. |
著者所属 |
|
|
|
FUJITSU LABORATORIES LTD. |
著者所属 |
|
|
|
FUJITSU LABORATORIES LTD. |
著者所属(英) |
|
|
|
en |
|
|
ESIEE PARIS / FUJITSU LABORATORIES LTD. |
著者所属(英) |
|
|
|
en |
|
|
FUJITSU LABORATORIES LTD. |
著者所属(英) |
|
|
|
en |
|
|
FUJITSU LABORATORIES LTD. |
著者所属(英) |
|
|
|
en |
|
|
FUJITSU LABORATORIES LTD. |
著者所属(英) |
|
|
|
en |
|
|
FUJITSU LABORATORIES LTD. |
著者名 |
Amir, Haderbache
Masahiro, Miwa
Masafumi, Yamazaki
Tsuguchika, Tabaru
Kohta, Nakashima
|
著者名(英) |
Amir, Haderbache
Masahiro, Miwa
Masafumi, Yamazaki
Tsuguchika, Tabaru
Kohta, Nakashima
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Recent works in deep learning show that training large models can improve accuracy. Many distributed deep learning frameworks have been so far developed to scale up machine learning algorithms. For the sake of performance, we believe these intensive computations must be combined with a clever data parallelism strategy. This paper brings one possible answer to the issue of supplying data to deep learning worker nodes on HPC systems. We design a two sides system where independent MPI Process executions match Spark tasks whose job is to provide data partition. We test and evaluate different Spark configurations and show that this system provides a flexible and scalable data supply mechanism which leverage MPI high performance and Spark high level data management. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Recent works in deep learning show that training large models can improve accuracy. Many distributed deep learning frameworks have been so far developed to scale up machine learning algorithms. For the sake of performance, we believe these intensive computations must be combined with a clever data parallelism strategy. This paper brings one possible answer to the issue of supplying data to deep learning worker nodes on HPC systems. We design a two sides system where independent MPI Process executions match Spark tasks whose job is to provide data partition. We test and evaluate different Spark configurations and show that this system provides a flexible and scalable data supply mechanism which leverage MPI high performance and Spark high level data management. |
書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10463942 |
書誌情報 |
研究報告ハイパフォーマンスコンピューティング(HPC)
巻 2016-HPC-156,
号 11,
p. 1-9,
発行日 2016-09-08
|
ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8841 |
Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |