Item type |
Symposium(1) |
公開日 |
2015-05-12 |
タイトル |
|
|
タイトル |
Performance Analysis of MapReduce Implementations for High Performance Homology Search |
タイトル |
|
|
言語 |
en |
|
タイトル |
Performance Analysis of MapReduce Implementations for High Performance Homology Search |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
アプリケーション |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_5794 |
|
資源タイプ |
conference paper |
著者所属 |
|
|
|
Tokyo Institute of Technology |
著者所属 |
|
|
|
Tokyo Institute of Technology/JST CREST |
著者所属 |
|
|
|
Tokyo Institute of Technology |
著者所属 |
|
|
|
Tokyo Institute of Technology/JST CREST |
著者所属 |
|
|
|
Tokyo Institute of Technology/JST CREST |
著者所属(英) |
|
|
|
en |
|
|
Tokyo Institute of Technology |
著者所属(英) |
|
|
|
en |
|
|
Tokyo Institute of Technology / JST CREST |
著者所属(英) |
|
|
|
en |
|
|
Tokyo Institute of Technology |
著者所属(英) |
|
|
|
en |
|
|
Tokyo Institute of Technology / JST CREST |
著者所属(英) |
|
|
|
en |
|
|
Tokyo Institute of Technology / JST CREST |
著者名 |
Chaojie, Zhang
Koichi, Shirahata
Shuji, Suzuki
Yutaka, Akiyama
Satoshi, Matsuoka
|
著者名(英) |
Chaojie, Zhang
Koichi, Shirahata
Shuji, Suzuki
Yutaka, Akiyama
Satoshi, Matsuoka
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Homology search to be used in emerging bioinformatics problems such as metagenomics is of increasing importance and challenge as its application area grows more broadly while the computational complexity is increasing, thus requiring massive parallel data processing. Earlier work by some of the authors have devised novel algorithms such as GHOSTX, but the master-worker parallelization to enumerate and schedule for data processing was done with a privately developed, MPI-based master-worker framework called GHOST-MP. An alternative is to utilize the now-popular big data software substrates, such as MapReduce with abundant associated software tool-chains, but it is not clear whether the massive resource required by metagenomic homology search would not overwhelm its known limitations. By converting the GHOST-MP master-worker data processing pipeline to accommodate MapReduce, and benchmarking them on a variety of high-performance MapReduce incarnations including Hadoop, Spark, and Hamar, we attempt to characterize the appropriateness of MapReduce as a generic framework for metagenomics that embody extremely resource consuming requirements for both compute and data. Our experimental results show that MapReduce-based implementations exhibit good scaling at least up to 32 nodes and Hamar exhibits comparable performance with GHOST-MP on TSUBAME-KFC. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Homology search to be used in emerging bioinformatics problems such as metagenomics is of increasing importance and challenge as its application area grows more broadly while the computational complexity is increasing, thus requiring massive parallel data processing. Earlier work by some of the authors have devised novel algorithms such as GHOSTX, but the master-worker parallelization to enumerate and schedule for data processing was done with a privately developed, MPI-based master-worker framework called GHOST-MP. An alternative is to utilize the now-popular big data software substrates, such as MapReduce with abundant associated software tool-chains, but it is not clear whether the massive resource required by metagenomic homology search would not overwhelm its known limitations. By converting the GHOST-MP master-worker data processing pipeline to accommodate MapReduce, and benchmarking them on a variety of high-performance MapReduce incarnations including Hadoop, Spark, and Hamar, we attempt to characterize the appropriateness of MapReduce as a generic framework for metagenomics that embody extremely resource consuming requirements for both compute and data. Our experimental results show that MapReduce-based implementations exhibit good scaling at least up to 32 nodes and Hamar exhibits comparable performance with GHOST-MP on TSUBAME-KFC. |
書誌情報 |
ハイパフォーマンスコンピューティングと計算科学シンポジウム論文集
巻 2015,
p. 73-80,
発行日 2015-05-12
|
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |