| Item type |
SIG Technical Reports(1) |
| 公開日 |
2017-01-16 |
| タイトル |
|
|
タイトル |
FPGAにおける大規模なNoCのトレース駆動エミュレーションの検討 |
| タイトル |
|
|
言語 |
en |
|
タイトル |
Trace-Driven Emulation of Large-Scale Networks-on-Chip on FPGAs |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
専用システムとアクセラレータ |
| 資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
| 著者所属 |
|
|
|
東京工業大学情報理工学院 |
| 著者所属 |
|
|
|
東京工業大学情報理工学院 |
| 著者所属(英) |
|
|
|
en |
|
|
School of Computing, Tokyo Institute of Technology |
| 著者所属(英) |
|
|
|
en |
|
|
School of Computing, Tokyo Institute of Technology |
| 著者名 |
Thiem, Van Chut
吉瀬, 謙二
|
| 著者名(英) |
Thiem, Van Chu
Kenji, Kise
|
| 論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Research and development of large-scale Networks-on-Chip (NoCs) play a key role in designing future many-core systems but are challenging due to the lack of fast and accurate evaluation environments. In recent years, there have been several attempts to build NoC emulation systems using FPGAs. These studies have shown promising emulation speedups of up to several orders of magnitude compared with conventional software simulators. However, emulating large-scale NoCs with hundreds to thousands of nodes on FPGAs is a challenging problem because of the FPGA capacity constraints. Moreover, supporting trace-driven workloads captured from practical applications may drastically degrade the emulation speed, especially when the emulated NoC is large, because the trace data are typically very large and thus stored in off-chip memory (usually DRAM). In this paper, we first present estimates of the DRAM capacity and bandwidth required for emulating NoCs under trace-driven workloads. We next show that although the limitation of DRAM bandwidth can be alleviated by storing trace data to DRAM in order of the emulation cycle and sequentially loading them during the emulation, a cyclic dependency between trace data and the network may occur in some certain cases. Finally, we describe our approach to enable trace-driven emulation of large-scale NoCs on FPGAs. In our previous work, to overcome the FPGA capacity constraints, we have proposed an emulation method based on the time-division multiplexing technique where a NoC is emulated using a small number of physical nodes. In this work, we show that this emulation method also makes it much easier to achieve the DRAM bandwidth requirement when emulating large-scale NoCs under trace-driven workloads. We explain in detail how trace data are stored in DRAM and fed to the emulated NoC efficiently and show some preliminary evaluation results. |
| 論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Research and development of large-scale Networks-on-Chip (NoCs) play a key role in designing future many-core systems but are challenging due to the lack of fast and accurate evaluation environments. In recent years, there have been several attempts to build NoC emulation systems using FPGAs. These studies have shown promising emulation speedups of up to several orders of magnitude compared with conventional software simulators. However, emulating large-scale NoCs with hundreds to thousands of nodes on FPGAs is a challenging problem because of the FPGA capacity constraints. Moreover, supporting trace-driven workloads captured from practical applications may drastically degrade the emulation speed, especially when the emulated NoC is large, because the trace data are typically very large and thus stored in off-chip memory (usually DRAM). In this paper, we first present estimates of the DRAM capacity and bandwidth required for emulating NoCs under trace-driven workloads. We next show that although the limitation of DRAM bandwidth can be alleviated by storing trace data to DRAM in order of the emulation cycle and sequentially loading them during the emulation, a cyclic dependency between trace data and the network may occur in some certain cases. Finally, we describe our approach to enable trace-driven emulation of large-scale NoCs on FPGAs. In our previous work, to overcome the FPGA capacity constraints, we have proposed an emulation method based on the time-division multiplexing technique where a NoC is emulated using a small number of physical nodes. In this work, we show that this emulation method also makes it much easier to achieve the DRAM bandwidth requirement when emulating large-scale NoCs under trace-driven workloads. We explain in detail how trace data are stored in DRAM and fed to the emulated NoC efficiently and show some preliminary evaluation results. |
| 書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10096105 |
| 書誌情報 |
研究報告システム・アーキテクチャ(ARC)
巻 2017-ARC-224,
号 27,
p. 1-6,
発行日 2017-01-16
|
| ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8574 |
| Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
| 出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |