Item type |
Trans(1) |
公開日 |
2021-01-27 |
タイトル |
|
|
タイトル |
Compiling ONNX Neural Network Model Using MLIR |
タイトル |
|
|
言語 |
en |
|
タイトル |
Compiling ONNX Neural Network Model Using MLIR |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
[発表概要, Unrefereed Presentatin Abstract] |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_6501 |
|
資源タイプ |
journal article |
著者所属 |
|
|
|
IBM Research - Tokyo |
著者所属 |
|
|
|
IBM T.J. Watson Research Center |
著者所属 |
|
|
|
IBM T.J. Watson Research Center |
著者所属 |
|
|
|
IBM T.J. Watson Research Center |
著者所属 |
|
|
|
IBM Research - Tokyo |
著者所属 |
|
|
|
IBM T.J. Watson Research Center |
著者所属 |
|
|
|
IBM Research - Tokyo |
著者所属 |
|
|
|
IBM Research - Tokyo |
著者所属 |
|
|
|
IBM T.J. Watson Research Center |
著者所属(英) |
|
|
|
en |
|
|
IBM Research - Tokyo |
著者所属(英) |
|
|
|
en |
|
|
IBM T.J. Watson Research Center |
著者所属(英) |
|
|
|
en |
|
|
IBM T.J. Watson Research Center |
著者所属(英) |
|
|
|
en |
|
|
IBM T.J. Watson Research Center |
著者所属(英) |
|
|
|
en |
|
|
IBM Research - Tokyo |
著者所属(英) |
|
|
|
en |
|
|
IBM T.J. Watson Research Center |
著者所属(英) |
|
|
|
en |
|
|
IBM Research - Tokyo |
著者所属(英) |
|
|
|
en |
|
|
IBM Research - Tokyo |
著者所属(英) |
|
|
|
en |
|
|
IBM T.J. Watson Research Center |
著者名 |
Tung, D. Le
Gheorghe-Teodor, Bercea
Tong, Chen
Alexandre, E. Eichenberger
Haruki, Imai
Tian, Jin
Kiyokuni, Kawachiya
Yasushi, Negishi
Kevin, O'Brien
|
著者名(英) |
Tung, D. Le
Gheorghe-Teodor, Bercea
Tong, Chen
Alexandre, E. Eichenberger
Haruki, Imai
Tian, Jin
Kiyokuni, Kawachiya
Yasushi, Negishi
Kevin, O'Brien
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Neural network model is becoming popular and has been used in various tasks such as computer vision, speech recognition, and natural language processing. It is often the case that the training phase of a model is done in an environment, while the inference phase is executed in another environment. It is because the optimization characteristics for each phase are largely different. As a result, it is critical to efficiently compile a trained model for inferencing on different environments. To represent neural network models, users often use ONNX which is an open standard format for machine learning interoperability. We are developing a framework for compiling a model in ONNX into a standalone binary that is executable on different target hardwares such as x86, P, and Z@. The framework is written using MLIR, a modern compiler infrastructure for multi-level intermediate representations. In particular, we introduce two internal representations: ONNX IR for representing ONNX operators, and Kernel IR as an intermediate representation for efficiently lowering ONNX operators into LLVM bitcode. In this presentation, we will discuss the overall structure of the framework and show some practical examples of converting ONNX operators and models. We also cover several issues related to endianness. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Neural network model is becoming popular and has been used in various tasks such as computer vision, speech recognition, and natural language processing. It is often the case that the training phase of a model is done in an environment, while the inference phase is executed in another environment. It is because the optimization characteristics for each phase are largely different. As a result, it is critical to efficiently compile a trained model for inferencing on different environments. To represent neural network models, users often use ONNX which is an open standard format for machine learning interoperability. We are developing a framework for compiling a model in ONNX into a standalone binary that is executable on different target hardwares such as x86, P, and Z@. The framework is written using MLIR, a modern compiler infrastructure for multi-level intermediate representations. In particular, we introduce two internal representations: ONNX IR for representing ONNX operators, and Kernel IR as an intermediate representation for efficiently lowering ONNX operators into LLVM bitcode. In this presentation, we will discuss the overall structure of the framework and show some practical examples of converting ONNX operators and models. We also cover several issues related to endianness. |
書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AA11464814 |
書誌情報 |
情報処理学会論文誌プログラミング(PRO)
巻 14,
号 1,
p. 18-18,
発行日 2021-01-27
|
ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
1882-7802 |
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |