Nested RNSの定数除算を用いた深層畳込みニューラルネットワークのFPGA実現について

中原, 啓貴; 笹尾, 勤; 岩本, 久; Hiroki, Nakahara; Tsutomu, Sasao; Hisashi, Iwamoto

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Nested RNSの定数除算を用いた深層畳込みニューラルネットワークのFPGA実現について

https://ipsj.ixsq.nii.ac.jp/records/147065

名前 / ファイル	ライセンス	アクション
IPSJ-SLDM16174039.pdf (506.6 kB)	Copyright (c) 2016 by the Institute of Electronics, Information and Communication Engineers This SIG report is only available to those in membership of the SIG.
SLDM:会員：¥0, DLIB:会員：¥0

Item type

SIG Technical Reports(1)

公開日

2016-01-12

タイトル

Nested RNSの定数除算を用いた深層畳込みニューラルネットワークのFPGA実現について

タイトル

言語

タイトル

A Realization of Deep Convolutional Neural Network using the Nested RNS on an FPGA including the Constant Division

言語

jpn

キーワード

主題Scheme

Other

主題

ニューラルネットワークとTRAX

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

愛媛大学大学院理工学研究科電子情報工学専攻

著者所属

明治大学理工学部情報科学科

著者所属

REVSONIC株式会社

著者所属(英)

Department of Electrical and Electronic Engineering and Computer Science, Ehime University

著者所属(英)

Department of Computer Science, Meiji University

著者所属(英)

REVSONIC Corp.

著者名

中原, 啓貴
笹尾, 勤
岩本, 久

著者名(英)

Hiroki, Nakahara
Tsutomu, Sasao
Hisashi, Iwamoto

論文抄録

内容記述タイプ

Other

内容記述

画像識別等の組込み機器では学習済み深層畳み込みニューラルネットワーク（DCNM Deep Convolutional Neural Network) の識別高速化が求められている DCNN の演算の 90%以上は 2 次元畳み込みであり，主に積和 (MAC:Multiply-Accumulation) 演算が行われている．現行の FPGA は MAC 演算用の DSP ブロック（Xilinx 社 FPGA では DSP48E ブロック) を搭載しているが，大規模な DCNN を実現する際，大量の DSP ブロックが必要である．n ビットの乗算は O(ｎ・22n) の面積を必要とするため，入力数 n を分解すれば面積を削減できる．剰余数系 (RNS:Residue Number System) を改良したNested RNS (NRNS) を適用することで入力数 n が分割されるため，コンパクトな回路で並列処理でき，かつ動作周波数が上がる．DCNN を実現するためには，活性化関数とオーバーフローを防止するための切り上げ処理を実現する必要がある．本論文では，NRNS の性質を利用して NRNS 上で活性化関数 ReLU を各桁のマルチプレクサで実現する．また，切り上げは NRNS の法の部分集合のダイナミックレンジによる定数除算を行う．この成立を利用して，定数除算を NRNS 上でコンパクトに実現する提案手法を NetFPGA SUME (Xilinx 社 Virtex7VC7V690T) 上に実現し，他の FPGA 実現法と比較した結果提案手法が面積性能効率で最も優れていた．

論文抄録(英)

内容記述タイプ

Other

内容記述

A pre-trained deep convolutional neural network (DCNN) is the feed-forward computation perspective which is widely used for the embedded vision systems. In the DCNN, the 2D convolutional operation occupies more than 90% of the computation time. Since the 2D convolutional operation performs massive multiply-accumulation (MAC) operations, conventional realizations could not implement a fully parallel DCNN. We apply the nested RNS (NRNS), which recursively decompose the RNS. It can decompose the MAC unit into circuits with small sizes. In the DCNN using the NRNS, a MAC unit is decomposed into 4-bit ones realized by look-up tables of the FPGA. Also, to realize the fully functions for the DCNN, we implement the activation function and the truncation on the FPGA. The ReLU function is realized by the multiplexer, while the truncation is realized by the division of a dynamic range for a subset of moduli. The ImageNet DCNN using the NRNS is implemented on a NetFPGA-sume evaluation board which has a Xilinx Inc. Virtex7 VC7V690T FPGA. As for the performance per area GOPS (Giga operations per second) per a slice, the proposed one is better than the existing best realization.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11451459

書誌情報

研究報告システムとLSIの設計技術（SLDM）

巻 2016-SLDM-174, 号 39, p. 1-6, 発行日 2016-01-12

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8639

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-20 17:45:03.690346

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Nested RNSの定数除算を用いた深層畳込みニューラルネットワークのFPGA実現について

× 中原, 啓貴

× 笹尾, 勤

× 岩本, 久

× Hiroki, Nakahara

× Tsutomu, Sasao

× Hisashi, Iwamoto

Versions

Share

Cite as

エクスポート