推論多重実行におけるGPU資源利用効率化技術

鈴木, 貴久; 田中, 美帆; 豊永, 慎也; 松倉, 隆一; Takahisa, Suzuki; Miho, Tanaka; Shinya, Toyonaga; Ryuichi, Matsukura

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

推論多重実行におけるGPU資源利用効率化技術

https://ipsj.ixsq.nii.ac.jp/records/210088

名前 / ファイル	ライセンス	アクション
IPSJ-DPS21186065.pdf (673.9 kB)	Copyright (c) 2021 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2021-03-08

タイトル

推論多重実行におけるGPU資源利用効率化技術

タイトル

言語

タイトル

Efficient GPU Resource Management for Multiple AI Inference Processes Execution

言語

jpn

キーワード

主題Scheme

Other

主題

機械学習

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

(株)富士通研究所

著者所属

(株)富士通研究所

著者所属

(株)富士通研究所

著者所属

(株)富士通研究所

著者所属(英)

Fujitsu Laboratories LTD.

著者所属(英)

Fujitsu Laboratories LTD.

著者所属(英)

Fujitsu Laboratories LTD.

著者所属(英)

Fujitsu Laboratories LTD.

著者名

鈴木, 貴久
田中, 美帆
豊永, 慎也
松倉, 隆一

著者名(英)

Takahisa, Suzuki
Miho, Tanaka
Shinya, Toyonaga
Ryuichi, Matsukura

論文抄録

内容記述タイプ

Other

内容記述

AI（Deep Learning）技術の普及とハードウェアの進歩により，一台のデバイス上で複数の推論処理を行うようになってきている．これにより，1 台の GPU を複数の推論処理で共有して利用できるようになるが，一方で共有による資源の競合や重複が発生し利用効率が低下する．そのために，複数の推論処理での GPU 資源利用を管理・制御し，資源利用を効率化する機構が必要となる．従来でも，GPU の処理単位で複数の処理を同時実行する技術はあったが，一般に推論ではデータをロードして，推論して，出力するため，推論中に別の処理に干渉されると処理時間が増加してしまう．そのため，本稿では推論の単位で資源利用を管理・制御することで GPU 利用を効率化する技術を提案する．また，実際に推論を使った画像処理のアプリケーションでその効果を評価し，1.6 倍の効率化を達成した．

論文抄録(英)

内容記述タイプ

Other

内容記述

With the spread of artificial intelligence (Deep Learning) technologies and advances in hardware technologies, multiple inference processes are becoming to be performed on a single device. As a result, a single GPU would be shared by multiple inference processes. But due to sharing one GPU, its utilization efficiency is decreased by resource contention and duplication. Therefore, a mechanism to manage and control GPU resource utilization in multiple inference processing is required to make resource utilization efficient. Conventionally, there has been a technique which executes multiple processes in parallel on one GPU. On the other hand, generally, an inference process is consisted of, a data load, a perform an inference, and an output result phases, but if another process make an interfere during the inference phase, the processing time would increase. Therefore, in this paper, we propose a technique to improve the efficiency of GPU utilization by managing and controlling resource utilization in units of inference process. And, the effect was evaluated by an image processing application which uses an inference, and the efficiency improvement of 1.6 times was achieved.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10116224

書誌情報

研究報告マルチメディア通信と分散処理（DPS）

巻 2021-DPS-186, 号 65, p. 1-7, 発行日 2021-03-08

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8906

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 18:16:40.107921

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

推論多重実行におけるGPU資源利用効率化技術

× 鈴木, 貴久

× 田中, 美帆

× 豊永, 慎也

× 松倉, 隆一

× Takahisa, Suzuki

× Miho, Tanaka

× Shinya, Toyonaga

× Ryuichi, Matsukura

Versions

Share

Cite as

エクスポート