ブラックボックス条件下における画像解釈器への標的型敵対的攻撃

廣瀬, 雄大; 向田, 眞志保; 小野, 智司; Yudai, Hirose; Mashiho, Mukaida; Satoshi, Ono

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

ブラックボックス条件下における画像解釈器への標的型敵対的攻撃

https://ipsj.ixsq.nii.ac.jp/records/240844

名前 / ファイル	ライセンス	アクション
IPSJ-CSS2024098.pdf (3.4 MB) 2026年10月15日からダウンロード可能です。	Copyright (c) 2024 by the Information Processing Society of Japan
非会員：¥660, IPSJ:学会員：¥330, CSEC:会員：¥0, SPT:会員：¥0, DLIB:会員：¥0

Item type

Symposium(1)

公開日

2024-10-15

タイトル

言語

タイトル

ブラックボックス条件下における画像解釈器への標的型敵対的攻撃

タイトル

言語

タイトル

Targeted Adversarial Attacks on Image Interpreters under Black-box Condition

言語

jpn

キーワード

主題Scheme

Other

主題

説明可能AI，敵対的攻撃，進化的アルゴリズム

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_5794

資源タイプ

conference paper

著者所属

鹿児島大学

著者所属

鹿児島大学

著者所属

鹿児島大学

著者所属(英)

Kagoshima University

著者所属(英)

Kagoshima University

著者所属(英)

Kagoshima University

著者名

廣瀬, 雄大
向田, 眞志保
小野, 智司

著者名(英)

Yudai, Hirose
Mashiho, Mukaida
Satoshi, Ono

論文抄録

内容記述タイプ

Other

内容記述

深層ニューラルネットワークは画像認識や医療画像診断など様々な分野で活用されているものの，入力に特殊な摂動を加えることで誤った結果を出力するAdversarial Example (AE)と呼ばれる脆弱性の存在が明らかになっている．このような脆弱性は入力に対する推論根拠を出力する説明可能AI(eXplainable AI)にも例外なく存在する．説明可能AIとしてはGradCAMやGuidedBackPropagationなどの画像解釈器が提案されているが，これらの解釈器に対する脆弱性の検証は十分に行われていない．そこで本研究では，進化型最適化手法であるSep-CMA-ESを用いた，標的型敵対的攻撃手法を提案する．提案手法ではモデルの内部構造が使えないブラックボックス条件下において，予測ラベルは維持したまま解釈結果を特定の画像の解釈となるように誤らせる脆弱性の存在を明らかにした．

論文抄録(英)

内容記述タイプ

Other

内容記述

Deep neural networks(DNNs) are used in various fields such as image recognition and medical image diagnosis. However, DNNs have a vulnerability called an Adversarial Example (AE), which can cause incorrect output by applying special perturbations to inputs. Such vulnerabilities have also been found to exist in eXplainable AI, which provides a basis for prediction on inputs. Image interpreters such as GradCAM and GuidedBackPropagation have been proposed as explainable AI. However, the vulnerabilities of these interpreters have not been sufficiently verified. In this study, we propose a targeted adversarial attack method based on an evolutionary optimisation method, Sep-CMA-ES. The proposed method reveals the existence of a vulnerability under black box conditions, where the internal structure of the model is not available, that allows the interpretation result to be misinterpreted as an interpretation of a particular image, while maintaining the predictive labels.

書誌情報

コンピュータセキュリティシンポジウム2024論文集

p. 719-726, 発行日 2024-10-15

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 07:50:23.630959

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

ブラックボックス条件下における画像解釈器への標的型敵対的攻撃

× 廣瀬, 雄大

× 向田, 眞志保

× 小野, 智司

× Yudai, Hirose

× Mashiho, Mukaida

× Satoshi, Ono

Versions

Share

Cite as

エクスポート