Fuzzy Hashを用いたマルウェア検知精度を長期的に維持する機械学習モデル逐次更新手法の提案

栗原,史弥; 松木,隆宏; 寺田,真敏; Fumiya Kurihara; Takahiro Matsuki; Masato Terada

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Fuzzy Hashを用いたマルウェア検知精度を長期的に維持する機械学習モデル逐次更新手法の提案

https://doi.org/10.20729/0002004332

名前 / ファイル	ライセンス	アクション
IPSJ-JNL6609011.pdf (13.3 MB) 2027年9月15日からダウンロード可能です。	Copyright (c) 2025 by the Information Processing Society of Japan
非会員：¥660, IPSJ:学会員：¥330, 論文誌:会員：¥0, DLIB:会員：¥0

Item type

Journal(1)

公開日

2025-09-15

タイトル

言語

タイトル

Fuzzy Hashを用いたマルウェア検知精度を長期的に維持する機械学習モデル逐次更新手法の提案

タイトル

言語

タイトル

Proposal of Sequential Updating of Machine Learning Models with Fuzzy Hash Values to Maintain Long-term Malware Detection Accuracy

言語

jpn

キーワード

主題Scheme

Other

主題

[特集:AI社会を安全にするコンピュータセキュリティ技術（特選論文）] マルウェア，機械学習，コンセプトドリフト，Fuzzy Hash値

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

ID登録

10.20729/0002004332

ID登録タイプ

JaLC

著者所属

東京電機大学

著者所属

東京電機大学

著者所属

東京電機大学

著者所属(英)

Tokyo Denki University

著者所属(英)

Tokyo Denki University

著者所属(英)

Tokyo Denki University

著者名

栗原,史弥
松木,隆宏
寺田,真敏

著者名(英)

Fumiya Kurihara
Takahiro Matsuki
Masato Terada

論文抄録

内容記述タイプ

Other

内容記述

マルウェアを用いたサイバー攻撃の脅威は，医療機関を標的とし診療停止に追い込むなど，私たちの日常生活に深刻な影響を及ぼしている．このようなマルウェアを正確に検知するために，機械学習を活用した手法が研究されている．しかし，学習データと予測データの関係や分布が時間経過とともに変化する「ドリフト」と呼ばれる現象が原因で，検知精度の低下が課題となっている．本研究では，このドリフト問題に対処するため，マルウェアの持つスケールフリー性を活用した機械学習モデル逐次更新手法を提案し，長期的なマルウェア検知率の維持を目指している．本論文では，(1)提案手法での適用を検討した3種のFuzzy Hash値と2種の類似度算出手法を用い，バイナリデータから算出するFuzzy Hash値の場合には，マルウェアがスケールフリー性を有することを示す．次に，(2)マルウェアのスケールフリー性を活用した機械学習モデル逐次更新手法を，PE表層情報を用いたマルウェア検知器に適用する．そのうえで，FFRI Dataset 2021～2023の検体を用い，1カ月ごとのマルウェア検知率と正常なソフトウェアの誤検知率の推移を通して提案する逐次更新手法の有効性を示す．

論文抄録(英)

内容記述タイプ

Other

内容記述

The threat of cyberattacks using malware has had a severe impact on our daily lives, such as targeting medical institutions and forcing them to suspend medical services. To accurately detect such malware, machine learning-based methods have been actively researched. However, the problem is drift, a phenomenon in which accuracy degrades over time due to changes in the relationship and distribution between training data and predicted data over time. This study aims to address the drift problem by proposing a sequential updating method for machine learning models that leverages the scale-free property of malware, with the goal of maintaining long-term malware detection rates. In this paper, (1) we examine the applicability of three types of Fuzzy Hash values and two similarity calculation methods. We show that malware has scale-free property when using Fuzzy Hash values computed from binary data. Next, (2) we apply the machine learning model updating method using the scale-free nature of malware to a malware detector using PE surface information and show the effectiveness of the proposed sequential updating method through the monthly malware detection rate and the false positive rate of normal software using the FFRI Dataset 2021-2023 samples.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 66, 号 9, p. 1148-1158, 発行日 2025-09-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

公開者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-09-05 04:46:27.968857

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Fuzzy Hashを用いたマルウェア検知精度を長期的に維持する機械学習モデル逐次更新手法の提案

× 栗原,史弥

× 松木,隆宏

× 寺田,真敏

× Fumiya Kurihara

× Takahiro Matsuki

× Masato Terada

Versions

Share

Cite as

エクスポート