n-gram抽出と機械学習を用いた亜種マルウェア分類手法の提案と評価

瀧口, 翔貴; 宇田, 隆哉; Shoki, Takiguchi; Ryuya, Uda

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

n-gram抽出と機械学習を用いた亜種マルウェア分類手法の提案と評価

https://doi.org/10.20729/00217609

名前 / ファイル	ライセンス	アクション
IPSJ-JNL6304015.pdf (628.2 kB)	Copyright (c) 2022 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2022-04-15

タイトル

n-gram抽出と機械学習を用いた亜種マルウェア分類手法の提案と評価

タイトル

言語

タイトル

Proposal and Evaluation of Malware Species Classification Method by n-gram Extraction and Machine Learning

言語

jpn

キーワード

主題Scheme

Other

主題

[一般論文] 亜種マルウェア，マルウェア検出，n-gram

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

ID登録

10.20729/00217609

ID登録タイプ

JaLC

著者所属

東京工科大学

著者所属

東京工科大学

著者所属(英)

Tokyo University of Technology

著者所属(英)

Tokyo University of Technology

著者名

瀧口, 翔貴
宇田, 隆哉

著者名(英)

Shoki, Takiguchi
Ryuya, Uda

論文抄録

内容記述タイプ

Other

内容記述

マルウェア検出に機械学習を用いる研究はあるが，攻撃者に手法が既知である場合に検出を回避されるものがある．バイナリのn-gramに対して情報利得を求め，その値の高いもののみを利用して機械学習を行う方式にも問題はあり，すべてのマルウェアに有効な対策手法を考案することは困難である．そこで，本論文では，対象を単純にパターンマッチングできない亜種マルウェアに限定することで，n-gramを用いて亜種マルウェアからコードを抽出し，機械学習を用いてこれを検出する手法を提案する．マルウェアのバイナリすべてを画像化して機械学習により検出する手法は存在するが，本研究の手法では，機械学習に入力する前に検体のサイズを小さくできる．サイズ評価では，16gramを用いた場合の平均で約40分の1～110分の1程度に縮小させることに成功した．検出評価では，それぞれ670から1,500個の亜種マルウェアファミリの検体と1,500個の良性ソフトウェアを使用し，数個の誤分類が生じたのみであった．

論文抄録(英)

内容記述タイプ

Other

内容記述

Machine learning has been used for detecting malware, but some of the methods turn useless when the methods are known by attackers. One of the best methods is a method with binary n-grams of whole files and information gain of the n-grams. Especially, selecting top k n-grams of high information gain prevents attackers from adding common n-grams in benignware. However, the method still has problems and it is difficult to find an effective method against all malwares. Therefore, in this paper, we propose a method with n-grams extraction from malware and with machine learning by limiting targets to only malware subspecies. Of course, there is an existing method which uses whole malware binaries and machine learning. Compared to that, our method can downsize both malware and benignware before inputting machine learning network. In evaluation of file size, we succeeded to downsize from 1/40 to 1/110 in average by 16grams. Also, in evaluation of malware detection, we only got some misclassifications when comparing from 670 to 1,500 samples of each malware subspecies family with 1,500 benignware samples.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 63, 号 4, p. 1052-1071, 発行日 2022-04-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 15:23:57.801713

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

n-gram抽出と機械学習を用いた亜種マルウェア分類手法の提案と評価

× 瀧口, 翔貴

× 宇田, 隆哉

× Shoki, Takiguchi

× Ryuya, Uda

Versions

Share

Cite as

エクスポート