WEKO3
アイテム
Malicious JavaScript Detection in Realistic Environments with SVM and MLP Models
https://ipsj.ixsq.nii.ac.jp/records/239363
https://ipsj.ixsq.nii.ac.jp/records/239363633780b7-e263-4dfd-bede-639cd6b47efe
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]()
2026年9月15日からダウンロード可能です。
|
Copyright (c) 2024 by the Information Processing Society of Japan
|
|
非会員:¥0, IPSJ:学会員:¥0, 論文誌:会員:¥0, DLIB:会員:¥0 |
Item type | Journal(1) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2024-09-15 | |||||||||
タイトル | ||||||||||
タイトル | Malicious JavaScript Detection in Realistic Environments with SVM and MLP Models | |||||||||
タイトル | ||||||||||
言語 | en | |||||||||
タイトル | Malicious JavaScript Detection in Realistic Environments with SVM and MLP Models | |||||||||
言語 | ||||||||||
言語 | eng | |||||||||
キーワード | ||||||||||
主題Scheme | Other | |||||||||
主題 | [特集:サプライチェーンを安全にするサイバーセキュリティ技術] malicious JavaScript, feature re-sampling, imbalance dataset | |||||||||
資源タイプ | ||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||
資源タイプ | journal article | |||||||||
著者所属 | ||||||||||
National Defense Academy | ||||||||||
著者所属 | ||||||||||
National Defense Academy | ||||||||||
著者所属(英) | ||||||||||
en | ||||||||||
National Defense Academy | ||||||||||
著者所属(英) | ||||||||||
en | ||||||||||
National Defense Academy | ||||||||||
著者名 |
Ngoc, Minh Phung
× Ngoc, Minh Phung
× Mamoru, Mimura
|
|||||||||
著者名(英) |
Ngoc, Minh Phung
× Ngoc, Minh Phung
× Mamoru, Mimura
|
|||||||||
論文抄録 | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | Malicious JavaScript detection using machine learning models has shown many great results over the years. However, real-world data only has a small fraction of malicious JavaScript. Many previous techniques ignore most of the benign samples and focus on training a machine learning model with a balanced dataset. This paper continues the previous work (Phung and Mimura, 2023), uses Support vector machine (SVM) and Multi-layer perceptron (MLP) as classifiers, trains the models with a Doc2Vec-based filter that can quickly classify JavaScript malware using Natural Language Processing (NLP) and feature re-sampling. In this paper, the total features of the benign samples will be reduced using a combination of word vectors and a clustering model. Random seed oversampling will generate new training malicious data based on the original training dataset. We evaluate our models with a dataset of over 30,000 samples obtained from top popular websites, PhishTank, and GitHub. The experimental result shows that Abstract syntax tree (AST) parsing has the most effect on the improvement of the detection scores. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.32(2024) (online) DOI http://dx.doi.org/10.2197/ipsjjip.32.748 ------------------------------ |
|||||||||
論文抄録(英) | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | Malicious JavaScript detection using machine learning models has shown many great results over the years. However, real-world data only has a small fraction of malicious JavaScript. Many previous techniques ignore most of the benign samples and focus on training a machine learning model with a balanced dataset. This paper continues the previous work (Phung and Mimura, 2023), uses Support vector machine (SVM) and Multi-layer perceptron (MLP) as classifiers, trains the models with a Doc2Vec-based filter that can quickly classify JavaScript malware using Natural Language Processing (NLP) and feature re-sampling. In this paper, the total features of the benign samples will be reduced using a combination of word vectors and a clustering model. Random seed oversampling will generate new training malicious data based on the original training dataset. We evaluate our models with a dataset of over 30,000 samples obtained from top popular websites, PhishTank, and GitHub. The experimental result shows that Abstract syntax tree (AST) parsing has the most effect on the improvement of the detection scores. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.32(2024) (online) DOI http://dx.doi.org/10.2197/ipsjjip.32.748 ------------------------------ |
|||||||||
書誌レコードID | ||||||||||
収録物識別子タイプ | NCID | |||||||||
収録物識別子 | AN00116647 | |||||||||
書誌情報 |
情報処理学会論文誌 巻 65, 号 9, 発行日 2024-09-15 |
|||||||||
ISSN | ||||||||||
収録物識別子タイプ | ISSN | |||||||||
収録物識別子 | 1882-7764 | |||||||||
公開者 | ||||||||||
言語 | ja | |||||||||
出版者 | 情報処理学会 |