WEKO3
アイテム
Identification of Cybersecurity Specific Content Using Different Language Models
https://ipsj.ixsq.nii.ac.jp/records/206914
https://ipsj.ixsq.nii.ac.jp/records/206914100532fa-5218-4da9-b8f8-779f4b9c773e
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2020 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Journal(1) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2020-09-15 | |||||||||||||||
タイトル | ||||||||||||||||
タイトル | Identification of Cybersecurity Specific Content Using Different Language Models | |||||||||||||||
タイトル | ||||||||||||||||
言語 | en | |||||||||||||||
タイトル | Identification of Cybersecurity Specific Content Using Different Language Models | |||||||||||||||
言語 | ||||||||||||||||
言語 | eng | |||||||||||||||
キーワード | ||||||||||||||||
主題Scheme | Other | |||||||||||||||
主題 | [特集:“Applications and the Internet” in Conjunction with Main Topics of COMPSAC2019] cyber threat, NLP, Text-Classification | |||||||||||||||
資源タイプ | ||||||||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||||
資源タイプ | journal article | |||||||||||||||
著者所属 | ||||||||||||||||
Nagoya University, Graduate School of Informatics | ||||||||||||||||
著者所属 | ||||||||||||||||
Nagoya University, Information Security Office | ||||||||||||||||
著者所属 | ||||||||||||||||
Nagoya University, Information Technology Center | ||||||||||||||||
著者所属 | ||||||||||||||||
Nagoya University, Information Technology Center | ||||||||||||||||
著者所属 | ||||||||||||||||
ExaWizards Inc. | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Nagoya University, Graduate School of Informatics | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Nagoya University, Information Security Office | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Nagoya University, Information Technology Center | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Nagoya University, Information Technology Center | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
ExaWizards Inc. | ||||||||||||||||
著者名 |
Otgonpurev, Mendsaikhan
× Otgonpurev, Mendsaikhan
× Hirokazu, Hasegawa
× Yukiko, Yamaguchi
× Hajime, Shimada
× Enkhbold, Bataa
|
|||||||||||||||
著者名(英) |
Otgonpurev, Mendsaikhan
× Otgonpurev, Mendsaikhan
× Hirokazu, Hasegawa
× Yukiko, Yamaguchi
× Hajime, Shimada
× Enkhbold, Bataa
|
|||||||||||||||
論文抄録 | ||||||||||||||||
内容記述タイプ | Other | |||||||||||||||
内容記述 | Given the sheer amount of digital texts publicly available on the Internet, it becomes more challenging for security analysts to identify cyber threat related content. In this research, we proposed to build an autonomous system to identify cyber threat information from publicly available information sources. We examined different language models to utilize as a cybersecurity-specific filter for the proposed system. Using the domain-specific training data, we trained Doc2Vec and BERT models and compared their performance. According to our evaluation, the BERT-based Natural Language Filter is able to identify and classify cybersecurity-specific natural language text with 90% accuracy. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.28(2020) (online) DOI http://dx.doi.org/10.2197/ipsjjip.28.623 ------------------------------ |
|||||||||||||||
論文抄録(英) | ||||||||||||||||
内容記述タイプ | Other | |||||||||||||||
内容記述 | Given the sheer amount of digital texts publicly available on the Internet, it becomes more challenging for security analysts to identify cyber threat related content. In this research, we proposed to build an autonomous system to identify cyber threat information from publicly available information sources. We examined different language models to utilize as a cybersecurity-specific filter for the proposed system. Using the domain-specific training data, we trained Doc2Vec and BERT models and compared their performance. According to our evaluation, the BERT-based Natural Language Filter is able to identify and classify cybersecurity-specific natural language text with 90% accuracy. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.28(2020) (online) DOI http://dx.doi.org/10.2197/ipsjjip.28.623 ------------------------------ |
|||||||||||||||
書誌レコードID | ||||||||||||||||
収録物識別子タイプ | NCID | |||||||||||||||
収録物識別子 | AN00116647 | |||||||||||||||
書誌情報 |
情報処理学会論文誌 巻 61, 号 9, 発行日 2020-09-15 |
|||||||||||||||
ISSN | ||||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||||
収録物識別子 | 1882-7764 |