WEKO3
アイテム
Hierarchical Clustering of OSS License Statements toward Automatic Generation of License Rules
https://ipsj.ixsq.nii.ac.jp/records/193892
https://ipsj.ixsq.nii.ac.jp/records/193892258e95b6-fc33-40d4-9321-2219c1f767d9
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2019 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Journal(1) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2019-01-15 | |||||||||||||
タイトル | ||||||||||||||
タイトル | Hierarchical Clustering of OSS License Statements toward Automatic Generation of License Rules | |||||||||||||
タイトル | ||||||||||||||
言語 | en | |||||||||||||
タイトル | Hierarchical Clustering of OSS License Statements toward Automatic Generation of License Rules | |||||||||||||
言語 | ||||||||||||||
言語 | eng | |||||||||||||
キーワード | ||||||||||||||
主題Scheme | Other | |||||||||||||
主題 | [特集:全ての人とモノがつながる社会に向けたコラボレーション技術とネットワークサービス] OSS license, license identification, license generation rules, clustering | |||||||||||||
資源タイプ | ||||||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||
資源タイプ | journal article | |||||||||||||
著者所属 | ||||||||||||||
Graduate School of System Engineering, Wakayama University | ||||||||||||||
著者所属 | ||||||||||||||
Graduate School of System Engineering, Wakayama University | ||||||||||||||
著者所属 | ||||||||||||||
Graduate School of System Engineering, Wakayama University | ||||||||||||||
著者所属 | ||||||||||||||
Graduate School of Science and Technology, Kumamoto University | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Graduate School of System Engineering, Wakayama University | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Graduate School of System Engineering, Wakayama University | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Graduate School of System Engineering, Wakayama University | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Graduate School of Science and Technology, Kumamoto University | ||||||||||||||
著者名 |
Yunosuke, Higashi
× Yunosuke, Higashi
× Masao, Ohira
× Yutaro, Kashiwa
× Yuki, Manabe
|
|||||||||||||
著者名(英) |
Yunosuke, Higashi
× Yunosuke, Higashi
× Masao, Ohira
× Yutaro, Kashiwa
× Yuki, Manabe
|
|||||||||||||
論文抄録 | ||||||||||||||
内容記述タイプ | Other | |||||||||||||
内容記述 | Reusing open source software (OSS) components for one's own software products has become common in the modern software development. Automated license identification tools have been proposed to help developers identify OSS licenses, since a large number of licenses sometimes must be checked before attempting to reuse. Of the existing tools, Ninka[1] can most correctly identify licenses of each source file by using regular expressions. In case Ninka does not have license identification rules for unknown licenses, Ninka reports these as “unknown licenses” which must be checked by developers manually. Since completely-new or derived OSS licenses appear nearly every year, a license identification tool should be appropriately maintained by adding regular expressions corresponding to the new licenses. The final goal of our study is to construct a method to automatically create candidate license rules to be added to a license identification tool such as Ninka. Toward achieving the goal, files identified as unknown licenses must be classified by license firstly. In this paper, we propose a hierarchical clustering which divides unknown licenses into clusters of files with a single license. We conduct a case study to confirm the usefulness of our clustering method when it is applied for classifying 2,801, 1,230 and 2,446 unknown license statement files for Linux Kernel v4.4.6, FreeBSD v10.3.0 and Debian v7.8.0 respectively. As a result, it is confirmed that our method can create clusters which are suitable as candidates for generating license rules automatically. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.27(2019) (online) DOI http://dx.doi.org/10.2197/ipsjjip.27.42 ------------------------------ |
|||||||||||||
論文抄録(英) | ||||||||||||||
内容記述タイプ | Other | |||||||||||||
内容記述 | Reusing open source software (OSS) components for one's own software products has become common in the modern software development. Automated license identification tools have been proposed to help developers identify OSS licenses, since a large number of licenses sometimes must be checked before attempting to reuse. Of the existing tools, Ninka[1] can most correctly identify licenses of each source file by using regular expressions. In case Ninka does not have license identification rules for unknown licenses, Ninka reports these as “unknown licenses” which must be checked by developers manually. Since completely-new or derived OSS licenses appear nearly every year, a license identification tool should be appropriately maintained by adding regular expressions corresponding to the new licenses. The final goal of our study is to construct a method to automatically create candidate license rules to be added to a license identification tool such as Ninka. Toward achieving the goal, files identified as unknown licenses must be classified by license firstly. In this paper, we propose a hierarchical clustering which divides unknown licenses into clusters of files with a single license. We conduct a case study to confirm the usefulness of our clustering method when it is applied for classifying 2,801, 1,230 and 2,446 unknown license statement files for Linux Kernel v4.4.6, FreeBSD v10.3.0 and Debian v7.8.0 respectively. As a result, it is confirmed that our method can create clusters which are suitable as candidates for generating license rules automatically. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.27(2019) (online) DOI http://dx.doi.org/10.2197/ipsjjip.27.42 ------------------------------ |
|||||||||||||
書誌レコードID | ||||||||||||||
収録物識別子タイプ | NCID | |||||||||||||
収録物識別子 | AN00116647 | |||||||||||||
書誌情報 |
情報処理学会論文誌 巻 60, 号 1, 発行日 2019-01-15 |
|||||||||||||
ISSN | ||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||
収録物識別子 | 1882-7764 |