ログイン 新規登録
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 研究報告
  2. ソフトウェア工学(SE)
  3. 2021
  4. 2021-SE-207

Scalable Large-Variance Clone Detection

https://ipsj.ixsq.nii.ac.jp/records/209668
https://ipsj.ixsq.nii.ac.jp/records/209668
8010ee1b-1c79-4bb6-909c-17fcec21962e
名前 / ファイル ライセンス アクション
IPSJ-SE21207011.pdf IPSJ-SE21207011.pdf (793.1 kB)
Copyright (c) 2021 by the Information Processing Society of Japan
オープンアクセス
Item type SIG Technical Reports(1)
公開日 2021-02-22
タイトル
タイトル Scalable Large-Variance Clone Detection
タイトル
言語 en
タイトル Scalable Large-Variance Clone Detection
言語
言語 eng
キーワード
主題Scheme Other
主題 コードクローン
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_18gh
資源タイプ technical report
著者所属
Osaka University
著者所属
Osaka University
著者所属
Osaka University
著者所属(英)
en
Osaka University
著者所属(英)
en
Osaka University
著者所属(英)
en
Osaka University
著者名 Tasuku, Nakagawa

× Tasuku, Nakagawa

Tasuku, Nakagawa

Search repository
Yoshiki, Higo

× Yoshiki, Higo

Yoshiki, Higo

Search repository
Shinji, Kusumoto

× Shinji, Kusumoto

Shinji, Kusumoto

Search repository
著者名(英) Tasuku, Nakagawa

× Tasuku, Nakagawa

en Tasuku, Nakagawa

Search repository
Yoshiki, Higo

× Yoshiki, Higo

en Yoshiki, Higo

Search repository
Shinji, Kusumoto

× Shinji, Kusumoto

en Shinji, Kusumoto

Search repository
論文抄録
内容記述タイプ Other
内容記述 A code clone (in short, clone) is a code fragment that is identical or similar to other code fragments in source code. Clones generated by a large number of changes to copy-and-pasted code fragments are called large-variance clones. It is difficult for general clone detection techniques to detect such clones and thus specialized techniques are necessary. In addition, with the rapid growth of software development, scalable clone detectors that can detect clones in large codebases are required. However, there are no existing techniques for quickly detecting large-variance clones in large codebases. In this paper, we propose a scalable clone detection technique that can detect large-variance clones from large codebases and describe its implementation, called NIL. NIL is a token-based clone detector that efficiently identifies clone candidates using an N-gram representation of token sequences and an inverted index. Then, NIL verifies the clone candidates by measuring their similarity based on the longest common subsequence between their token sequences. We evaluate NIL in terms of large-variance clone detection accuracy, general Type-1, Type-2, and Type-3 clone detection accuracy, and scalability. Our experimental results show that NIL has higher accuracy in terms of large-variance clone detection, equivalent accuracy in terms of general clone detection, and the shortest execution time for inputs of various sizes (1-250 MLOC) compared to existing state-of-the-art tools.
論文抄録(英)
内容記述タイプ Other
内容記述 A code clone (in short, clone) is a code fragment that is identical or similar to other code fragments in source code. Clones generated by a large number of changes to copy-and-pasted code fragments are called large-variance clones. It is difficult for general clone detection techniques to detect such clones and thus specialized techniques are necessary. In addition, with the rapid growth of software development, scalable clone detectors that can detect clones in large codebases are required. However, there are no existing techniques for quickly detecting large-variance clones in large codebases. In this paper, we propose a scalable clone detection technique that can detect large-variance clones from large codebases and describe its implementation, called NIL. NIL is a token-based clone detector that efficiently identifies clone candidates using an N-gram representation of token sequences and an inverted index. Then, NIL verifies the clone candidates by measuring their similarity based on the longest common subsequence between their token sequences. We evaluate NIL in terms of large-variance clone detection accuracy, general Type-1, Type-2, and Type-3 clone detection accuracy, and scalability. Our experimental results show that NIL has higher accuracy in terms of large-variance clone detection, equivalent accuracy in terms of general clone detection, and the shortest execution time for inputs of various sizes (1-250 MLOC) compared to existing state-of-the-art tools.
書誌レコードID
収録物識別子タイプ NCID
収録物識別子 AN10112981
書誌情報 研究報告ソフトウェア工学(SE)

巻 2021-SE-207, 号 11, p. 1-8, 発行日 2021-02-22
ISSN
収録物識別子タイプ ISSN
収録物識別子 2188-8825
Notice
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.
出版者
言語 ja
出版者 情報処理学会
戻る
0
views
See details
Views

Versions

Ver.1 2025-01-19 18:26:41.034187
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3