共同執筆コンテンツにおける単語の起源追跡

中村, 晃; 鈴木, 優; 石川, 佳治; Akira, Nakamura; Yu, Suzuki; Yoshiharu, Ishikawa

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

共同執筆コンテンツにおける単語の起源追跡

https://ipsj.ixsq.nii.ac.jp/records/142630

名前 / ファイル	ライセンス	アクション
IPSJ-TOD0802005.pdf (2.0 MB)	Copyright (c) 2015 by the Information Processing Society of Japan
オープンアクセス

Item type

Trans(1)

公開日

2015-06-30

タイトル

共同執筆コンテンツにおける単語の起源追跡

タイトル

言語

タイトル

Provenance Tracking of Terms in Collaborative Authoring Systems

言語

jpn

キーワード

主題Scheme

Other

主題

[研究論文] Wikipedia，起源，オーサーシップ，バージョン管理，共同執筆

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

名古屋大学大学院情報科学研究科／現在，SCSK株式会社

著者所属

奈良先端科学技術大学院大学情報科学研究科

著者所属

名古屋大学大学院情報科学研究科

著者所属(英)

Graduate School of Information Science, Nagoya University / Presently with SCSK Corporation

著者所属(英)

Graduate School of Information Science, Nara Institute of Science and Technology

著者所属(英)

Graduate School of Information Science, Nagoya University

著者名

中村, 晃
鈴木, 優
石川, 佳治

著者名(英)

Akira, Nakamura
Yu, Suzuki
Yoshiharu, Ishikawa

論文抄録

内容記述タイプ

Other

内容記述

現在，Web上にはWikipediaを代表としたコラボレーションプラットフォームが多数存在する．本論文では，版管理された共同執筆型のコンテンツに対して，記述の起源を追跡する手法を提案する．不特定多数の編集者がコンテンツの編集に関与するWikiシステムや，ソフトウェアの共同開発を前提としたコード管理システムにおいて，記述の正確な起源の特定は重要である．実際に，Wikipediaにおける編集者や記事の質推定などのように，記述の起源を利用した研究や応用アプリケーションがすでに存在する．しかし，記述に対する正確な起源の付与は，記述の復元を考慮する必要があるため容易ではない．なぜならば，記述が復元された場合，削除以前の記述の起源を参照する必要があるからである．既存手法では，小さな粒度の記述の復元を検出することは困難であった．そこで，本研究では削除された記述の位置を保持したまま管理することによって，小さな粒度での復元を厳密に検出し，記述の起源を正確に推定する．評価実験ではWikipedia日本語版において，人手により特定した186件の単語の起源とシステムの推定した起源との照合を行った．その結果，提案手法の正解率は86.0%となり，既存手法と比較して24.7ポイントの精度向上を確認することができた．

論文抄録(英)

内容記述タイプ

Other

内容記述

Numerous collaboration platforms on the Web are used in order to share and edit documents or source code. We propose a method of provenance tracking for collaborative authoring systems having revisioned contents such as Wiki systems or code management systems. Accurate provenance of each text is important and have potential applications. Actually, studies and applications utilizing provenance already exist, such as a study of measuring quality in Wikipedia. However, attributing accurate provenance to texts is difficult because of restoration. Provenance of restored text should refer to provenance of the text before deletion. Restoration detection of small granularity like a term level is difficult for existing techniques. Our proposed method manages provenance with keeping positions of deleted terms to detect small granularity restoration strictly, and to track provenance exactly. In evaluation experiment, we used 186 article-term sets chosen at random from Japanese Wikipedia as a dataset. We compared provenance determined by systems and true provenance manually labeled by observers. As a result, accuracy of our proposed method is 86.0% on this dataset, and outperforms accuracy of the state-of-the-art algorithm with increases of 24.7 points.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11464847

書誌情報

情報処理学会論文誌データベース（TOD）

巻 8, 号 2, p. 43-56, 発行日 2015-06-30

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7799

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-20 18:53:42.846847

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

共同執筆コンテンツにおける単語の起源追跡

× 中村, 晃

× 鈴木, 優

× 石川, 佳治

× Akira, Nakamura

× Yu, Suzuki

× Yoshiharu, Ishikawa

Versions

Share

Cite as

エクスポート