2種類の翻訳システムを用いた学術論文の特許分類体系への自動分類

難波, 英嗣; 竹澤, 寿幸; Hidetsugu, Nanba; Toshiyuki, Takezawa

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

2種類の翻訳システムを用いた学術論文の特許分類体系への自動分類

https://ipsj.ixsq.nii.ac.jp/records/66388

名前 / ファイル	ライセンス	アクション
IPSJ-TOD0203008.pdf (233.6 kB)	Copyright (c) 2009 by the Information Processing Society of Japan
オープンアクセス

Item type

Trans(1)

公開日

2009-09-30

タイトル

2種類の翻訳システムを用いた学術論文の特許分類体系への自動分類

タイトル

言語

タイトル

Classification of Research Papers into a Patent Classification System Using Two Translation Systems

言語

jpn

キーワード

主題Scheme

Other

主題

研究論文

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

広島市立大学大学院情報科学研究科

著者所属

広島市立大学大学院情報科学研究科

著者所属(英)

Graduate School of Information Sciences, Hiroshima City University

著者所属(英)

Graduate School of Information Sciences, Hiroshima City University

著者名

難波, 英嗣

著者名(英)

Hidetsugu, Nanba

論文抄録

内容記述タイプ

Other

内容記述

学術論文の特許分類体系への分類は，特許と論文を対象とした網羅的かつ効率的な先行技術調査，無効資料調査，技術動向分析などを可能にする．しかし，特許の場合と同様に論文発表時に著者本人に特許分類コードを付与してもらうことや，すでに発表済みのすべての論文に人手で分類コードを付与することは，コスト面から考えて現実的ではない．そこで，本研究では，学術論文を特許分類体系に自動的に分類する手法を提案する．論文を特許分類体系に分類するには，特許と論文で使われる用語の違いについて検討する必要がある．特許では請求範囲をなるべく広く確保するため，一般性の高い特許用語を用いて記述する傾向がある．このため，単純に表層的な単語の一致度を用いる従来の文書分類モデルでは，十分な分類精度が得られるとは限らない．さらに，より網羅的な調査や分析を可能にするためには，複数の言語で記述された論文を分類対象にする必要がある．これらの問題を解決するため，本研究では，特許用および論文用の 2 種類の翻訳モデルを用いた分類手法を提案する．特許と論文では使われる用語が違うことから，入力された論文を翻訳する際，特許用の翻訳システムは，論文用のものと同等の翻訳精度が期待できない．しかし，特許用システムによる翻訳結果に特許用語が数多く含まれていれば，文書分類の段階での精度向上が期待できるため，総合的に見れば特許用翻訳システムを用いるメリットがあると考えられる．提案手法の有効性を検証するため，第 7 回 NTCIR ワークショップ特許マイニングタスクのデータを用いて実験を行った．実験の結果，特許用翻訳システムと論文用のものを組み合わせたときに，論文用のシステムを単体で用いた場合と比べ，分類精度が改善できることが分かった．

論文抄録(英)

内容記述タイプ

Other

内容記述

Classification of research papers into patent classification systems enables exhaustive and effective invalidity search, prior art search, and technical trend analysis. However, it is very costly to ask research paper's authors or professionals to assign patent classification codes manually. Therefore, we propose a method that automatically classifies research papers into a patent classification system. To classify research papers into the classification system, we should take account of the differences of terms used in research papers and patents, because the terms used in patents are often more abstract or creative than those used in research papers, to try to widen the scope of the claims. Focusing on the classification of research papers written in various languages is also required for exhaustive searches and analyses. To solve these problems, we propose some classification methods using two machine translation systems. Generally, a performance of a machine translation system for patents is inferior to that for research papers, because the terms used in patents are different from those in research papers. However, we consider that the translation system for patents is useful for our task, because translation results by the translation system for patents tend to contain more patent terms than those for research papers. To confirm the effectiveness of our method, we conducted some examinations using the data provided from the Patent Mining Task in the NTCIR-7 Workshop. From the experimental results, we found that our method using translation systems for both research papers and patents could improve a method using single translation system.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11464847

書誌情報

情報処理学会論文誌データベース（TOD）

巻 2, 号 3, p. 76-86, 発行日 2009-09-30

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7799

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 01:04:11.106217

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

2種類の翻訳システムを用いた学術論文の特許分類体系への自動分類

× 難波, 英嗣

× Hidetsugu, Nanba

Versions

Share

Cite as

エクスポート