Shift-Reduce法に基づく日本語固有表現抽出

山田, 寛康; Hiroyasu, Yamada

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Shift-Reduce法に基づく日本語固有表現抽出

https://ipsj.ixsq.nii.ac.jp/records/47825

名前 / ファイル	ライセンス	アクション
IPSJ-NL07179003.pdf (574.4 kB)	Copyright (c) 2007 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2007-05-24

タイトル

Shift-Reduce法に基づく日本語固有表現抽出

タイトル

言語

タイトル

Shift-Reduce Chunking for Japanese Named Entity Extraction

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

(株)ジャストシステム　イノベーティブ・テクノロジー研究開発部

著者所属(英)

Innovative Technology R&D Dept., Justsystems

著者名

山田, 寛康

著者名(英)

Hiroyasu, Yamada

論文抄録

内容記述タイプ

Other

内容記述

本稿では日本語固有表現に対してShift-Reduce法に基づく抽出法を提案しIREX日本語固有表現抽出タスクを用いてその有効性を検証する. 提案手法はShift-Reduce法に基づくことで文頭から順に固有表現の語境界推定後にその種類を推定するという自然な解析が実現できる. また日本語における形態素単位解析では形態素語境界と固有表現の語境界が異なる場合の誤抽出が問題となる. この問題に対し提案手法は簡単な拡張アクションを追加することで入力文全てを文字単位に解析することなく対処できる. CRL固有表現抽出データを用いた五分割交差検定による評価実験では文頭から文末に向かって部分的に文字単位解析する効率的な方法で 0.88 のF値を得た.

論文抄録(英)

内容記述タイプ

Other

内容記述

We propose a method for Japanese Named Entity (NE) extraction based on shift-reduce parsing in a deterministic manner. After shift action is employed to determine the word boundaries of an NE composed of multiple morphemes, reduce action is applied for the estimation of the NE type. In analysis of Japanese NEs for each morpheme, incorrect extractions are inevitable because of some NEs whose word boundaries are different from the morpheme's ones. While most well known work analyzes NEs for each character in sentences at the expense of efficiency, our method can analyze NEs for each morpheme in most cases by introducing two types of additional shift-reduce actions that adjust to the word boundaries of an NE. The result of 5-fold cross validation using CRL NE data-set shows that the 0.88 F-value is comparable with related work, and our left-to-right analysis for each morpheme is more efficient.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10115061

書誌情報

情報処理学会研究報告自然言語処理（NL）

巻 2007, 号 47(2007-NL-179), p. 13-18, 発行日 2007-05-24

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 08:48:15.892701

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Shift-Reduce法に基づく日本語固有表現抽出

× 山田, 寛康

× Hiroyasu, Yamada

Versions

Share

Cite as

エクスポート