説明文を対象とした日本語文末述語の平易化

加藤, 汰一; 宮田, 玲; 佐藤, 理史; Taichi, Kato; Rei, Miyata; Satoshi, Sato

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

説明文を対象とした日本語文末述語の平易化

https://doi.org/10.20729/00212765

名前 / ファイル	ライセンス	アクション
IPSJ-JNL6209022.pdf (765.0 kB)	Copyright (c) 2021 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2021-09-15

タイトル

説明文を対象とした日本語文末述語の平易化

タイトル

言語

タイトル

Simplification of Japanese Sentence-ending Predicates in Descriptive Text

言語

jpn

キーワード

主題Scheme

Other

主題

[一般論文（特選論文）] 語彙平易化，マスク言語モデル，言い換え生成，人手評価，エラー分析

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

ID登録

10.20729/00212765

ID登録タイプ

JaLC

著者所属

名古屋大学大学院工学研究科

著者所属

名古屋大学大学院工学研究科

著者所属

名古屋大学大学院工学研究科

著者所属(英)

Graduate School of Engineering, Nagoya University

著者所属(英)

Graduate School of Engineering, Nagoya University

著者所属(英)

Graduate School of Engineering, Nagoya University

著者名

加藤, 汰一
宮田, 玲
佐藤, 理史

著者名(英)

Taichi, Kato
Rei, Miyata
Satoshi, Sato

論文抄録

内容記述タイプ

Other

内容記述

日本語文の文末述語は，内容語とアスペクト・モダリティ・丁寧体などの機能表現の複雑な組合せからなることが多く，それがしばしば日本語学習者によるテキスト読解を妨げる要因となる．従来の語彙平易化手法の多くは，難解な語を単語単位で平易な同義語に置き換える枠組みを採用しており，文末述語の平易化には必ずしも適していない．そこで本研究では，難解表現の検出および換言候補の生成・検証・ランキングからなる基本的な語彙平易化のプロセスを採用しつつ，日本語文末述語を一括して平易に言い換える手法を提案する．本手法の最大の特徴は，換言候補の生成プロセスにおいて事前学習済みのマスク言語モデルであるBERTを効果的に適用することで，文全体の主要な意味を保持したまま，文末述語をまとめて平易化することである．これにより多様な表現候補の生成が可能となる．説明文を対象とした人手評価実験の結果，提案手法は複数の従来手法と比較して，一貫して多くの流暢かつ妥当な換言候補を生成できることが示された．さらに，(1)平均トークン埋め込みとドロップアウトの有効性，(2)生成された候補の平易度，(3)適用先テキストドメインによる性能の違い，(4)提案手法のエラー事例を詳細に調査することで，提案手法の挙動の特徴や改善点を明らかにした．

論文抄録(英)

内容記述タイプ

Other

内容記述

Japanese sentence-ending predicates tend to be composed of a complex sequence of content words and functional elements, such as aspect, modality, and honorifics, which can often hinder the understanding of language learners. Conventional lexical simplification methods, which are designed to replace difficult target words with simpler synonyms in a word-by-word manner, are not always suitable for simplifying such Japanese predicates. Here, we propose a novel method that can simplify the whole sequence of predicate, following a basic lexical simplification process consisting of detection, generation, validation and ranking steps. The principal feature of our method is the high ability to substitute the whole predicates with simple ones while maintaining their core meanings in the context by effectively using the pre-trained masked language model of BERT. Experimental results showed that our proposed method consistently produced many more candidates that are both fluent and adequate than the multiple baseline methods. Furthermore, we conducted in-depth analyses of (1) the effectiveness of the average token embedding and dropout, (2) the simplicity of generated candidates, (3) the differences of performance by text domain, and (4) the remaining errors of our proposed method, revealing the characteristics of our methods and future prospects for improvement.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 62, 号 9, p. 1605-1619, 発行日 2021-09-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 17:15:20.304187

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

説明文を対象とした日本語文末述語の平易化

× 加藤, 汰一

× 宮田, 玲

× 佐藤, 理史

× Taichi, Kato

× Rei, Miyata

× Satoshi, Sato

Versions

Share

Cite as

エクスポート