統計的機械翻訳のための統語に基づく単純な事前並べ替え手法

星野, 翔; 宮尾, 祐介; 須藤, 克仁; 林, 克彦; 永田, 昌明; Sho, Hoshino; Yusuke, Miyao; Katsuhito, Sudoh; Katsuhiko, Hayashi; Masaaki, Nagata

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

統計的機械翻訳のための統語に基づく単純な事前並べ替え手法

https://ipsj.ixsq.nii.ac.jp/records/195410

名前 / ファイル	ライセンス	アクション
IPSJ-JNL6003022.pdf (995.9 kB)	Copyright (c) 2019 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2019-03-15

タイトル

統計的機械翻訳のための統語に基づく単純な事前並べ替え手法

タイトル

言語

タイトル

A Simple Syntax-based Preordering Method for Statistical Machine Translation

言語

jpn

キーワード

主題Scheme

Other

主題

[特集：若手研究者] 統計的機械翻訳，統語に基づく事前並べ替え

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

国立情報学研究所／現在，株式会社みらい翻訳

著者所属

国立情報学研究所／現在，東京大学

著者所属

日本電信電話株式会社NTTコミュニケーション科学基礎研究所／現在，奈良先端科学技術大学院大学

著者所属

日本電信電話株式会社NTTコミュニケーション科学基礎研究所／現在，大阪大学

著者所属

日本電信電話株式会社NTTコミュニケーション科学基礎研究所

著者所属(英)

National Institute of Informatics / Presently with Mirai Translate, Inc.

著者所属(英)

National Institute of Informatics / Presently with The University of Tokyo

著者所属(英)

NTT Communication Science Laboratories, NTT Corporation / Presently with Nara Institute of Science and Technology

著者所属(英)

NTT Communication Science Laboratories, NTT Corporation / Presently with Osaka University

著者所属(英)

NTT Communication Science Laboratories, NTT Corporation

著者名

星野, 翔
宮尾, 祐介
須藤, 克仁
林, 克彦
永田, 昌明

著者名(英)

Sho, Hoshino
Yusuke, Miyao
Katsuhito, Sudoh
Katsuhiko, Hayashi
Masaaki, Nagata

論文抄録

内容記述タイプ

Other

内容記述

本論文は，英語と日本語のように語順が大きく異なる言語対における統計的機械翻訳の精度向上のため，統語に基づく単純な事前並べ替え手法を提案する．まず，句構造構文解析器を用いて入力文を構文解析および2分木化して，2分木化句構造木を得る．次に，線形サポートベクタマシンを2値分類器として用いて，2分木の各ノードに反転または非反転の並べ替えラベルを付与する．その後，構文木に付与された並べ替えラベルに従い，入力文を並べ替え，統計的機械翻訳システムを用いて翻訳する．類似の手法は過去に幾度となく試行されているが，提案手法は，2値分類器の学習に必要なオラクル並べ替えラベルおよび分類器の素性テンプレートを同時に改良する．大規模特許データを用いる英日・日英翻訳実験において，我々の提案手法は先行研究の事前並べ替え手法の翻訳精度を大幅に改善できることを示す．

論文抄録(英)

内容記述タイプ

Other

内容記述

We propose a simple syntax-based preordering method that improves translation accuracy of distant language pairs, such as English and Japanese, using statistical machine translation. Our method reorders a source-side binary constituent tree by assigning reordering labels, whether the order of child nodes under a binary node should be reversed, using linear support vector machine as a binary classifier. While this idea has been repeatedly implemented in the task of preordering, the way how to obtain oracle reordering labels used for training the classifier remains in a nontrivial open problem. We introduce a procedure to obtain the oracle reordering labels as well as a set of features that improves binary classification accuracy on the task of predicting reordering labels. The tree reordered according to the classified labels is used to yield reordered source sentence, which is fed to a standard statistical machine translation system to generate translation. Experimental results in English-to-Japanese and Japanese-to-English patent translation show that our proposal substantially improves a previously proposed method in terms of translation accuracy.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 60, 号 3, p. 890-902, 発行日 2019-03-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 23:07:28.572248

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

統計的機械翻訳のための統語に基づく単純な事前並べ替え手法

× 星野, 翔

× 宮尾, 祐介

× 須藤, 克仁

× 林, 克彦

× 永田, 昌明

× Sho, Hoshino

× Yusuke, Miyao

× Katsuhito, Sudoh

× Katsuhiko, Hayashi

× Masaaki, Nagata

Versions

Share

Cite as

エクスポート