WEKO3
アイテム
Unsupervised Word Alignment by Agreement Under ITG Constraint
https://ipsj.ixsq.nii.ac.jp/records/183032
https://ipsj.ixsq.nii.ac.jp/records/183032d98caf51-405a-4bc8-8b19-550493013b64
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2017 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Journal(1) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2017-08-15 | |||||||||||||||
タイトル | ||||||||||||||||
タイトル | Unsupervised Word Alignment by Agreement Under ITG Constraint | |||||||||||||||
タイトル | ||||||||||||||||
言語 | en | |||||||||||||||
タイトル | Unsupervised Word Alignment by Agreement Under ITG Constraint | |||||||||||||||
言語 | ||||||||||||||||
言語 | eng | |||||||||||||||
キーワード | ||||||||||||||||
主題Scheme | Other | |||||||||||||||
主題 | [一般論文] statistical machine translation, Inversion Transduction Grammar, unsupervised word alignment, posterior regularized EM, constrained EM | |||||||||||||||
資源タイプ | ||||||||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||||
資源タイプ | journal article | |||||||||||||||
著者所属 | ||||||||||||||||
NTT Communication Science Laboratories | ||||||||||||||||
著者所属 | ||||||||||||||||
Ehime University | ||||||||||||||||
著者所属 | ||||||||||||||||
Tokyo Institute of Technology | ||||||||||||||||
著者所属 | ||||||||||||||||
Tokyo Institute of Technology | ||||||||||||||||
著者所属 | ||||||||||||||||
National Institute of Information and Communications Technology | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
NTT Communication Science Laboratories | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Ehime University | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Tokyo Institute of Technology | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
Tokyo Institute of Technology | ||||||||||||||||
著者所属(英) | ||||||||||||||||
en | ||||||||||||||||
National Institute of Information and Communications Technology | ||||||||||||||||
著者名 |
Hidetaka, Kamigaito
× Hidetaka, Kamigaito
× Akihiro, Tamura
× Hiroya, Takamura
× Manabu, Okumura
× Eiichiro, Sumita
|
|||||||||||||||
著者名(英) |
Hidetaka, Kamigaito
× Hidetaka, Kamigaito
× Akihiro, Tamura
× Hiroya, Takamura
× Manabu, Okumura
× Eiichiro, Sumita
|
|||||||||||||||
論文抄録 | ||||||||||||||||
内容記述タイプ | Other | |||||||||||||||
内容記述 | We propose a novel unsupervised word alignment method that uses a constraint based on Inversion Transduction Grammar (ITG) parse trees to jointly unify two directional models. Previous agreement methods are not helpful for locating alignments with long distances because they do not use any syntactic structures. In contrast, the proposed method symmetrizes alignments in consideration of their structural coherence by using the ITG constraint softly in the posterior regularization framework. The ITG constraint is also compatible with word alignments that are not covered by ITG parse trees. Hence, the proposed method is robust to ITG parse errors compared to other alignment methods that directly use an ITG model. Compared to the HMM, IBM Model 4, and the baseline agreement method, the experimental results show that, in word alignment evaluation, the IBM Model 4 with the proposed ITG constraint achieves the best performance on the Japanese-English KFTT and BTEC corpus, and in translation evaluation, the proposed method shows comparable or statistically significantly better performance on the Japanese-English KFTT, Japanese-English IWSLT 2007, and Czech/German-English WMT 2015 corpus. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.25(2017) (online) DOI http://dx.doi.org/10.2197/ipsjjip.25.831 ------------------------------ |
|||||||||||||||
論文抄録(英) | ||||||||||||||||
内容記述タイプ | Other | |||||||||||||||
内容記述 | We propose a novel unsupervised word alignment method that uses a constraint based on Inversion Transduction Grammar (ITG) parse trees to jointly unify two directional models. Previous agreement methods are not helpful for locating alignments with long distances because they do not use any syntactic structures. In contrast, the proposed method symmetrizes alignments in consideration of their structural coherence by using the ITG constraint softly in the posterior regularization framework. The ITG constraint is also compatible with word alignments that are not covered by ITG parse trees. Hence, the proposed method is robust to ITG parse errors compared to other alignment methods that directly use an ITG model. Compared to the HMM, IBM Model 4, and the baseline agreement method, the experimental results show that, in word alignment evaluation, the IBM Model 4 with the proposed ITG constraint achieves the best performance on the Japanese-English KFTT and BTEC corpus, and in translation evaluation, the proposed method shows comparable or statistically significantly better performance on the Japanese-English KFTT, Japanese-English IWSLT 2007, and Czech/German-English WMT 2015 corpus. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.25(2017) (online) DOI http://dx.doi.org/10.2197/ipsjjip.25.831 ------------------------------ |
|||||||||||||||
書誌レコードID | ||||||||||||||||
収録物識別子タイプ | NCID | |||||||||||||||
収録物識別子 | AN00116647 | |||||||||||||||
書誌情報 |
情報処理学会論文誌 巻 58, 号 8, 発行日 2017-08-15 |
|||||||||||||||
ISSN | ||||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||||
収録物識別子 | 1882-7764 |