WEKO3
アイテム
Extraction of Potentially Useful Phrase Pairs for Statistical Machine Translation
https://ipsj.ixsq.nii.ac.jp/records/141604
https://ipsj.ixsq.nii.ac.jp/records/141604fb4945c7-78db-41d4-b22f-af632013760f
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2015 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Journal(1) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2015-04-15 | |||||||||
タイトル | ||||||||||
タイトル | Extraction of Potentially Useful Phrase Pairs for Statistical Machine Translation | |||||||||
タイトル | ||||||||||
言語 | en | |||||||||
タイトル | Extraction of Potentially Useful Phrase Pairs for Statistical Machine Translation | |||||||||
言語 | ||||||||||
言語 | eng | |||||||||
キーワード | ||||||||||
主題Scheme | Other | |||||||||
主題 | [一般論文] statistical machine translation, phrase table, classification model | |||||||||
資源タイプ | ||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||
資源タイプ | journal article | |||||||||
著者所属 | ||||||||||
Graduate School of Information, Production and Systems, Waseda University | ||||||||||
著者所属 | ||||||||||
Graduate School of Information, Production and Systems, Waseda University | ||||||||||
著者所属(英) | ||||||||||
en | ||||||||||
Graduate School of Information, Production and Systems, Waseda University | ||||||||||
著者所属(英) | ||||||||||
en | ||||||||||
Graduate School of Information, Production and Systems, Waseda University | ||||||||||
著者名 |
Juan, Luo
× Juan, Luo
× Yves, Lepage
|
|||||||||
著者名(英) |
Juan, Luo
× Juan, Luo
× Yves, Lepage
|
|||||||||
論文抄録 | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | Over the last decade, an increasing amount of work has been done to advance the phrase-based statistical machine translation model in which the method of extracting phrase pairs consists of word alignment and phrase extraction. In this paper, we show that, for Japanese-English and Chinese-English statistical machine translation systems, this method is indeed missing potentially useful phrase pairs which could lead to better translation scores. These potentially useful phrase pairs can be detected by looking at the segmentation traces after decoding. We choose to see the problem of extracting potentially useful phrase pairs as a two-class classification problem: among all the possible phrase pairs, distinguish the useful ones from the not-useful ones. As for any classification problem, the question is to discover the relevant features which contribute the most. Extracting potentially useful phrase pairs resulted in a statistically significant improvement of 7.65 BLEU points in English-Chinese and 7.61 BLEU points in Chinese-English experiments. A slight increase of 0.94 BLEU points and 0.4 BLEU points is also observed for English-Japanese system and Japanese-English system, respectively. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.23(2015) No.3 (online) ------------------------------ |
|||||||||
論文抄録(英) | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | Over the last decade, an increasing amount of work has been done to advance the phrase-based statistical machine translation model in which the method of extracting phrase pairs consists of word alignment and phrase extraction. In this paper, we show that, for Japanese-English and Chinese-English statistical machine translation systems, this method is indeed missing potentially useful phrase pairs which could lead to better translation scores. These potentially useful phrase pairs can be detected by looking at the segmentation traces after decoding. We choose to see the problem of extracting potentially useful phrase pairs as a two-class classification problem: among all the possible phrase pairs, distinguish the useful ones from the not-useful ones. As for any classification problem, the question is to discover the relevant features which contribute the most. Extracting potentially useful phrase pairs resulted in a statistically significant improvement of 7.65 BLEU points in English-Chinese and 7.61 BLEU points in Chinese-English experiments. A slight increase of 0.94 BLEU points and 0.4 BLEU points is also observed for English-Japanese system and Japanese-English system, respectively. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.23(2015) No.3 (online) ------------------------------ |
|||||||||
書誌レコードID | ||||||||||
収録物識別子タイプ | NCID | |||||||||
収録物識別子 | AN00116647 | |||||||||
書誌情報 |
情報処理学会論文誌 巻 56, 号 4, 発行日 2015-04-15 |
|||||||||
ISSN | ||||||||||
収録物識別子タイプ | ISSN | |||||||||
収録物識別子 | 1882-7764 |