WEKO3
アイテム
Joint Phrase Alignment and Extraction for Statistical Machine Translation
https://ipsj.ixsq.nii.ac.jp/records/81312
https://ipsj.ixsq.nii.ac.jp/records/81312947886aa-49ab-4426-964e-f752b75e521a
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
Copyright (c) 2012 by the Information Processing Society of Japan
|
|
| オープンアクセス | ||
| Item type | Journal(1) | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2012-03-15 | |||||||||||||||
| タイトル | ||||||||||||||||
| タイトル | Joint Phrase Alignment and Extraction for Statistical Machine Translation | |||||||||||||||
| タイトル | ||||||||||||||||
| 言語 | en | |||||||||||||||
| タイトル | Joint Phrase Alignment and Extraction for Statistical Machine Translation | |||||||||||||||
| 言語 | ||||||||||||||||
| 言語 | eng | |||||||||||||||
| キーワード | ||||||||||||||||
| 主題Scheme | Other | |||||||||||||||
| 主題 | 一般論文 | |||||||||||||||
| 資源タイプ | ||||||||||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||||
| 資源タイプ | journal article | |||||||||||||||
| 著者所属 | ||||||||||||||||
| Graduate School of Informatics, Kyoto University/National Institute of Information and Communications Technology | ||||||||||||||||
| 著者所属 | ||||||||||||||||
| National Institute of Information and Communications Technology | ||||||||||||||||
| 著者所属 | ||||||||||||||||
| National Institute of Information and Communications Technology | ||||||||||||||||
| 著者所属 | ||||||||||||||||
| Graduate School of Informatics, Kyoto University | ||||||||||||||||
| 著者所属 | ||||||||||||||||
| Graduate School of Informatics, Kyoto University | ||||||||||||||||
| 著者所属(英) | ||||||||||||||||
| en | ||||||||||||||||
| Graduate School of Informatics, Kyoto University / National Institute of Information and Communications Technology | ||||||||||||||||
| 著者所属(英) | ||||||||||||||||
| en | ||||||||||||||||
| National Institute of Information and Communications Technology | ||||||||||||||||
| 著者所属(英) | ||||||||||||||||
| en | ||||||||||||||||
| National Institute of Information and Communications Technology | ||||||||||||||||
| 著者所属(英) | ||||||||||||||||
| en | ||||||||||||||||
| Graduate School of Informatics, Kyoto University | ||||||||||||||||
| 著者所属(英) | ||||||||||||||||
| en | ||||||||||||||||
| Graduate School of Informatics, Kyoto University | ||||||||||||||||
| 著者名 |
Graham, Neubig
× Graham, Neubig
× Taro, Watanabe
× Eiichiro, Sumita
× Shinsuke, Mori
× Tatsuya, Kawahara
|
|||||||||||||||
| 著者名(英) |
Graham, Neubig
× Graham, Neubig
× Taro, Watanabe
× Eiichiro, Sumita
× Shinsuke, Mori
× Tatsuya, Kawahara
|
|||||||||||||||
| 論文抄録 | ||||||||||||||||
| 内容記述タイプ | Other | |||||||||||||||
| 内容記述 | The phrase table, a scored list of bilingual phrases, lies at the center of phrase-based machine translation systems. We present a method to directly learn this phrase table from a parallel corpus of sentences that are not aligned at the word level. The key contribution of this work is that while previous methods have generally only modeled phrases at one level of granularity, in the proposed method phrases of many granularities are included directly in the model. This allows for the direct learning of a phrase table that achieves competitive accuracy without the complicated multi-step process of word alignment and phrase extraction that is used in previous research. The model is achieved through the use of non-parametric Bayesian methods and inversion transduction grammars (ITGs), a variety of synchronous context-free grammars (SCFGs). Experiments on several language pairs demonstrate that the proposed model matches the accuracy of the more traditional two-step word alignment/phrase extraction approach while reducing its phrase table to a fraction of its original size. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.20(2012) No.2 (online) DOI http://dx.doi.org/10.2197/ipsjjip.20.512 ------------------------------ |
|||||||||||||||
| 論文抄録(英) | ||||||||||||||||
| 内容記述タイプ | Other | |||||||||||||||
| 内容記述 | The phrase table, a scored list of bilingual phrases, lies at the center of phrase-based machine translation systems. We present a method to directly learn this phrase table from a parallel corpus of sentences that are not aligned at the word level. The key contribution of this work is that while previous methods have generally only modeled phrases at one level of granularity, in the proposed method phrases of many granularities are included directly in the model. This allows for the direct learning of a phrase table that achieves competitive accuracy without the complicated multi-step process of word alignment and phrase extraction that is used in previous research. The model is achieved through the use of non-parametric Bayesian methods and inversion transduction grammars (ITGs), a variety of synchronous context-free grammars (SCFGs). Experiments on several language pairs demonstrate that the proposed model matches the accuracy of the more traditional two-step word alignment/phrase extraction approach while reducing its phrase table to a fraction of its original size. ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.20(2012) No.2 (online) DOI http://dx.doi.org/10.2197/ipsjjip.20.512 ------------------------------ |
|||||||||||||||
| 書誌レコードID | ||||||||||||||||
| 収録物識別子タイプ | NCID | |||||||||||||||
| 収録物識別子 | AN00116647 | |||||||||||||||
| 書誌情報 |
情報処理学会論文誌 巻 53, 号 3, 発行日 2012-03-15 |
|||||||||||||||
| ISSN | ||||||||||||||||
| 収録物識別子タイプ | ISSN | |||||||||||||||
| 収録物識別子 | 1882-7764 | |||||||||||||||