『日本語話し言葉コーパス』を用いた汎用的な発音変動モデルの統計的学習

秋田, 祐哉; 河原, 達也; Yuya, Akita; Tatsuya, Kawahara

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

『日本語話し言葉コーパス』を用いた汎用的な発音変動モデルの統計的学習

https://ipsj.ixsq.nii.ac.jp/records/57095

名前 / ファイル	ライセンス	アクション
IPSJ-SLP04053003 (175.9 kB)	Copyright (c) 2004 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2004-10-22

タイトル

『日本語話し言葉コーパス』を用いた汎用的な発音変動モデルの統計的学習

タイトル

言語

タイトル

Generalized Statistical Modeling of Pronunciation Variations using the Corpus of Spontaneous Japanese

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

京都大学大学院情報学研究科／科学技術振興機構さきがけ研究21

著者所属

京都大学大学院情報学研究科／科学技術振興機構さきがけ研究21

著者所属(英)

School of Informatics, Kyoto University/PRESTO, Japan Science and Technology Agency (JST)

著者所属(英)

School of Informatics, Kyoto University/PRESTO, Japan Science and Technology Agency (JST)

著者名

秋田, 祐哉河原, 達也

著者名(英)

Yuya, Akita Tatsuya, Kawahara

論文抄録

内容記述タイプ

Other

内容記述

話し言葉音声の認識において，発音変動のモデル化は認識性能に深く関わる課題である．通常，音声認識に用いる発音辞書は形態素解析器が出力する標準的な読みに基づいて生成されるが，これでは話し言葉に多く含まれる発音変動をカバーできない．本研究では，まず『日本語話し言葉コーパス』（CSJ）を用いて発音変動のパターンを汎用的な音素系列のレベルで統計的に学習した．コーパスから自動的に獲得された音素列の変動パターンは265種類であり，音韻論的に妥当なものに加えて人手による規則化が困難なものを頻度統計とあわせて抽出することができた．これらのパターンに対して，バックオフ手法により可変長の音素文脈を扱える確率つき音素書き換え規則を構築する．これらの規則を適用することで，任意の語いに対して標準的な読み（baseform）から話し言葉特有の変動を含んだ発音（surface form）を生起確率とともに生成することができる．本手法をCSJとは異なるドメインのための発音辞書に適用したところ，エントリ数が21%増加した．さらに，この発音辞書を用いた音声認識により有意な単語誤り率の改善を得ることができた．

論文抄録(英)

内容記述タイプ

Other

内容記述

Pronunciation variation modeling is one of major issues in automatic transcription of spontaneous speech. We present statistical modeling of subword-based mapping between baseforms and surface forms using a large-scale spontaneous speech corpus (CSJ). Variation patterns of phone sequences are automatically extracted together with their contexts of up to two preceding and following phones, which are decided by their occurrence statistics. Then, we derive a set of rewrite rules with their probabilities and variable-length phone contexts. The model effectively predicts pronunciation variations depending on the phone context using a back-off scheme. Since it is based on phone sequences, the model is applicable to any lexicon togenerate appropriate surface forms. The proposed method was evaluatedon a transcription task whose domain is different from the training corpus (CSJ), and significant reduction of word error rate was achieved.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

情報処理学会研究報告音声言語情報処理（SLP）

巻 2004, 号 103(2004-SLP-053), p. 13-18, 発行日 2004-10-22

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-21 15:29:42.987618

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

『日本語話し言葉コーパス』を用いた汎用的な発音変動モデルの統計的学習

× 秋田, 祐哉河原, 達也

× Yuya, Akita Tatsuya, Kawahara

Versions

Share

Cite as

エクスポート

インデックスリンク

インデックスツリー

アイテム

『日本語話し言葉コーパス』を用いた汎用的な発音変動モデルの統計的学習

× 秋田, 祐哉 河原, 達也

× Yuya, Akita Tatsuya, Kawahara

Versions

Share

Cite as

エクスポート

× 秋田, 祐哉河原, 達也