ニュース音声認識のための（n≧4）- gramを併用する言語モデル

2024-07-27T17:49:06Zhttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_oaipmh

oai:ipsj.ixsq.nii.ac.jp:000575772024-03-29T05:26:34Z01164:05159:05216:05217

ニュース音声認識のための（n≧4）- gramを併用する言語モデルA New Language Model by using (n≧4) - gram for Broadcast News Speech Transcriptionjpnhttp://id.nii.ac.jp/1001/00057577/Technical Reporthttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_action_common_download&item_id=57577&item_no=1&attribute_id=1&file_no=1Copyright (c) 1999 by the Information Processing Society of JapanNHK放送技術研究所ATR音声翻訳通信研究所NHK放送技術研究所NHK放送技術研究所加藤, 直人浦谷則好江原暉将安藤, 彰男音声認識の精度向上には言語制約が強い言語モデルを構成すること必要であり，その一つの方法がタスク適応である．一方で，タスク適応しすぎると頑健性が損なわれるという問題がある．本稿では(n≧4)?gramを利用することによりタスクへの適応をしつつ，2 3-gramも利用することで頑健性もそれほど損なわない言語モデルについて述べる．提案する言語モデルでは(n≧4)-gramを，従来のn-gramのように宣言的知識として記憶するのではなく，単語出現位置辞書という概念を導入して手続き的知識として記憶することによりそれほどデータ量を増やすことなく利用している．本言語モデルを放送ニュースに応用し，そのperplexityによる評価実験を行ったところ，良好な結果を得た．Language model adaptation is one of the important methods to construct a speech recognition system for practical use. The conventional adaptation methods adjusted n-gram estimated from various task corpora to ones from a specific task corpus. However the methods are not so effective in some tasks such as TV news, because some of TV news does not use news scripts. This paper proposes a new language model for Broadcast news speech transcription. Our model can not only adapt to a specific task but also deal with the more tasks by dynamically using (n≧4)-gram and 2,3-gram. The proposed method can reduce amount of (n≧4)-gram data by registering it as procedural knowledge through WPD (Word Position Data). The WPD represents each position of words in a task corpus and is automatically composed of the corpus. We conducted a serirs of experiments to evaluate our model and obtained a good result.AN10442647情報処理学会研究報告音声言語情報処理（SLP）1999108(1999-SLP-029)1871921999-12-202009-06-30