高齢者向け音響モデルによる大語彙連続音声認識の評価

馬場, 朗; 芳澤伸一; 山田, 実一; 李晃伸; 鹿野, 清宏; Akira, Baba; Shinichi, Yoshizawa; Miichi, Yamada; Akinobu, Lee; Kiyohiro, Shikano

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

高齢者向け音響モデルによる大語彙連続音声認識の評価

https://ipsj.ixsq.nii.ac.jp/records/57437

名前 / ファイル	ライセンス	アクション
IPSJ-SLP00035003.pdf (588.7 kB)	Copyright (c) 2001 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2001-02-02

タイトル

高齢者向け音響モデルによる大語彙連続音声認識の評価

タイトル

言語

タイトル

Acoustic Model Evaluation for Elder Speaker Speech Database

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

イメージ情報科学研究所

著者所属

イメージ情報科学研究所

著者所属

奈良先端科学技術大学院大学情報科学研究科

著者所属

奈良先端科学技術大学院大学情報科学研究科

著者所属

奈良先端科学技術大学院大学情報科学研究科

著者所属(英)

Laboratories of Image Information Science and Technology

著者所属(英)

Laboratories of Image Information Science and Technology

著者所属(英)

Graduate School of Information Science, Nara Institute of Science and Technology

著者所属(英)

Graduate School of Information Science, Nara Institute of Science and Technology

著者所属(英)

Graduate School of Information Science, Nara Institute of Science and Technology

著者名

馬場, 朗

著者名(英)

Akira, Baba

論文抄録

内容記述タイプ

Other

内容記述

近年、大語彙連続音声認識を利用したシステムの普及に伴い、さまざまな用途で音声認識が利用されるようになりつつある。音声認識システムの性能に悪影響を与える要因の一つとして、システムを利用するユーザーの音響特性と音響モデルとの間のミスマッチがある。一般に、音響モデルの学習には成人音声を用いるため、高齢者の音響特性との間にミスマッチを生じ、認識率の低下を生じる可能性がある。本論文では、大規模な高齢者音声データベース（200文章×301人）を用いて音響モデルの学習を行い、この高齢者向け音響モデルを大語彙連続音声認識システムにおいて評価した。実験結果では、成人データベース（150文章×260人）から学習したモデルによる認識結果と比較して、3?5％の単語認識率の改善が得られた。

論文抄録(英)

内容記述タイプ

Other

内容記述

Speech recognition technologies have been widely used in various areas due to the recent developments of large vocabulary continuous speech recognition (LVCSR) algorithms. Acoustical difference among speakers is considered to be one of the main reasons for the degradation of speech recognition rates. Especially, the acoustic difference between elder speaker speech database and usual adult speech database should be evaluated and researched for elder speakers to use speech recognition systems. In this paper we evaluated elder speaker acoustic models in LVCSR, which are trained by the 301-elder-speaker utterance database, where each speaker utters 200 sentences. The elder speaker PTM acoustic model attains 88.9% word recognition rate, which better than 85.4% word recognition rate by the usual aduld PTM acoustic model.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

情報処理学会研究報告音声言語情報処理（SLP）

巻 2001, 号 11(2000-SLP-035), p. 13-18, 発行日 2001-02-02

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 04:29:54.878993

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

高齢者向け音響モデルによる大語彙連続音声認識の評価

× 馬場, 朗

× Akira, Baba

Versions

Share

Cite as

エクスポート