分散音声認識における実時間周波数特性正規化手法

柘植, 覚; 黒岩, 眞吾; 獅々堀正幹; 任福継; 北, 研二; Satoru, Tsuge; Shingo, Kuroiwa; Masami, Shishibori; Fuji, Ren; Kenji, Kita

WEKO3

インデックスツリー

RootNode

アイテム

分散音声認識における実時間周波数特性正規化手法

https://ipsj.ixsq.nii.ac.jp/records/10072

名前 / ファイル	ライセンス	アクション
IPSJ-JNL4802047.pdf (228.1 kB)	Copyright (c) 2007 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2007-02-15

タイトル

分散音声認識における実時間周波数特性正規化手法

タイトル

言語

タイトル

Real-time Frequency Characteristic Normalization for Distributed Speech Recognition

言語

jpn

キーワード

主題Scheme

Other

主題

論文

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

その他タイトル

その他のタイトル

音声言語

著者所属

徳島大学工学部

著者所属

徳島大学工学部

著者所属

徳島大学工学部

著者所属

徳島大学工学部

著者所属

徳島大学高度情報化基盤センター

著者所属(英)

Faculty of Engineering, The University of Tokushima

著者所属(英)

Faculty of Engineering, The University of Tokushima

著者所属(英)

Faculty of Engineering, The University of Tokushima

著者所属(英)

Faculty of Engineering, The University of Tokushima

著者所属(英)

Center for Advanced Information Technology, The Universityof Tokushima

著者名

柘植, 覚黒岩, 眞吾獅々堀正幹任福継北, 研二

著者名(英)

Satoru, Tsuge Shingo, Kuroiwa Masami, Shishibori Fuji, Ren Kenji, Kita

論文抄録

内容記述タイプ

Other

内容記述

本論文では，分散音声認識（DSR: Distributed Speech Recognition）における入力系の周波数特性の差異による認識性能劣化を抑制する周波数特性正規化手法として，複数参照ケプストラムを用いた実時間周波数特性正規化手法を提案する．提案手法は，複数の参照ケプストラムを使用し，周波数特性の正規化を行うバイアスをフレーム同期で計算し，実時間で入力系の周波数特性を正規化する手法である．一般に，DSR で用いられるクライアントではメモリ量，計算量の制限があるため，提案手法ではこれらの増加量を低減させるため，参照ケプストラムをDSR フロントエンドの特徴パラメータ圧縮部で使用されるVQ コードブックの組合せで表現した．ETSI Advanced DSR フロントエンドを用いた日本音響学会新聞記事読み上げ音声コーパスの音声認識実験より，提案手法は，ETSI Advanced DSR フロントエンドにおけるBlind Equalization と比較し，周波数特性の差異による音声認識精度劣化の抑制に有効であることを確認した．特に，提案手法はMIRS フィルタ条件下でETSI Advanced DSR フロントエンド（Blind Equalization）の単語誤り率を10.8%削減することが可能であった．

論文抄録(英)

内容記述タイプ

Other

内容記述

In this paper, we propose a real-time blind equalization method with multiple references for ETSI standard Distributed Speech Recognition (DSR) front-end. The proposed method compensates for acoustic mismatch caused by input devices. In ETSI advanced DSR frontend, the blind equalization method is introduced to compensate for acoustic mismatch. This method estimates the bias, which compensates for the mismatch, using one reference vector. If the input speech is short or contains many similar phonemes, there is concern that this method might not estimate the accurate bias. On the other hand, the proposed method estimates the bias, which is calculated on frame by frame, using multiple references instead of one reference. Using multiple references, the proposed method estimates the bias more accurately. In addition, we represent the references by combining the VQ centroids used in the data compression process of ETSI standard DSR front-end. This limits increases in memory size and computation costs on the front-end. Experimental results on a Japanese newspaper dictation task indicate that the proposed method gave better performance under acoustic mismatched conditions than the conventional blind equalization method. Especially, we observed a 10.8% improvement in the error rate under the MIRS filter condition.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 48, 号 2, p. 900-908, 発行日 2007-02-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-23 03:03:58.756650

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

分散音声認識における実時間周波数特性正規化手法

× 柘植, 覚黒岩, 眞吾獅々堀正幹任福継北, 研二

× Satoru, Tsuge Shingo, Kuroiwa Masami, Shishibori Fuji, Ren Kenji, Kita

Versions

Share

Cite as

エクスポート

インデックスリンク

インデックスツリー

アイテム

分散音声認識における実時間周波数特性正規化手法

× 柘植, 覚 黒岩, 眞吾 獅々堀正幹 任福継 北, 研二

× Satoru, Tsuge Shingo, Kuroiwa Masami, Shishibori Fuji, Ren Kenji, Kita

Versions

Share

Cite as

エクスポート

× 柘植, 覚黒岩, 眞吾獅々堀正幹任福継北, 研二