音声ストリーム分離法の提案と複数音声の同時認識の予備実験

奥乃, 博; 中谷, 智広; 川端, 豪; Hiroshi, G.Okuno; Tomohiro, Nakatani; Takeshi, Kawabata

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

音声ストリーム分離法の提案と複数音声の同時認識の予備実験

https://ipsj.ixsq.nii.ac.jp/records/13455

名前 / ファイル	ライセンス	アクション
IPSJ-JNL3803014.pdf (1.9 MB)	Copyright (c) 1997 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

1997-03-15

タイトル

音声ストリーム分離法の提案と複数音声の同時認識の予備実験

タイトル

言語

タイトル

Speech Stream Segregation and Preliminary Results on Listening to Several Speeches Simultaneously

言語

jpn

キーワード

主題Scheme

Other

主題

論文

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

その他タイトル

その他のタイトル

音声言語情報処理

著者所属

日本電信電話株式会社基礎研究所

著者所属

日本電信電話株式会社基礎研究所

著者所属

日本電信電話株式会社基礎研究所

著者所属(英)

NTT Basic Research Laboratories, Nippon Telegraph and Telephone Corporation

著者所属(英)

NTT Basic Research Laboratories, Nippon Telegraph and Telephone Corporation

著者所属(英)

NTT Basic Research Laboratories, Nippon Telegraph and Telephone Corporation

著者名

奥乃, 博中谷, 智広川端, 豪

著者名(英)

Hiroshi, G.Okuno Tomohiro, Nakatani Takeshi, Kawabata

論文抄録

内容記述タイプ

Other

内容記述

本稿では，一般環境下での音声認識のための前処理として音響ストリーム分離を使用するうえでの問題点について検討する．本稿の前半では，音声ストリーム分離の方法を提案する．提案する方法は，調波構造ストリーム断片の抽出とそのグルーピング，および，入力音からすべての調波構造を除いた残差での非調波構造の補完から構成される．本稿の後半では，分離した音声ストリームを離散型単一コードブック型HMM?LRで認識するうえでの問題点を解明し，その解決策を提示する．提案する音声ストリーム分離方法で方向情報抽出のために用いたバイノーラル入力がスペクトル変形を引き起こし，音声認識に影響を与えることが判明した．この対策として，4方向で頭部音響伝達関数をかけた学習データでHMM?LRのパラメータを再学習する方法を提案した．2人の話者の500組の子音を含んだ発話（SN比0??3dB）の音声認識実験を5種類行い，音声ストリーム分離により上位10候補累積認識率に対する混合音による認識誤りを最大77％削減することができた．

論文抄録(英)

内容記述タイプ

Other

内容記述

This paper reports the preliminary results of experiments on listening to several sounds at once.Two issues are addressed:segregating speech streams from a mixture of sounds,and interfacing speech stream segregation with automatic speech recognition(ASR).Speech stream segregation(SSS) is designed as three processes:extracting harmonic fragments;grouping these extracted harmonic fragments according to their directions;and substituting the non-harmonic residue of harmonic fragments for non-harmonic parts of each group.The main problem in interfacing SSS with HMM-based ASR is how to reduce the recognition errors caused by spectral distortion of segregated sounds mainly due to binaural input.Our solution is to re-train the parameters of the HMM with training data binauralized for four directions.Experiments with five sets of 500 mixtures of two women's/men's utterances of a word(SNR is 0dB to -3dB)showed that the error of up to the 10th candidate of word recognition was reduced up to 77% by speech stream segregation.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 38, 号 3, p. 510-523, 発行日 1997-03-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-23 01:14:05.308338

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

音声ストリーム分離法の提案と複数音声の同時認識の予備実験

× 奥乃, 博中谷, 智広川端, 豪

× Hiroshi, G.Okuno Tomohiro, Nakatani Takeshi, Kawabata

Versions

Share

Cite as

エクスポート

インデックスリンク

インデックスツリー

アイテム

音声ストリーム分離法の提案と複数音声の同時認識の予備実験

× 奥乃, 博 中谷, 智広 川端, 豪

× Hiroshi, G.Okuno Tomohiro, Nakatani Takeshi, Kawabata

Versions

Share

Cite as

エクスポート

× 奥乃, 博中谷, 智広川端, 豪