WEKO3
アイテム
Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs
https://ipsj.ixsq.nii.ac.jp/records/90270
https://ipsj.ixsq.nii.ac.jp/records/9027021abdcb2-41af-4a15-b232-1ca34f3b0364
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Copyright (c) 2013 by the Information Processing Society of Japan
|
|
オープンアクセス |
Item type | Journal(1) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2013-02-15 | |||||||||||||
タイトル | ||||||||||||||
タイトル | Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs | |||||||||||||
タイトル | ||||||||||||||
言語 | en | |||||||||||||
タイトル | Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs | |||||||||||||
言語 | ||||||||||||||
言語 | eng | |||||||||||||
キーワード | ||||||||||||||
主題Scheme | Other | |||||||||||||
主題 | [特集:音声ドキュメント処理] majority voting, multiple speech recognizers, network-based indexing, spoken term detection | |||||||||||||
資源タイプ | ||||||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||
資源タイプ | journal article | |||||||||||||
著者所属 | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属 | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属 | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属 | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者所属(英) | ||||||||||||||
en | ||||||||||||||
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi | ||||||||||||||
著者名 |
Satoshi, Natori
× Satoshi, Natori
× Yuto, Furuya
× Hiromitsu, Nishizaki
× Yoshihiro, Sekiguchi
|
|||||||||||||
著者名(英) |
Satoshi, Natori
× Satoshi, Natori
× Yuto, Furuya
× Hiromitsu, Nishizaki
× Yoshihiro, Sekiguchi
|
|||||||||||||
論文抄録 | ||||||||||||||
内容記述タイプ | Other | |||||||||||||
内容記述 | Spoken Term Detection (STD) that considers the out-of-vocabulary (OOV) problem has generated significant interest in the field of spoken document processing. This study describes STD with false detection control using phoneme transition networks (PTNs) derived from the outputs of multiple speech recognizers. PTNs are similar to subword-based confusion networks (CNs), which are originally derived from a single speech recognizer. Since PTN-formed index is based on the outputs of multiple speech recognizers, it is robust to recognition errors. Therefore, PTN should also be robust to recognition errors in an STD task, when compared to the CN-formed index from a single speech recognition system. Our PTN-formed index was evaluated on a test collection. The experiment showed that the PTN-based approach effectively detected OOV terms, and improved the F-measure value from 0.370 to 0.639 when compared with a baseline approach. Furthermore, we applied two false detection control parameters, one is based on the majority voting scheme. The other is a measure of the ambiguity of CN, to the calculation of detection score. By introducing these parameters, the performance of STD was found to be better (0.736 for the F-measure value) than that without any parameters (0.639). ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.21(2013) No.2 (online) DOI http://dx.doi.org/10.2197/ipsjjip.21.176 ------------------------------ |
|||||||||||||
論文抄録(英) | ||||||||||||||
内容記述タイプ | Other | |||||||||||||
内容記述 | Spoken Term Detection (STD) that considers the out-of-vocabulary (OOV) problem has generated significant interest in the field of spoken document processing. This study describes STD with false detection control using phoneme transition networks (PTNs) derived from the outputs of multiple speech recognizers. PTNs are similar to subword-based confusion networks (CNs), which are originally derived from a single speech recognizer. Since PTN-formed index is based on the outputs of multiple speech recognizers, it is robust to recognition errors. Therefore, PTN should also be robust to recognition errors in an STD task, when compared to the CN-formed index from a single speech recognition system. Our PTN-formed index was evaluated on a test collection. The experiment showed that the PTN-based approach effectively detected OOV terms, and improved the F-measure value from 0.370 to 0.639 when compared with a baseline approach. Furthermore, we applied two false detection control parameters, one is based on the majority voting scheme. The other is a measure of the ambiguity of CN, to the calculation of detection score. By introducing these parameters, the performance of STD was found to be better (0.736 for the F-measure value) than that without any parameters (0.639). ------------------------------ This is a preprint of an article intended for publication Journal of Information Processing(JIP). This preprint should not be cited. This article should be cited as: Journal of Information Processing Vol.21(2013) No.2 (online) DOI http://dx.doi.org/10.2197/ipsjjip.21.176 ------------------------------ |
|||||||||||||
書誌レコードID | ||||||||||||||
収録物識別子タイプ | NCID | |||||||||||||
収録物識別子 | AN00116647 | |||||||||||||
書誌情報 |
情報処理学会論文誌 巻 54, 号 2, 発行日 2013-02-15 |
|||||||||||||
ISSN | ||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||
収録物識別子 | 1882-7764 |