| Item type |
SIG Technical Reports(1) |
| 公開日 |
2023-02-21 |
| タイトル |
|
|
タイトル |
Increasing Speech Intelligibility for Evacuation Guidance by Mimicking Professional Announcers’ Voice |
| タイトル |
|
|
言語 |
en |
|
タイトル |
Increasing Speech Intelligibility for Evacuation Guidance by Mimicking Professional Announcers’ Voice |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
SP4:音声処理・評価 |
| 資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
| 著者所属 |
|
|
|
School of Information Science, Japan Advanced Institute of Science and Technology |
| 著者所属 |
|
|
|
School of Information Science, Japan Advanced Institute of Science and Technology |
| 著者所属 |
|
|
|
School of Information Science, Japan Advanced Institute of Science and Technology |
| 著者所属(英) |
|
|
|
en |
|
|
School of Information Science, Japan Advanced Institute of Science and Technology |
| 著者所属(英) |
|
|
|
en |
|
|
School of Information Science, Japan Advanced Institute of Science and Technology |
| 著者所属(英) |
|
|
|
en |
|
|
School of Information Science, Japan Advanced Institute of Science and Technology |
| 著者名 |
KimDung, Tran
Masato, Akagi
Masashi, Unoki
|
| 著者名(英) |
KimDung, Tran
Masato, Akagi
Masashi, Unoki
|
| 論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
A recent study found that speech uttered by professional announcers is more intelligible than that by non-experts in noisy environments. Based on this finding, a voice conversion system StarGAN was applied to mimic professional announcer speech by modifying the speaker embedding of non-expert speech. Experimental results showed that intelligibility is increased significantly by this method. This report discusses whether speech intelligibility can be changed gradually even by shifting one PCA component of the speaker embedding and what features are changed when the PCA component is shifted. Possible candidates of the physical correlates are tilt and plateau of spectrum and cepstral prominent peaks. |
| 論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
A recent study found that speech uttered by professional announcers is more intelligible than that by non-experts in noisy environments. Based on this finding, a voice conversion system StarGAN was applied to mimic professional announcer speech by modifying the speaker embedding of non-expert speech. Experimental results showed that intelligibility is increased significantly by this method. This report discusses whether speech intelligibility can be changed gradually even by shifting one PCA component of the speaker embedding and what features are changed when the PCA component is shifted. Possible candidates of the physical correlates are tilt and plateau of spectrum and cepstral prominent peaks. |
| 書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AN10442647 |
| 書誌情報 |
研究報告音声言語情報処理(SLP)
巻 2023-SLP-146,
号 73,
p. 1-6,
発行日 2023-02-21
|
| ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8663 |
| Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
| 出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |