WEKO3
アイテム
Deep Learning-Based Voice Conversion
https://ipsj.ixsq.nii.ac.jp/records/194519
https://ipsj.ixsq.nii.ac.jp/records/194519b1c43e8b-5a2b-4499-845e-8d1c7c27582d
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
Copyright (c) 2019 by the Information Processing Society of Japan
|
|
| オープンアクセス | ||
| Item type | SIG Technical Reports(1) | |||||||
|---|---|---|---|---|---|---|---|---|
| 公開日 | 2019-02-20 | |||||||
| タイトル | ||||||||
| タイトル | Deep Learning-Based Voice Conversion | |||||||
| タイトル | ||||||||
| 言語 | en | |||||||
| タイトル | Deep Learning-Based Voice Conversion | |||||||
| 言語 | ||||||||
| 言語 | eng | |||||||
| 資源タイプ | ||||||||
| 資源タイプ識別子 | http://purl.org/coar/resource_type/c_18gh | |||||||
| 資源タイプ | technical report | |||||||
| 著者所属 | ||||||||
| University of Science and Technology of China | ||||||||
| 著者所属(英) | ||||||||
| en | ||||||||
| University of Science and Technology of China | ||||||||
| 著者名 |
Zhenhua, Ling
× Zhenhua, Ling
|
|||||||
| 著者名(英) |
Zhenhua, Ling
× Zhenhua, Ling
|
|||||||
| 論文抄録 | ||||||||
| 内容記述タイプ | Other | |||||||
| 内容記述 | I will introduce our recent work on applying deep learning techniques to voice conversion in this talk. Several methods have been proposed to improve different components in the pipeline of a statistical parametric voice conversion system, including deep neural networks with layer-wise generative training for acoustic modeling, deep autoencoders with binary distributed hidden units for feature representation, and WaveNet vocoder with limited training data for waveform reconstruction. Then, I will introduce our system designed for Voice Conversion Challenge 2018, which achieved the best performance under both parallel and non-parallel conditions in this evaluation. After this, I will present our recent progress on sequence-to-sequence acoustic modeling for voice conversion, which converts the acoustic features and durations of source utterances simultaneously using a unified acoustic model. Finally, some discussions on the future development of voice conversion techniques will be given. | |||||||
| 論文抄録(英) | ||||||||
| 内容記述タイプ | Other | |||||||
| 内容記述 | I will introduce our recent work on applying deep learning techniques to voice conversion in this talk. Several methods have been proposed to improve different components in the pipeline of a statistical parametric voice conversion system, including deep neural networks with layer-wise generative training for acoustic modeling, deep autoencoders with binary distributed hidden units for feature representation, and WaveNet vocoder with limited training data for waveform reconstruction. Then, I will introduce our system designed for Voice Conversion Challenge 2018, which achieved the best performance under both parallel and non-parallel conditions in this evaluation. After this, I will present our recent progress on sequence-to-sequence acoustic modeling for voice conversion, which converts the acoustic features and durations of source utterances simultaneously using a unified acoustic model. Finally, some discussions on the future development of voice conversion techniques will be given. | |||||||
| 書誌レコードID | ||||||||
| 収録物識別子タイプ | NCID | |||||||
| 収録物識別子 | AN10442647 | |||||||
| 書誌情報 |
研究報告音声言語情報処理(SLP) 巻 2019-SLP-126, 号 4, p. 1-1, 発行日 2019-02-20 |
|||||||
| ISSN | ||||||||
| 収録物識別子タイプ | ISSN | |||||||
| 収録物識別子 | 2188-8663 | |||||||
| Notice | ||||||||
| SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. | ||||||||
| 出版者 | ||||||||
| 言語 | ja | |||||||
| 出版者 | 情報処理学会 | |||||||