CycleGANを用いた高品質なノンパラレル声質変換

房, 福明; 山岸, 順一; 越前, 功; Fuming, Fang; Junichi, Yamagishi; Isao, Echizen

WEKO3

インデックスツリー

RootNode

アイテム

CycleGANを用いた高品質なノンパラレル声質変換

https://ipsj.ixsq.nii.ac.jp/records/184865

名前 / ファイル	ライセンス	アクション
IPSJ-SLP17119009.pdf (973.9 kB)	Copyright (c) 2017 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2017-12-14

タイトル

CycleGANを用いた高品質なノンパラレル声質変換

タイトル

言語

タイトル

High-quality nonparallel voice conversion using CycleGAN

言語

jpn

キーワード

主題Scheme

Other

主題

ポスターセッション

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

国立情報学研究所

著者所属

国立情報学研究所／エジンバラ大学

著者所属

国立情報学研究所

著者名

房, 福明
山岸, 順一
越前, 功

著者名(英)

Fuming, Fang
Junichi, Yamagishi
Isao, Echizen

論文抄録

内容記述タイプ

Other

内容記述

近年，機械学習の進展により声質変換の性能が大幅に向上した．しかし，学習データが対とならないノンパラレルの場合，ソース話者とターゲット話者の特徴を精密にマッチすることが難しい．ノンパラレル声質変換モデルの学習はまだ困難であり，変換性能はまだ低い問題がある．一方，画像変換分野ではペアなしの画像データベースから変換モデルを学習する方法として CycleGAN が注目されている．CycleGAN は GAN の一種であり，複数個の generator と discriminator を持つ．また，generator は入力データの一部の情報を維持しながら，discriminator との競争学習によりターゲットドメインへの変換ができる特徴がある．そこで，本研究はこのアイディアに基づいて CycleGAN をノンパラレル声質変換に適用する方法を提案する．提案手法では，ソース話者とターゲット話者の類似特徴を直接マッチするのではなく，ソース話者の一部の言語情報を維持しながら話者特徴をターゲット話者にできるだけ近付けるように変換モデルを学習する．被験者評価実験より，提案手法は標準の GAN に基づいたパラレル声質変換を上回ったことを示す．

論文抄録(英)

内容記述タイプ

Other

内容記述

Recently, voice conversion (VC) based on deep learning has achieved remarkable performance. However, it is still difficult to train a mapping model using nonparallel training samples. In this work, we propose a high-quality nonparallel VC training method based on CycleGAN. A CycleGAN is a kind of generative adversarial network (GAN) originally developed for unpaired image-to-image translation. This model can be learned by an approach that a part of input information is kept while the corresponding distribution of the input data can be converted into a target distribution without paired training samples. Experimental results show that the proposed method outperforms a standard GAN-based parallel VC system.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

研究報告音声言語情報処理（SLP）

巻 2017-SLP-119, 号 9, p. 1-6, 発行日 2017-12-14

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8663

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-20 03:09:15.131269

Show All versions

Cite as

房, 福明, 山岸, 順一, 越前, 功, 2017: 情報処理学会, 1–6 p.

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

CycleGANを用いた高品質なノンパラレル声質変換

× 房, 福明

× 山岸, 順一

× 越前, 功

× Fuming, Fang

× Junichi, Yamagishi

× Isao, Echizen

Versions

Share

Cite as

エクスポート