情報学広場：情報処理学会電子図書館

WEKO3

インデックスツリー

RootNode

アイテム

音響特性補正の導入による肉伝導音声変換の収録環境適応

https://ipsj.ixsq.nii.ac.jp/records/56563

名前 / ファイル	ライセンス	アクション
IPSJ-SLP09075002.pdf (1.2 MB)	Copyright (c) 2009 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2009-01-30

タイトル

タイトル

音響特性補正の導入による肉伝導音声変換の収録環境適応

タイトル

言語

en

タイトル

Adaptive Approach to Varying Recording Conditions in Body Transmitted Voice Conversion Based on Acoustic Compensation

言語

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

奈良先端科学技術大学院大学

著者所属

奈良先端科学技術大学院大学

著者所属

奈良先端科学技術大学院大学

著者所属

奈良先端科学技術大学院大学

著者所属

奈良先端科学技術大学院大学

著者所属(英)

en

Graduate school of Information Science, Nara Institute of Science and Technology

著者所属(英)

en

Graduate school of Information Science, Nara Institute of Science and Technology

著者所属(英)

en

Graduate school of Information Science, Nara Institute of Science and Technology

著者所属(英)

en

Graduate school of Information Science, Nara Institute of Science and Technology

著者所属(英)

en

Graduate school of Information Science, Nara Institute of Science and Technology

著者名

宮本, 大輔中村, 圭吾戸田, 智基猿渡, 洋鹿野, 清宏

著者名(英)

Daisuke, Miyamoto Keigo, Nakamura Tomoki, Toda Hiroshi, Saruwatari Kiyohiro, Shikano

論文抄録

内容記述タイプ

Other

内容記述

肉伝導音声変換はNon-Audible Murmur (NAM)マイクロフォンで収録される肉伝導音声の音質向上に効果的である．この手法では，肉伝導音声から空気伝導音声へ変換するための確率モデルが事前に学習される．肉伝導音声の音響特性は，NAMマイクロフォンの圧着位置などの収録環境に敏感であるため，実際の使用においては学習時と変換時の音響特性の不一致により，しばしば変換音質が大きく劣化する．この問題に対して，我々は肉伝導音声変換のためのCepstrum Mean Subtraction (CMS)とConstrained Structural Maximum A Posteriori Linear Regression (CSMAPLR)，またはSignal Bias Removal (SBR)とCSMAPLRの組み合わせに基づく教師無しの音響特性補正法を提案する．実験結果から，提案手法により音響特性の不一致に起因する変換音質の劣化が大幅に低減されることを示す．

論文抄録(英)

内容記述タイプ

Other

内容記述

Body transmitted voice conversion is very effective for enhancing body transmitted speech recorded with Non-Audible Murmur (NAM) microphone. In this method, a probabilistic model to convert body transmitted speech into natural speech is trained previously. Because acoustic characteristics of body transmitted speech is sensitive to recording conditions such as a location of NAM microphone, significant degradation of the conversion performance is often caused in practical situations by acoustic mismatches between the training and the conversion processes. To alleviate this problem, we propose unsupervised acoustic compensation methods based on combination of Cepstrum Mean Subtraction (CMS) and Constrained Structural Maximum A Posteriori Linear Regression (CSMAPLR), or combination of Signal Bias Removal (SBR) and CSMAPLR for body transmitted voice conversion. Experimental results demonstrate that the proposed methods significantly reduce the quality degradation of the converted speech caused by the acoustic mismatches.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

情報処理学会研究報告音声言語情報処理（SLP）

巻 2009, 号 10(2009-SLP-075), p. 7-12, 発行日 2009-01-30

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

ja

出版者

情報処理学会

戻る

0

views

	Views

Versions

Ver.1

2025-01-22 04:54:40.899783

Show All versions

Share

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX