情報学広場：情報処理学会電子図書館

WEKO3

To

lat lon distance

[[sub_check.contents]]

[[sub_check.contents]]

[[sub_radio.contents]]

To

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

ベイジアンネットワークを用いたバイナリマスキングに基づく音源分離

https://ipsj.ixsq.nii.ac.jp/records/56643

名前 / ファイル	ライセンス	アクション
IPSJ-SLP08072010.pdf (1.9 MB)	Copyright (c) 2008 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2008-07-11

タイトル

タイトル

ベイジアンネットワークを用いたバイナリマスキングに基づく音源分離

タイトル

言語

en

タイトル

Sound Source separation based on binary masking using bayesian network

言語

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

名古屋大学大学院情報科学研究科

著者所属

名古屋大学大学院情報科学研究科

著者所属

名古屋大学大学院情報科学研究科

著者所属

名古屋大学大学院情報科学研究科

著者所属

名古屋大学大学院情報科学研究科

著者所属(英)

en

Graduate School of Information Science, Nagoya University

著者所属(英)

en

Graduate School of Information Science, Nagoya University

著者所属(英)

en

Graduate School of Information Science, Nagoya University

著者所属(英)

en

Graduate School of Information Science, Nagoya University

著者所属(英)

en

Graduate School of Information Science, Nagoya University

著者名

伊藤, 弘章大石康智宮島, 千代美北岡, 教英武田, 一哉

著者名(英)

Hiroaki, Itou Yasunori, Ohishi Chiyomi, Miyajima Norihide, Kitaoka Kazuya, Takeda

論文抄録

内容記述タイプ

Other

内容記述

本研究では，音楽混じりの音声に対する単一チャネル音源分離手法を提案する．バイナリマスキングの原理に基づくと，混合信号の各周波数成分のパワーは，個々の音源のうち，その周波数成分のパワーが最も大きい音源に由来するものであると考えることができる．したがって，混合信号の時間-周波数成分の中から，個々の音源が支配する成分を選択的に残し，他の成分をマスクするためのマスクパターンを決定すれば，個々の音源を分離することが可能となる．しかし，混合信号に含まれる各音源は未知であるため，個々の音源に対してこのマスクパターンを最適に推定する必要がある．そこで，時間-周波数成分の周囲の依存関係を仮定し，ベイジアンネットワークを用いて確率的に成分を選択する手法を提案する．提案手法の有効性を確認するために，6 種類の SNR で非定常な音楽を重畳した混合信号に対して音源分離実験を行い，目的の音源成分を選択するマスクパターンの正解率と音質の評価を行った．実験結果より，マスクの正解率と音質評価ともに従来のベイズ識別器を用いる手法よりも良い結果が得られることが確認された．

論文抄録(英)

内容記述タイプ

Other

内容記述

In our study, we propose a method of single-channel sound source separation for a mixture of speech and music. Based on the principle of binary masking, we can assume that each time-frequency bin is dominated by a certain source whose power is highest of all original sources at the bin. Therefore, if we decide the mask pattern to selectively retain components dominated by the target signal and mask out the other signal, we can segregate the target signal from the mixed one. However, since original sources are unknown, we need to optimally estimate this mask pattern for each original source. So we assume the dependency among neighboring time-frequency components, and propose a probabilistic mask estimation method using bayesian networks. To prove the effectiveness of proposed method, we performed an experiment of source separation of mixture of speech and nonstationary musics with six defferent levels of SNRs and evaluate the accuracy of estimated mask pattern to select the target components and obtained sound quality.As a result, both accuracy and sound quality were better in comparison with conventional method which used bayesian classifier.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

情報処理学会研究報告音声言語情報処理（SLP）

巻 2008, 号 68(2008-SLP-072), p. 51-56, 発行日 2008-07-11

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

ja

出版者

情報処理学会

戻る

0

views

	Views

Versions

Ver.1

2025-01-22 04:52:43.654311

Show All versions

Share

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX