空間モデルを考慮した深層学習ベースの音源分離

戸上, 真人; Masahito, Togami

WEKO3

インデックスツリー

RootNode

アイテム

空間モデルを考慮した深層学習ベースの音源分離

https://ipsj.ixsq.nii.ac.jp/records/209746

名前 / ファイル	ライセンス	アクション
IPSJ-SLP21136008.pdf (1.8 MB)	Copyright (c) 2021 by the Institute of Electronics, Information and Communication Engineers This SIG report is only available to those in membership of the SIG.
SLP:会員：¥0, DLIB:会員：¥0

Item type

SIG Technical Reports(1)

公開日

2021-02-24

タイトル

空間モデルを考慮した深層学習ベースの音源分離

タイトル

言語

タイトル

Speech source separation based on deep learning with spatial model

言語

jpn

キーワード

主題Scheme

Other

主題

招待講演

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者名

戸上, 真人

著者名(英)

Masahito, Togami

論文抄録

内容記述タイプ

Other

内容記述

深層学習ベースの音源分離の進化が著しいが，ニューラルネットワーク (NN) は空間モデルとは独立に学習されることが多い．しかし，そのような構成で学習された NN は，空間モデルを用いて音源分離を行う構成において本当に最適であるといえるのかという疑問が残る．本講演では従来の統計モデルに基づく音源分離および深層学習を用いた音源分離の研究の流れを示すと共に，深層学習を用いた音源分離に空間モデルを取り込み，NN を空間モデルを考慮して学習する方法として近年著者らが進めている 4 つの方向性，1) 空間モデルの影響を考慮した NN の損失関数，2) NN の構造の中に空間モデルを用いた音源分離を埋め込む方法，3) 所望音源の到来方向の情報をアトラクタとして用いて音源分離に必要なパラメータを推定するフレームワーク，4) 統計モデルに基づく音源分離法を疑似教師信号生成機として用いる教師無し NN 学習法を紹介する．

論文抄録(英)

内容記述タイプ

Other

内容記述

Recently, deep learning based speech source separation has been evolved rapidly. A neural network (NN) is usually learned independently of a spatial model. However, a research question remains whether the NN that is trained such as conﬁguration is really optimal when speech source separation is performed with the spatial model. In this talk, I will introduce conventional statistical model based speech source separation and deep learning based speech source separation. After that, I will introduce four research directions which incorporate a spatial model into the NN structure, i.e. 1) Loss function of the NN that considers the spatial model, 2) Insertion of speech source separation with the spatial model into the NN structure, 3) A NN framework which estimates parameters for speech source separation with a direction-of-arrival attractor, and 4) Unsupervised learning of NN which utilizes statistical model based speech source separation as a pseudo clean signal generator.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

研究報告音声言語情報処理（SLP）

巻 2021-SLP-136, 号 8, p. 1-6, 発行日 2021-02-24

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8663

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 18:24:14.768112

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

空間モデルを考慮した深層学習ベースの音源分離

× 戸上, 真人

× Masahito, Togami

Versions

Share

Cite as

エクスポート