音声認識を用いた空耳自動生成の検討

半田, 尚暉; 植村, あい子; 吉田, 典正; Naoki, Handa; Aiko, Uemura; Norimasa, Yoshida

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

音声認識を用いた空耳自動生成の検討

https://ipsj.ixsq.nii.ac.jp/records/232825

名前 / ファイル	ライセンス	アクション
IPSJ-MUS24139008.pdf (587.0 kB)	Copyright (c) 2024 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2024-03-02

タイトル

音声認識を用いた空耳自動生成の検討

タイトル

言語

タイトル

A Study of Automatic Soramimi Generation by Speech Recognition

言語

jpn

キーワード

主題Scheme

Other

主題

応用システム

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

日本大学大学院生産工学研究科マネジメント工学専攻

著者所属

日本大学生産工学部マネジメント工学科

著者所属

日本大学生産工学部マネジメント工学科

著者所属(英)

Department of Industrial Engineering and Management, Graduate School of Industrial Technology, Nihon University

著者所属(英)

Department of Industrial Engineering and Management, College of Industrial Technology, Nihon University

著者所属(英)

Department of Industrial Engineering and Management, College of Industrial Technology, Nihon University

著者名

半田, 尚暉
植村, あい子
吉田, 典正

著者名(英)

Naoki, Handa
Aiko, Uemura
Norimasa, Yoshida

論文抄録

内容記述タイプ

Other

内容記述

本研究では，「空耳」と呼ばれる，音声を聞いた際に本来の言葉とは異なる意味の言葉に聞こえる現象に着目する．空耳は異言語を母国語で誤認することで生じると仮定し，入力の言語（例えば英語）とは異なる言語（日本語）で学習された音声認識ツールを用いて，音声に含まれる非言語情報を考慮した空耳の自動生成に取り組む．具体的には，TV 番組の「空耳アワー」で報告された楽曲に対して日本語モデルの音声認識ツールを複数用いて，出力結果を空耳として利用できるか検証する．発音記号に着目し報告された楽曲から評価基準を定め，発音での近さを評価した．そして，報告された事例と楽曲の聴こえ方が似ている曲を選別し，音声認識を行った．実験結果から，高性能な音声認識手法ほど空耳の自動生成には向かないことが分かった．また，音声認識を通じて新しい空耳を作成できるケースもあり，空耳を音声認識によって生成できる可能性が示唆された．

論文抄録(英)

内容記述タイプ

Other

内容記述

This study focuses on a phenomenon known as “Soramimi” in which we regard some sounds as having a different meaning from the original words. We assume that Soramimi is caused by the misrecognition of a foreign language with one's native language. Accordingly, we use speech recognition tools trained with a different language (e.g., Japanese) from the input language (e.g., English) to generate Soramimi by taking into account the nonverbal information contained in vocals. Specifically, we use several Japanese model speech recognition tools for songs reported in the TV program “Soramimi Hour” to verify whether the output results are usable as Soramimi. Our analysis focuses on phonetic symbols and defines the evaluation criteria from the reported songs in Soramimi Hour. We evaluated the similarity based on pronunciation. Then, we selected songs with particular similarities between the way the music was heard and the reported cases. We then performed speech recognition on these selections. Experimental results show that the more robust speech recognition methods are unsuitable for automatic Soramimi generation. In some cases, new Soramimi were generated through speech recognition, suggesting that Soramimi can be generated by speech recognition.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10438388

書誌情報

研究報告音楽情報科学（MUS）

巻 2024-MUS-139, 号 8, p. 1-6, 発行日 2024-03-02

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8752

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 10:18:32.518136

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

音声認識を用いた空耳自動生成の検討

× 半田, 尚暉

× 植村, あい子

× 吉田, 典正

× Naoki, Handa

× Aiko, Uemura

× Norimasa, Yoshida

Versions

Share

Cite as

エクスポート