機械学習による音声情報のための知覚ハッシュ

郭,承迅; 細野,海人; 宮田,高道; 宮田,純子; 木下,宏揚; Chenghsun Kuo; Kaito Hosono; Takamichi Miyata; Sumiko Miyata; Hirotsugu Kinoshita

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

機械学習による音声情報のための知覚ハッシュ

https://ipsj.ixsq.nii.ac.jp/records/2000725

名前 / ファイル	ライセンス	アクション
IPSJ-IOT25068039.pdf (2.1 MB) 2999年12月31日からダウンロード可能です。	Copyright (c) 2025 by the Institute of Electronics, Information and Communication Engineers This SIG report is only available to those in membership of the SIG.
IOT:会員：¥0, DLIB:会員：¥0

Item type

SIG Technical Reports(1)

公開日

2025-02-24

タイトル

言語

タイトル

機械学習による音声情報のための知覚ハッシュ

タイトル

言語

タイトル

Perceptual Hashing for Audio Information through Machine Learning

言語

jpn

キーワード

主題Scheme

Other

主題

SITE

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

神奈川大学工学部電気電子情報工学科

著者所属

神奈川大学工学部電気電子情報工学科

著者所属

千葉工業大学先進工学部知能メディア工学科

著者所属

東京科学大学工学院

著者所属

神奈川大学工学部電気電子情報工学科

著者所属(英)

Department of Electrical, Electronic and Information Engineering, Faculty of Engineering, Kanagawa University

著者所属(英)

Department of Electrical, Electronic and Information Engineering, Faculty of Engineering, Kanagawa University

著者所属(英)

Department of Advanced Media, Faculty of Advanced Engineering, Chiba Institute of Technology

著者所属(英)

School of Engineering, Institute of Science Tokyo

著者所属(英)

Department of Electrical, Electronic and Information Engineering, Faculty of Engineering, Kanagawa University

著者名

郭,承迅
細野,海人
宮田,高道
宮田,純子
木下,宏揚

著者名(英)

Chenghsun Kuo
Kaito Hosono
Takamichi Miyata
Sumiko Miyata
Hirotsugu Kinoshita

論文抄録

内容記述タイプ

Other

内容記述

著作権管理システムでは，音声が改変されていないか検査できるように，情報の指紋ともいえるメッセージダイジェストが必要になる．しかし，加工編集されたコンテンツにに対応するには，暗号学的ハッシュ関数だけでは不十分で，知覚ハッシュが重要な要素技術となる．知覚ハッシュによるメッセージダイジェストは，音声の加工・編集を行っても人間が元は同じ音声と認識できる場合に等しいハッシュ値を出力するため，著作権管理に向いている．さらに，同一話者が異なる単語を発声した場合や異なる話者が同一単語を発声した場合においても，コンテンツの同一性に基づくハッシュ値を生成可能である．また，論文や書籍・オーディオコレクションなどのコンテンツには，複数の異なる音声が収録されており，コンテンツ内の各音声に同一のハッシュ値を一括して生成すれば，コンテンツの管理が容易になる．本研究では，Wav2Vec2モデルを利用した知覚ハッシュ生成手法を提案する．音声波形を直接処理する特性を活かし，従来のCNNベース手法よりも編集耐性と話者/コンテンツ同一性の両立を実現する．

論文抄録(英)

内容記述タイプ

Other

内容記述

For copyright management systems, message digests, which can be considered as fingerprints of information, are necessary to inspect whether audio has been modified or not. Message digests based on perceptual hashes are suitable for copyright management because they produce equal hash values for audio that humans can recognize as the same original sound, even if the audio is processed or edited. Furthermore, contents such as papers, books, and audio collections contain several different audios. If the same hash value is generated for each audio in the content, it becomes easier to manage the content. In this study, we propose a method for generating perceptual hashes using CNN weight coefficients.

言語

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA12326962

書誌情報

研究報告インターネットと運用技術（IOT）

巻 2025-IOT-68, 号 39, p. 1-7, 発行日 2025-02-24

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8787

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-02-19 10:32:01.683324

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

機械学習による音声情報のための知覚ハッシュ

× 郭,承迅

× 細野,海人

× 宮田,高道

× 宮田,純子

× 木下,宏揚

× Chenghsun Kuo

× Kaito Hosono

× Takamichi Miyata

× Sumiko Miyata

× Hirotsugu Kinoshita

Versions

Share

Cite as

エクスポート