地上デジタル放送における音声と字幕データを活用した放送内容のテキスト化と要約手法の検討

阿達,藍留; 塚越,柚季; 大向,一輝

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

地上デジタル放送における音声と字幕データを活用した放送内容のテキスト化と要約手法の検討

https://ipsj.ixsq.nii.ac.jp/records/2006801

名前 / ファイル	ライセンス	アクション
IPSJ-CH26140028.pdf (1.0 MB) 2028年1月25日からダウンロード可能です。	Copyright (c) 2026 by the Information Processing Society of Japan
非会員：¥660, IPSJ:学会員：¥330, CH:会員：¥0, DLIB:会員：¥0

Item type

SIG Technical Reports(1)

公開日

2026-01-25

タイトル

言語

タイトル

地上デジタル放送における音声と字幕データを活用した放送内容のテキスト化と要約手法の検討

タイトル

言語

タイトル

A Study on Transcription and Summarization Methods for Digital Terrestrial Television Content Using Audio and Subtitle Data

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

東京大学

著者所属

東京大学

著者所属

東京大学

著者所属(英)

The University of Tokyo

著者所属(英)

The University of Tokyo

著者所属(英)

The University of Tokyo

著者名

阿達,藍留
塚越,柚季
大向,一輝

論文抄録

内容記述タイプ

Other

内容記述

本研究では，日本の地上デジタル放送における音声と字幕のデータを利用して放送内容をテキスト化し，大規模言語モデル(LLM)を用いてキーワード抽出および要約を行う手法を提案する．AIによる音声の書き起こしは，タイムスタンプによる映像との同期が容易である反面，人名・地名といった固有名詞や同音異義語の認識精度に課題が残る．一方，字幕データは主に専門のオペレーターによって付与されているためテキストとしての正確性は高いが，生放送においては入力作業に伴う遅延や，CM・放送時間の制約による文章の途中終了・欠落が起こり得る．そこで本研究では，LLMを用いて両者のデータを統合・相互補完することで，テキスト化の精度向上を図る．さらに，統合されたテキストから放送内容の理解に資する重要語句の抽出と要約生成を行う．本手法により，放送内容の効率的な分析が可能となるだけでなく，デジタルアーカイブにおける映像資料のメタデータ拡充や，検索性の向上に寄与することが期待される．

論文抄録(英)

内容記述タイプ

Other

内容記述

This study proposes a method to digitize broadcast content by utilizing audio and subtitle data from Japanese digital terrestrial television, followed by keyword extraction and summarization using Large Language Models (LLMs). While AI-driven speech-to-text allows for easy synchronization with video via timestamps, it faces challenges regarding the recognition accuracy of proper nouns―such as personal and place names―and homonyms. Conversely, subtitle data, primarily provided by professional operators, offers high textual accuracy; however, live broadcasts may suffer from input-related delays or truncated and missing sentences due to commercials and airtime constraints. Therefore, this research aims to improve transcription accuracy by integrating and mutually complementing both data sources using LLMs. Furthermore, the system extracts key terms and generates summaries from the integrated text to facilitate understanding of the broadcast content. This method is expected not only to enable efficient analysis of broadcast content but also to contribute to enriching metadata and enhancing searchability for video materials in digital archives.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN1010060X

書誌情報

研究報告人文科学とコンピュータ（CH）

巻 2026-CH-140, 号 28, p. 1-6, 発行日 2026-01-25

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2188-8957

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2026-01-15 04:25:02.579747

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

地上デジタル放送における音声と字幕データを活用した放送内容のテキスト化と要約手法の検討

× 阿達,藍留

× 塚越,柚季

× 大向,一輝

Versions

Share

Cite as

エクスポート