談話標識と話題語に基づく統計的尺度による講演からの重要文抽出

北出, 祐; 南條浩輝; 河原, 達也; 奥乃, 博; Tasuku, Kitade; Hiroaki, Nanjo; Tatsuya, Kawahara; Hiroshi, G.Okuno

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

談話標識と話題語に基づく統計的尺度による講演からの重要文抽出

https://ipsj.ixsq.nii.ac.jp/records/57234

名前 / ファイル	ライセンス	アクション
IPSJ-SLP03046002.pdf (114.4 kB)	Copyright (c) 2003 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2003-05-27

タイトル

談話標識と話題語に基づく統計的尺度による講演からの重要文抽出

タイトル

言語

タイトル

Automatic Extraction of Important Sentences from Lecture Transcription using Statistical Measure based on Discourse Markers and Topic Words

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

京都大学情報学研究科知能情報学専攻

著者所属

京都大学情報学研究科知能情報学専攻

著者所属

京都大学情報学研究科知能情報学専攻

著者所属

京都大学情報学研究科知能情報学専攻

著者所属(英)

School of Informatics, Kyoto University

著者所属(英)

School of Informatics, Kyoto University

著者所属(英)

School of Informatics, Kyoto University

著者所属(英)

School of Informatics, Kyoto University

著者名

北出, 祐
南條浩輝
河原, 達也
奥乃, 博

著者名(英)

Tasuku, Kitade
Hiroaki, Nanjo
Tatsuya, Kawahara
Hiroshi, G.Okuno

論文抄録

内容記述タイプ

Other

内容記述

講演（学会講演）のディジタルアーカイブ化を目的として，書き起こし（音声認識結果）から自動的に重要文を抽出するために，学会講演特有の話題構造を利用した談話標識に基づく手法を提案する．ポーズ情報および言語的情報をもとに話し言葉におけるセクション境界候補を検出し，セクション冒頭の文に頻出する談話標識を求めた上で，これに基づく統計的な重要度尺度を定義する．さらに話題語（キーワード）の統計量に基づく重要度尺度と統合することも検討した．これらの重要度尺度でCSJの14件の学会講演を対象に重要文抽出精度の評価を行い，(1)談話標識に基づく手法が有効であること，(2)話題語に基づく手法と統合することで相乗効果が得られること，を確認した．

論文抄録(英)

内容記述タイプ

Other

内容記述

For efficient access to speech media, secondary information is required. We explore automatic extraction of important sentences from lecture presentations. We segment a lecture into units and extract key sentences based on the discourse structure. To detect the boundaries of the units, we make use of the pause information and linguistic information. We also incorporate another extraction method based on topic dependent keywords. We evaluate the proposed methods and their combination with 14 lecture transciptions. It is confirmed that the use of section boundary information and its combination with keyword-based method are effective.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

情報処理学会研究報告音声言語情報処理（SLP）

巻 2003, 号 58(2003-SLP-046), p. 7-12, 発行日 2003-05-27

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 04:35:23.864911

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

談話標識と話題語に基づく統計的尺度による講演からの重要文抽出

× 北出, 祐

× 南條浩輝

× 河原, 達也

× 奥乃, 博

× Tasuku, Kitade

× Hiroaki, Nanjo

× Tatsuya, Kawahara

× Hiroshi, G.Okuno

Versions

Share

Cite as

エクスポート