ログイン 新規登録
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. シンポジウム
  2. シンポジウムシリーズ
  3. じんもんこんシンポジウム
  4. 2025

軽量なレイアウト認識モデルを活用した 大規模なOCRテキストデータの構造化及び成果物の分析

https://ipsj.ixsq.nii.ac.jp/records/2006234
https://ipsj.ixsq.nii.ac.jp/records/2006234
affd86b9-9011-4cf8-b450-6626bdbcdbad
名前 / ファイル ライセンス アクション
IPSJ-CH2025059.pdf IPSJ-CH2025059.pdf (1.6 MB)
 2026年12月13日からダウンロード可能です。
Copyright (c) 2025 by the Information Processing Society of Japan
非会員:¥660, IPSJ:学会員:¥330, CH:会員:¥0, DLIB:会員:¥0
Item type Symposium(1)
公開日 2025-12-06
タイトル
言語 ja
タイトル 軽量なレイアウト認識モデルを活用した 大規模なOCRテキストデータの構造化及び成果物の分析
タイトル
言語 en
タイトル Structuring and analyzing the results large-scale OCR text dataobtained using a lightweight layout recognition model
言語
言語 jpn
キーワード
主題Scheme Other
主題 OCR;構造化;テキストデータ;大規模データ資源;レイアウト認識
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_5794
資源タイプ conference paper
著者所属
National Diet Library
著者所属
National Diet Library
著者所属(英)
en
National Diet Library
著者所属(英)
en
National Diet Library
著者名 青池,亨

× 青池,亨

青池,亨

Search repository
木下,貴文

× 木下,貴文

木下,貴文

Search repository
著者名(英) Toru Aoike

× Toru Aoike

en Toru Aoike

Search repository
Takafumi Kinoshita

× Takafumi Kinoshita

en Takafumi Kinoshita

Search repository
論文抄録
内容記述タイプ Other
内容記述 The National Diet Library (NDL) has made a considerable effort both to use optical character recognition (OCR) in converting its collection into digital text and to develop OCR technology. Given the sheer volume of the materials that must be handled, even as more advanced methods that yielded more sophisticated results became available, the difficulty of performing large-scale reprocessing on materials that had already undergone OCR processing was a significant challenge. The results of this study, which was made using materials for which copyright protection had expired, show that the usability of large volumes of existing OCR text data can be improved in a fast and resource-efficient manner by applying post-processing with a lightweight layout recognition model. In addition, the results of this study were applied in the development of an experimental feature that has been added to the Next Digital Library in the form of a text mode that displays only the structured text data.
論文抄録(英)
内容記述タイプ Other
内容記述 The National Diet Library (NDL) has made a considerable effort both to use optical character recognition (OCR) in converting its collection into digital text and to develop OCR technology. Given the sheer volume of the materials that must be handled, even as more advanced methods that yielded more sophisticated results became available, the difficulty of performing large-scale reprocessing on materials that had already undergone OCR processing was a significant challenge. The results of this study, which was made using materials for which copyright protection had expired, show that the usability of large volumes of existing OCR text data can be improved in a fast and resource-efficient manner by applying post-processing with a lightweight layout recognition model. In addition, the results of this study were applied in the development of an experimental feature that has been added to the Next Digital Library in the form of a text mode that displays only the structured text data.
書誌情報 じんもんこん2025論文集

巻 2025, p. 431-436, ページ数 6, 発行日 2025-12-06
出版者
言語 ja
出版者 情報処理学会
戻る
0
views
See details
Views

Versions

Ver.1 2025-12-03 01:56:43.095756
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3