階層構造データ列の簡易な高速検索アルゴリズム

村田, 真樹; 内山, 将夫; 金丸, 敏幸什; 井佐原, 均; Masaki, MURATA; Masao, UTIYAMA; Toshiyuki, KANAMARU; Hitoshi, ISAHARA

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

階層構造データ列の簡易な高速検索アルゴリズム

https://ipsj.ixsq.nii.ac.jp/records/48011

名前 / ファイル	ライセンス	アクション
IPSJ-NL05168003.pdf (1.2 MB)	Copyright (c) 2005 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2005-07-22

タイトル

階層構造データ列の簡易な高速検索アルゴリズム

タイトル

言語

タイトル

Fast Search Algorithm for Sequences of Hierarchically Structured Data

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

独立行政法人情報通信研究機構

著者所属

独立行政法人情報通信研究機構

著者所属

京都大学大学院人間・環境学研究科独立行政法人情報通信研究機構

著者所属

独立行政法人情報通信研究機構

著者所属(英)

National Institute of Information and Communications Technology

著者所属(英)

National Institute of Information and Communications Technology

著者所属(英)

Graduate School of Human and Environmental Studies Kyoto University,National Institute of Information and Communications Technology

著者所属(英)

National Institute of Information and Communications Technology

著者名

村田, 真樹

著者名(英)

Masaki, MURATA

論文抄録

内容記述タイプ

Other

内容記述

本稿では，京都テキストコーパスの形態素情報の品詞，品詞細分類，単語などの階層構造をもったデータの列を簡易に高速検索するアルゴリズムを記述する．本稿のアルゴリズムでは各データについて抽象度の低い階層のデータを二つの同じ抽象度の高いデータで挟んだデータを作成し，それらをつなげて一つのテキストとしそれをデータベースに格納する．データベースからの検索にはsufBxarrayを利用する．実際に実験を行なった結果，本稿で提案する手法は比較手法に比べて，速いときで194倍速く，平均でも24倍速かった．本稿のアルゴリズムは他の形態の階層構造にも利用できる．本手法の応用としては，Webテキストなどにおいて各単語に下位の意味ラベル，中位の苦味ラベル，上位の意味ラベルなどの意味的な階層の情報を付与しておきそのデータに対して本手法を適用することが考えられる．このようにするとⅥねbテキストにおいて「下位の意味ラベル：行政機関」と「単語：の」と「中位の意味ラベル：技術」をつなげた検索キーのようなものも検索できるようになり，従来のWEB検索よりもより汎用的で便利な検索が実現できるようになる．WEBの検索は一般的に需要が大きく，本稿のアルゴリズムはそういう需要の大きな課題にも利用できるものである．

論文抄録(英)

内容記述タイプ

Other

内容記述

We developed an algorithm for quickly searching sequences of hierarchically structured data,such as the Kyoto Text Corpus,Where each word includes information on its part of speech（POS）,minor POS and word itSelf. Using our method,We first make a data item where each data item in a lower level is surrounded by two data items in a higher level. We then connect these data items to make a long string and store the string in a database. We use suffix arrays to retrieve queries on the database. Our experiments showed that our method was 450 times faster than a Conventional method at fastest and 64 times faster on average. Our method can be used for other kings of hierarchica11v structured data,such as Web applications. Methods that can be used on such data are in high demand. For example,our method can be used to retrieve Web text,that includes hierarchical informatior 0f low,middle,and high semantic levels. If we use our method for such Web text,We can query using the terms,“Middle semantic level:technique”,“Word:of”,and “Low semantic level:administrative organ”:in other words,our retrieval method is more useful and convenient than Conventional web retrieval.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10115061

書誌情報

情報処理学会研究報告自然言語処理（NL）

巻 2005, 号 73(2005-NL-168), p. 13-20, 発行日 2005-07-22

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 08:42:39.286967

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

階層構造データ列の簡易な高速検索アルゴリズム

× 村田, 真樹

× Masaki, MURATA

Versions

Share

Cite as

エクスポート