@techreport{oai:ipsj.ixsq.nii.ac.jp:00019128, author = {市川, 宙 and 橋本, 泰一 and 徳永, 健伸 and 田中, 穂積 and Hiroshi, Ichikawa and Taiichi, Hashimoto and Takenobu, Tokunaga and Hozumi, Tanaka}, issue = {42(2005-DBS-136)}, month = {May}, note = {本論文では,構文木付きコーパスから,構文的に類似した文を検索する手法を提案した.構文的類似度の計算手法としてはTree Kernel (Collins)が提案されている.しかし,Tree Kernelの類似度計算は時間を要するため,これを類似文検索に応用すると,検索速度が問題になる.検索時間短縮のためには,予め検索対象のインデックスを作成しておくのが一般的だが,Tree Kernelではその性質上,検索対象のインデックス化が困難である.そこで,Tree Kernelを近似する高速な新しいアルゴリズムとしてTree OverlappingとSubpath Setを提案した.これらのアルゴリズムは,Tree Kernelとは異なり,検索対象のインデックス化が可能なため,高速な検索が可能である.本論文ではTree Kernel Tree Overlapping Subpath Setの3種類のアルゴリズムについて述べ,実験結果を示し,比較した., This paper proposes a method to retrieve sentences which have a similar syntactic structure to the syntax tree of the query sentence. Tree Kernel has been proposed by Collins as a method to calculate structural similarity. However, the similarity retrieval by Tree Kernel is not practicable because Tree Kernel computation requires significant resources. A general method to shorten the retrieving time and to reduce required computation is indexing the corpora beforehand. However, in case of Tree Kernel, it is too hard to index the corpora. Therefore, we propose faster approximation algorithms: Tree Overlapping and Subpath Set. These algorithms are faster than Tree Kernel because indexing is possible. This paper describes three algorithms: Tree Kernel, Tree Overlapping and Subpath Set, and shows the result of evaluations and algorithm comparison.}, title = {テキスト構文構造類似度を用いた類似文検索手法}, year = {2005} }