2024-03-29T08:16:17Zhttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_oaipmhoai:ipsj.ixsq.nii.ac.jp:000729142023-04-27T10:00:04Z01164:02592:06240:06332
圧縮テキスト上での<i>q</i>-gram非重複頻度の効率的な計算とその応用enghttp://id.nii.ac.jp/1001/00072914/Technical Reporthttps://ipsj.ixsq.nii.ac.jp/ej/?action=repository_action_common_download&item_id=72914&item_no=1&attribute_id=1&file_no=1Copyright (c) 2011 by the Information Processing Society of JapanDepartment of InformaticsDepartment of Electrical Engineering and Computer ScienceDepartment of InformaticsGraduate School of Information Science and Electrical Engineering Kyushu UniversityDepartment of InformaticsKeisuke, GotoNami, FukuiHideo, BannaiShunsuke, IkenagaMasayuki, TakedaIn many problems concerning text data, length-q substrings, or q-grams, can represent important characteristics of the data. Determining the frequencies of all q-grams contained in the data is an important problem with many applications. In this paper, we consider the problem of calculating the non-overlapping frequencies of all q-grams in a text represented as a straight line program (SLP). We show that the problem can be solved in O(q2n) time, where n is the size of the SLP. We also show an interesting application of the algorithm, which converts an arbitrary SLP to an SLP that is constructed based on frequency information.In many problems concerning text data, length-q substrings, or q-grams, can represent important characteristics of the data. Determining the frequencies of all q-grams contained in the data is an important problem with many applications. In this paper, we consider the problem of calculating the non-overlapping frequencies of all q-grams in a text represented as a straight line program (SLP). We show that the problem can be solved in O(q2n) time, where n is the size of the SLP. We also show an interesting application of the algorithm, which converts an arbitrary SLP to an SLP that is constructed based on frequency information.AN1009593X研究報告アルゴリズム(AL)2011-AL-13411172011-02-282011-02-22