4 citations found. Retrieving documents...
R. A. Baeza-Yates and G. Navarro. A practical index for text retrieval allowing errors. In CLEI, volume 1, pages 273--282, November 1997.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Accelerating Substring Searching: Breaking the I/O Barrier - Kahveci, Singh (2002)   (Correct)

....large. In memory algorithms can become impractical for string databases because the database size grows faster than the available memory capacity, and extensive memory requirements make the search techniques impractical. The number of CPU operations can be reduced by using index based techniques [1, 3, 11, 13, 27, 28, 31, 40, 41, 45]. However, the typical size of these indexes is even larger than the size of the database. For example, the size of the SST tree [15] is 200 times larger than the database size for the parameters specified in [15] The distance between two strings is generally defined as the minimum number of ....

....space complexity is O(n) the index size can be 7 9 times larger than the data size. This may cause a drop in performance if the index does not fits in memory. Second, the worst case running time complexity of this technique is very high. A similar technique is proposed by Baeza Yates and Navarro [3] for the whole substring matching problem. The authors propose to keep a dictionary that stores all possible substrings of length l that occur in s. The dictionary also keeps an array for each substring to store the starting locations of that substring in s. For a given query q with range r, the ....

R. A. Baeza-Yates and G. Navarro. A practical index for text retrieval allowing errors. In CLEI, volume 1, pages 273--282, November 1997.


An Efficient Index Structure for String Databases - Kahveci, Singh (2001)   (10 citations)  (Correct)

....is too large. In memory algorithms can become impractical for string databases because the database size grows faster than the available memory capacity, and extensive memory requirements make the search techniques impractical. The size of the index structure for the index based techniques [2, 4, 15, 18] are even larger than the size of the database, and their performance deteriorates for long query patterns. Therefore, efficient external memory algorithms are needed for most string comparison applications of the future. A string can be transformed into another string by using three ....

.... used to refine this distance vector (Step z2 ) When all rows have been searched, the disk pages corresponding to the last result set are read (Step 4) Finally, postprocessing is carried out to eliminate false retrievals (Step 5) Note that any of the distance computation techniques available [2, 3, 4, 5, 6, 11, 15, 17, 18, 20, 21, 22] can be used in the postprocessing step. As a consequence of Theorem 2 and Lemma 3 we have the following theorem. Theorem 4 The MRS index structure does not incur any false drops. We note the following about the search algorithm. 1) For each MBR, the refinement of radius is carried out ....

R. A. Baeza-Yates and G. Navarro. A practical index for text retrieval allowing errors. In CLEI, volume 1, pages 273--282, November 1997.


An Efficient Index Structure for String Databases - Kahveci, Singh (2001)   (10 citations)  (Correct)

....10 12 Year 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 0 2000 4000 6000 8000 10000 12000 Year Base Pairs (millions) Figure 1. The growth of the number of sequences and the size of the database over years. String search algorithms proposed so far are all in memory algorithms [2, 4, 5, 6, 10, 16, 17, 18, 20, 21, 22]. That is, these techniques assume that the database fits into memory. Therefore, these techniques suffer from disk I Os when the database is too large. In memory algorithms are impractical for this kind of applications because, the database size grows faster than the available memory capacity, ....

....a range refinement (Step 2b(ii) are carried out. When all rows have been searched, the disk pages corresponding to the last result set are read (Step 2c) Finally, postprocessing is carried out to eliminate false retrievals (Step 2d) Note that any of the distance computation techniques available [2, 3, 4, 5, 6, 10, 16, 17, 18, 20, 21, 22] can be used in the postprocessing step. 1 q q 2 q 3 128 512 512 Figure 8. Partitioning for query q, jqj = 1152 As a consequence of Theorem 2 and Lemma 3 we have the following theorem. Theorem 4 The MRS index structure does not incur any false drops. We note the following about the ....

R. A. Baeza-Yates and G. Navarro. A practical index for text retrieval allowing errors. In CLEI, volume 1, pages 273--282, November 1997.


An Efficient Index Structure for String Databases - Kahveci, Singh (2001)   (10 citations)  (Correct)

....is too large. In memory algorithms can become impractical for string databases because the database size grows faster than the available memory capacity, and extensive memory requirements make the search techniques impractical. The size of the index structure for the index based techniques [2, 4, 15, 18] are even larger than the size of the database, and their performance deteriorates for long query patterns. Therefore, efficient external memory algorithms are needed for most string comparison applications of the future. A string s 1 can be transformed into another string s 2 by using three edit ....

.... then used to refine this distance vector (Step 3b) When all rows have been searched, the disk pages corresponding to the last result set are read (Step 4) Finally, postprocessing is carried out to eliminate false retrievals (Step 5) Note that any of the distance computation techniques available [2, 3, 4, 5, 6, 11, 15, 17, 18, 20, 21, 22] can be used in the postprocessing step. As a consequence of Theorem 2 and Lemma 3 we have the following theorem. Theorem 4 The MRS index structure does not incur any false drops. We note the following about the search algorithm. 1) For each MBR, the refinement of radius is carried out ....

R. A. Baeza-Yates and G. Navarro. A practical index for text retrieval allowing errors. In CLEI, volume 1, pages 273--282, November 1997.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC