See this document in CiteSeerX!

Fast and Flexible Word Searching on Compressed Text (2000)  (Make Corrections)  (8 citations)
Edleno Silva de Moura, Gonzalo Navarro, Nivio Ziviani, Ricardo Baeza-Yates
ACM Transactions on Information Systems



  Home/Search   Context   Related

 
View or download:
dcc.uchile.cl/~gnavarro/...tois00.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  dcc.uchile.cl/~gnavarro/publ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: text. When searching complex or approximate patterns, our algorithms are up to 8 times faster than the search on uncompressed text. We also discuss the impact of our technique in inverted files pointing to logical blocks and argue for the possibility of keeping the text compressed all the time, decompressing only for displaying purposes. Categories and Subject Descriptors: E.4 [Coding and Information Theory]: Data Compaction This work has been partially supported by SIAM Project, grant... (Update)

Cited by:   More
Improving Web Search Efficiency via a Locality.. - de Moura, Santos, .. (2005)   (Correct)
Index Structures for Distributed Text Databases - Marin   (Correct)
Compressing Distributed Text in Parallel with (s.. - Bonacic, Farina..   (Correct)

Similar documents (at the sentence level):
14.1%:   Fast Searching on Compressed Text Allowing Errors - de Moura, Navarro, al. (1998)   (Correct)

Active bibliography (related documents):   More   All
0.8:   Adding Compression to Block Addressing Inverted Indexes - Navarro, de Moura.. (2000)   (Correct)
0.5:   Compressed Pattern Matching Approximate Compressed .. - de Moura..   (Correct)
0.3:   Approximate Text Searching - Badino (1998)   (Correct)

Similar documents based on text:   More   All
2.6:   Direct Pattern Matching on Compressed Text - de Moura, Navarro, Ziviani (1998)   (Correct)
0.6:   An Efficient Compression Code for Text Databases - Brisaboa, Iglesias, Navarro, ..   (Correct)
0.4:   Syntactic Similarity of Web Documents - Jr, Ziviani (2003)   (Correct)

Related documents from co-citation:   More   All
4:   Filtered document retrieval with frequency-sorted indexes - Persin, Zobel et al. - 1996
4:   Compression: A key for next-generation text retrieval systems - Ziviani, de Moura et al. - 2000
4:   Adding compression to block addressing inverted indexes - Navarro, Moura et al. - 2000

BibTeX entry:   (Update)

Edleno Silva de Moura, Gonzalo Navarro, Nivio Ziviani, and Ricardo Baeza-Yates. Fast and flexible word searching on compressed text. ACM Transactions on Information Systems, 18(2):113--139, April 2000. http://citeseer.ist.psu.edu/silvademoura00fast.html   More

@article{ demoura00fast,
    author = "Edleno {Silva de Moura} and Gonzalo Navarro and Nivio Ziviani and Ricardo Baeza-Yates",
    title = "Fast and flexible word searching on compressed text",
    journal = "ACM Transactions on Information Systems",
    volume = "18",
    number = "2",
    pages = "113--139",
    year = "2000",
    url = "citeseer.ist.psu.edu/silvademoura00fast.html" }
Citations (may not include all citations):
458   A universal algorithm for sequential data compression - Ziv, Lempel - 1977
293   Compression of individual sequences via variable-rate coding - Ziv, Lempel - 1978
196   Fast text searching allowing errors (context) - Wu, Manber - 1992
137   Efficient string matching: an aid to bibliographic search (context) - Aho, Corasick - 1975
121   Handbook of Algorithms and Data Structures (context) - Gonnet, Baeza-Yates - 1991
118   Glimpse: a tool to search through entire file systems - Manber, Wu - 1993
92   Human Behaviour and the Principle of Least Effort (context) - Zipf - 1949
85   Overview of the third text retrieval conference (context) - Harman - 1995
81   A new approach to text searching (context) - Baeza-Yates, Gonnet - 1992
72   A locally adaptive data compression scheme (context) - Bentley, Sleator et al. - 1986
64   Managing Gigabytes (context) - Witten, Moffat et al. - 1999
61   Let sleeping files lie: pattern matching in z-compressed fil.. - Amir, Benson et al. - 1996
56   A very fast substring search algorithm (context) - Sunday - 1990
49   the complexity of finite sequences (context) - Ziv, Lempel - 1976
45   String matching in lempel-ziv compressed strings - Farach, Thorup - 1995
38   Information Retrieval - Computational and Theoretical Aspect.. (context) - Heaps - 1978
36   A text compression scheme that allows fast searching directl.. - Manber - 1997
34   Data compression in full-text retrieval systems (context) - Bell, Moffat et al. - 1993
27   Multiple pattern matching in lzw compressed text - Kida, Takeda et al. - 1998
26   Adding compression to a full-text retrieval system - Zobel, Moffat - 1995
25   Large text searching allowing errors (context) - Ara'ujo, Navarro et al. - 1997
25   A general practical approach to pattern matching over Ziv-Le.. - Navarro, Raffinot - 1999
24   Faster approximate string matching - Baeza-Yates, Navarro - 1999
24   Adding compression to block addressing inverted indices - Navarro, Moura et al. - 2000
19   Shift-And approach to pattern matching in LZW compressed tex.. - Kida, Takeda et al. - 1999
19   Generating a canonical prefix encoding (context) - Schwartz, Kallick - 1964
17   Efficient decoding of prefix codes - Hirschberg, Lelewer - 1990
14   Word-based text compression (context) - Moffat - 1989
14   Direct pattern matching on compressed text - Moura, Navarro et al. - 1998
14   Indexing compressed text - Moura, Navarro et al. - 1997
14   Fast searching on compressed text allowing errors (context) - Moura, Navarro et al. - 1998
13   Constructing word-based text compression algorithms - Horspool, Cormack - 1992
9   Block addressing indices for approximate text retrieval - Baeza-Yates, Navarro - 1997
2   Fast file search using text compression (context) - Turpin, Moffat - 1997
1   Efficient two-dimensional compressed matching (context) - Flexible, on et al. - 1992
1   ao de Dados a Sistemas de Recuperac~ao de Informac~ao (context) - Moura - 1999
1   and Laber (context) - Milidiu, Pessoa - 1998
1   and Katajainen (context) - Moffat - 1995

Documents on the same site (http://www.dcc.uchile.cl/~gnavarro/publ.html):   More
A More Precise Solution to Two Problems on Tries - Navarro, Poblete   (Correct)
Fast Approximate String Matching in a Dictionary - Baeza-Yates, Navarro (1998)   (Correct)
An Optimal Index for PAT Arrays - Navarro (1996)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC