MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Index compression vs. Retrieval time of inverted files for XML documents (2002) [2 citations — 0 self]

Download:
Download as a PDF
by Norbert Fuhr, Norbert Gövert
In Proceedings of the 11th ACM International Conference on Information and Knowledge Management
http://ls6-www.informatik.uni-dortmund.de/bib/fulltext/ir/./Fuhr_Goevert:02.pdf
Add To MetaCart

Abstract:

Query languages for retrieval of XML documents allow for conditions referring both to the content and the structure of documents. In this paper, we investigate two different approaches for reducing index space of inverted files for XML documents. First, we consider methods for compressing index entries. Second, we develop the new XS tree data structure which contains the structural description of a document in a rather compact form, such that these descriptions can be kept in main memory. Experimental results on two large XML document collections show that very high compression rates for indexes can be achieved, but any compression increases retrieval time. On the other hand, highly compressed indexes may be feasible for applications where storage is limited, such as in PDAs or E-book devices.

Citations

564 Managing Gigabytes: Compressing and Indexing Documents and Images – Witten, Moffat, et al. - 1999
400 DataGuides: enabling query formulation and optimization in semistructured databases – Goldman, Widom - 1997
223 Universal codeword sets and representations of the integers – Elias - 1975
150 Passage-Level Evidence in Document Retrieval – Callan - 1994
116 K.Großjohann: XIRQL: A Query Language for Information Retrieval – Fuhr - 2001
111 XMill: an efficient compressor for XML data – Liefke, Suciu - 2000
97 Effective retrieval of structured documents – Wilkinson - 1994
89 Self-indexing inverted files for fast text retrieval – Moffat, Zobel - 1996
52 Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems – COOPER - 1968
43 A Query Language for {XML – Florescu, Deutsch, et al. - 1999
33 W.C.: Inverted fi les – Fox, Harman, et al.
28 Efficient decoding of prefix codes – Hirschberg, Lelewer - 1990
1 Design of indexes for structured document databases – Thom, Zobel, et al. - 1995