MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  ABSTRACT Efficient Strategies for Partitioning and Querying a Hierarchical Document Space

Download:
Download as a PDF
by Bruno Codenotti, Gianluca De Marco, Via G. Moruzzi, Manuela Montangero
http://people.cs.uchicago.edu/~codenott/www2003last.pdf
Add To MetaCart

Abstract:

We consider a problem arising in the efficient management of a Hierarchical Document Space, i.e., partitioning the leaves of a tree among a set of servers in a such a way that it is possible to take full advantage of the hierarchical system to efficiently answer user’s queries. After proving that the problem is NP-Hard, we devise efficient approximate solutions, and we make a number of experiments which show that allowing for very little space inefficiency can be instrumental to achieving a significant improvement in the query efficiency.

Citations

7715 Computers and Intractability: A Guide to the Theory of NP-Completeness – Garey, Johnson - 1979
148 Hierarchical classification of Web content – Dumais, Chen - 2000
93 An efficient approximation scheme for the onedimensional bin-packing problem – Karp, Karmarkar - 1982
87 Hypursuit: A hierarchical network search engine that exploits content-link hypertext clustering – Weiss - 1996
80 WorstCase Performance Bounds for Simple One-dimensional Packing Algorithms – Johnson, Bemers, et al. - 1974
65 Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies – Chakrabarti, Dom, et al. - 1998
63 Efficient Clustering of Very Large Document Collections,” Data Mining for Scientific and Eng – Dhillon, Fan, et al. - 2001
52 A compendium of NP optimization problems. http://www.nada.kth.se/theory/problemlist.html – Crescenzi, Kann - 2000
29 Web Search Using Automatic Classification – Chekuri, Goldwasser, et al. - 1997
16 A Data Clustering Algorithm on Distributed Memory Multiprocessors – Dhillon, Modha - 1999
7 Efficiency considerations for scalable information retrieval servers – Frieder, Grossman, et al. - 2000
2 New worst case results for the bin packing problem – Simchi-Levi - 1994