MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  The Role of Semantic Locality in Hierarchical Distributed Dynamic Indexing (1999) [15 citations — 5 self]

Download:
pdf
by Fabien David Bouskila, Ecole Nationale, Superieure Telecommunications Paris
In Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI 2000), Las Vegas
http://www.cse.lehigh.edu/~billp/pubs/BouskilaMSThesis.pdf
Add To MetaCart

Abstract:

The global growth in popularity of the World Wide Web has been enabled in part by the availability of browser-based search tools, which in turn have led to an increased demand for indexing techniques and technologies. This explosive growth is evidenced by the rapid expansion in the number and size of digital collections of documents. Simulta-neously, fully automatic content-based techniques of indexing have been under develop-ment at a variety of institutions. The time is thus ripe for the development of scalable knowledge management systems capable of handling extremely large textual collections distributed across multiple repositories. Hierarchical Distributed Dynamic Indexing (HDDI) dynamically creates a hierarchical index from distributed document collections. At each node of the hierarchy, a knowl-edge base is created and subtopic regions of semantic locality are identi ed. This thesis presents an overview of HDDI with a focus on the algorithm that identi es regions of semantic locality within knowledge bases at each level of the hierarchy. iii To my parents, my brother Gautier, my sister Elise, Bertrand and Nathalie. iv ACKNOWLEDGMENTS I would like to thank Professor William Morton Pottenger for his proactive manag-ing of the HDDI project, and for his open-door policy towards students. I gratefully acknowledge the assistance and contributions of the sta in the Automated Learning Group directed by Dr. Michael Welge at the National Center for Supercomputing Ap-plications (NCSA) as well as the funding and technical oversight provided by Dr. Tilt Thompkins in the Emerging Technologies Group at NCSA. I also want tothankmy the-

Citations

2217 Introduction to Modern Information Retrieval – Salton, McGill - 1983
2005 The Design and Analysis of Computer Algorithms – Aho, Hopcroft, et al. - 1974
377 Using linear algebra for intelligent information retrieval – Berry, Dumais, et al. - 1995
160 The chaco user’s guide – version 2.0 – Hendrickson, Leland - 1994
153 Automatic Word Sense Discrimination – Schütze - 1998
148 Information storage and retrieval – Korfhage - 1997
140 B.Pottenger,L.Rauchwerger,andP.Tu.Parallel programming with polaris – Blume, Doallo, et al. - 1996
129 Rijsbergen, Information Retrieval – Van - 1979
124 On relevance, probabilistic indexing and information retrieval – Maron, Kuhns - 1960
96 Automatic construction of networks of concepts characterizing document databases – Chen, K - 1992
44 Mathematical Taxonomy – Jardine, Sibson - 1971
32 Depth rst search and linear graph algorithms – Tarjan - 1972
29 Automatic structuring of knowledge bases by conceptual clustering – Mineau, Godin - 1995
23 Document retrieval systems — optimization and evaluation – Rocchio - 1966
21 Algorithm AS 136: A K-means clustering algorithm – Hartigan, Wong - 1979
17 Experiments in Solving Recurrences in Computer Programs – Theory - 1997
15 Report on the testing and analysis of an investigation into the comparative efficiency of indexing systems – Cleverdon - 1962
14 On the inverse relationship of recall and precision – Cleverdon - 1972
13 Clustering in a high-dimensional space using hypergraph models – Han, Karypis, et al. - 1997
12 Automatic text analysis – Salton - 1970
11 The association factor in information retrieval – Stiles - 1961
10 Interoperability, scaling, and the digital libraries research agenda http:// www.hpcc.gov/reports/reports-nco/iita-dlw/main.html – Lynch, Garcia-Molina - 1995
4 Bayesian classi cation – Cheeseman, Self, et al. - 1988
2 National Laboratory, “arXiv.org ePrint archive”. http://xxx.lanl.gov – Alamos - 1999
2 Automatic Keyword Classi cation for Information Retrieval – Jones - 1971
2 Statistical Association Methods for Mechanized Documentation. Washington DC: National Bureau of Standards – Stevens, Guiliano, et al. - 1964
2 Speculations concerning information retrieval – Good - 1958
1 A basis for time and cost evaluation in information systems," The Information Bazar – Korfhage, DeLutis - 1969
1 Dynamic Information and Library Processing. Englewood Cli s – Salton - 1975
1 A statistical approach tomechanised encoding and searching of library information – Luhn - 1957
1 Fairhorne, The Mathematics of Classi cation – A - 1961
1 Is automatic classi cation a reasonable application of statistical analysis of text – Doyle - 1965
1 Retrieval and relevance: On the evaluation of IR systems." The ISI Lazerow Lecture – Robertson - 1993
1 Multilevel k-way hypergraph partioning – Karypis, Kumar - 1998
1 Texture segmentation of SAR images – Wang - 1997
1 Automatic graph clustering (system demonstration – Sablowski, Frick - 1996
1 Intel Technology Journal, Volume 7, Issue 4, 2003 Manohar Castelino is a senior network software engineer at Intel in the Network Processor Division. He received a B.E. degree from KREC India and has worked primarily in the areas of networking and network – unknown authors - 1998