MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  The SphereSearch engine for unified ranked retrieval of heterogeneous XML and web documents (2005) [19 citations — 6 self]

Download:
Download as a PDF
by Jens Graupmann, Ralf Schenkel, Gerhard Weikum
In VLDB
http://www.vldb2005.org/program/paper/thu/p529-graupmann.pdf
Add To MetaCart

Abstract:

This paper presents the novel SphereSearch Engine that provides unified ranked retrieval on heterogeneous XML and Web data. Its search capabilities include vague structure conditions, text content conditions, and relevance ranking based on IR statistics and statistically quantified ontological relationships. Web pages in HTML or PDF are automatically converted into XML format, with the option of generating semantic tags by means of linguistic annotation tools. For Web data the XML-oriented query engine is leveraged to provide very rich search options that cannot be expressed in traditional Web search engines: concept-aware and link-aware querying that takes into account the implicit structure and context of Web pages. The benefits of the SphereSearch engine are demonstrated by experiments with a large and richly tagged but non-schematic open encyclopedia extended with external documents.

Citations

1199 WordNet: an Electronic Lexical Database – Fellbaum - 1998
233 Optimal Aggregation Algorithms for Middleware – Fagin, Lotem, et al. - 2001
209 Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval – Robertson, Walker - 1994
141 WebOQL: Restructuring Documents, Databases and Webs – Arocena, Mendelzon - 1998
128 To weave the web – Atzeni, Mecca, et al. - 1997
117 XIRQL: a query language for information retrieval in XML documents – Fuhr, Großjohann - 2001
103 Extracting structured data from web pages – Arasu, Garcia-Molina - 2003
97 Optimizing multi-feature queries for image databases – Güntzer, Balke, et al. - 2000
82 W3QS: A Query System for the World-Wide Web – Konopnicki, Shmueli - 1995
78 The index-based XXL search engine for querying XML data with relevance ranking – Theobald, Weikum - 2002
58 Efficient IR-style keyword search over relational databases – Hristidis, Gravano, et al. - 2003
56 Query processing issues in image (multimedia) databases – Nepal, Ramakrishna - 1999
54 Building light-weight wrappers for legacy web data-sources using w4f – Sahuguet, Azavant - 1999
53 unknown title – Wikipedia
48 Top-k Query Evaluation with Probabilistic Guarantees – Theobald, Weikum, et al. - 2004
42 a general architecture for text engineering – Gate - 2002
37 Concept-based query expansion – Qiu, Frei - 1993
36 FleXPath: Flexible structure and full-text querying for XML – Amer-Yahia, Lakshmanan, et al. - 2004
34 Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods – Cohen, Sarawagi - 2004
31 Querying and Ranking XML Documents – Schlieder, Meuss - 2002
26 XMach-1: A Benchmark for XML Data Management – Böhme, Rahm - 2001
15 RoadRunner: Automatic data extraction from data-intensive Web sites – Crescenzi, Mecca, et al. - 2002
15 et al. XRANK: ranked keyword search over XML documents – Guo - 2003
13 An Expressive and Efficient Language for XML Information Retrieval – Chinenyanga, Kushmerick - 2001
11 et al. The Lorel Query Language for Semistructured Data – Abiteboul - 1997
11 Breaking through the syntax barrier: Searching with entities and relations – Chakrabarti - 2004
11 et al. XSEarch: A semantic search engine for XML – Cohen - 2003
8 et al. Web-scale information extraction in KnowItAll – Etzioni - 2004
8 Efficient creation and incremental maintenance of the HOPI index for complex XML document collections – Schenkel, Theobald, et al. - 2005
7 A semantic taxonomy-based personalizable meta-search agent – Kerschberg, Kim, et al. - 2001
7 et al. XMark: A Benchmark for XML Data Management – Schmidt - 2002
6 et al. The INEX evaluation initiative – Kazai - 2003
5 Merging XML indices – Amati, Carpineto, et al. - 2004
5 et al. Keyword searching and browsing in databases using BANKS – Bhalotia - 2002
5 Ontology-Enabled XML Search – Schenkel, Theobald, et al.
5 An algebra for structured queries in bayesian networks – Vittaut, Piwowarski, et al. - 2004
4 Information extraction and automatic markup for XML documents – Abolhassani, Fuhr, et al. - 2003
4 et al. The Lixto data extraction project – back and forth between theory and practice – Gottlob - 2004
3 Applying the divergence from randomness approach for content-only search in XML documents – unknown authors - 2004
3 Computer science bibliography. http://www.informatik.uni-trier.de/ ley/db/index.html – Ley - 2007
3 BINGO!: Bookmark-induced gathering of information – Sizov, Theobald, et al. - 2002
2 Automatic query refinement using mined semantic relations – Graupmann, Cai, et al. - 2005
2 Querying XML using structures and keywords – Yu, Jagadish, et al. - 2003
1 et al. An effective approach to document retrieval via utilizing WordNet and recognizing phrases – Liu - 2004