Abstract Using Attribute Grammars to Uniformly Represent Structured Documents- Application to Information Retrieval [1 citations — 1 self]
Abstract:
This paper presents an ongoing work to uniformly represent structured documents by mean of Attribute Grammars (AG). Each document corresponds to a syntactic tree with nodes decorated with sets of attributes. The values of these attributes correspond to characteristics which specify the semantics of both the textual content and the structural elements. We show how to use this representation for the Information Retrieval (IR) task from collections of structured documents. We give a brief global overview of the proposed DASTIR system, describing the specification of the syntactic and the semantic parts of the AG generated to give the desired response to a structural query.
Citations
| 211 | Multi-paragraph segmentation of expository text – Hearst - 1994 |
| 100 | Effective Retrieval of Structured Documents – Wilkinson - 1994 |
| 50 | Retrieval from hierarchical texts by partial patterns – Kilpelainen, Mannila - 1993 |
| 48 | A language for queries on structure and contents of textual databases – Navarro, Baeza-Yates - 1995 |
| 41 | Dempster-Shafer’s Theory of Evidence Applied to Structured Documents: Modelling Uncertainty – Lalmas - 1997 |
| 25 | MarkItUp! an incremental approach to document structure recognition – Fankhauser, Xu - 1993 |
| 21 | Searching text-rich XML documents with relevance ranking – Hayashi - 2000 |
| 7 | Expressiveness of Structured Document Query Languages Based on Attribute Grammars – Neven, Bussche - 2002 |
| 5 | XPRES: a Ranking Approach to Retrieval on Structured Documents – Wolff, Floerke, et al. - 1999 |
| 2 | Approximate tree embeeding for querying XML data – Schlieder, Naumann - 2000 |
| 1 | Generating SGML specific editors: from DTDs to attribute grammars – Ramalho, Lopes, et al. - 1998 |

