• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 34
Next 10 →

TEI P5 as an XML Standard for Treebank Encoding ∗

by Adam Przepiórkowski
"... The aim of the paper is to show that a subset of Text Encoding Initiative Guidelines is a reasonable choice as a standard for stand-off XML encoding of syntactically annotated corpora. The proposed TEI schema — actually employed in the National Corpus of Polish — is compared to other such candidate ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
The aim of the paper is to show that a subset of Text Encoding Initiative Guidelines is a reasonable choice as a standard for stand-off XML encoding of syntactically annotated corpora. The proposed TEI schema — actually employed in the National Corpus of Polish — is compared to other such candidate

Merging Layered Annotations

by Nancy Ide, Keith Suderman
"... The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1’s Linguistic Annotation Framework. Because few systems that enable search and access of the corpus currently support stand-off markup, the project has deve ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1’s Linguistic Annotation Framework. Because few systems that enable search and access of the corpus currently support stand-off markup, the project has

Layering and merging linguistic annotations

by Keith Suderman, Nancy Ide - In NLPXML-2006 , 2006
"... The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1’s Linguistic Annotation Framework. Because few systems that enable search and access of the corpus currently support stand-off markup, the project has deve ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1’s Linguistic Annotation Framework. Because few systems that enable search and access of the corpus currently support stand-off markup, the project has

Efficient XQuery Support for Stand-Off Annotation

by Wouter Alink, Raoul Bhoedjang, et al. , 2006
"... XML annotations are a widely occurring phenomenon in many application fields, and XML databases should be used to store and query such data. To provide intuitive and fast querying of annotations, we make a case for extending XPath with four new axis steps, that correspond with socalled StandOff join ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
XML annotations are a widely occurring phenomenon in many application fields, and XML databases should be used to store and query such data. To provide intuitive and fast querying of annotations, we make a case for extending XPath with four new axis steps, that correspond with socalled StandOff

MAE and MAI: Lightweight Annotation and Adjudication Tools

by Amber Stubbs
"... MAE and MAI are lightweight annotation and adjudication tools for corpus creation. DTDs are used to define the annotation tags and attributes, including extent tags, link tags, and non-consuming tags. Both programs are written in Java and use a stand-alone SQLite database for storage and retrieval o ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
of annotation data. Output is in stand-off XML. 1

XML-based Stand-off Representation and Exploitation of Multi-Level Linguistic Annotation

by unknown authors
"... Abstract: This paper deals with the representation of multi-level linguistic annotations. It proposes an XML-based, generic stand-off architecture and presents an example instantiation. Application scenarios that profit from this architecture are sketched out. 1 In recent years, corpus linguistics h ..."
Abstract - Add to MetaCart
Abstract: This paper deals with the representation of multi-level linguistic annotations. It proposes an XML-based, generic stand-off architecture and presents an example instantiation. Application scenarios that profit from this architecture are sketched out. 1 In recent years, corpus linguistics

XIRAF: An XML-IR Approach to . . .

by W. Alink , 2005
"... This Master’s thesis addresses problems in current digital forensic investigations. It proposes the XIRAF system as a novel approach towards the integration of existing forensic analysis tools using XML technology. The concept of integrating these tools can be compared to the concept of concurrent ..."
Abstract - Add to MetaCart
answering, and multimedia retrieval. This thesis introduces Burkowski axis steps in XPath as a viable solution for the digital forensics application area. The steps can be used in stand-off XML annotation in which the content is separated from the annotations. This approach has many advantages over inline

Representing and querying multi-dimensional markup for question answering

by Wouter Alink, Valentin Jijkoun, David Ahn, Maarten de Rijke, et al. - IN PROCEEDINGS OF THE 5TH WORKSHOP ON NLP AND XML (NLPXML-2006): MULTI-DIMENSIONAL MARKUP IN NATURAL LANGUAGE PROCESSING , 2006
"... This paper describes our approach to representing and querying multi-dimensional, possibly overlapping text annotations, as used in our question answering (QA) system. We use a system extending XQuery, the W3C-standard XML query language, with new axes that allow one to jump easily between different ..."
Abstract - Cited by 5 (4 self) - Add to MetaCart
different annotations of the same data. The new axes are formulated in terms of (partial) overlap and containment. All annotations are made using stand-off XML in a single document, which can be efficiently queried using the XQuery extension. The system is scalable to gigabytes of XML annotations. We show

A framework for annotating information structure in discourse

by Sasha Calhoun, Malvina Nissim, Mark Steedman, Jason Brenier - Pie in the Sky: Proceedings of the workshop, ACL , 2005
"... We present a framework for the integrated analysis of the textual and prosodic characteristics of information structure in the Switchboard corpus of conversational English. Information structure describes the availability, organisation and salience of entities in a discourse model. We present standa ..."
Abstract - Cited by 16 (1 self) - Add to MetaCart
. This annotation, using stand-off XML in NXT, can help establish standards for the annotation of information structure in discourse. 1

A Flexible Stand-Off Data Model with Query Language for Multi-Level Annotation

by Christoph Müller
"... We present an implemented XML data model and a new, simplified query language for multi-level annotated corpora. The new query language involves automatic conversion of queries into the underlying, more complicated MMAXQL query language. It supports queries for sequential and hierarchical, but also ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
We present an implemented XML data model and a new, simplified query language for multi-level annotated corpora. The new query language involves automatic conversion of queries into the underlying, more complicated MMAXQL query language. It supports queries for sequential and hierarchical, but also
Next 10 →
Results 1 - 10 of 34
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University