Results 1 -
3 of
3
Automatic Pipeline Construction for RealTime Annotation
- In Proc. of the 14th CICLing
, 2013
"... Abstract. Many annotation tasks in computational linguistics are tack-led with manually constructed pipelines of algorithms. In real-time tasks where information needs are stated and addressed ad-hoc, however, man-ual construction is infeasible. This paper presents an artificial intelligence approac ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Many annotation tasks in computational linguistics are tack-led with manually constructed pipelines of algorithms. In real-time tasks where information needs are stated and addressed ad-hoc, however, man-ual construction is infeasible. This paper presents an artificial intelligence approach to automatically construct annotation pipelines for given infor-mation needs and quality prioritizations. Based on an abstract ontologi-cal model, we use partial order planning to select a pipeline’s algorithms and informed search to obtain an efficient pipeline schedule. We realized the approach as an expert system on top of Apache UIMA, which offers evidence that pipelines can be constructed ad-hoc in near-zero time. 1
Information Extraction as a Filtering Task
"... Information extraction is usually approached as an annota-tion task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the ..."
Abstract
- Add to MetaCart
(Show Context)
Information extraction is usually approached as an annota-tion task: Input texts run through several analysis steps of an extraction process in which different semantic concepts are annotated and matched against the slots of templates. We argue that such an approach lacks an efficient control of the input of the analysis steps. In this paper, we hence propose and evaluate a model and a formal approach that consistently put the filtering view in the focus: Before spend-ing annotation effort, filter those portions of the input texts that may contain relevant information for filling a template and discard the others. We model all dependencies between the semantic concepts sought for with a truth maintenance system, which then efficiently infers the portions of text to be annotated in each analysis step. The filtering view en-ables an information extraction system (1) to annotate only relevant portions of input texts and (2) to easily trade its run-time efficiency for its recall. We provide our approach as an open-source extension of Apache UIMA and we show the potential of our approach in a number of experiments. c©Wachsmuth et. al. (2013). This is the author’s version of the work. It is posted here for your personal use. Not for redistribution.
PRINCIPLES OF USABLE QUERY INTERFACES BY
"... People generate and share a huge variety of information that can be used to derive useful insights and create value for society. Keyword queries are by far the most popular method for users to search, explore, and understand such information. Because search is more effective when information is well ..."
Abstract
- Add to MetaCart
People generate and share a huge variety of information that can be used to derive useful insights and create value for society. Keyword queries are by far the most popular method for users to search, explore, and understand such information. Because search is more effective when information is well organized, people often choose to add structure to their information, producing semi-structured or even structured data. However, the new structure is not always added in the right places, i.e., where it will maximize the improvement in search results. Further, seemingly unimportant changes in the structure of the information tend to affect search results in undesirable ways. In addition, current keyword search and exploration systems are not very effective for certain types of information and queries. This thesis addresses these three issues by showing how to determine where to add structure to information for maximal benefit, and how to use the resulting structure to provide effective and robust answers to keyword queries. Our first contribution is to determine what structure should be imposed on a given collection of information. Since the resources for converting information to a more structured format are always limited, our goal is to impose structure in a manner that will maximize the improvement for subsequent queries. In more technical terms, this means discovering and maintaining exactly the parts of a database schema that will most improve users ’ search and exploration experience. We show that adding structure corresponding to the most “popular ” parts of a schema does