Results 1 -
6 of
6
The Penn Discourse TreeBank 2.0
- In Proceedings of LREC
, 2008
"... We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argume ..."
Abstract
-
Cited by 42 (14 self)
- Add to MetaCart
We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense annotation of the relations, and (c) the attribution of discourse relations and each of their arguments. We list the differences between PDTB-1.0 and PDTB-2.0. We present representative statistics for several aspects of the annotation in the corpus. 1.
Finding the Sources and Targets of Subjective Expressions
"... As many popular text genres such as blogs or news contain opinions by multiple sources and about multiple targets, finding the sources and targets of subjective expressions becomes an important sub-task for automatic opinion analysis systems. We argue that while automatic semantic role labeling syst ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
As many popular text genres such as blogs or news contain opinions by multiple sources and about multiple targets, finding the sources and targets of subjective expressions becomes an important sub-task for automatic opinion analysis systems. We argue that while automatic semantic role labeling systems (ASRL) have an important contribution to make, they cannot solve the problem for all cases. Based on the experience of manually annotating opinions, sources, and targets in various genres, we present linguistic phenomena that require knowledge beyond that of ASRL systems. In particular, we address issues relating to the attribution of opinions to sources; sources and targets that are realized as zero-forms; and inferred opinions. We also discuss in some depth that for arguing attitudes we need to be able to recover propositions and not only argued-about entities. A recurrent theme of the discussion is that close attention to specific discourse contexts is needed to identify sources and targets correctly. 1.
Complexity of Dependencies in Discourse: Are Dependencies in Discourse More Complex than in Syntax?
"... This paper investigates the complexity of dependencies at the discourse level, in particular the dependencies between discourse connectives and their arguments. Our study is based on data from the Penn Discourse Treebank (PDTB) and is therefore an exploration into the ways treebanks can inform lingu ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper investigates the complexity of dependencies at the discourse level, in particular the dependencies between discourse connectives and their arguments. Our study is based on data from the Penn Discourse Treebank (PDTB) and is therefore an exploration into the ways treebanks can inform linguistic issues. We observe that, unlike in syntax, there is more uncertainty and flexibility with regards to the location and extent of discourse arguments. This leads to a variety of possible patterns of dependencies between pairs of discourse relations, including nested, crossed and a range of other non-tree-like configurations. Nevertheless, our main conclusion is that the types of discourse dependencies are highly restricted since the more complex cases can be factored out by appealing to discourse notions like anaphora and attribution. We conjecture that the complexity of dependencies is far more restricted at the discourse level as compared to the syntactic level. 1
TEXT2TABLE: Medical Text Summarization System based on Named Entity Recognition and Modality Identification
"... With the rapidly growing use of electronic health records, the possibility of large-scale clinical information extraction has drawn much attention. It is not, however, easy to extract information because these reports are written in natural language. To address this problem, this paper presents a sy ..."
Abstract
- Add to MetaCart
With the rapidly growing use of electronic health records, the possibility of large-scale clinical information extraction has drawn much attention. It is not, however, easy to extract information because these reports are written in natural language. To address this problem, this paper presents a system that converts a medical text into a table structure. This system’s core technologies are (1) medical event recognition modules and (2) a negative event identification module that judges whether an event actually occurred or not. Regarding the latter module, this paper also proposes an SVM-based classifier using syntactic information. Experimental results demonstrate empirically that syntactic information can contribute to the method’s accuracy. 1
Annotating Event Mentions in Text with Modality, Focus, and Source Information
"... Many natural language processing tasks, including information extraction, question answering and recognizing textual entailment, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicateargument structure analysis ..."
Abstract
- Add to MetaCart
Many natural language processing tasks, including information extraction, question answering and recognizing textual entailment, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicateargument structure analysis. We refer to modality, polarity and other associated information as extended modality. In this paper, we propose a new annotation scheme for representing the extended modality of event mentions in a sentence. Our extended modality

