Results 1 - 10
of
41
Easily identifiable discourse relations
, 2008
"... We present a corpus study of local discourse relations based on the Penn Discourse Tree Bank, a large manually annotated corpus of explicitly or implicitly realized relations. We show that while there is a large degree of ambiguity in temporal explicit discourse connectives, overall connectives are ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
We present a corpus study of local discourse relations based on the Penn Discourse Tree Bank, a large manually annotated corpus of explicitly or implicitly realized relations. We show that while there is a large degree of ambiguity in temporal explicit discourse connectives, overall connectives are mostly unambiguous and allow high-accuracy prediction of discourse relation type. We achieve 93.09 % accuracy in classifying the explicit relations and 74.74 % accuracy overall. In addition, we show that some pairs of relations occur together in text more often than expected by chance. This finding suggests that global sequence classification of the relations in text can lead to better results, especially for implicit relations. 1
Revisiting Readability: A Unified Framework for Predicting Text Quality
"... We combine lexical, syntactic, and discourse features to produce a highly predictive model of human readers ’ judgments of text readability. This is the first study to take into account such a variety of linguistic factors and the first to empirically demonstrate that discourse relations are strongl ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We combine lexical, syntactic, and discourse features to produce a highly predictive model of human readers ’ judgments of text readability. This is the first study to take into account such a variety of linguistic factors and the first to empirically demonstrate that discourse relations are strongly associated with the perceived quality of text. We show that various surface metrics generally expected to be related to readability are not very good predictors of readability judgments in our Wall Street Journal corpus. We also establish that readability predictors behave differently depending on the task: predicting text readability or ranking the readability. Our experiments indicate that discourse relations are the one class of features that exhibits robustness across these two tasks. 1
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank
"... We present an implicit discourse relation classifier in the Penn Discourse Treebank (PDTB). Our classifier considers the context of the two arguments, word pair information, as well as the arguments ’ internal constituent and dependency parses. Our results on the PDTB yields a significant 14.1 % imp ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We present an implicit discourse relation classifier in the Penn Discourse Treebank (PDTB). Our classifier considers the context of the two arguments, word pair information, as well as the arguments ’ internal constituent and dependency parses. Our results on the PDTB yields a significant 14.1 % improvement over the baseline. In our error analysis, we discuss four challenges in recognizing implicit relations in the PDTB. 1
Departures from Tree Structures in Discourse: Shared Arguments in the Penn Discourse Treebank
"... The term discourse structure is used to denote any structure of a text above that of the sentence. Trees have often been posited as a good abstraction when discourse is taken to have a hierarchical structure (Mann and Thompson 1987; Webber et al. 2003; Marcu 2000; Egg and Redeker 2008). Nevertheless ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
The term discourse structure is used to denote any structure of a text above that of the sentence. Trees have often been posited as a good abstraction when discourse is taken to have a hierarchical structure (Mann and Thompson 1987; Webber et al. 2003; Marcu 2000; Egg and Redeker 2008). Nevertheless, periodically researchers have commented on the need to depart from the strict singleparent hierarchy of trees to structures which have shared daughters, a move which incorporates multiple inheritance and is therefore an issue for tree representations. This study follows up on the observation in (Lee et al. 2006) about the relative ubiquity of shared structures in the Penn Discourse Treebank or PDTB (Prasad et al. 2008; PDTB-Group 2008)), a recently released corpus which annotates discourse relations and their arguments. We limit our investigation here to cases where the shared discourse structure is a syntactically subordinate clause introduced by a subordinating conjunction (e.g. because, although, when, etc.). We examine annotations in the PDTB where the subordinate clause has been taken to be an argument of both the relation associated with the subordinating conjunction and another relation expressed in the immediately subsequent discourse. We ask what such annotations imply about the link between syntactic subordination and discourse subordination. Our argument is that while syntactic subordination may often correlate with discourse subordination, there are interesting exceptions that might better be captured as discourse coordination. We provide some systematic characterization of these exceptions by appealing to well-motivated discourse factors, and discuss their implications for tree structures. 1
Using Syntax to Disambiguate Explicit Discourse Connectives in Text ∗
"... Discourse connectives are words or phrases such as once, since, and on the contrary that explicitly signal the presence of a discourse relation. There are two types of ambiguity that need to be resolved during discourse processing. First, a word can be ambiguous between discourse or non-discourse us ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Discourse connectives are words or phrases such as once, since, and on the contrary that explicitly signal the presence of a discourse relation. There are two types of ambiguity that need to be resolved during discourse processing. First, a word can be ambiguous between discourse or non-discourse usage. For example, once can be either a temporal discourse connective or a simply a word meaning “formerly”. Secondly, some connectives are ambiguous in terms of the relation they mark. For example since can serve as either a temporal or causal connective. We demonstrate that syntactic features improve performance in both disambiguation tasks. We report state-ofthe-art results for identifying discourse vs. non-discourse usage and human-level performance on sense disambiguation. 1
Question Generation from Paragraphs at UPenn: QGSTEC System Description
"... Abstract. This paper describes the question generation system developed at UPenn for QGSTEC, 2010. The system uses predicate argument structures of sentences along with semantic roles for the question generation task from paragraphs. The semantic role labels are used to identify relevant parts of te ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. This paper describes the question generation system developed at UPenn for QGSTEC, 2010. The system uses predicate argument structures of sentences along with semantic roles for the question generation task from paragraphs. The semantic role labels are used to identify relevant parts of text before forming questions over them. The generated questions are then ranked to pick final six best questions.
Predicting Discourse Connectives for Implicit Discourse Relation Recognition
"... Existing works indicate that the absence of explicit discourse connectives makes it difficult to recognize implicit discourse relations. In this paper we attempt to overcome this difficulty for implicit relation recognition by automatically inserting discourse connectives between arguments with the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Existing works indicate that the absence of explicit discourse connectives makes it difficult to recognize implicit discourse relations. In this paper we attempt to overcome this difficulty for implicit relation recognition by automatically inserting discourse connectives between arguments with the use of a language model. Then we propose two algorithms to use these predicted connectives. One is to use these predicted implicit connectives as additional features in a supervised model. The other is to perform implicit relation recognition based only on these predicted connectives. Results on Penn Discourse Treebank 2.0 show that predicted discourse connectives help implicit relation recognition and the first algorithm can achieve an absolute average f-score improvement of 3 % over a state of the art baseline system. 1
Genre distinctions for Discourse in the Penn TreeBank
"... Articles in the Penn TreeBank were identified as being reviews, summaries, letters to the editor, news reportage, corrections, wit and short verse, or quarterly profit reports. All but the latter three were then characterised in terms of features manually annotated in the Penn Discourse TreeBank — d ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Articles in the Penn TreeBank were identified as being reviews, summaries, letters to the editor, news reportage, corrections, wit and short verse, or quarterly profit reports. All but the latter three were then characterised in terms of features manually annotated in the Penn Discourse TreeBank — discourse connectives and their senses. Summaries turned out to display very different discourse features than the other three genres. Letters also appeared to have some different features. The two main findings involve (1) differences between genres in the senses associated with intra-sentential discourse connectives, inter-sentential discourse connectives and inter-sentential discourse relations that are not lexically marked; and (2) differences within all four genres between the senses of discourse relations not lexically marked and those that are marked. The first finding means that genre should be made a factor in automated sense labelling of non-lexically marked discourse relations. The second means that lexically marked relations provide a poor model for automated sense labelling of relations that are not lexically marked. 1
Automatic Factual Question Generation from Text
"... Texts with potential educational value are becoming available through the Internet (e.g., Wikipedia, news services). However, using these new texts in classrooms introduces many challenges, one of which is that they usually lack practice exercises and assessments. Here, we address part of this chall ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Texts with potential educational value are becoming available through the Internet (e.g., Wikipedia, news services). However, using these new texts in classrooms introduces many challenges, one of which is that they usually lack practice exercises and assessments. Here, we address part of this challenge by automating the creation of a specific type of assessment item. Specifically, we focus on automatically generating factual WH questions. Our goal is to create an automated system that can take as input a text and produce as output questions for assessing a reader’s knowledge of the information in the text. The questions could then be presented to a teacher, who could select and revise the ones that he or she judges to be useful. After introducing the problem, we describe some of the computational and linguistic challenges presented by factual question generation. We then present an implemented system that leverages existing natural language processing techniques to address some of these challenges. The system uses a combination of manually encoded transformation rules and a statistical question ranker trained on a tailored dataset of labeled system output. We present experiments that evaluate individual components of the system as well as the system as a whole. We found, among other things, that the question ranker roughly doubled the acceptability
The Hindi Discourse Relation Bank
"... We describe the Hindi Discourse Relation Bank project, aimed at developing a large corpus annotated with discourse relations. We adopt the lexically grounded approach of the Penn Discourse Treebank, and describe our classification of Hindi discourse connectives, our modifications to the sense classi ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We describe the Hindi Discourse Relation Bank project, aimed at developing a large corpus annotated with discourse relations. We adopt the lexically grounded approach of the Penn Discourse Treebank, and describe our classification of Hindi discourse connectives, our modifications to the sense classification of discourse relations, and some crosslinguistic comparisons based on some initial annotations carried out so far. 1

