Results 1 - 10
of
117
TextTiling: Segmenting text into multi-paragraph subtopic passages
- Computational Linguistics
, 1997
"... TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages, or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence and distribution. The algorithm is fully implemented and is shown to produce segmentation t ..."
Abstract
-
Cited by 275 (1 self)
- Add to MetaCart
TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages, or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence and distribution. The algorithm is fully implemented and is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts. Multi-paragraph subtopic segmentation should be useful for many text analysis tasks, including information retrieval and summarization. 1.
Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information
- COMPUTATIONAL LINGUISTICS
, 1993
"... ... this paper, we argue that, to handle explanation dialogues successfully, a discourse model must include information about the intended effect of individual parts of the text on the hearer, as well as how the parts relate to one another rhetorically. We present a text planner that records this in ..."
Abstract
-
Cited by 201 (27 self)
- Add to MetaCart
... this paper, we argue that, to handle explanation dialogues successfully, a discourse model must include information about the intended effect of individual parts of the text on the hearer, as well as how the parts relate to one another rhetorically. We present a text planner that records this information and show how the resulting structure is used to respond appropriately to a follow-up question.
A Data-Driven Methodology for Motivating a Set of Coherence Relations
, 1996
"... The notion that a text is coherent in virtue of the `relations' that hold between its component spans currently forms the basis for an active research programme in discourse linguistics. Coherence relations feature prominently in many theories of discourse structure, and have recently been used with ..."
Abstract
-
Cited by 110 (16 self)
- Add to MetaCart
The notion that a text is coherent in virtue of the `relations' that hold between its component spans currently forms the basis for an active research programme in discourse linguistics. Coherence relations feature prominently in many theories of discourse structure, and have recently been used with considerable success in text generation systems. However, while the concept of coherence relations is now common currency for discourse theorists, there remains much confusion about them, and no standard set of relations has yet emerged. The aim of this thesis is to contribute towards the development of a standard set of relations. We begin from an explicitly empirical conception of relations: they are taken to model a collection of psychological mechanisms operative during the tasks of reading and writing. This conception is fleshed out with reference to psychological theories of skilled task performance, and to Rosch's notion of the basic level of categorisation. A methodology for investi...
Empirical studies on the disambiguation of cue phrases
- Computational Linguistics
, 1993
"... Cue phrases are linguistic expressions such as now and well that function as explicit indicators of the structure of a discourse. For example, now may signal the beginning of a subtopic or a return to a previous topic, while well may mark subsequent material as a response to prior material, or as an ..."
Abstract
-
Cited by 102 (9 self)
- Add to MetaCart
Cue phrases are linguistic expressions such as now and well that function as explicit indicators of the structure of a discourse. For example, now may signal the beginning of a subtopic or a return to a previous topic, while well may mark subsequent material as a response to prior material, or as an explanatory comment. However, while cue phrases may convey discourse structure, each also has one or more alternate uses. While incidentally may be used sententially as an adverbial, for example, the discourse use initiates a digression. Although distinguishing discourse and sentential uses of cue phrases is critical to the interpretation and generation of discourse, the question of how speakers and hearers accomplish this disambiguation is rarely addressed. This paper reports results of empirical studies on discourse and sentential uses of cue phrases, in which both text-based and prosodic features were examined for disambiguating power. Based on these studies, it is proposed that discourse versus sentential usage may be distinguished by intonational features, specifically, pitch accent and prosodic phrasing. A prosodic model that characterizes these distinctions is identified. This model is associated with features identifiable from text analysis, including orthography and part of speech, to permit the application of the results of the prosodic analysis to the generation of appropriate intonational features for discourse and sentential uses of cue phrases in synthetic speech. 1.
Preliminaries to a Theory of Speech Disfluencies
, 1994
"... This thesis examines disfluencies (e.g., "um", repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. ..."
Abstract
-
Cited by 97 (7 self)
- Add to MetaCart
This thesis examines disfluencies (e.g., "um", repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. The goal of the thesis is to provide evidence that, on the contrary, disfluencies show remarkably regular trends in a number of dimensions. These regularities have consequences for models of human language production; they can also be exploited to improve performance in speech applications. The method includes analysis of over 5000 hand-annotated disfluencies from a database (250,000 words) containing three different styles of spontaneous speech: task-oriented human-computer dialog, task-oriented human-human dialog, and human-human conversation on a prescribed topic. The approach is theory-neutral and strongly data-driven. The annotations correspond to observable characteristics ("features") ...
Relational Agents: Effecting Change through Human-Computer Relationships
, 2003
"... What kinds of social relationships can people have with computers? Are there activities that computers can engage in that actively draw people into relationships with them? What are the potential benefits to the people who participate in these human-computer relationships? To address these question ..."
Abstract
-
Cited by 79 (5 self)
- Add to MetaCart
What kinds of social relationships can people have with computers? Are there activities that computers can engage in that actively draw people into relationships with them? What are the potential benefits to the people who participate in these human-computer relationships? To address these questions this work introduces a theory of Relational Agents, which are computational artifacts designed to build and maintain long-term, social-emotional relationships with their users. These can be purely software humanoid animated agents--as developed in this work--but they can also be non-humanoid or embodied in various physical forms, from robots, to pets, to jewelry, clothing, hand-helds, and other interactive devices. Central to the notion of relationship is that it is a persistent construct, spanning multiple interactions; thus, Relational Agents are explicitly designed to remember past history and manage future expectations in their interactions with users. Finally, relationships are fundamentally social and emotional, and detailed knowledge of human social psychology--with a particular emphasis on the role of affect--must be incorporated into these agents if they are to effectively leverage the mechanisms of human social cognition in order to build relationships in the most natural manner possible. People build
An Unsupervised Approach to Recognizing Discourse Relations
, 2002
"... We present an unsupervised approach to recognizing discourse relations of CONTRAST, EXPLANATION-EVIDENCE, CONDITION and ELABORATION that hold between arbitrary spans of texts. We show that discourse relation classifiers trained on examples that are automatically extracted from massive amoun ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
We present an unsupervised approach to recognizing discourse relations of CONTRAST, EXPLANATION-EVIDENCE, CONDITION and ELABORATION that hold between arbitrary spans of texts. We show that discourse relation classifiers trained on examples that are automatically extracted from massive amounts of text can be used to distinguish between some of these relations with accuracies as high as 93%, even when the relations are not explicitly marked by cue phrases.
The Rhetorical Parsing of Natural Language Texts
, 1997
"... We derive the rhetorical structures of texts by means of two new, surface-form-based algorithms: one that identifies discourse usages of cue phrases and breaks sen- tences into clauses, and one that produces valid rhetorical structure trees for unre- stricted natural language texts. The algo- ..."
Abstract
-
Cited by 63 (9 self)
- Add to MetaCart
We derive the rhetorical structures of texts by means of two new, surface-form-based algorithms: one that identifies discourse usages of cue phrases and breaks sen- tences into clauses, and one that produces valid rhetorical structure trees for unre- stricted natural language texts. The algo- rithms use information that was derived from a corpus analysis of cue phrases.
Speech repairs, intonational phrases and discourse markers: modeling speakers’ utterances in spoken dialogue
- Computational Linguistics
, 1999
"... Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker’s intended utterances: both segmenting a speaker’s turn into utterances and determining the intended words in each utterance. Eve ..."
Abstract
-
Cited by 61 (9 self)
- Add to MetaCart
Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker’s intended utterances: both segmenting a speaker’s turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72 % of turn-internal intonational boundaries with a precision of 71%, 97 % of discourse markers with 96 % precision, and detect and correct 66 % of repairs with 74 % precision.

