Results 1 - 10
of
82
Training Tree Transducers
- IN HLT-NAACL
, 2004
"... Many probabilistic models for natural language are now written in terms of hierarchical tree structure. Tree-based modeling still lacks many of the standard tools taken for granted in (finite-state) string-based modeling. The theory of tree transducer automata provides a possible framework to ..."
Abstract
-
Cited by 132 (12 self)
- Add to MetaCart
Many probabilistic models for natural language are now written in terms of hierarchical tree structure. Tree-based modeling still lacks many of the standard tools taken for granted in (finite-state) string-based modeling. The theory of tree transducer automata provides a possible framework to draw on, as it has been worked out in an extensive literature. We motivate the use of tree transducers for natural language and address the training problem for probabilistic tree-totree and tree-to-string transducers.
An Overview of Probabilistic Tree Transducers for Natural Language Processing
, 2005
"... Probabilistic finite-state string transducers (FSTs) are extremely popular in natural language processing, due to powerful generic methods for applying, composing, and learning them. Unfortunately, FSTs are not a good fit for much of the current work on probabilistic modeling for machine translati ..."
Abstract
-
Cited by 76 (7 self)
- Add to MetaCart
Probabilistic finite-state string transducers (FSTs) are extremely popular in natural language processing, due to powerful generic methods for applying, composing, and learning them. Unfortunately, FSTs are not a good fit for much of the current work on probabilistic modeling for machine translation, summarization, paraphrasing, and language modeling. These methods operate directly on trees, rather than strings. We show that tree acceptors and tree transducers subsume most of this work, and we discuss algorithms for realizing the same benefits found in probabilistic string transduction.
Evaluation Metrics for Generation
- In Proceedings of the First International Natural Language Generation Conference (INLG2000), Mitzpe
"... Certain generation applications may profit from the use of stochastic methods. In developing stochastic methods, it is crucial to be able to quickly assess the relative merits of different approaches or models. In this paper, we present several types of intrinsic (system internal) metrics which we h ..."
Abstract
-
Cited by 57 (5 self)
- Add to MetaCart
(Show Context)
Certain generation applications may profit from the use of stochastic methods. In developing stochastic methods, it is crucial to be able to quickly assess the relative merits of different approaches or models. In this paper, we present several types of intrinsic (system internal) metrics which we have used for baseline quantitative assessment. This quantitative assessment should then be augmented to a fuller evaluation that examines qualitative aspects. To this end, we describe an experiment that tests correlation between the quantitative metrics and human qualitative judgment. The experiment confirms that intrinsic metrics cannot replace human evaluation, but some correlate significantly with human judgments of quality and understandability and can be used for evaluation during development.
Instance-Based Natural Language Generation
- CARNEGIE MELLON UNIVERSITY
, 2001
"... This paper presents a bottom-up generator that makes use of Information Retrieval techniques to rank potential generation candidates by comparing them to a data base of stored instances. We introduce two general techniques to address the search problem, expectation-driven search and dynamic grammar ..."
Abstract
-
Cited by 54 (7 self)
- Add to MetaCart
This paper presents a bottom-up generator that makes use of Information Retrieval techniques to rank potential generation candidates by comparing them to a data base of stored instances. We introduce two general techniques to address the search problem, expectation-driven search and dynamic grammar rule selection, and present the architecture of an implemented generation system called IGEN. Our approach uses a domain-specific generation grammar that is automatically derived from a semantically tagged treebank. We then evaluate the efficiency of our system.
Talking To Machines (Statistically Speaking)
"... Statistical methods have long been the dominant approach in speech recognition and probabilistic modelling in ASR is now a mature technology. The use of statistical methods in other areas of spoken dialogue is however more recent and rather less mature. This paper reviews spoken dialogue systems fro ..."
Abstract
-
Cited by 53 (19 self)
- Add to MetaCart
Statistical methods have long been the dominant approach in speech recognition and probabilistic modelling in ASR is now a mature technology. The use of statistical methods in other areas of spoken dialogue is however more recent and rather less mature. This paper reviews spoken dialogue systems from a statistical modelling perspective. The complete system is first presented as a partially observable Markov decision process. The various sub-components are then exposed by introducing appropriate intermediate variables. Samples of existing work are reviewed within this framework, including dialogue control and optimisation, semantic interpretation, goal detection, natural language generation and synthesis.
Speaking with Hands: Creating Animated Conversational Characters from Recordings of Human Performance
, 2004
"... We describe a method for using a database of recorded speech and captured motion to create an animated conversational character. People's utterances are composed of short, clearly-delimited phrases; in each phrase, gesture and speech go together meaningfully and synchronize at a common point of ..."
Abstract
-
Cited by 53 (2 self)
- Add to MetaCart
We describe a method for using a database of recorded speech and captured motion to create an animated conversational character. People's utterances are composed of short, clearly-delimited phrases; in each phrase, gesture and speech go together meaningfully and synchronize at a common point of maximum emphasis. We develop tools for collecting and managing performance data that exploit this structure. The tools help create scripts for performers, help annotate and segment performance data, and structure specific messages for characters to use within application contexts. Our animations then reproduce this structure. They recombine motion samples with new speech samples to recreate coherent phrases, and blend segments of speech and motion together phraseby -phrase into extended utterances. By framing problems for utterance generation and synthesis so that they can draw closely on a talented performance, our techniques support the rapid construction of animated characters with rich and appropriate expression.
Bootstrapping Lexical Choice via Multiple-Sequence Alignment
- IN PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP
, 2002
"... An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically, ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
(Show Context)
An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically,
Learning Content Selection Rules for Generating Object Descriptions in Dialogue
- Journal of Artificial Intelligence Research
, 2005
"... A fundamental requirement of any task-oriented dialogue system is the ability to generate object descriptions that refer to objects in the task domain. The subproblem of content selection for object descriptions in task-oriented dialogue has been the focus of much previous work and a large number of ..."
Abstract
-
Cited by 50 (1 self)
- Add to MetaCart
(Show Context)
A fundamental requirement of any task-oriented dialogue system is the ability to generate object descriptions that refer to objects in the task domain. The subproblem of content selection for object descriptions in task-oriented dialogue has been the focus of much previous work and a large number of models have been proposed. In this paper, we use the annotated coconut corpus of task-oriented design dialogues to develop feature sets based on Dale & Reiter’s incremental model, Brennan & Clark’s conceptual pact model, and Jordan’s intentional influences model, and use these feature sets in a machine learning experiment to automatically learn a model of content selection for object descriptions. Since Dale & Reiter’s model requires a representation of discourse structure, the corpus annotations are used to derive a representation based on Grosz & Sidner’s theory of the intentional structure of discourse, as well as two very simple representations of discourse structure based purely on recency. We then apply the rule-induction program ripper to train and test the content selection component of an object description generator on a set of 393 object descriptions from the corpus. To our knowledge, this is the first reported
Explorations in sentence fusion
- In Proceedings of the 10th European Workshop on Natural Language Generation
, 2005
"... Sentence fusion is a text-to-text (revision-like) generation task which takes related sentences as input and merges these into a single output sentence. In this paper we describe our ongoing work on developing a sentence fusion module for Dutch. We propose a generalized version of alignment which no ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
Sentence fusion is a text-to-text (revision-like) generation task which takes related sentences as input and merges these into a single output sentence. In this paper we describe our ongoing work on developing a sentence fusion module for Dutch. We propose a generalized version of alignment which not only indicates which words and phrases should be aligned but also labels these in terms of a small set of primitive semantic relations, indicating how words and phrases from the two input sentences relate to each other. It is shown that human labelers can perform this task with a high agreement (Fscore of.95). We then describe and evaluate our adaptation of an existing automatic alignment algorithm, and use the resulting alignments, plus the semantic labels, in a generalized fusion and generation algorithm. A small-scale evaluation study reveals that most of the resulting sentences are adequate to good. 1