Results 1 - 10
of
133
SRILM—An extensible language modeling toolkit
- In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002
, 2002
"... SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation ..."
Abstract
-
Cited by 449 (13 self)
- Add to MetaCart
SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation and evaluation of a variety of language model types based on N-gram statistics, as well as several related tasks, such as statistical tagging and manipulation of N-best lists and word lattices. This paper summarizes the functionality of the toolkit and discusses its design and implementation, highlighting ease of rapid prototyping, reusability, and combinability of tools. 1.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract
-
Cited by 393 (4 self)
- Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Matching words and pictures
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation ..."
Abstract
-
Cited by 391 (33 self)
- Add to MetaCart
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation) and corresponding to particular image regions (region naming). Auto-annotation might help organize and access large collections of images. Region naming is a model of object recognition as a process of translating image regions to words, much as one might translate from one language to another. Learning the relationships between image regions and semantic correlates (words) is an interesting example of multi-modal data mining, particularly because it is typically hard to apply data mining techniques to collections of images. We develop a number of models for the joint distribution of image regions and words, including several which explicitly learn the correspondence between regions and words. We study multi-modal and correspondence extensions to Hofmann’s hierarchical clustering/aspect model, a translation model adapted from statistical machine translation (Brown et al.), and a multi-modal extension to mixture of latent Dirichlet allocation
Intelligent tutoring systems with conversational dialogue
- AI Magazine
, 2001
"... This article presents the tutoring systems that we have been developing. AUTOTUTOR is a conversational agent, with a talking head, that helps college students learn about computer literacy. ANDES, ATLAS, AND WHY2 help adults learn about physics. Instead of being mere information-delivery systems, ou ..."
Abstract
-
Cited by 51 (12 self)
- Add to MetaCart
This article presents the tutoring systems that we have been developing. AUTOTUTOR is a conversational agent, with a talking head, that helps college students learn about computer literacy. ANDES, ATLAS, AND WHY2 help adults learn about physics. Instead of being mere information-delivery systems, our systems help students actively construct knowledge through conversations
Words with attitude
- In 1st International WordNet Conference
, 2002
"... The traditional notion of word meaning used in natural language processing is literal or lexical meaning as used in dictionaries and lexicons. This relatively objective notion of lexical meaning is different from more subjective notions of emotive or affective meaning. Our aim is to come to grips wi ..."
Abstract
-
Cited by 40 (3 self)
- Add to MetaCart
The traditional notion of word meaning used in natural language processing is literal or lexical meaning as used in dictionaries and lexicons. This relatively objective notion of lexical meaning is different from more subjective notions of emotive or affective meaning. Our aim is to come to grips with subjective aspects of meaning expressed in written texts, such as the attitude or value expressed in them. This paper explores how the structure of the WordNet lexical database might be used to assess affective or emotive meaning. In particular, we construct measures based on Osgood’s semantic differential technique. Suppose we can evaluate the subjective meaning expressed in a text. This would allow us to classify documents on subjective criteria, rather than on their factual content. There are several potential applications for such classifications, for example, providing summary statistics for search engines. Given the query “Crete travel review”, a search engine could report, “There are 1000 hits of which 3/4 is a positive review”. Another potential application is filtering “flames ” for newsgroups.
Probabilistic Modeling in Psycholinguistics: Linguistic Comprehension and Production
- PROBABILISTIC LINGUISTICS
, 2003
"... ..."
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Effects of disfluencies, predictability, and utterance position on word form variation in English conversation
, 2003
"... Function words, especially frequently occurring ones such as (the, that, and, and of), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling of lexical production and for computer speech recognition and synthesis. This study investigates which factors ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
Function words, especially frequently occurring ones such as (the, that, and, and of), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling of lexical production and for computer speech recognition and synthesis. This study investigates which factors affect the forms of function words, especially whether they have a fuller pronunciation (e.g., , , 22 , ) or a more reduced or lenited pronunciation (e.g., ). It is based on over 8000 occurrences of the ten most frequent English function words in a four-hour sample from conversations from the Switchboard corpus. Ordinary linear and logistic regression models were used to examine variation in the length of the words, in the form of their vowel (basic, full, or reduced), and whether final obstruents were present or not. For all these measures, after controlling for segmental context, rate of speech, and other important factors, there are strong independent effects that made high-frequency monosyllabic function words more likely to be longer or have a fuller form (1) when neighboring disfluencies (such as filled pauses uh and um) indicate that the speaker was encountering problems in planning the utterance; (2) when the word is unexpected, i.e less predictable in context; (3) when the word is either utterance-initial or utterance-final. Looking at the phenomenon in a different way, frequent function words are more likely to be shorter and to have less full forms in fluent speech, in predictable positions or multi-word collocations, and utterance-internally. Also considered are other factors such as sex (women are more likely to use fuller forms, even after controlling for rate of speech, for example), and some of the dif...
Learning User Interest Dynamics with a Three-Descriptor Representation
, 2000
"... Learning users' interest categories is challenging in a dynamic environment like the Web because they change over time. This paper describes a novel scheme to represent a user's interest categories, and an adaptive algorithm to learn the dynamics of the user's interests through positive and negative ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Learning users' interest categories is challenging in a dynamic environment like the Web because they change over time. This paper describes a novel scheme to represent a user's interest categories, and an adaptive algorithm to learn the dynamics of the user's interests through positive and negative relevance feedback. We propose a three-descriptor model to represent a user's interests. The proposed model maintains a long-term interest descriptor to capture the user's general interests and a short-term interest descriptor to keep track of the user's more recent, faster-changing interests. An algorithm based on the three-descriptor representation is developed to acquire high accuracy of recognition for long-term interests, and to adapt quickly to changing interests in the short-term. The model is also extended to multiple three-descriptor representations to capture a broader range of interests. Empirical studies confirm the effectiveness of this scheme to accurately model a user's inter...
THE POWER OF EXTENDED TOP-DOWN TREE TRANSDUCERS
"... Extended top-down tree transducers (transducteurs generalises descendants [Arnold, Dauchet: Bi-transductions de forets. ICALP'76. Edinburgh University Press. 1976]) received renewed interest in the field of Natural Language Processing. Here those transducers are extensively and systematically studie ..."
Abstract
-
Cited by 21 (13 self)
- Add to MetaCart
Extended top-down tree transducers (transducteurs generalises descendants [Arnold, Dauchet: Bi-transductions de forets. ICALP'76. Edinburgh University Press. 1976]) received renewed interest in the field of Natural Language Processing. Here those transducers are extensively and systematically studied. Their main properties are identified and their relation to classical top-down tree transducers is exactly characterized. The obtained properties completely explain the Hasse diagram of the induced classes of tree transformations. In addition, it is shown that most interesting classes of transformations computed by extended top-down tree transducers are not closed under composition.

