Results 1 - 10
of
270
Word Learning as Bayesian Inference
- In Proceedings of the 22nd Annual Conference of the Cognitive Science Society
, 2000
"... The authors present a Bayesian framework for understanding how adults and children learn the meanings of words. The theory explains how learners can generalize meaningfully from just one or a few positive examples of a novel word’s referents, by making rational inductive inferences that integrate pr ..."
Abstract
-
Cited by 175 (33 self)
- Add to MetaCart
(Show Context)
The authors present a Bayesian framework for understanding how adults and children learn the meanings of words. The theory explains how learners can generalize meaningfully from just one or a few positive examples of a novel word’s referents, by making rational inductive inferences that integrate prior knowledge about plausible word meanings with the statistical structure of the observed examples. The theory addresses shortcomings of the two best known approaches to modeling word learning, based on deductive hypothesis elimination and associative learning. Three experiments with adults and children test the Bayesian account’s predictions in the context of learning words for object categories at multiple levels of a taxonomic hierarchy. Results provide strong support for the Bayesian account over competing accounts, in terms of both quantitative model fits and the ability to explain important qualitative phenomena. Several extensions of the basic theory are discussed, illustrating the broader potential for Bayesian models of word learning.
Grounded semantic composition for visual scenes
- Journal of Artificial Intelligence Research
, 2004
"... We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is ..."
Abstract
-
Cited by 105 (24 self)
- Add to MetaCart
(Show Context)
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account. 1.
From the lexicon to expectations about kinds: a role for associative learning
- Psychological Review
, 2005
"... In the novel noun generalization task, 2 1/2-year-old children display generalized expectations about how solid and nonsolid things are named, extending names for never-before-encountered solids by shape and for never-before-encountered nonsolids by material.This distinction between solids and nonso ..."
Abstract
-
Cited by 101 (32 self)
- Add to MetaCart
(Show Context)
In the novel noun generalization task, 2 1/2-year-old children display generalized expectations about how solid and nonsolid things are named, extending names for never-before-encountered solids by shape and for never-before-encountered nonsolids by material.This distinction between solids and nonsolids has been interpreted in terms of an ontological distinction between objects and substances.Nine simulations and behavioral experiments tested the hypothesis that these expectations arise from the correlations characterizing early learned noun categories.In the simulation studies, connectionist networks were trained on noun vocabularies modeled after those of children.These networks formed generalized expectations about solids and nonsolids that match children’s performances in the novel noun generalization task in the very different languages of English and Japanese.The simulations also generate new predictions supported by new experiments with children.Implications are discussed in terms of children’s development of distinctions between kinds of categories and in terms of the nature of this knowledge. Concepts are hypothetical constructs, theoretical devices hypothesized to explain data, what people do, and what people say. The question of whether a particular theory can explain children’s concepts is therefore semantically strange because strictly speaking this question asks about an explanation of an explanation.We begin with this reminder because the goal of the research reported here is to understand the role of associative processes in children’s systematic attention to the shape of solid things and to the material of nonsolid things in the task of forming new lexical categories. These attentional biases have been interpreted in terms of children’s concepts about the ontological kinds of object and substance
Semiotic Schemas: A Framework for Grounding Language in Action and Perception
, 2005
"... A theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to linguistic meaning. Schemas serve as structured be ..."
Abstract
-
Cited by 100 (11 self)
- Add to MetaCart
A theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to linguistic meaning. Schemas serve as structured beliefs that are grounded in an agent’s physical environment through a causal-predictive cycle of action and perception. Words and basic speech acts are interpreted in terms of grounded schemas. The framework reflects lessons learned from implementations of several language processing robots. It provides a basis for the analysis and design of situated, multimodal communication systems that straddle symbolic and non-symbolic realms.
Towards unsupervised pattern discovery in speech
, 2008
"... We present a novel approach to speech processing based on the principle of pattern discovery. Our work represents a departure from traditional models of speech recognition, where the end goal is to classify speech into categories defined by a prespecified inventory of lexical units (i.e., phones or ..."
Abstract
-
Cited by 78 (10 self)
- Add to MetaCart
We present a novel approach to speech processing based on the principle of pattern discovery. Our work represents a departure from traditional models of speech recognition, where the end goal is to classify speech into categories defined by a prespecified inventory of lexical units (i.e., phones or words). Instead, we attempt to discover such an inventory in an unsupervised manner by exploiting the structure of repeating patterns within the speech signal. We show how pattern discovery can be used to automatically acquire lexical entities directly from an untranscribed audio stream. Our approach to unsupervised word acquisition utilizes a segmental variant of a widely used dynamic programming technique, which allows us to find matching acoustic patterns between spoken utterances. By aggregating information about these matching patterns across audio streams, we demonstrate how to group similar acoustic sequences together to form clusters corresponding to lexical entities such as words and short multiword phrases. On a corpus of academic lecture material, we demonstrate that clusters found using this technique exhibit high purity and that many of the corresponding lexical identities are relevant to the underlying audio stream.
Validating Human–Robot Interaction Schemes in Multitasking Environments
, 2005
"... The ability of robots to autonomously perform tasks is increasing. More autonomy in robots means that the human managing the robot may have available free time. It is desirable to use this free time productively, and a current trend is to use this available free time to manage multiple robots. We pr ..."
Abstract
-
Cited by 76 (11 self)
- Add to MetaCart
The ability of robots to autonomously perform tasks is increasing. More autonomy in robots means that the human managing the robot may have available free time. It is desirable to use this free time productively, and a current trend is to use this available free time to manage multiple robots. We present the notion of neglect tolerance as a means for determining how robot autonomy and interface design determine how free time can be used to support multitasking, in general, and multirobot teams, in particular. We use neglect tolerance to 1) identify the maximum number of robots that can be managed; 2) identify feasible configurations of multirobot teams; and 3) predict performance of multirobot teams under certain independence assumptions. We present a measurement methodology, based on a secondary task paradigm, for obtaining neglect tolerance values that allow a human to balance workload with robot performance.
Integrating experiential and distributional data to learn semantic representations
- Psychological Review
, 2009
"... The authors identify 2 major types of statistical data from which semantic representations can be learned. These are denoted as experiential data and distributional data. Experiential data are derived by way of experience with the physical world and comprise the sensory-motor data obtained through s ..."
Abstract
-
Cited by 68 (4 self)
- Add to MetaCart
The authors identify 2 major types of statistical data from which semantic representations can be learned. These are denoted as experiential data and distributional data. Experiential data are derived by way of experience with the physical world and comprise the sensory-motor data obtained through sense receptors. Distributional data, by contrast, describe the statistical distribution of words across spoken and written language. The authors claim that experiential and distributional data represent distinct data types and that each is a nontrivial source of semantic information. Their theoretical proposal is that human semantic representations are derived from an optimal statistical combination of these 2 data types. Using a Bayesian probabilistic model, they demonstrate how word meanings can be learned by treating experiential and distributional data as a single joint distribution and learning the statistical structure that underlies it. The semantic representations that are learned in this manner are measurably more realistic—as verified by comparison to a set of human-based measures of semantic representation—than those available from either data type individually or from both sources independently. This is not a result of merely using quantitatively more data, but rather it is because experiential and distributional data are qualitatively distinct, yet intercorrelated, types of data. The semantic representations that are learned are based on statistical structures that exist both within and between the experiential and distributional data types.
Developmental robotics: Theory and experiments
- International Journal of Humanoid Robotics
, 2004
"... A hand-designed internal representation of the world cannot deal with unknown or uncontrolled environments. Motivated by human cognitive and behavioral development, this paper presents a theory, an architecture, and some experimental results for developmental robotics. By a developmental robot, we m ..."
Abstract
-
Cited by 62 (11 self)
- Add to MetaCart
A hand-designed internal representation of the world cannot deal with unknown or uncontrolled environments. Motivated by human cognitive and behavioral development, this paper presents a theory, an architecture, and some experimental results for developmental robotics. By a developmental robot, we mean that the robot generates its “brain ” (or “central nervous system, ” including the information processor and controller) through online, real-time interactions with its environment (including humans). A new Self-Aware Self-Effecting (SASE) agent concept is proposed, based on our SAIL and Dav developmental robots. The manual and autonomous development paradigms are formulated along with a theory of representation suited for autonomous development. Unlike traditional robot learning, the tasks that a developmental robot ends up learning are unknown during the programming time so that the task-specific representation must be generated and updated through real-time “living ” experiences. Experimental results with SAIL and Dav developmental robots are presented, including visual attention selection, autonomous navigation, developmental speech learning, range-based obstacle avoidance, and scaffolding through transfer and chaining.
The Challenges of Joint Attention
- Interaction Studies
, 2004
"... This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic ..."
Abstract
-
Cited by 62 (7 self)
- Add to MetaCart
(Show Context)
This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic and computational models of the di#erent prerequisites of joint attention is discussed in relation with a developmental timeline drawn from results in child studies.
Mental Imagery for a Conversational Robot
, 2004
"... To build robots that engage in fluid face-to-face spoken conversations with people, robots must have ways to connect what they say to what they see. A critical aspect of how language connects to vision is that language encodes points of view. The meaning of my left and your left differs due to an im ..."
Abstract
-
Cited by 59 (21 self)
- Add to MetaCart
To build robots that engage in fluid face-to-face spoken conversations with people, robots must have ways to connect what they say to what they see. A critical aspect of how language connects to vision is that language encodes points of view. The meaning of my left and your left differs due to an implied shift of visual perspective. The connection of language to vision also relies on object permanence. We can talk about things that are not in view. For a robot to participate in situated spoken dialog, it must have the capacity to imagine shifts of perspective, and it must maintain object permanence. We present a set of representations and procedures that enable a robotic manipulator to maintain a “mental model” of its physical environment by coupling active vision to physical simulation. Within this model, “imagined” views can be generated from arbitrary perspectives, providing the basis for situated language comprehension and production. An initial application of mental imagery for spatial language understanding for an interactive robot is described.