Results 1 - 10
of
162
Grounded semantic composition for visual scenes
- Journal of Artificial Intelligence Research
, 2004
"... We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is ..."
Abstract
-
Cited by 105 (24 self)
- Add to MetaCart
(Show Context)
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account. 1.
Becoming Syntactic
, 2006
"... Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, the authors applied a connectionist model of sentence production to the development and use of abstract syntax. Th ..."
Abstract
-
Cited by 96 (6 self)
- Add to MetaCart
(Show Context)
Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, the authors applied a connectionist model of sentence production to the development and use of abstract syntax. The model makes use of (a) error-based learning to acquire and adapt sequencing mechanisms and (b) meaning–form mappings to derive syntactic representations. The model is able to account for most of what is known about structural priming in adult speakers, as well as key findings in preferential looking and elicited production studies of language acquisition. The model suggests how abstract knowledge and concrete experience are balanced in the development and use of syntax.
Looking to understand: The coupling between speakers’ and listeners’ eye movements and its relationship to discourse comprehension
- Cognitive Science
"... While their eye movements were being recorded, participants spoke extemporaneously about a TV show whose cast members they were viewing. Later, other participants listened to these speeches while their eyes were tracked. Within this naturalistic paradigm using spontaneous speech, a number of results ..."
Abstract
-
Cited by 83 (14 self)
- Add to MetaCart
(Show Context)
While their eye movements were being recorded, participants spoke extemporaneously about a TV show whose cast members they were viewing. Later, other participants listened to these speeches while their eyes were tracked. Within this naturalistic paradigm using spontaneous speech, a number of results linking eye movements to speech comprehension, speech production and memory were replicated. More importantly, a crossrecurrence analysis demonstrated that speaker and listener eye movements were coupled, and that the strength of this relationship positively correlated with listeners ’ comprehension. Just as the mental state of a single person can be reflected in patterns of eye movements, the commonality of mental states that is brought about by successful communication is mirrored in a similarity between speaker and listener’s eye movements.
Eye Movements and Spoken Language Comprehension: Effects of Visual Context on Syntactic Ambiguity Resolution
- COGNITIVE PSYCHOLOGY
, 2002
"... ..."
(Show Context)
Eyetracking and selective attention in category learning
- Cognitive Psychology
, 2005
"... An eyetracking version of the classic Shepard, Hovland and Jenkins (1961) experiment was conducted. Forty years of research has assumed that category learning includes learning how to selectively attend to only those stimulus dimensions useful for classification. We confirmed that par-ticipants lear ..."
Abstract
-
Cited by 59 (10 self)
- Add to MetaCart
(Show Context)
An eyetracking version of the classic Shepard, Hovland and Jenkins (1961) experiment was conducted. Forty years of research has assumed that category learning includes learning how to selectively attend to only those stimulus dimensions useful for classification. We confirmed that par-ticipants learned to allocate their attention optimally. However, we also found that neither associationist ac-counts of gradual learning nor hypothesis-testing accounts accurately predicted the pattern of eye movements leading up to successful learning. The implication of these results, and the use of eyetracking technology more generally, for categorization theory are discussed. Selective attention has played a prominent role in most theories of categorization ever since Roger Shepard’s influ-
Conversing with the user based on eye-gaze patterns
- In Proceedings of CHI’05
, 2005
"... Motivated by and grounded in observations of eye-gaze patterns in human-human dialogue, this study explores using eye-gaze patterns in managing human-computer dialogue. We developed an interactive system, iTourist, for city trip planning, which encapsulated knowledge of eye-gaze patterns gained from ..."
Abstract
-
Cited by 52 (0 self)
- Add to MetaCart
(Show Context)
Motivated by and grounded in observations of eye-gaze patterns in human-human dialogue, this study explores using eye-gaze patterns in managing human-computer dialogue. We developed an interactive system, iTourist, for city trip planning, which encapsulated knowledge of eye-gaze patterns gained from studies of human-human collaboration systems. User study results show that it was possible to sense users ’ interest based on eye-gaze patterns and manage computer information output accordingly. Study participants could successfully plan their trip with iTourist and positively rated their experience of using it. We demonstrate that eyegaze could play an important role in managing future multimodal human-computer dialogues.
Structural Priming as Implicit Learning: A Comparison of Models of Sentence Production
"... Structural priming reflects a tendency to generalize recently spoken or heard syntactic structures to different utterances. We propose that it is a form of implicit learning. To explore this hypothesis, we developed and tested a connectionist model of language production that incorporated mechanisms ..."
Abstract
-
Cited by 51 (13 self)
- Add to MetaCart
Structural priming reflects a tendency to generalize recently spoken or heard syntactic structures to different utterances. We propose that it is a form of implicit learning. To explore this hypothesis, we developed and tested a connectionist model of language production that incorporated mechanisms previously used to simulate implicit learning. In the model, the mechanism that learned to produce structured sequences of phrases from messages also exhibited structural priming. The ability of the model to account for structural priming depended on representational assumptions about the nature of messages and the relationship between comprehension and production. Modeling experiments showed that comprehension-based representations were important for the model's generalizations in production, and that non-atomic message representations allowed a better fit to existing data on structural priming than traditional thematic-role representations.
Symbolically speaking: a connectionist model of sentence production
- Cognitive Science
, 2002
"... The ability to combine words into novel sentences has been used to argue that humans have symbolic language production abilities. Critiques of connectionist models of language often center on the inability of these models to generalize symbolically (Fodor & Pylyshyn, 1988; Marcus, 1998). To addr ..."
Abstract
-
Cited by 48 (6 self)
- Add to MetaCart
The ability to combine words into novel sentences has been used to argue that humans have symbolic language production abilities. Critiques of connectionist models of language often center on the inability of these models to generalize symbolically (Fodor & Pylyshyn, 1988; Marcus, 1998). To address these issues, a connectionist model of sentence production was developed. The model had variables (role-concept bindings) that were inspired by spatial representations (Landau & Jackendoff, 1993). In order to take advantage of these variables, a novel dual-pathway architecture with event semantics is proposed and shown to be better at symbolic generalization than several variants. This architecture has one pathway for mapping message content to words and a separate pathway that enforces sequencing constraints. Analysis of the model’s hidden units demonstrated that the model learned different types of information in each pathway, and that the model’s compositional behavior arose from the combination of these two pathways. The model’s ability to balance symbolic and statistical behavior in syntax acquisition and to model aphasic double dissociations provided independent support for the dual-pathway architecture.
Does language guide event perception? Evidence from eye movements
- Cognition
, 2008
"... Languages differ in how they encode motion. When describing bounded motion, English speakers typically use verbs that convey information about manner (e.g., slide, skip, walk) rather than path (e.g., approach, ascend), whereas Greek speakers do the opposite. We investigated whether this strong cross ..."
Abstract
-
Cited by 36 (9 self)
- Add to MetaCart
(Show Context)
Languages differ in how they encode motion. When describing bounded motion, English speakers typically use verbs that convey information about manner (e.g., slide, skip, walk) rather than path (e.g., approach, ascend), whereas Greek speakers do the opposite. We investigated whether this strong cross-language difference influences how people allocate attention during motion perception. We compared eye movements from Greek and English speakers as they viewed motion events while (a) preparing verbal descriptions, or (b) memorizing the events. We found that in the verbal description task, speakers ’ eyes rapidly focus on the event components typically encoded in their native language, generating significant cross-language differences even during the first second of motion onset. However, when freely inspecting ongoing events, as in the memorization task, people allocate attention similarly regardless of the language they speak. Differences between language groups arose only after the motion stopped, such that participants spontaneously studied those aspects of the scene that their language does not routinely encode. The findings offer a novel perspective on the relation between language and perceptual/cognitive processes. Specifically, they indicate that attention allocation during event perception depends on the goals of the perceiver; effects of one’s native language arise only when linguistic forms are recruited to achieve the task.
Word length effects in object naming: The role of a response criterion
- Journal of Memory & Language
, 2003
"... According to Levelt, Roelofs, and Meyer (1999) speakers generate the phonological and phonetic representations of successive syllables of a word in sequence and only begin to speak after having fully planned at least one complete phonological word. Therefore, speech onset latencies should be longer ..."
Abstract
-
Cited by 32 (13 self)
- Add to MetaCart
(Show Context)
According to Levelt, Roelofs, and Meyer (1999) speakers generate the phonological and phonetic representations of successive syllables of a word in sequence and only begin to speak after having fully planned at least one complete phonological word. Therefore, speech onset latencies should be longer for long than for short words. We tested this prediction in four experiments in which Dutch participants named or categorized objects with monosyllabic or di-syllabic names. Experiment 1 yielded a length effect on production latencies when objects with long and short names were tested in separate blocks, but not when they were mixed. Experiment 2 showed that the length effect was not due to a difference in the ease of object recognition. Experiment 3 replicated the results of Experiment 1 using a within-participants design. In Experiment 4, the long and short target words appeared in a phrasal context. In addition to the speech onset latencies, we obtained the viewing times for the target objects, which have been shown to depend on the time necessary to plan the form of the target names. We found word length effects for both dependent variables, but only when objects with short and long names were presented in separate blocks. We argue that in pure and mixed blocks speakers used different response deadlines, which they tried to meet by either generating the motor programs for one syllable or for all syllables of the word before speech onset. Computer simulations using WEAVER++ support this view.