One of the key ideas in both robotics and neuroscience is that complex behaviour can arise from the interaction of many cooperating simple agents or modules. In this paper we suggest that this idea can be extended; just as combining simple agents may be important for complex behaviour, combining tasks is important for learning the parts themselves. In particular we show that combining classifications across different modalities can help solve the teaching signal dilemma and allow the development of task relevant classifications without external supervision. We recap some psychophysical and neurobiological data supporting the idea that information from different modalities can assist (or interfere) with classification in another modality and describe a neural network algorithm that is able to take advantage of the structure between the pattern distributions to different sensory modalities to eliminate the need for a teaching signal during training of each network. The algorithm is demonstrated on the problem of learning to recognize speech both acoustically and visually. Simultaneous presentation of moving mouth images and emanating sound waves allows the development of lip-reading and acoustic speech classifiers. The resulting classifiers approach the performance of supervised classifiers without requiring hand-labeling of the training patterns.
|
2138
|
Learning Internal Representations by Error Propagation
– Rumelhart, Hinton, et al.
- 1986
|
|
780
|
Self-organized formation of topologically correct feature maps
– Kohonen
- 1982
|
|
271
|
Adaptive pattern classification and universal recoding: parallel development and coding of neural feature detectors
– Grossberg
- 1976
|
|
222
|
Feature discovery by competitive learning
– Rumelhart, Zipser
- 1985
|
|
158
|
Hearing lips and seeing voices
– MCGURK, MACDONALD
- 1976
|
|
115
|
Self-organizing neural network that discovers surfaces in random-dot stereograms
– Becker, Hinton
- 1992
|
|
94
|
Visual contribution of speech intelligibility in noise
– Sumby, Pollack
- 1954
|
|
76
|
Soft competitive adaptation: Neural network learning algorithms based on fitting statistical mixtures. Unpublished doctoral dissertation
– Nowlan
- 1991
|
|
75
|
Maximum likelihood competitive learning
– Nowlan
- 1990
|
|
63
|
Improved versions of learning vector quantization
– Kohonen
|
|
55
|
Mutual information maximization: Models of cortical self organization
– Becker
- 1996
|
|
55
|
Hearing lips and seeing voices. Nature
– McGurk, MacDonald
|
|
29
|
State dependent activity in monkey visual cortex I: Single cell activity
– Haenny, Schiller
- 1988
|
|
27
|
Category learning through multimodality sensing
– Sa, Ballard
- 1998
|
|
26
|
The representation and storage of information in neuronal networks in the primate cerebral cortex and hippocampus
– Rolls
- 1989
|
|
25
|
Learning Classification with Unlabelled Data
– Sa
- 1994
|
|
20
|
The discovery of structure by multi-stream networks of local processors with contextual guidance
– Phillips, Kay, et al.
- 1995
|
|
19
|
Temporal texture and activity recognition
– Polana, Nelson
- 1997
|
|
18
|
Visual influences on speech perception processes
– MacDonald, McGurk
- 1978
|
|
17
|
Discovering predictable classifications
– Schmidhuber, Prelinger
- 1993
|
|
15
|
Effect of cooling area 18 on striate cortex cells in the squirrel monkey
– Sandell, Schiller
- 1982
|
|
14
|
Unsupervised Classification Learning from Cross-Modal Environmental Structure
– Sa
- 1994
|
|
11
|
Integration of auditory information in the cat's visual cortex
– Fishman, Michael
- 1973
|
|
9
|
Contextually guided unsupervised learning using local multivariate binary processors
– Kay, Floreano, et al.
- 1998
|
|
7
|
Visual learning in the perception of texture: Simple and contingent aftereffects of texture density
– Durgin, Proffitt
- 1996
|
|
7
|
Extraretinal representations in area V4 of macaque monkey
– Maunsell, Sclar, et al.
- 1991
|
|
6
|
Minimizing disagreement for self-supervised classification
– Sa
- 1994
|
|
6
|
Neuronal convergence of noxious, acoustic and visual stimuli in the visual cortex of the cat
– Murata, Cramer, et al.
- 1965
|
|
6
|
Auditory Specificity in Unit Recordings from Cat's Visual Cortex
– Spinelli, Starr, et al.
- 1968
|
|
5
|
Responses somesthesiques, visuel et auditives, recuellies, au niveau du cortex "associatif " infrasylvien chez le chat curarise non anesthesie
– Buser, Borenstein
- 1959
|
|
5
|
Self-teaching through correlated input
– Sa, Ballard
- 1992
|
|
5
|
Contingent aftereffects of texture density: Perceptual learning and contingency
– Durgin
- 1995
|
|
5
|
Visual System's View of Acoustic Space
– Morrell
- 1972
|
|
4
|
The experimental development of color-tone synesthesia
– Howells
- 1944
|
|
4
|
Macro and microelectrode studies of somatic responses in the lateral geniculate body
– Meulders, Colle, et al.
- 1965
|
|
4
|
Visual system's view of acoustic space. Nature
– Morrell
- 1972
|
|
4
|
Receptive field organization of ganglion cells in the cat's retina," Experimental Neurology
– Spinelli
- 1967
|
|
4
|
Centrifugal optic nerve responses evoked by auditory and somatic stimulation," Experimental Neurology
– Spinelli, Pribram, et al.
- 1965
|
|
4
|
Afferent and efferent activity in single units of the cat's optic nerve," Experimental Neurology
– Spinelli, Weingarten
- 1966
|
|
2
|
Pattern classification by the bayes machine
– Diamantini, Spalvieri
- 1995
|
|
1
|
The intermodal representation of speech in infants. Infant Behavior and Development
– Kuhl, Meltzoff
- 1984
|
|
1
|
Neural network definitions of highly predictable protein secondary structure classes
– Lapedes, Steeg, et al.
- 1994
|