Results 1 - 10
of
17
Person identification using multiple cues
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... Abstract-This paper presents a person identification system based on acoustic and visual features. The system is organized as a set of non-homogeneous classifiers whose outputs are integrated after a normalization step. In particular, two classifiers based on acoustic features and three based on vis ..."
Abstract
-
Cited by 142 (1 self)
- Add to MetaCart
Abstract-This paper presents a person identification system based on acoustic and visual features. The system is organized as a set of non-homogeneous classifiers whose outputs are integrated after a normalization step. In particular, two classifiers based on acoustic features and three based on visual ones provide data for an integration module whose performance is evaluated. A novel technique for the integration of multiple classifiers at an hybrid ranWmeasurement level is introduced using HyperBF networks. Two different methods for the rejection of an unknown person are introduced. The performance of the integrated system is shown to be superior to that of the acoustic and visual subsystems. The resulting identification system can be used to log personal access and, with minor modifications, as an identity verification system. Index Tenns-Template matching, robust statistics, correlation, face recognition, speaker recognition, learning, classification. I.
Automatic Person Recognition by Using Acoustic and Geometric Features
, 1993
"... The paper describes a multisensorial person identification system: visual and acoustic cues are used jointly for person identification. A simple approach, based on the fusion of the lists of scores produced independently by a speaker recognition system and a face recognition system, is presented. Ex ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
The paper describes a multisensorial person identification system: visual and acoustic cues are used jointly for person identification. A simple approach, based on the fusion of the lists of scores produced independently by a speaker recognition system and a face recognition system, is presented. Experiments are reported which show that integration of visual and acoustic information enhances both performance and reliability of the separate systems. Finally two network architectures, based on radial basis function theory, are proposed to describe integration at different levels of abstraction. Keywords: face recognition, speaker identification, classification 1. Introduction This paper describes an automatic person recognition system 1 which uses both acoustic features, derived from the analysis of a given speech signal, and visual ones, related to distinctive parameters of the face of the person who uttered that speech signal. Visual and acoustic cues are used jointly for person id...
Caricatural Effects in Automated Face Perception
, 1993
"... This paper analyzes properties of a certain class of approximation techniques -- HyperBF networks -- in face perception tasks. The problem of gender classification and identification is addressed using a geometrical description of faces, extracted automatically from digitized pictures of frontal vie ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
This paper analyzes properties of a certain class of approximation techniques -- HyperBF networks -- in face perception tasks. The problem of gender classification and identification is addressed using a geometrical description of faces, extracted automatically from digitized pictures of frontal views of people without facial hair. The HyperBF networks perform satisfactorily on the classification tasks and exhibit the phenomenon of caricaturing, previously reported in psychophysical experiments. 1. Introduction Faces allow people to establish, among other things, the gender of a person, his (her) age, his (her) identity and, to a certain extent, emotions. In the current paper we address the tasks of gender classification and recognition. The work was done within MAIA, the integrated AI project under development at IRST, which aims to develop a face recognition system as one of its components ([1]). We will show that limited geometrical information may account for correct sex attributio...
Robust Estimation of Correlation with Applications to Computer Vision
- Pattern Recognition
, 1995
"... In this paper we compare to the standard correlation coefficient three estimators of similarity for visual patterns which are based on the L 2 and L 1 norms. The emphasis of the comparison is on the stability of the resulting estimates. Bias, efficiency, normality and robustness are investigated thr ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
In this paper we compare to the standard correlation coefficient three estimators of similarity for visual patterns which are based on the L 2 and L 1 norms. The emphasis of the comparison is on the stability of the resulting estimates. Bias, efficiency, normality and robustness are investigated through Monte Carlo simulations in a statistical task, the estimation of the correlation parameter of a binormal distribution. The four estimators are then compared on two pattern recognition tasks: people identification through face recognition and book identification from the cover image. The similarity measures based on the L 1 norm prove to be less sensitive to noise and provide better performance than those based on L 2 norm . Keywords: template matching, robust statistics, correlation, face recognition, book recognition. 1. Introduction The estimation of similarity of patterns is a common low-level vision task which must be routinely performed by many computer vision systems. The Pear...
On Training Neural Nets through Stochastic Minimization
- Neural Networks
, 1994
"... The revival of multilayer neural networks in the mid 80's originated from the discovery of the backpropagation technique as a feasible training procedure. In spite of its shortcomings, it is probably the most widespread technique for training feedforward nets. In recent years, several deterministic ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
The revival of multilayer neural networks in the mid 80's originated from the discovery of the backpropagation technique as a feasible training procedure. In spite of its shortcomings, it is probably the most widespread technique for training feedforward nets. In recent years, several deterministic methods more efficient than back-propagation have been proposed. In this paper a stochastic minimization algorithm, Iterated Adaptive Memory Stochastic Search, is described which does not use gradient information and is found to perform better than back-propagation on the encoder and parity problems 1 . Keywords: stochastic optimization, learning algorithms, back-propagation 1. Introduction Learning from examples, the problem which neural networks were created to solve, is one of the most important research topics in the AI community. A possible way to formalize learning from examples is to hypothesize the existence of a function that captures the underlying mapping, thereby enabling gen...
Robust Speech Understanding for Robot Telecontrol
- In Proceedings of the 6th International Conference on Advanced Robotics
, 1993
"... This paper describes an Automatic Speech Understanding (ASU) system used in a human-robot interface for the remote control of a mobile robot. The intended application is that of an operator issuing telecontrol commands to one or more robots from a remote workstation. ASU is supposed to be performed ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
This paper describes an Automatic Speech Understanding (ASU) system used in a human-robot interface for the remote control of a mobile robot. The intended application is that of an operator issuing telecontrol commands to one or more robots from a remote workstation. ASU is supposed to be performed with continuous speech, speaker independence, and quasi real time conditions. Training and testing of the system was based on speech data collected by means of Wizard of Oz simulations. Two kinds of robustness factors are introduced: the first is a recognition-errortolerant approach to semantic interpretation, the second is based on a technique for evaluating the reliability of the ASU system output with respect to the input utterance. Preliminary results are 93% correct semantic interpretations, and 96.5% correct detection of out-of-domain sentences at the cost of rejecting 6.7% correct in-domain sentences. I. Introduction This paper describes an Automatic Speech Understanding (ASU) syste...
A Fast Straight Line Extractor for Vision-Guided Robot Navigation
, 1993
"... A method to extract straight segments from gray-level images is presented, which is based on a novel approach to adaptive thresholding of local discontinuities. In particular, the method takes advantage from modeling the statistics of the low-intensity edges generated by acquisition noise. Though ve ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
A method to extract straight segments from gray-level images is presented, which is based on a novel approach to adaptive thresholding of local discontinuities. In particular, the method takes advantage from modeling the statistics of the low-intensity edges generated by acquisition noise. Though very simple, the algorithm is general enough to successfully cope with images taken in a wide range of lighting conditions, contrast and noise; fast and accurate, it is therefore very well suited to support vision-based modules for autonomous indoor navigation. Examples from this application field are reported which may better emphasize the qualities of the method as compared to others based on different thresholding techniques.
Design and Acquisition of a Task-Oriented Spontaneous-Speech Database
- Intelligent Perceptual Systems, volume 745 of Lecture Notes in Artificial Intelligence, 196–210
, 1993
"... . The need of large databases both for training and testing automatic speech recognition and understanding systems is a well known issue. This paper presents the result of a first collection of task-oriented spontaneous speech corpora performed in the MAIA project, under development at IRST. Abo ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
. The need of large databases both for training and testing automatic speech recognition and understanding systems is a well known issue. This paper presents the result of a first collection of task-oriented spontaneous speech corpora performed in the MAIA project, under development at IRST. About 2000 sentences were acquired from 50 subjects concerning two scenarios of human-machine spoken interactions: a telecontrol station for a mobile robot and an information query system. Both systems were simulated by means of the well known "Wizard of Oz" technique. This paper focuses on the methodological issues of this approach, putting in evidence some important points which must be considered in the design of simulations, together with the adopted solutions. A first evaluation of the collected data concludes the exposition. 1 Introduction Automatic speech recognition and understanding are based on models both at the acoustic level and at the linguistic level. For the acoustic mo...
Experiencing Real-Life Interactions with the Experimental Platform of MAIA
- In Proceedings of the 1st European Workshop on Human Comfort and Security
, 1994
"... Over the years, automated vehicles have evolved from reliable yet rigidly constrained AGV to the fairly flexible ones of the HelpMate generation. If many problems related to the autonomous navigation of such vehicles have found solutions that appear to be adequate to many practical situations, yet t ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Over the years, automated vehicles have evolved from reliable yet rigidly constrained AGV to the fairly flexible ones of the HelpMate generation. If many problems related to the autonomous navigation of such vehicles have found solutions that appear to be adequate to many practical situations, yet their capacity of interacting with the users and the environment is still limited. In the present paper, the research work done at IRST in the framework of the Experimental Platform of MAIA is presented. In particular, a transport mission scenario (MAIA '94) is considered, in which autonomous navigation, speech recognition and planning skills represents three aspects of one and the same ability that the system can exhibit of establishing interactions with the external world in an reliable and autonomous fashion. IRST Technical Report 9406-27, June 1994. 1 Introduction Design and realization of artificial systems able to autonomously accomplish transport missions in relatively unstructured ...
Making sense: autonomy and adaptation in visual robotics
, 2000
"... This is a practical and theoretical thesis in visual robotics. It describes the development and actual implementation (in a real world, real-time visual robot) of new approaches to three non-linear problems: how to ensure robustness under the harshest conditions of unforeseen reconfiguration, how to ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This is a practical and theoretical thesis in visual robotics. It describes the development and actual implementation (in a real world, real-time visual robot) of new approaches to three non-linear problems: how to ensure robustness under the harshest conditions of unforeseen reconfiguration, how to provide specialised space-variant sampling regimes according to which task is currently at hand, and how to automatically direct attention using any number of adaptive response layers in concert. Additionally, the descriptions of this practical work are preceded by the exposition of a new theoretical framework for intelligent system evaluation, which offers performance silhouettes as a schematic method. The three practical methods stem from, and are embedded in, the theoretical framework, and all have been suggested, to varying degrees, by knowledge of biological processes or capabilities- self-organisation, visual periphery sensitivity, and adaptive reduction in sensitivity, for example. The overall goal of the research is to develop extremely simple algorithms capable of operating in real-time and endowing a robot with robust essential perceptual capabilities that can operate in all environments. The outcome of the work is a visual robotic system that exhibits seemingly intelligent behaviour in complex, changing, and noisy natural environments. That it does so with minimal help from other sophisticated agents (such as computer science researchers) is a credit to its autonomy and the adaptation of its design to arbitrary environments.

