Download:
|
by Juergen Luettin, Neil A. Thacker, Steve W. Beet
Proc. Int. Conf. on Pattern Recognition
ftp://ftp.idiap.ch/pub/papers/vision/luettin_icpr96learning.ps.gz
Add To MetaCart
Abstract:
An approach for person identification is described based on spatio-temporal analysis of the talking face. A person is represented by a parametric model of the visible speech articulators and their temporal characteristics during speech production. The model consists of shape parameters, representing the lip contour and intensity parameters representing the grey level distribution in the mouth region. The model is used to track lips in image sequences where the model parameters are recovered from the tracking results. While some of these parameters relate to speech information, others are intuitively related to different persons and we show that models based on these features enable successful person identification. We model the shape and intensity parameters as mixtures of Gaussians and their temporal dependencies by Hidden Markov Models. Identifying a talking person is performed by estimating the likelihood of each model for having generated the observed sequence of features and the model with the highest likelihood is chosen as the identified person. 1.
Citations
|
891
|
Fundamentals of speech recognition
– Rabiner, Juang
- 1993
|
|
404
|
Human and Machine Recognition of Faces: A Survey
– Chellappa, Wilson, et al.
- 1995
|
|
318
|
A maximum likelihood approach to continuous speech recognition
– Bahl, Jelinek, et al.
- 1983
|
|
154
|
Analysis and synthesis of facial image sequences using physical and anatomical models
– �Terzopoulos, Waters
- 1993
|
|
148
|
Cepstral analysis technique for automatic speaker verification
– Furui
- 1981
|
|
121
|
Person Identification Using Multiple Cues
– Brunelli, Falavigna
- 1995
|
|
95
|
Connectionist models of face processing: A survey
– Valentin, Abdi, et al.
- 1994
|
|
87
|
Facial expression recognition using a dynamic model and motion energy
– Essa, Pentland
- 1995
|
|
86
|
Recognition of facial expression from optical flow
– Mase
- 1991
|
|
81
|
Computing spatio-temporal representation of human faces
– Yacoob, Davis
- 1994
|
|
59
|
Visual speech recognition with stochastic networks
– Movellan
- 1995
|
|
36
|
Speaker recognition - identifying people by their voices
– Doddington
|
|
34
|
Identification and ratings of caricatures: implications for mental representations of faces
– Rhodes, Brennan, et al.
- 1987
|
|
29
|
Active shape models for visual speech feature extraction
– Luettin, Thacker, et al.
- 1996
|
|
26
|
Visual Speech Recognition Using Active Shape Models and Hidden Markov Models
– Luettin, Thacker, et al.
- 1996
|
|
24
|
Concatenated Phoneme Models for Text-Variable Speaker Recognition
– Matsui, Furui
- 1993
|
|
19
|
Automatic person recognition by using acoustic and geometric features
– Brunelli, Falavigna, et al.
- 1995
|
|
11
|
Locating and Tracking Facial Speech Features
– Luettin, Thacker, et al.
- 1996
|
|
5
|
Cootes, "A Unified Approach To Coding and Interpreting Face Images
– Lanitis, Taylor, et al.
- 1995
|
|
3
|
Evaluating the articulation index for audio-visual input
– Grant, Braida
- 1991
|
|
1
|
Automatic Recofnition and Analysis of Human Faces and Facial Expressions: A Survey
– Samal, Iyengar
- 1992
|
|
1
|
C.J.Taylor and J.Haslam, "Use of active shape models for locating structures in medical images
– Cootes
- 1994
|