Results 1 -
1 of
1
MULTIPOSE AUDIO-VISUAL SPEECH RECOGNITION
"... In this paper we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We fo-cus on the effects of a changing pose of the speaker relative to the camera, a problem encountered in natural situations. To that purpose, we introduce a pose normalizati ..."
Abstract
- Add to MetaCart
In this paper we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We fo-cus on the effects of a changing pose of the speaker relative to the camera, a problem encountered in natural situations. To that purpose, we introduce a pose normalization technique and per-form speech recognition from multiple views by generating virtual frontal views from non-frontal images. The proposed method is in-spired by pose-invariant face recognition studies and relies on linear regression to find an approximate mapping between images from different poses. Lipreading experiments quantify the loss of perfor-mance related to pose changes and the proposed pose normalization techniques, while audio-visual results analyse how an audio-visual system should account for non-frontal poses in terms of the weight assigned to the visual modality in the audio-visual classifier. 1.