Results 1 - 10
of
195
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
, 2009
"... Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypi ..."
Abstract
-
Cited by 374 (51 self)
- Add to MetaCart
Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.
Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2007
"... Dynamic texture is an extension of texture to the temporal domain. Description and recognition of dynamic textures have attracted growing attention. In this paper, a novel approach for recognizing dynamic textures is proposed and its simplifications and extensions to facial image analysis are also ..."
Abstract
-
Cited by 291 (35 self)
- Add to MetaCart
(Show Context)
Dynamic texture is an extension of texture to the temporal domain. Description and recognition of dynamic textures have attracted growing attention. In this paper, a novel approach for recognizing dynamic textures is proposed and its simplifications and extensions to facial image analysis are also considered. First, the textures are modeled with volume local binary patterns (VLBP), which are an extension of the LBP operator widely used in ordinary texture analysis, combining motion and appearance. To make the approach computationally simple and easy to extend, only the co-occurrences on three orthogonal planes (LBP-TOP) are then considered. A block-based method is also proposed to deal with specific dynamic events, such as facial expressions, in which local information and its spatial locations should also be taken into account. In experiments with two dynamic texture databases, DynTex and MIT, both the VLBP and LBP-TOP clearly outperformed the earlier approaches. The proposed block-based method was evaluated with the Cohn-Kanade facial expression database with excellent results. Advantages of our approach include local processing, robustness to monotonic gray-scale changes and simple computation.
Action MACH: a spatio-temporal maximum average correlation height filter for action recognition
- In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition
, 2008
"... In this paper we introduce a template-based method for recognizing human actions called Action MACH. Our approach is based on a Maximum Average Correlation Height (MACH) filter. A common limitation of template-based methods is their inability to generate a single template using a collection of examp ..."
Abstract
-
Cited by 237 (10 self)
- Add to MetaCart
(Show Context)
In this paper we introduce a template-based method for recognizing human actions called Action MACH. Our approach is based on a Maximum Average Correlation Height (MACH) filter. A common limitation of template-based methods is their inability to generate a single template using a collection of examples. MACH is capable of capturing intra-class variability by synthesizing a single Action MACH filter for a given action class. We generalize the traditional MACH filter to video (3D spatiotemporal volume), and vector valued data. By analyzing the response of the filter in the frequency domain, we avoid the high computational cost commonly incurred in template-based approaches. Vector valued data is analyzed using the Clifford Fourier transform, a generalization of the Fourier transform intended for both scalar and vector-valued data. Finally, we perform an extensive set of experiments and compare our method with some of the most recent approaches in the field by using publicly available datasets, and two new annotated human action datasets which include actions performed in classic feature films and sports broadcast television. 1.
Toward an Affect-Sensitive Multimodal Human-Computer Interaction
- Proceedings of the IEEE
, 2003
"... The ability to recognize affective states of a person... This paper argues that next-generation human-computer interaction (HCI) designs need to include the essence of emotional intelligence -- the ability to recognize a user's affective states -- in order to become more human-like, more effect ..."
Abstract
-
Cited by 189 (30 self)
- Add to MetaCart
(Show Context)
The ability to recognize affective states of a person... This paper argues that next-generation human-computer interaction (HCI) designs need to include the essence of emotional intelligence -- the ability to recognize a user's affective states -- in order to become more human-like, more effective, and more efficient. Affective arousal modulates all nonverbal communicative cues (facial expressions, body movements, and vocal and physiological reactions). In a face-to-face interaction, humans detect and interpret those interactive signals of their communicator with little or no effort. Yet design and development of an automated system that accomplishes these tasks is rather difficult. This paper surveys the past work in solving these problems by a computer and provides a set of recommendations for developing the first part of an intelligent multimodal HCI -- an automatic personalized analyzer of a user's nonverbal affective feedback.
Facial expression recognition based on Local Binary Patterns: A comprehensive study
, 2009
"... ..."
Multimodal human computer interaction: A survey
, 2005
"... In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user ..."
Abstract
-
Cited by 119 (3 self)
- Add to MetaCart
(Show Context)
In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user and task modeling, and multimodal fusion, highlighting challenges, open issues, and emerging applications for Multimodal Human Computer Interaction (MMHCI) research.
A 3d facial expression database for facial behavior research
- Proc. IEEE Int’l Conf. Face and Gesture Recognition
, 2006
"... Traditionally, human facial expressions have been studied using either 2D static images or 2D video sequences. The 2D-based analysis is incapable of handing large pose variations. Although 3D modeling techniques have been extensively used for 3D face recognition and 3D face animation, barely any res ..."
Abstract
-
Cited by 113 (6 self)
- Add to MetaCart
(Show Context)
Traditionally, human facial expressions have been studied using either 2D static images or 2D video sequences. The 2D-based analysis is incapable of handing large pose variations. Although 3D modeling techniques have been extensively used for 3D face recognition and 3D face animation, barely any research on 3D facial expression recognition using 3D range data has been reported. A primary factor for preventing such research is the lack of a publicly available 3D facial expression database. In this paper, we present a newly developed 3D facial expression database, which includes both prototypical 3D facial expression shapes and 2D facial textures of 2,500 models from 100 subjects. This is the first attempt at making a 3D facial expression database available for the research community, with the ultimate goal of fostering the research on affective computing and increasing the general understanding of facial behavior and the fine 3D structure inherent in human facial expressions. The new database can be a valuable resource for algorithm assessment, comparison and evaluation. 1.
Dynamics of Facial Expression: Recognition of Facial Actions and Their Temporal Segments from Face Profile Image Sequences
- IEEE Trans. Systems, Man, and Cybernetics, Part B
, 2006
"... Abstract—Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another appr ..."
Abstract
-
Cited by 111 (19 self)
- Add to MetaCart
(Show Context)
Abstract—Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another approach to machine analysis of prototypic facial expressions of emotion, the method presented in this paper attempts to handle a large range of human facial behavior by recognizing facial muscle actions that produce expressions. Virtually all of the existing vision systems for facial muscle action detection deal only with frontal-view face images and cannot handle temporal dynamics of facial actions. In this paper, we present a system for automatic recognition of facial action units (AUs) and their temporal models from long, profile-view face image sequences. We exploit particle filtering to track 15 facial points in an input face-profile sequence, and we introduce facial-action-dynamics recognition from continuous video input using temporal rules. The algorithm performs both automatic segmentation of an input video into facial expressions pictured and recognition of temporal segments (i.e., onset, apex, offset) of 27 AUs occurring alone or in a combination in the input face-profile video. A recognition rate of 87 % is achieved. Index Terms—Computer vision, facial action units, facial expression analysis, facial expression dynamics analysis, particle filtering, rule-based reasoning, spatial reasoning, temporal reasoning. I.
Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships
"... Abstract—A system that could automatically analyze the facial actions in real time has applications in a wide range of different fields. However, developing such a system is always challenging due to the richness, ambiguity, and dynamic nature of facial actions. Although a number of research groups ..."
Abstract
-
Cited by 104 (10 self)
- Add to MetaCart
Abstract—A system that could automatically analyze the facial actions in real time has applications in a wide range of different fields. However, developing such a system is always challenging due to the richness, ambiguity, and dynamic nature of facial actions. Although a number of research groups attempt to recognize facial action units (AUs) by improving either the facial feature extraction techniques or the AU classification techniques, these methods often recognize AUs or certain AU combinations individually and statically, ignoring the semantic relationships among AUs and the dynamics of AUs. Hence, these approaches cannot always recognize AUs reliably, robustly, and consistently. In this paper, we propose a novel approach that systematically accounts for the relationships among AUs and their temporal evolutions for AU recognition. Specifically, we use a dynamic Bayesian network (DBN) to model the relationships among different AUs. The DBN provides a coherent and unified hierarchical probabilistic framework to represent probabilistic relationships among various AUs and to account for the temporal changes in facial action development. Within our system, robust computer vision techniques are used to obtain AU measurements. Such AU measurements are then applied as evidence to the DBN for inferring various AUs. The experiments show that the integration of AU relationships and AU dynamics with AU measurements yields significant improvement of AU recognition, especially for spontaneous facial expressions and under more realistic environment including illumination variation, face pose variation, and occlusion. Index Terms—Facial action unit recognition, facial expression analysis, Facial Action Coding System, Bayesian networks. 1
Semi-supervised Learning of Classifiers: Theory, Algorithms and Their Application to Human-Computer Interaction
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2004
"... Automatic classification is one of the basic tasks required in any pattern recognition and human computer interaction application. In this paper we discuss training probabilistic classifiers with labeled and unlabeled data. We provide a new analysis that shows under what conditions unlabeled data ..."
Abstract
-
Cited by 74 (17 self)
- Add to MetaCart
Automatic classification is one of the basic tasks required in any pattern recognition and human computer interaction application. In this paper we discuss training probabilistic classifiers with labeled and unlabeled data. We provide a new analysis that shows under what conditions unlabeled data can be used in learning to improve classification performance. We also show that if the conditions are violated, using unlabeled data can be detrimental to classification performance. We discuss the implications of this analysis to a specific type of probabilistic classifiers, Bayesian networks, and propose a new structure learning algorithm that can utilize unlabeled data to improve classification. Finally, we show how the resulting algorithms are successfully employed in two applications related to human-computer interaction and pattern recognition; facial expression recognition and face detection.