Results 1 - 10
of
83
Social Signal Processing: Survey of an Emerging Domain
, 2008
"... The ability to understand and manage social signals of a person we are communicating with is the core of social intelligence. Social intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. This paper argues that next- ..."
Abstract
-
Cited by 153 (32 self)
- Add to MetaCart
The ability to understand and manage social signals of a person we are communicating with is the core of social intelligence. Social intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. This paper argues that next-generation computing needs to include the essence of social intelligence – the ability to recognize human social signals and social behaviours like turn taking, politeness, and disagreement – in order to become more effective and more efficient. Although each one of us understands the importance of social signals in everyday life situations, and in spite of recent advances in machine analysis of relevant behavioural cues like blinks, smiles, crossed arms, laughter, and similar, design and development of automated systems for Social Signal Processing (SSP) are rather difficult. This paper surveys the past efforts in solving these problems by a computer, it summarizes the relevant findings in social psychology, and it proposes a set of recommendations for enabling the development of the next generation of socially-aware computing.
Human Computing and Machine Understanding of Human Behavior: A Survey
- SURVEY, PROC. ACM INT’L CONF. MULTIMODAL INTERFACES
, 2006
"... A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should b ..."
Abstract
-
Cited by 132 (33 self)
- Add to MetaCart
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior.
The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression
"... In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limit ..."
Abstract
-
Cited by 122 (7 self)
- Add to MetaCart
(Show Context)
In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22 % and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leaveone-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010. 1.
Fully Automatic Facial Action Unit Detection and Temporal Analysis
, 2006
"... In this work we report on the progress of building a system that enables fully automated fast and robust facial expression recognition from face video. We analyse subtle changes in facial expression by recognizing facial muscle action units (AUs) and analysing their temporal behavior. By detecting A ..."
Abstract
-
Cited by 66 (15 self)
- Add to MetaCart
(Show Context)
In this work we report on the progress of building a system that enables fully automated fast and robust facial expression recognition from face video. We analyse subtle changes in facial expression by recognizing facial muscle action units (AUs) and analysing their temporal behavior. By detecting AUs from face video we enable the analysis of various facial communicative signals including facial expressions of emotion, attitude and mood. For an input video picturing a facial expression we detect per frame whether any of 15 different AUs is activated, whether that facial action is in the onset, apex, or offset phase, and what the total duration of the activation in question is. We base this process upon a set of spatio-temporal features calculated from tracking data for 20 facial fiducial points. To detect these 20 points of interest in the first frame of an input face video, we utilize a fully automatic, facial point localization method that uses individual feature GentleBoost templates built from Gabor wavelet features. Then, we exploit a particle filtering scheme that uses factorized likelihoods and a novel observation model that combines a rigid and a morphological model to track the facial points. The AUs displayed in the input video and their temporal segments are recognized finally by Support Vector Machines trained on a subset of most informative spatio-temporal features selected by AdaBoost. For Cohn-Kanade and MMI databases, the proposed system classifies 15 AUs occurring alone or in combination with other AUs with a mean agreement rate of 90.2 % with human FACS coders.
Face Verification across Age Progression
- in Proc. IEEE Conf. Computer Vision and Pattern Recognition
, 2005
"... Abstract—Human faces undergo considerable amounts of variations with aging. While face recognition systems have been proven to be sensitive to factors such as illumination and pose, their sensitivity to facial aging effects is yet to be studied. How does age progression affect the similarity between ..."
Abstract
-
Cited by 62 (6 self)
- Add to MetaCart
(Show Context)
Abstract—Human faces undergo considerable amounts of variations with aging. While face recognition systems have been proven to be sensitive to factors such as illumination and pose, their sensitivity to facial aging effects is yet to be studied. How does age progression affect the similarity between a pair of face images of an individual? What is the confidence associated with establishing the identity between a pair of age separated face images? In this paper, we develop a Bayesian age difference classifier that classifies face images of individuals based on age differences and performs face verification across age progression. Further, we study the similarity of faces across age progression. Since age separated face images invariably differ in illumination and pose, we propose preprocessing methods for minimizing such variations. Experimental results using a database comprising of pairs of face images that were retrieved from the passports of 465 individuals are presented. The verification system for faces separated by as many as nine years, attains an equal error rate of 8.5%. Index Terms—Age progression, face recognition, face verification, probabilistic eigenspaces, similarity measure. I.
The painful face - pain expression recognition using active appearance models
- In ICMI
, 2007
"... Pain is typically assessed by patient self-report. Self-reported pain, however, is difficult to interpret and may be impaired or not even possible, as in young children or the severely ill. Behavioral scientists have identified reliable and valid facial indicators of pain. Until now they required ma ..."
Abstract
-
Cited by 55 (18 self)
- Add to MetaCart
(Show Context)
Pain is typically assessed by patient self-report. Self-reported pain, however, is difficult to interpret and may be impaired or not even possible, as in young children or the severely ill. Behavioral scientists have identified reliable and valid facial indicators of pain. Until now they required manual measurement by highly skilled observers. We developed an approach that automatically recognizes acute pain. Adult patients with rotator cuff injury were video-recorded while a physiotherapist manipulated their affected and unaffected shoulder. Skilled observers rated pain expression from the video on a 5-point Likert-type scale. From these ratings, sequences were categorized as no-pain (rating of 0), pain (rating of 3, 4, or 5), and indeterminate (rating of 1 or 2). We explored machine learning approaches for pain-no pain classification. Active Appearance Models (AAM) were used to decouple shape and appearance parameters from the digitized face images. Support vector machines (SVM) were used with several representations from the AAM. Using a leave-one-out procedure, we achieved an equal error rate of 19 % (hit rate = 81%) using canonical appearance and shape features. These findings suggest the feasibility of automatic pain detection from video.
Machine Analysis of Facial Expressions
"... The human face is the site for major sensory inputs and major communicative outputs. It houses the majority of our sensory apparatus as well as our speech production apparatus. It is used to identify other members of our species, to gather information about age, gender, attractiveness, and personali ..."
Abstract
-
Cited by 52 (19 self)
- Add to MetaCart
The human face is the site for major sensory inputs and major communicative outputs. It houses the majority of our sensory apparatus as well as our speech production apparatus. It is used to identify other members of our species, to gather information about age, gender, attractiveness, and personality, and to regulate conversation by gazing or nodding. Moreover, the human face is our preeminent means of communicating and understanding somebody’s affective state and intentions on the basis of the shown facial expression (Keltner & Ekman, 2000). Thus, the human face is a multi-signal input-output communicative system capable of tremendous flexibility and specificity (Ekman & Friesen, 1975). In general, the human face conveys information via four kinds of signals. (a) Static facial signals represent relatively permanent features of the face, such as the bony structure, the soft tissue, and the overall proportions of the face. These signals contribute to an individual’s appearance and are usually exploited for person identification. (b) Slow facial signals represent changes in the appearance of the face that occur gradually over time, such as development of permanent wrinkles and changes in skin texture.
Investigating spontaneous facial action recognition through AAM representations of the face
- Face Recognition Book. Pro Literatur Verlag
, 2007
"... The Facial Action Coding System (FACS) [Ekman et al., 2002] is the leading method for measuring facial movement in behavioral science. FACS has been successfully applied, but not limited to, identifying the differences between simulated and genuine pain, differences betweenwhen people are telling th ..."
Abstract
-
Cited by 43 (10 self)
- Add to MetaCart
(Show Context)
The Facial Action Coding System (FACS) [Ekman et al., 2002] is the leading method for measuring facial movement in behavioral science. FACS has been successfully applied, but not limited to, identifying the differences between simulated and genuine pain, differences betweenwhen people are telling the truth versus lying, and differences between suicidal and
Observer-based measurement of facial expression with the Facial Action Coding System
- Oxford University
, 2007
"... Facial expression has been a focus of emotion research for over a hundred years (Darwin, 1872/1998). It is central to several leading theories of emotion (Ekman, 1992; Izard, 1977; Tomkins, 1962) and has been the focus of at times heated debate about issues in emotion science (Ekman, 1973, 1993; ..."
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
Facial expression has been a focus of emotion research for over a hundred years (Darwin, 1872/1998). It is central to several leading theories of emotion (Ekman, 1992; Izard, 1977; Tomkins, 1962) and has been the focus of at times heated debate about issues in emotion science (Ekman, 1973, 1993;
Action Unit Detection with Segment-based SVMs
"... Automatic facial action unit (AU) detection from video is a long-standing problem in computer vision. Two main approaches have been pursued: (1) static modeling—typically posed as a discriminative classification problem in which each video frame is evaluated independently; (2) temporal modeling—fram ..."
Abstract
-
Cited by 36 (7 self)
- Add to MetaCart
(Show Context)
Automatic facial action unit (AU) detection from video is a long-standing problem in computer vision. Two main approaches have been pursued: (1) static modeling—typically posed as a discriminative classification problem in which each video frame is evaluated independently; (2) temporal modeling—frames are segmented into sequences and typically modeled with a variant of dynamic Bayesian networks. We propose a segment-based approach, kSeg-SVM, that incorporates benefits of both approaches and avoids their limitations. kSeg-SVM is a temporal extension of the spatial bag-of-words. kSeg-SVM is trained within a structured output SVM framework that formulates AU detection as a problem of detecting temporal events in a time series of visual features. Each segment is modeled by a variant of the BoW representation with soft assignment of the words based on similarity. Our framework has several benefits for AU detection: (1) both dependencies between features and the length of action units are modeled; (2) all possible segments of the video may be used for training; and (3) no assumptions are required about the underlying structure of the action unit events (e.g., i.i.d.). Our algorithm finds the best k-or-fewer segments that maximize the SVM score. Experimental results suggest that the proposed method outperforms state-of-the-art static methods for AU detection. 1.