• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Dynamic Bayesian networks for information fusion with applications to human-computer interfaces (1999)

by V Pavlovic
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 37
Next 10 →

Active and Dynamic Information Fusion for Facial Expression Understanding from Image Sequences

by Yongmian Zhang, Qiang Ji - IEEE TRANS. PATTERN ANALYSIS & MACHINE INTELLIGENCE , 2005
"... This paper explores the use of multisensory information fusion technique with Dynamic Bayesian networks (DBNs) for modeling and understanding the temporal behaviors of facial expressions in image sequences. Our facial feature detection and tracking based on active IR illumination provides reliable ..."
Abstract - Cited by 111 (8 self) - Add to MetaCart
This paper explores the use of multisensory information fusion technique with Dynamic Bayesian networks (DBNs) for modeling and understanding the temporal behaviors of facial expressions in image sequences. Our facial feature detection and tracking based on active IR illumination provides reliable visual information under variable lighting and head motion. Our approach to facial expression recognition lies in the proposed dynamic and probabilistic framework based on combining DBNs with Ekman's Facial Action Coding System (FACS) for systematically modeling the dynamic and stochastic behaviors of spontaneous facial expressions. The framework not only provides a coherent and unified hierarchical probabilistic framework to represent spatial and temporal information related to facial expressions, but also allows us to actively select the most informative visual cues from the available information sources to minimize the ambiguity in recognition. The recognition of facial expressions is accomplished by fusing not only from the current visual observations, but also from the previous visual evidences. Consequently, the recognition becomes more robust and accurate through explicitly modeling temporal behavior of facial expression. In this paper, we present the theoretical foundation underlying the proposed probabilistic and dynamic framework for facial expression modeling and understanding. Experimental results demonstrate that our approach can accurately and robustly recognize spontaneous facial expressions from an image sequence under different conditions.

Context-aware visual tracking

by Ming Yang, Ying Wu, Gang Hua - IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2009
"... Enormous uncertainties in unconstrained environments lead to a fundamental dilemma that many tracking algorithms have to face in practice: Tracking has to be computationally efficient, but verifying whether or not the tracker is following the true target tends to be demanding, especially when the ba ..."
Abstract - Cited by 55 (8 self) - Add to MetaCart
Enormous uncertainties in unconstrained environments lead to a fundamental dilemma that many tracking algorithms have to face in practice: Tracking has to be computationally efficient, but verifying whether or not the tracker is following the true target tends to be demanding, especially when the background is cluttered and/or when occlusion occurs. Due to the lack of a good solution to this problem, many existing methods tend to be either effective but computationally intensive by using sophisticated image observation models or efficient but vulnerable to false alarms. This greatly challenges long-duration robust tracking. This paper presents a novel solution to this dilemma by considering the context of the tracking scene. Specifically, we integrate into the tracking process a set of auxiliary objects that are automatically discovered in the video on the fly by data mining. Auxiliary objects have three properties, at least in a short time interval: 1) persistent co-occurrence with the target, 2) consistent motion correlation to the target, and 3) easy to track. Regarding these auxiliary objects as the context of the target, the collaborative tracking of these auxiliary objects leads to efficient computation as well as strong verification. Our extensive experiments have exhibited exciting performance in very challenging realworld testing cases.
(Show Context)

Citation Context

...onsensus since ψ(x1,x2) is approaching to an impulse delta function, and vice versa. The Bayesian MAP inference of x1 and the ML estimate of σ12 can be obtained by the following Bayesian EM algorithm =-=[48]-=-, i.e., x1 = (Σ −1 1 1 + σ2 I) 12 −1 × (Σ −1 1 z1 + 1 σ2 (A12x2 + µ12)), (20) 12 σ 2 12 = 1 n (x1 − A12x2 − µ12) ⊤ (x1 − A12x2 − µ12). (21) Fixing σ12, the E-Step in Eq. 20 obtains the MAP estimate of...

Measurement Integration Under Inconsistency for Robust Tracking

by Gang Hua, Ying Wu - IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR , 2006
"... The solutions to many vision problems involve integrating measurements from multiple sources. Most existing methods rely on a hidden assumption, i.e., these measurements are consistent. In reality, unfortunately, this may not hold. The fact that naively fusing inconsistent measurements amounts to fa ..."
Abstract - Cited by 15 (3 self) - Add to MetaCart
The solutions to many vision problems involve integrating measurements from multiple sources. Most existing methods rely on a hidden assumption, i.e., these measurements are consistent. In reality, unfortunately, this may not hold. The fact that naively fusing inconsistent measurements amounts to failing these methods indicates that this is not a trivial problem. This paper presents a novel approach to handling it. A new theorem is proven that gives two algebraic criteria to examine the consistency and inconsistency. In addition, a more general criterion is presented. Based on the theoretical analysis, a new information integration method is proposed and leads to encouraging results when applied to the task of visual tracking. 1
(Show Context)

Citation Context

... to a delta function, and vice versa. Denote Θ = {σ2 ij : {i, j} ∈ E}, Eq. 1 is indeed p(X|Θ,Z). The MAP estimate of xi and the ML estimate of Θ can be obtained by the following Bayesian EM algorithm =-=[8]-=-, i.e., σ 2 ij xi = (Σ −1 i ∑ + σ j∈N(i) 2 ij × (Σ −1 i zi + ∑ 1 σ j∈N(i) 2 ij 1 I) −1 (Aijxj + µij)) (6) = 1 n (xi − Aijxj − µij) T (xi − Aijxj − µij) (7) Fixing Θ, the E-Step in Eq. 6 obtains the MA...

Variational Learning in Mixed-State Dynamic Graphical Models

by Vladimir Pavlovic, Brendan J. Frey, Thomas S. Huang - IN PROCEEDINGS OF THE FIFTEENTH ANNUAL CONFERENCE ON UNCERTAINTY IN ARTI INTELLIGENCE (UAI{99 , 1999
"... Many real-valued stochastic time-series are locally linear (Gaussian), but globally nonlinear. For example, the trajectory of a human hand gesture can be viewed as a linear dynamic system driven by a nonlinear dynamic system that represents muscle actions. We present a mixed-state dynamic grap ..."
Abstract - Cited by 10 (2 self) - Add to MetaCart
Many real-valued stochastic time-series are locally linear (Gaussian), but globally nonlinear. For example, the trajectory of a human hand gesture can be viewed as a linear dynamic system driven by a nonlinear dynamic system that represents muscle actions. We present a mixed-state dynamic graphical model in which a hidden Markov model drives a linear dynamic system. This combination allows us to model both the discrete and continuous causes of trajectories suchas human gestures. The number of computations needed for exact inference is exponential in the sequence length, so we derive an approximate variational inference technique that can also be used to learn the parameters of the discrete and continuous models. We showhow the mixed-state model and the variational technique can be used to classify human hand gestures made with a computer mouse.

Applying Dynamic Bayesian Networks in Transliteration Identification

by Peter Nabende
"... This report presents work associated with the application of Dynamic Bayesian Networks (DBNs) in transliteration identification. Transliteration identification is mainly needed to help deal with Out Of Vocabulary words in Cross Language Information Retrieval (CLIR) and Machine Translation (MT). In r ..."
Abstract - Cited by 8 (1 self) - Add to MetaCart
This report presents work associated with the application of Dynamic Bayesian Networks (DBNs) in transliteration identification. Transliteration identification is mainly needed to help deal with Out Of Vocabulary words in Cross Language Information Retrieval (CLIR) and Machine Translation (MT). In related transliteration studies, transliteration identification refers

Narrative Spaces: bridging architecture and entertainment via interactive technology

by Flavia Sparacino - 6th International Conference on Generative Art , 2002
"... Our society's modalities of communication are rapidly changing. Large panel displays and screens are be ing installed in many public spaces, ranging from open plazas, to shopping malls, to private houses, to theater stages, classrooms, and museums. In parallel, wearable computers are transformi ..."
Abstract - Cited by 8 (0 self) - Add to MetaCart
Our society's modalities of communication are rapidly changing. Large panel displays and screens are be ing installed in many public spaces, ranging from open plazas, to shopping malls, to private houses, to theater stages, classrooms, and museums. In parallel, wearable computers are transforming our technological landscape by reshaping the heavy, bulky desktop computer into a lightweight, portable device that is accessible to people at any time. Computation and sensing are moving from computers and devices into the environment itself. The space around us is instrumented with sensors and displays, and it tends to reflect a diffused need to combine together the information space with our physical space. This combination of large public and miniature personal digital displays together with distributed computing and sensing intelligence offers unprecedented opportunities to merge the virtual and the real, the information landscape of the Internet with the urban landscape of the city, to transform digital animated media in storytellers, in public installations and through personal wearable technology. This paper describes technological platforms built at the MIT Media Lab, through 1994-2002, that contribute to defining new trends in architecture that merge virtual and real spaces, and are reshaping the way we live and experience the museum, the house, the theater, and the modern city.
(Show Context)

Citation Context

..., when pointing direction, and accurate depth information are needed. Hidden Markov Models and more recently Bayesian networks, have been successfully used to classify human movements and gestures [8]=-=[9]. -=-2.2. Robustness of multimodal perception Robust sensing is the premise for the correct interpretation of the user’s intention. Monosensor applications which rely on one unique sensor modality to acq...

Variational Maximum A Posteriori by Annealed Mean Field Analysis

by Gang Hua, Student Member, Ying Wu
"... Abstract—This paper proposes a novel probabilistic variational method with deterministic annealing for the maximum a posteriori (MAP) estimation of complex stochastic systems. Since the MAP estimation involves global optimization, in general, it is very difficult to achieve. Therefore, most probabil ..."
Abstract - Cited by 7 (2 self) - Add to MetaCart
Abstract—This paper proposes a novel probabilistic variational method with deterministic annealing for the maximum a posteriori (MAP) estimation of complex stochastic systems. Since the MAP estimation involves global optimization, in general, it is very difficult to achieve. Therefore, most probabilistic inference algorithms are only able to achieve either the exact or the approximate posterior distributions. Our method constrains the mean field variational distribution to be multivariate Gaussian. Then, a deterministic annealing scheme is nicely incorporated into the mean field fix-point iterations to obtain the optimal MAP estimate. This is based on the observation that when the covariance of the variational Gaussian distribution approaches to zero, the infimum point of the Kullback-Leibler (KL) divergence between the variational Gaussian and the real posterior will be the same as the supreme point of the real posterior. Although global optimality may not be guaranteed, our extensive synthetic and real experiments demonstrate the effectiveness and efficiency of the proposed method. Index Terms—Mean field variational analysis, deterministic annealing, maximum a posteriori estimation, graphical model, Markov network. 1
(Show Context)

Citation Context

...ory involves the Bayesian inference algorithms on graphical models, while the third category is related to the global optimization methods. Bayesian network (BN), dynamic Bayesian network (DBN) [20], =-=[21]-=-, Markov network [4], [5], and dynamic Markov network [7], [10], [22] are all typical graphical models [23]. They are widely used for modeling and solving computer visionproblems.Tomentionsome,aBNispr...

Human-robot interface with anticipatory characteristics based on Laban Movement Analysis and Bayesian models

by Jörg Rett, Jorge Dias - roceedings of the IEEE 10th International Conference on Rehabilitation Robotics (ICORR), 2007
"... In this work we contribute to the field of human-machine interaction with a system that anticipates human movements using the concept of Laban Movement Analysis (LMA). The implementation uses a Bayesian model for learning and classification and results are presented for the application to online ges ..."
Abstract - Cited by 6 (6 self) - Add to MetaCart
In this work we contribute to the field of human-machine interaction with a system that anticipates human movements using the concept of Laban Movement Analysis (LMA). The implementation uses a Bayesian model for learning and classification and results are presented for the application to online gesture recognition. The merging of assistive robotics and socially interactive robotics has recently led to the definition of socially assistive robotics. What is necessary and we found still missing are socially interactive robots with a higher level cognitive system which analyzes deeply the observed human movement. In this article we provide a framework for cognitive processes to be implemented in human-machine-interfaces based on nowadays technologies. We present LMA as a concept that helps to identify useful low-level features, defines a framework of mid-level descriptors for movement-properties and helps to develop a classifier of expressive actions. Our interface anticipates a performed action observed from a stream of monocular camera images by using a Bayesian framework. With this work we define the required qualities and characteristics of future embodied agents in terms of social interaction with humans. This article searches for human qualities like anticipation and empathy and presents possible ways towards implementation in the cognitive system of a social robot. We present results through its embodiment in the social robot ’Nicole ’ in the context of a person performing gestures and ’Nicole ’ reacting by means of audio output and robot movement. ∗This work is partially supported by FCT-Fundação para a Ciência
(Show Context)

Citation Context

... of: 1. Feature Extraction, 2. Feature Correspondence and 3. High Level Processing. Meanwhile there is a large amount of work on gesture recognition mainly applied to control some sort of devices. In =-=[17]-=- DBNs were used to recognize a set of eleven hand gestures to manipulate a virtual display shown on a projection screen . Surveys specialized on gesture interfaces along the last ten years reflect the...

Gesture recognition using a marionette model and dynamic bayesian networks (dbns

by Jörg Rett, Jorge Dias - ICIAR 2006. LNCS , 2006
"... Abstract. This paper presents a framework for gesture recognition by modeling a system based on Dynamic Bayesian Networks (DBNs) from a Marionette point of view. To incorporate human qualities like anticipation and empathy inside the perception system of a social robot remains, so far an open issue. ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Abstract. This paper presents a framework for gesture recognition by modeling a system based on Dynamic Bayesian Networks (DBNs) from a Marionette point of view. To incorporate human qualities like anticipation and empathy inside the perception system of a social robot remains, so far an open issue. It is our goal to search for ways of implementation and test the feasibility. Towards this end we started the development of the guide robot ’Nicole ’ equipped with a monocular camera and an inertial sensor to observe its environment. The context of interaction is a person performing gestures and ’Nicole ’ reacting by means of audio output and motion. In this paper we present a solution to the gesture recognition task based on Dynamic Bayesian Network (DBN). We show that using a DBN is a human-like concept of recognizing gestures that encompass the quality of anticipation through the concept of prediction and update. A novel approach is used by incorporating a marionette model in the DBN as a trade-off between simple constant acceleration models and complex articulated models. 1
(Show Context)

Citation Context

...The process of prediction and update represents an intrinsic implementation of the mental concept of anticipation. Furthermore these methods have already proven their usability in gesture recognition =-=[3,4]-=-. To enhance the quality of inference and learning we introduce the marionette concept as a physical model of human motion to support the probabilistic model. The concept which was inspired by researc...

Vision And Learning For Intelligent Human-Computer Interaction

by Ying Wu , 2001
"... It was a dream to make computers see. The research in computer vision provides promising technologies to capture, analyze, transmit, retrieve and interpret visual information. However, due to the richness and large variations in the visual inputs, the practice of many statistical learning techniques ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
It was a dream to make computers see. The research in computer vision provides promising technologies to capture, analyze, transmit, retrieve and interpret visual information. However, due to the richness and large variations in the visual inputs, the practice of many statistical learning techniques for visual motion capturing and recognition are confronted by some similar problems, such that making intelligent and visually capable machines is still a challenging task. This dissertation concentrates on two important problems: capturing and recognizing human motion in video sequences, which are crucial for the research and applications of intelligent human computer interaction, multimedia communication, and smart environments.
(Show Context)

Citation Context

...arities between sign languages and spoken languages, the hidden Markov model (HMM) and its variants are also used to model the hand dynamics [27, 28]. As a generalization of HMM, dynamic Bayesian net =-=[29]-=- is another promising approach to model the hand dynamics. These methods are essentially learning methods that learn the intrinsic dynamics from a set of training data. The knowledge of dynamics and s...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University