Results 1 - 10
of
68
Automatic sign language analysis: A survey and the future beyond lexical meaning
- IN IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2005
"... Research in automatic analysis of sign language has largely focused on recognizing the lexical (or citation) form of sign gestures as they appear in continuous signing, and developing algorithms that scale well to large vocabularies. However, successful recognition of lexical signs is not sufficien ..."
Abstract
-
Cited by 122 (1 self)
- Add to MetaCart
(Show Context)
Research in automatic analysis of sign language has largely focused on recognizing the lexical (or citation) form of sign gestures as they appear in continuous signing, and developing algorithms that scale well to large vocabularies. However, successful recognition of lexical signs is not sufficient for a full understanding of sign language communication. Nonmanual signals and grammatical processes which result in systematic variations in sign appearance are integral aspects of this communication but have received comparatively little attention in the literature. In this survey, we examine data acquisition, feature extraction and classification methods employed for the analysis of sign language gestures. These are discussed with respect to issues such as modeling transitions between signs in continuous signing, modeling inflectional processes, signer independence, and adaptation. We further examine works that attempt to analyze nonmanual signals and discuss issues related to integrating these with (hand) sign gestures.We also discuss the overall progress toward a true test of sign recognition systems—dealing with natural signing by native signers. We suggest some future directions for this research and also point to contributions it can make to other fields of research. Web-based supplemental materials (appendicies) which contain several illustrative examples and videos of signing can be found at www.computer.org/publications/dlib.
A system for learning statistical motion patterns
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2006
"... permission from the publisher. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of th ..."
Abstract
-
Cited by 119 (1 self)
- Add to MetaCart
(Show Context)
permission from the publisher. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. © 2006 IEEE. Copyright and all rights therein are retained by authors or by other copyright holders. All persons downloading this information are expected to adhere to the terms and constraints invoked by copyright. This document or any part thereof may not be reposted without the explicit permission of the copyright holder. Citation for this copy:
Recognizing human actions in videos acquired by uncalibrated moving cameras
- in: IEEE International Conference on Computer Vision
, 2005
"... Most work in action recognition deals with sequences acquired by stationary cameras with fixed viewpoints. Due to the camera motion, the trajectories of the body parts contain not only the motion of the performing actor but also the motion of the camera. In addition to the camera motion, different v ..."
Abstract
-
Cited by 77 (3 self)
- Add to MetaCart
(Show Context)
Most work in action recognition deals with sequences acquired by stationary cameras with fixed viewpoints. Due to the camera motion, the trajectories of the body parts contain not only the motion of the performing actor but also the motion of the camera. In addition to the camera motion, different viewpoints of the same action in different environments result in different trajectories, which can not be matched using standard approaches. In order to handle these problems, we propose to use the multi-view geometry between two actions. However, well known epipolar geometry of the static scenes where the cameras are stationary is not suitable for our task. Thus, we propose to extend the standard epipolar geometry to the geometry of dynamic scenes where the cameras are moving. We demonstrate the versatility of the proposed geometric approach for recognition of actions in a number of challenging sequences. 1.
A Review of Vision Based Hand Gestures Recognition
- International Journal. Of Information Technology and Knowledge
"... With the ever-increasing diffusion of computers into the society, it is widely believed that present popular mode of interactions with computers (mouse and keyboard) will become a bottleneck in the effective utilization of information flow between the computers and the human. Vision based Gesture re ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
(Show Context)
With the ever-increasing diffusion of computers into the society, it is widely believed that present popular mode of interactions with computers (mouse and keyboard) will become a bottleneck in the effective utilization of information flow between the computers and the human. Vision based Gesture recognition has the potential to be a natural and powerful tool supporting efficient and intuitive interaction between the human and the computer. Visual interpretation of hand gestures can help in achieving the ease and naturalness desired for Human Computer Interaction (HCI). This has motivated many researchers in computer vision-based analysis and interpretation of hand gestures as a very active research area. We surveyed the literature on visual interpretation of hand gestures in the context of its role in HCI and various seminal works of researchers are emphasized. The purpose of this review is to introduce the field of gesture recognition as a mechanism for interaction with computers.
Sign language spotting with a threshold model based on conditional random fields
- IEEE TPAMI
"... Abstract—Sign language spotting is the task of detecting and recognizing signs in a signed utterance, in a set vocabulary. The difficulty of sign language spotting is that instances of signs vary in both motion and appearance. Moreover, signs appear within a continuous gesture stream, interspersed w ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Sign language spotting is the task of detecting and recognizing signs in a signed utterance, in a set vocabulary. The difficulty of sign language spotting is that instances of signs vary in both motion and appearance. Moreover, signs appear within a continuous gesture stream, interspersed with transitional movements between signs in a vocabulary and nonsign patterns (which include out-of-vocabulary signs, epentheses, and other movements that do not correspond to signs). In this paper, a novel method for designing threshold models in a conditional random field (CRF) model is proposed which performs an adaptive threshold for distinguishing between signs in a vocabulary and nonsign patterns. A short-sign detector, a hand appearance-based sign verification method, and a subsign reasoning method are included to further improve sign language spotting accuracy. Experiments demonstrate that our system can spot signs from continuous data with an 87.0 percent spotting rate and can recognize signs from isolated data with a 93.5 percent recognition rate versus 73.5 percent and 85.4 percent, respectively, for CRFs without a threshold model, short-sign detection, subsign reasoning, and hand appearance-based sign verification. Our system can also achieve a 15.0 percent sign error rate (SER) from continuous data and a 6.4 percent SER from isolated data versus 76.2 percent and 14.5 percent, respectively, for conventional CRFs. Index Terms—Sign language recognition, sign language spotting, conditional random field, threshold model. Ç 1
Real-time gesture recognition by learning and selective control of visual interest points
- IEEE Trans. on PAMI
, 2005
"... Abstract—For the real-time recognition of unspecified gestures by an arbitrary person, a comprehensive framework is presented that addresses two important problems in gesture recognition systems: selective attention and processing frame rate. To address the first problem, we propose the Quadruple Vi ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
(Show Context)
Abstract—For the real-time recognition of unspecified gestures by an arbitrary person, a comprehensive framework is presented that addresses two important problems in gesture recognition systems: selective attention and processing frame rate. To address the first problem, we propose the Quadruple Visual Interest Point Strategy. No assumptions are made with regard to scale or rotation of visual features, which are computed from dynamically changing regions of interest in a given image sequence. In this paper, each of the visual features is referred to as a visual interest point, to which a probability density function is assigned, and the selection is carried out. To address the second problem, we developed a selective control method to equip the recognition system with self-load monitoring and controlling functionality. Through evaluation experiments, we show that our approach provides robust recognition with respect to such factors as type of clothing, type of gesture, extent of motion trajectories, and individual differences in motion characteristics. In order to indicate the real-time performance and utility aspects of our approach, a gesture video system is developed that demonstrates full video-rate interaction with displayed image objects. Index Terms—Gesture recognition, selective control, visual interest points, Gaussian density feature, real-time interaction. 1
Tsotsos. Hand gesture recognition within a linguisticsbased framework
- In Proc. ECCV
, 2004
"... Abstract. An approach to recognizing hand gestures from a monocular temporal sequence of images is presented. Of particular concern is the representation and recognition of hand movements that are used in single handed American Sign Language (ASL). The approach exploits previous linguistic analysis ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
Abstract. An approach to recognizing hand gestures from a monocular temporal sequence of images is presented. Of particular concern is the representation and recognition of hand movements that are used in single handed American Sign Language (ASL). The approach exploits previous linguistic analysis of manual languages that decompose dynamic gestures into their static and dynamic components. The first level of decompo-sition is in terms of three sets of primitives, hand shape, location and movement. Further levels of decomposition involve the lexical and sen-tence levels and are part of our plan for future work. We propose and demonstrate that given a monocular gesture sequence, kinematic fea-tures can be recovered from the apparent motion that provide distinctive signatures for 14 primitive movements of ASL. The approach has been implemented in software and evaluated on a database of 592 gesture se-quences with an overall recognition rate of 86.00 % for fully automated processing and 97.13 % for manually initialized processing. 1
Robust person-independent visual sign language recognition
- In IbPRIA
, 2005
"... Abstract. Sign language recognition constitutes a challenging field of research in computer vision. Common problems like overlap, ambiguities, and minimal pairs occur frequently and require robust algorithms for feature extraction and processing. We present a system that performs person-dependent re ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Abstract. Sign language recognition constitutes a challenging field of research in computer vision. Common problems like overlap, ambiguities, and minimal pairs occur frequently and require robust algorithms for feature extraction and processing. We present a system that performs person-dependent recognition of 232 isolated signs with an accuracy of 99.3 % in a controlled environment. Person-independent recognition rates reach 44.1 % for 221 signs. An average performance of 87.8 % is achieved for six signers in various uncontrolled indoor and outdoor environments, using a reduced vocabulary of 18 signs. The system uses a background model to remove static areas from the input video on pixel level. In the tracking stage, multiple hypotheses are pursued in parallel to handle ambiguities and facilitate retrospective correction of errors. A winner hypothesis is found by applying high level knowledge of the human body, hand motion, and the signing process. Overlaps are resolved by template matching, exploiting temporally adjacent frames with no or less overlap. The extracted features are normalized for person-independence and robustness, and classified by Hidden Markov Models. 1
Region-Based Hierarchical Image Matching
- INT J COMPUT VIS
, 2007
"... This paper presents an approach to region-based hierarchical image matching, where, given two images, the goal is to identify the largest part in image 1 and its match in image 2 having the maximum similarity measure defined in terms of geometric and photometric properties of regions (e.g., area, b ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
This paper presents an approach to region-based hierarchical image matching, where, given two images, the goal is to identify the largest part in image 1 and its match in image 2 having the maximum similarity measure defined in terms of geometric and photometric properties of regions (e.g., area, boundary shape, and color), as well as region topology (e.g., recursive embedding of regions). To this end, each image is represented by a tree of recursively embedded regions, obtained by a multiscale segmentation algorithm. This allows us to pose image matching as the tree matching problem. To overcome imaging noise, one-to-one, many-to-one, and many-to-many node correspondences are allowed. The trees are first augmented with new nodes generated by merging adjacent sibling nodes, which produces directed acyclic graphs (DAGs). Then, transitive closures of the DAGs are constructed, and the tree matching problem reformulated as finding a bijection between the two transitive closures on DAGs, while preserving the connectivity and ancestor-descendant relationships of the original trees. The proposed approach is validated on real images showing similar objects, captured under different types of noise, including differences in lighting conditions, scales, or viewpoints, amidst limited occlusion and clutter.