• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Markerless motion capture of multiple characters using multi-view image segmentation (2013)

by Y Liu, J Gall, C Stoll, Q Dai, H-P Seidel, C Theobalt
Add To MetaCart

Tools

Sorted by:
Results 1 - 4 of 4

Body Parts Dependent Joint Regressors for Human Pose Estimation in Still Images

by Matthias Dantone, Juergen Gall, Ieee Christian Leistner, Luc Van Gool
"... Abstract—In this work, we address the problem of estimating 2d human pose from still images. Articulated body pose estimation is challenging due to the large variation in body poses and appearances of the different body parts. Recent methods that rely on the pictorial structure framework have shown ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Abstract—In this work, we address the problem of estimating 2d human pose from still images. Articulated body pose estimation is challenging due to the large variation in body poses and appearances of the different body parts. Recent methods that rely on the pictorial structure framework have shown to be very successful in solving this task. They model the body part appearances using discriminatively trained, independent part templates and the spatial relations of the body parts using a tree model. Within such a framework, we address the problem of obtaining better part templates which are able to handle a very high variation in appearance. To this end, we introduce parts dependent body joint regressors which are random forests that operate over two layers. While the first layer acts as an independent body part classifier, the second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts. This helps to overcome typical ambiguities of tree structures, such as self-similarities of legs and arms. In addition, we introduce a novel dataset termed FashionPose that contains over 7, 000 images with a challenging variation of body part appearances due to a large variation of dressing styles. In the experiments, we demonstrate that the proposed parts dependent joint regressors outperform independent classifiers or regressors. The method also performs better or similar to the state-of-the-art in terms of accuracy, while running with a couple of frames per second. Index Terms—Human pose estimation, fashion, random forest, regression, classification F 1
(Show Context)

Citation Context

...timation, fashion, random forest, regression, classification F 1 INTRODUCTION While current systems for human pose estimation achieve impressive results on depth data [1] or multicamera video footage =-=[2]-=-, human pose estimation from still images is still an unsolved task. In particular, images from the web impose many difficulties due to large variation of poses and dressing styles. In order to addres...

Free-viewpoint Video of Human Actors using Multiple Handheld Kinects

by Genzhi Ye, Yebin Liu, Yue Deng, Nils Hasler, Xiangyang Ji, Qionghai Dai, Christian Theobalt - IEEE T-SMC:B SPECIAL ISSUE ON COMPUTER VISION FOR RGB-D SENSORS: KINECT AND ITS APPLICATIONS
"... We present an algorithm for creating free-viewpoint video of interacting humans using three hand-held Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. S ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
We present an algorithm for creating free-viewpoint video of interacting humans using three hand-held Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Finally, texture recovery is achieved through jointly optimization on spatio-temporal RGB data using matrix completion. As opposed to previous methods, our algorithm succeeds on free-viewpoint video of human actors under general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.

Place Date Signature

by Matthias Straka, Eidesstattliche Erklärung , 2014
"... declare that I have authored this thesis independently, that I have not used other than the declared sources / resources, and that I have explicitly marked all material which has been quoted either literally or by content from the used sources. ..."
Abstract - Add to MetaCart
declare that I have authored this thesis independently, that I have not used other than the declared sources / resources, and that I have explicitly marked all material which has been quoted either literally or by content from the used sources.
(Show Context)

Citation Context

...mesh vertices rigged to the skeleton to estimate the pose of the model given the image data. This approach has been extended to work with multiple people if each person can be segmented independently =-=[99]-=-. All above methods use the skeleton for initialization of the mesh but not during shape adaptation itself. Recent developments in the mesh editing community are motivated by the desire to perform mod...

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014 1 Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

by Catalin Ionescu, Dragos Papava, Vlad Olaru, Cristian Sminchisescu
"... Abstract—We introduce a new dataset,Human3.6M, of3.6Million accurate 3DHuman poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models ..."
Abstract - Add to MetaCart
Abstract—We introduce a new dataset,Human3.6M, of3.6Million accurate 3DHuman poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms. Besides increasing the size of the datasets in the current state of the art by several orders of magnitude, we also aim to complement such datasets with a diverse set of motions and poses encountered as part of typical human activities (taking photos, talking on the phone, posing, greeting, eating, etc.), with additional synchronized image, human motion capture and time of flight (depth) data, and with accurate 3D body scans of all the subject actors involved. We also provide controlled mixed reality evaluation scenarios where 3D human models are animated using motion capture and inserted using correct 3D geometry, in complex real environments, viewed with moving cameras, and under occlusion. Finally, we provide a set of large scale statistical models and detailed evaluation baselines for the dataset illustrating its diversity and the scope for improvement by future work in the research community. Our experiments show that our best large scale model can leverage our full training set to obtain a 20 % improvement in performance compared to a training set of the scale of the largest existing public dataset for this problem. Yet the potential for improvement by leveraging higher capacity, more complex models with our large dataset, is substantially vaster and should stimulate future research. The dataset together with code for the associated large-scale learning models, features, visualization tools, as well as the evaluation server, is available online at
(Show Context)

Citation Context

...utomatic discriminative prediction [10], [11], [12], [13], [14], whereas others aim at model-image alignment [15], [16], [17], [18], [19], [20], [21], [22], [23] or accurate modeling of 3D shape[24], =-=[25]-=- or clothing[26]. This process is ongoing and was made possible by the availability of 3Dhumanmotion capture[8], [4],as well as human body scan datasets like the commercially available CAESAR, or smal...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University