Results 1 - 10
of
139
Image analogies
, 2001
"... Figure 1 An image analogy. Our problem is to compute a new “analogous ” image B ′ that relates to B in “the same way ” as A ′ relates to A. Here, A, A ′ , and B are inputs to our algorithm, and B ′ is the output. The full-size images are shown in Figures 10 and 11. This paper describes a new framewo ..."
Abstract
-
Cited by 282 (8 self)
- Add to MetaCart
Figure 1 An image analogy. Our problem is to compute a new “analogous ” image B ′ that relates to B in “the same way ” as A ′ relates to A. Here, A, A ′ , and B are inputs to our algorithm, and B ′ is the output. The full-size images are shown in Figures 10 and 11. This paper describes a new framework for processing images by example, called “image analogies. ” The framework involves two stages: a design phase, in which a pair of images, with one image purported to be a “filtered ” version of the other, is presented as “training data”; and an application phase, in which the learned filter is applied to some new target image in order to create an “analogous” filtered result. Image analogies are based on a simple multiscale autoregression, inspired primarily by recent results in texture synthesis. By choosing different types of source image pairs as input, the framework supports a wide variety of “image filter ” effects, including traditional image filters, such as blurring or embossing; improved texture synthesis, in which some textures are synthesized with higher quality than by previous approaches; super-resolution, in which a higher-resolution image is inferred from a low-resolution source; texture transfer, in which images are “texturized ” with some arbitrary source texture; artistic filters, in which various drawing and painting styles are synthesized based on scanned real-world examples; and texture-by-numbers, in which realistic scenes, composed of a variety of textures, are created using a simple painting interface.
Interactive Control of Avatars Animated with Human Motion Data
, 2002
"... Real-time control of three-dimensional avatars is an important problem in the context of computer games and virtual environments. Avatar animation and control is difficult, however, because a large repertoire of avatar behaviors must be made available, and the user must be able to select from this s ..."
Abstract
-
Cited by 215 (26 self)
- Add to MetaCart
Real-time control of three-dimensional avatars is an important problem in the context of computer games and virtual environments. Avatar animation and control is difficult, however, because a large repertoire of avatar behaviors must be made available, and the user must be able to select from this set of behaviors, possibly with a low-dimensional input device. One appealing approach to obtaining a rich set of avatar behaviors is to collect an extended, unlabeled sequence of motion data appropriate to the application. In this paper, we show that such a motion database can be preprocessed for flexibility in behavior and efficient search and exploited for real-time avatar control. Flexibility is created by identifying plausible transitions between motion segments, and efficient search through the resulting graph structure is obtained through clustering. Three interface techniques are demonstrated for controlling avatar motion using this data structure: the user selects from a set of available choices, sketches a path through an environment, or acts out a desired motion in front of a video camera. We demonstrate the flexibility of the approach through four different applications and compare the avatar motion to directly recorded human motion.
Style-Based Inverse Kinematics
, 2004
"... This paper presents an inverse kinematics system based on a learned model of human poses. Given a set of constraints, our system can produce the most likely pose satisfying those constraints, in realtime. Training the model on different input data leads to different styles of IK. The model is repres ..."
Abstract
-
Cited by 132 (8 self)
- Add to MetaCart
This paper presents an inverse kinematics system based on a learned model of human poses. Given a set of constraints, our system can produce the most likely pose satisfying those constraints, in realtime. Training the model on different input data leads to different styles of IK. The model is represented as a probability distribution over the space of all possible poses. This means that our IK system can generate any pose, but prefers poses that are most similar to the space of poses in the training data. We represent the probability with a novel model called a Scaled Gaussian Process Latent Variable Model. The parameters of the model are all learned automatically; no manual tuning is required for the learning component of the system. We additionally describe a novel procedure for interpolating between styles. Our style-based
Implicit Probabilistic Models of Human Motion for Synthesis and Tracking Hedvig Sidenblen
- In European Conference on Computer Vision
, 2002
"... This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthe ..."
Abstract
-
Cited by 131 (3 self)
- Add to MetaCart
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution . These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set
Probabilistic Tracking in a Metric Space
- in ICCV
, 2001
"... A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent p ..."
Abstract
-
Cited by 111 (2 self)
- Add to MetaCart
A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent probabilistic mixture distributions of object configurations. Their use avoids tedious hand-construction of object models and problems with changes of topology. Using exemplars in place of a parameterized model poses several challenges, addressed here with what we call the "Metric Mixture" (M # ) approach. The M # model has several valuable properties. Principally, it provides alternatives to standard learning algorithms by allowing the use of metrics that are not embedded in a vector space. Secondly, it uses a noise model that is learned from training data. Lastly, it eliminates any need for an assumption of probabilistic pixelwise independence. Experiments demonstrate the effectiveness of the M # model in two domains: tracking walking people using chamfer distances on binary edge images and tracking mouth movements by means of a shuffle distance. 1
Trainable Videorealistic Speech Animation
- PROCEEDINGS OF SIGGRAPH 2002, SAN ANTONIO TEXAS
, 2002
"... We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from th ..."
Abstract
-
Cited by 110 (5 self)
- Add to MetaCart
We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned. The two key
Inferring 3d body pose from silhouettes using activity manifold learning
- In CVPR
, 2004
"... We aim to infer 3D body pose directly from human silhouettes. Given a visual input (silhouette), the objective is to recover the intrinsic body configuration, recover the view point, reconstruct the input and detect any spatial or temporal outliers. In order to recover intrinsic body configuration ( ..."
Abstract
-
Cited by 108 (11 self)
- Add to MetaCart
We aim to infer 3D body pose directly from human silhouettes. Given a visual input (silhouette), the objective is to recover the intrinsic body configuration, recover the view point, reconstruct the input and detect any spatial or temporal outliers. In order to recover intrinsic body configuration (pose) from the visual input (silhouette), we explicitly learn view-based representations of activity manifolds as well as learn mapping functions between such central representations and both the visual input space and the 3D body pose space. The body pose can be recovered in a closed form in two steps by projecting the visual input to the learned representations of the activity manifold, i.e., finding the point on the learned manifold representation corresponding to the visual input, followed by interpolating 3D pose. 1.
Motion Capture Assisted Animation: Texturing and Synthesis
, 2002
"... We discuss a method for creating animations that allows the animator to sketch an animation by setting a small number of keyframes on a fraction of the possible degrees of freedom. Motion capture data is then used to enhance the animation. Detail is added to degrees of freedom that were keyframed, a ..."
Abstract
-
Cited by 105 (3 self)
- Add to MetaCart
We discuss a method for creating animations that allows the animator to sketch an animation by setting a small number of keyframes on a fraction of the possible degrees of freedom. Motion capture data is then used to enhance the animation. Detail is added to degrees of freedom that were keyframed, a process we call texturing. Degrees of freedom that were not keyframed are synthesized. The method takes advantage of the fact that joint motions of an articulated figure are often correlated, so that given an incomplete data set, the missing degrees of freedom can be predicted from those that are present.
Spacetime faces: High resolution capture for modeling and animation
- IN ACM TRANSACTIONS ON GRAPHICS (PROC. OF ACM SIGGRAPH)
, 2004
"... We present an end-to-end system that goes from video sequences to high resolution, editable, dynamically controllable face models. The capture system employs synchronized video cameras and structured light projectors to record videos of a moving face from multiple viewpoints. A novel spacetime stere ..."
Abstract
-
Cited by 95 (7 self)
- Add to MetaCart
We present an end-to-end system that goes from video sequences to high resolution, editable, dynamically controllable face models. The capture system employs synchronized video cameras and structured light projectors to record videos of a moving face from multiple viewpoints. A novel spacetime stereo algorithm is introduced to compute depth maps accurately and overcome over-fitting deficiencies in prior work. A new template fitting and tracking procedure fills in missing data and yields point correspondence across the entire sequence without using markers. We demonstrate a datadriven, interactive method for inverse kinematics that draws on the large set of fitted templates and allows for posing new expressions by dragging surface points directly. Finally, we describe new tools that model the dynamics in the input sequence to enable new animations, created via key-framing or texture-synthesis techniques.
Inferring Body Pose without Tracking Body Parts
- IN CVPR
, 1999
"... A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e.,the projection onto the image plane) of a single articulated body in terms of the position of a predetermined ..."
Abstract
-
Cited by 87 (3 self)
- Add to MetaCart
A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e.,the projection onto the image plane) of a single articulated body in terms of the position of a predetermined set of joints. First, statistical segmentation of the human bodies from the background is performed and low-level visual features are found given the segmented body shape. The goal is to be able to map these, generally low level, visual features to body configurations. The system estimates different mappings, each one with a specific cluster in the visual feature space. Given a set of body motion sequences for training, unsupervised clustering is obtained via the Expectation Maximization algorithm. For each of the clusters, a function is estimated to build the mapping between low-level features to 2D pose. Given new visual features, a mapping from each cluster is performed to yield a set of possible poses. From this set, the system selects the most likely pose given the learned probability distribution and the visual feature similarity between hypothesis and input. Performance of the proposed approach is characterized using real and artificially generated body postures, showing promising results.

