Results 1 - 10
of
249
Recognizing Realistic Actions from Videos “in the Wild”
"... In this paper, we present a systematic framework for recognizing realistic actions from videos “in the wild. ” Such unconstrained videos are abundant in personal collections as well as on the web. Recognizing action from such videos has not been addressed extensively, primarily due to the tremendous ..."
Abstract
-
Cited by 220 (13 self)
- Add to MetaCart
(Show Context)
In this paper, we present a systematic framework for recognizing realistic actions from videos “in the wild. ” Such unconstrained videos are abundant in personal collections as well as on the web. Recognizing action from such videos has not been addressed extensively, primarily due to the tremendous variations that result from camera motion, background clutter, changes in object appearance, and scale, etc. The main challenge is how to extract reliable and informative features from the unconstrained videos. We extract both motion and static features from the videos. Since the raw features of both types are dense yet noisy, we propose strategies to prune these features. We use motion statistics to acquire stable motion features and clean static features. Furthermore, PageRank is used to mine the most informative static features. In order to further construct compact yet discriminative visual vocabularies, a divisive information-theoretic algorithm is employed to group semantically related features. Finally, AdaBoost is chosen to integrate all the heterogeneous yet complementary features for recognition. We have tested the framework on the KTH dataset and our own dataset consisting of 11 categories of actions collected from YouTube and personal videos, and have obtained impressive results for action recognition and action localization. 1.
Feature Correspondence via Graph Matching: Models and Global Optimization
"... Abstract. In this paper we present a new approach for establishing correspondences between sparse image features related by an unknown non-rigid mapping and corrupted by clutter and occlusion, such as points extracted from a pair of images containing a human figure in distinct poses. We formulate th ..."
Abstract
-
Cited by 121 (1 self)
- Add to MetaCart
Abstract. In this paper we present a new approach for establishing correspondences between sparse image features related by an unknown non-rigid mapping and corrupted by clutter and occlusion, such as points extracted from a pair of images containing a human figure in distinct poses. We formulate this matching task as an energy minimization problem by defining a complex objective function of the appearance and the spatial arrangement of the features. Optimization of this energy is an instance of graph matching, which is in general a NP-hard problem. We describe a novel graph matching optimization technique, which we refer to as dual decomposition (DD), and demonstrate on a variety of examples that this method outperforms existing graph matching algorithms. In the majority of our examples DD is able to find the global minimum within a minute. The ability to globally optimize the objective allows us to accurately learn the parameters of our matching model from training examples. We show on several matching tasks that our learned model yields results superior to those of state-of-the-art methods. 1
Balanced graph matching
- In NIPS
, 2006
"... Graph matching is a fundamental problem in Computer Vision and Machine Learning. We present two contributions. First, we give a new spectral relaxation technique for approximate solutions to matching problems, that naturally incorporates one-to-one or one-to-many constraints within the relaxation sc ..."
Abstract
-
Cited by 90 (4 self)
- Add to MetaCart
(Show Context)
Graph matching is a fundamental problem in Computer Vision and Machine Learning. We present two contributions. First, we give a new spectral relaxation technique for approximate solutions to matching problems, that naturally incorporates one-to-one or one-to-many constraints within the relaxation scheme. The second is a normalization procedure for existing graph matching scoring functions that can dramatically improve the matching accuracy. It is based on a reinterpretation of the graph matching compatibility matrix as a bipartite graph on edges for which we seek a bistochastic normalization. We evaluate our two contributions on a comprehensive test set of random graph matching problems, as well as on image correspondence problem. Our normalization procedure can be used to improve the performance of many existing graph matching algorithms, including spectral matching, graduated assignment and semidefinite programming. 1
A tensor-based algorithm for high-order graph matching
- In CVPR
, 2009
"... Abstract—This paper addresses the problem of establishing correspondences between two sets of visual features using higher-order constraints instead of the unary or pairwise ones used in classical methods. Concretely, the corresponding hypergraph matching problem is formulated as the maximization of ..."
Abstract
-
Cited by 83 (3 self)
- Add to MetaCart
(Show Context)
Abstract—This paper addresses the problem of establishing correspondences between two sets of visual features using higher-order constraints instead of the unary or pairwise ones used in classical methods. Concretely, the corresponding hypergraph matching problem is formulated as the maximization of a multilinear objective function over all permutations of the features. This function is defined by a tensor representing the affinity between feature tuples. It is maximized using a generalization of spectral techniques where a relaxed problem is first solved by a multi-dimensional power method, and the solution is then projected onto the closest assignment matrix. The proposed approach has been implemented, and it is compared to state-of-the-art algorithms on both synthetic and real data.
Learning Graph Matching
"... As a fundamental problem in pattern recognition, graph matching has found a variety of applications in the field of computer vision. In graph matching, patterns are modeled as graphs and pattern recognition amounts to finding a correspondence between the nodes of different graphs. There are many way ..."
Abstract
-
Cited by 81 (9 self)
- Add to MetaCart
(Show Context)
As a fundamental problem in pattern recognition, graph matching has found a variety of applications in the field of computer vision. In graph matching, patterns are modeled as graphs and pattern recognition amounts to finding a correspondence between the nodes of different graphs. There are many ways in which the problem has been formulated, but most can be cast in general as a quadratic assignment problem, where a linear term in the objective function encodes node compatibility functions and a quadratic term encodes edge compatibility functions. The main research focus in this theme is about designing efficient algorithms for solving approximately the quadratic assignment problem, since it is NP-hard. In this paper, we turn our attention to the complementary problem: how to estimate compatibility functions such that the solution of the resulting graph matching problem best matches the expected solution that a human would manually provide. We present a method for learning graph matching: the training examples are pairs of graphs and the “labels” are matchings between pairs of graphs. We present experimental results with real image data which give evidence that learning can improve the performance of standard graph matching algorithms. In particular, it turns out that linear assignment with such a learning scheme may improve over state-of-the-art quadratic assignment relaxations. This finding suggests that for a range of problems where quadratic assignment was thought to be essential for securing good results, linear assignment, which is far more efficient, could be just sufficient if learning is performed. This enables speed-ups of graph matching by up to 4 orders of magnitude while retaining state-of-the-art accuracy. 1.
A Survey on Shape Correspondence
, 2010
"... We present a review of the correspondence problem and its solution methods, targeting the computer graphics audience. With this goal in mind, we focus on the correspondence of geometric shapes represented by point sets, contours or triangle meshes. This survey is motivated by recent developments in ..."
Abstract
-
Cited by 78 (10 self)
- Add to MetaCart
We present a review of the correspondence problem and its solution methods, targeting the computer graphics audience. With this goal in mind, we focus on the correspondence of geometric shapes represented by point sets, contours or triangle meshes. This survey is motivated by recent developments in the field such as those requiring the correspondence of non-rigid or time-varying surfaces and a recent trend towards semantic shape analysis, of which shape correspondence is one of the central tasks. Establishing a meaningful shape correspondence is a difficult problem since it typically relies on an understanding of the structure of the shapes in question at both a local and global level, and sometimes also the shapes ’ functionality. However, despite its inherent complexity, shape correspondence is a recurrent problem and an essential component in numerous geometry processing applications. In this report, we discuss the different forms of the correspondence problem and review the main solution methods, aided by several classification criteria which can be used by the reader to objectively compare the methods. We finalize the report by discussing open problems and future perspectives.
Beyond local appearance: Category recognition from pairwise interactions of simple features
- In CVPR
, 2007
"... We present a discriminative shape-based algorithm for object category localization and recognition. Our method learns object models in a weakly-supervised fashion, with-out requiring the specification of object locations nor pixel masks in the training data. We represent object models as cliques of ..."
Abstract
-
Cited by 77 (9 self)
- Add to MetaCart
We present a discriminative shape-based algorithm for object category localization and recognition. Our method learns object models in a weakly-supervised fashion, with-out requiring the specification of object locations nor pixel masks in the training data. We represent object models as cliques of fully-interconnected parts, exploiting only the pairwise geometric relationships between them. The use of pairwise relationships enables our algorithm to suc-cessfully overcome several problems that are common to previously-published methods. Even though our algorithm can easily incorporate local appearance information from richer features, we purposefully do not use them in or-der to demonstrate that simple geometric relationships can match (or exceed) the performance of state-of-the-art object recognition algorithms.
Probabilistic Graph and Hypergraph Matching
"... We consider the problem of finding a matching between two sets of features, given complex relations among them, going beyond pairwise. Each feature set is modeled by a hypergraph where the complex relations are represented by hyper-edges. A match between the feature sets is then modeled as a hypergr ..."
Abstract
-
Cited by 67 (0 self)
- Add to MetaCart
(Show Context)
We consider the problem of finding a matching between two sets of features, given complex relations among them, going beyond pairwise. Each feature set is modeled by a hypergraph where the complex relations are represented by hyper-edges. A match between the feature sets is then modeled as a hypergraph matching problem. We derive the hyper-graph matching problem in a probabilistic setting represented by a convex optimization. First, we formalize a soft matching criterion that emerges from a probabilistic interpretation of the problem input and output, as opposed to previous methods that treat soft matching as a mere relaxation of the hard matching problem. Second, the model induces an algebraic relation between the hyper-edge weight matrix and the desired vertex-to-vertex probabilistic matching. Third, the model explains some of the graph matching normalization proposed in the past on a heuristic basis such as doubly stochastic normalizations of the edge weights. A key benefit of the model is that the global optimum of the matching criteria can be found via an iterative successive projection algorithm. The algorithm reduces to the well known Sinkhorn [15] row/column matrix normalization procedure in the special case when the two graphs have the same number of vertices and a complete matching is desired. Another benefit of our model is the straightforward scalability from graphs to hyper-graphs.
Discovering texture regularity as a higher-order correspondence problem
- In European Conference on Computer Vision (ECCV
"... Abstract. Understanding texture regularity in real images is a challeng-ing computer vision task. We propose a higher-order feature matching algorithm to discover the lattices of near-regular textures in real im-ages. The underlying lattice of a near-regular texture identifies all of the texels as w ..."
Abstract
-
Cited by 67 (7 self)
- Add to MetaCart
(Show Context)
Abstract. Understanding texture regularity in real images is a challeng-ing computer vision task. We propose a higher-order feature matching algorithm to discover the lattices of near-regular textures in real im-ages. The underlying lattice of a near-regular texture identifies all of the texels as well as the global topology among the texels. A key contri-bution of this paper is to formulate lattice-finding as a correspondence problem. The algorithm finds a plausible lattice by iteratively proposing texels and assigning neighbors between the texels. Our matching algo-rithm seeks assignments that maximize both pair-wise visual similarity and higher-order geometric consistency. We approximate the optimal as-signment using a recently developed spectral method. We successfully discover the lattices of a diverse set of unsegmented, real-world textures with significant geometric warping and large appearance variation among texels. 1
Discovering discriminative action parts from mid-level video representations
- In IEEE CVPR
"... We describe a mid-level approach for action recognition. From an input video, we extract salient spatio-temporal structures by forming clusters of trajectories that serve as candidates for the parts of an action. The assembly of these clusters into an action class is governed by a graphical model th ..."
Abstract
-
Cited by 60 (7 self)
- Add to MetaCart
(Show Context)
We describe a mid-level approach for action recognition. From an input video, we extract salient spatio-temporal structures by forming clusters of trajectories that serve as candidates for the parts of an action. The assembly of these clusters into an action class is governed by a graphical model that incorporates appearance and motion constraints for the individual parts and pairwise constraints for the spatio-temporal dependencies among them. During training, we estimate the model parameters discriminatively. During classification, we efficiently match the model to a video using discrete optimization. We validate the model’s classification ability in standard benchmark datasets and illustrate its potential to support a fine-grained analysis that not only gives a label to a video, but also identifies and localizes its constituent parts. 1.