Results 1 - 10
of
35
Visual tracking via incremental log-euclidean riemannian subspace learning
- In Proceedings IEEE Conference Computer Vision and Pattern Recognition
, 2008
"... Recently, a novel Log-Euclidean Riemannian metric [28] is proposed for statistics on symmetric positive definite (SPD) matrices. Under this metric, distances and Riemannian means take a much simpler form than the widely used affine-invariant Riemannian metric. Based on the Log-Euclidean Riemannian m ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
(Show Context)
Recently, a novel Log-Euclidean Riemannian metric [28] is proposed for statistics on symmetric positive definite (SPD) matrices. Under this metric, distances and Riemannian means take a much simpler form than the widely used affine-invariant Riemannian metric. Based on the Log-Euclidean Riemannian metric, we develop a tracking framework in this paper. In the framework, the covariance matrices of image features in the five modes are used to represent object appearance. Since a nonsingular covariance matrix is a SPD matrix lying on a connected Riemannian manifold, the Log-Euclidean Riemannian metric is used for statistics on the covariance matrices of image features. Further, we present an effective online Log-Euclidean Riemannian subspace learning algorithm which models the appearance changes of an object by incrementally learning a low-order Log-Euclidean eigenspace representation through adaptively updating the sample mean and eigenbasis. Tracking is then led by the Bayesian state inference framework in which a particle filter is used for propagating sample distributions over the time. Theoretic analysis and experimental evaluations demonstrate the promise and effectiveness of the proposed framework. 1.
Intrinsic Mean Shift for Clustering on Stiefel and Grassmann Manifolds
"... The mean shift algorithm, which is a nonparametric density estimator for detecting the modes of a distribution on a Euclidean space, was recently extended to operate on analytic manifolds. The extension is extrinsic in the sense that the inherent optimization is performed on the tangent spaces of th ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
(Show Context)
The mean shift algorithm, which is a nonparametric density estimator for detecting the modes of a distribution on a Euclidean space, was recently extended to operate on analytic manifolds. The extension is extrinsic in the sense that the inherent optimization is performed on the tangent spaces of these manifolds. This approach specifically requires the use of the exponential map at each iteration. This paper presents an alternative mean shift formulation, which performs the iterative optimization “on ” the manifold of interest and intrinsically locates the modes via consecutive evaluations of a mapping. In particular, these evaluations constitute a modified gradient ascent scheme that avoids the computation of the exponential maps for Stiefel and Grassmann manifolds. The performance of our algorithm is evaluated by conducting extensive comparative studies on synthetic data as well as experiments on object categorization and segmentation of multiple motions. 1.
A Survey of Appearance Models in Visual Object Tracking
"... Visual object tracking is a significant computer vision task which can be applied to many domains such as visual surveillance, human computer interaction, and video compression. Despite extensive research on this topic, it still suffers from difficulties in handling complex object appearance changes ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Visual object tracking is a significant computer vision task which can be applied to many domains such as visual surveillance, human computer interaction, and video compression. Despite extensive research on this topic, it still suffers from difficulties in handling complex object appearance changes caused by factors such as illumination variation, partial occlusion, shape deformation, and camera motion. Therefore, effective modeling of the 2D appearance of tracked objects is a key issue for the success of a visual tracker. In the literature, researchers have proposed a variety of 2D appearance models. To help readers swiftly learn the recent advances in 2D appearance models for visual object tracking, we contribute this survey, which provides a detailed review of the existing 2D appearance models. In particular, this survey takes a module-based architecture that enables readers to easily grasp the key points of visual object tracking. In this survey, we first decompose the problem of appearance modeling into two different processing stages: visual representation and statistical modeling. Then, different 2D appearance models are categorized and discussed with respect to their composition modules. Finally, we address several issues of interest as well as the remaining challenges for future research on this topic. The contributions of this survey are four-fold. First, we review the literature of visual representations according to their feature-construction mechanisms (i.e., local and global). Second, the existing statistical modeling schemes for tracking-bydetection are reviewed according to their model-construction mechanisms: generative, discriminative, and hybrid generativediscriminative. Third, each type of visual representations or statistical modeling techniques is analyzed and discussed from
Online Empirical Evaluation of Tracking Algorithms
, 2009
"... Evaluation of tracking algorithms in the absence of ground truth is a challenging problem. There exist a variety of approaches for this problem, ranging from formal model validation techniques to heuristics that look for mismatches between track properties and the observed data. However, few of thes ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Evaluation of tracking algorithms in the absence of ground truth is a challenging problem. There exist a variety of approaches for this problem, ranging from formal model validation techniques to heuristics that look for mismatches between track properties and the observed data. However, few of these methods scale up to the task of visual tracking where the models are usually non-linear and complex, and typically lie in a high dimensional space. Further, scenarios that cause track failures and/or poor tracking performance are also quite diverse for the visual tracking problem. In this paper, we propose an online performance evaluation strategy for tracking systems based on particle filters using a time-reversed Markov chain. The keu intuition of our proposed methodology relies on the timereversible nature of physical motion exhibited by most objects, which in turn should be possessed by a good tracker. In the presence of tracking failures due to occlusion, low SNR or modeling errors, this reversible nature of the tracker is violated. We use this property for detection of track failures. To evaluate the performance of the tracker at time instant t, we use the posterior of the tracking algorithm to initialize a time-reversed Markov chain. We compute the posterior density of track parameters at the starting time t = 0 by filtering back in time to the initial time instant. The distance between the
Tracking as segmentation of spatial-temporal volumes by anisotropic weighted TV
- In Energy Minimization Methods for Computer Vision and Pattern Recognition
, 2009
"... Abstract. Tracking is usually interpreted as finding an object in single consecutive frames. Regularization is done by enforcing temporal smoothness of appearance, shape and motion. We propose a tracker, by interpreting the task of tracking as segmentation of a volume in 3D. Inherently temporal and ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Tracking is usually interpreted as finding an object in single consecutive frames. Regularization is done by enforcing temporal smoothness of appearance, shape and motion. We propose a tracker, by interpreting the task of tracking as segmentation of a volume in 3D. Inherently temporal and spatial regularization is unified in a single regularization term. Segmentation is done by a variational approach using anisotropic weighted Total Variation (TV) regularization. The proposed convex energy is solved globally optimal by a fast primal-dual algorithm. Any image feature can be used in the segmentation cue of the proposed Mumford-Shah like data term. As a proof of concept we show experiments using a simple color-based appearance model. As demonstrated in the experiments, our tracking approach is able to handle large variations in shape and size, as well as partial and complete occlusions. 1
Detecting Abnormal Events via Hierarchical Dirichlet Processes
"... Abstract. Detecting abnormal event from video sequences is an important problem in computer vision and pattern recognition and a large number of algorithms have been devised to tackle this problem. Previous state-based approaches all suffer from the problem of deciding the appropriate number of stat ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Abstract. Detecting abnormal event from video sequences is an important problem in computer vision and pattern recognition and a large number of algorithms have been devised to tackle this problem. Previous state-based approaches all suffer from the problem of deciding the appropriate number of states and it is often difficult to do so except using a trial-and-error approach, which may be infeasible in real-world applications. Yet in this paper, we have proposed a more accurate and flexible algorithm for abnormal event detection from video sequences. Our three-phase approach first builds a set of weak classifiers using Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM), and then proposes an ensemble learning algorithm to filter out abnormal events. In the final phase, we will derive abnormal activity models from the normal activity model to reduce the FP (False Positive) rate in an unsupervised manner. The main advantage of our algorithm over previous ones is to naturally capture the underlying feature in abnormal event detection via HDP-HMM. Experimental results on a real-world video sequence dataset have shown the effectiveness of our algorithm. 1
Nonparametric density estimation on a graph: Learning framework, fast approximation and application in image segmentation
- In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on
, 2011
"... We present a novel framework for tree-structure embed-ded density estimation and its fast approximation for mode seeking. The proposed method could find diverse applica-tions in computer vision and feature space analysis. Given any undirected, connected and weighted graph, the density function is de ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
(Show Context)
We present a novel framework for tree-structure embed-ded density estimation and its fast approximation for mode seeking. The proposed method could find diverse applica-tions in computer vision and feature space analysis. Given any undirected, connected and weighted graph, the density function is defined as a joint representation of the feature space and the distance domain on the graph’s spanning tree. Since the distance domain of a tree is a constrained one, mode seeking can not be directly achieved by tradi-tional mean shift in both domain. we address this problem by introducing node shifting with force competition and its fast approximation. Our work is closely related to the pre-vious literature of nonparametric methods. One shall see, however, that the new formulation of this problem can lead to many advantages and new characteristics in its applica-tion, as will be illustrated later in this paper. 1.
Bag of textons for image segmentation via soft clustering and convex shift
- In Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR
, 2012
"... We propose an unsupervised image segmentation method based on texton similarity and mode seeking. The input im-age is first convolved with a filter-bank, followed by soft clustering on its filter response to generate textons. The in-put image is then superpixelized where each belonging pixel is rega ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We propose an unsupervised image segmentation method based on texton similarity and mode seeking. The input im-age is first convolved with a filter-bank, followed by soft clustering on its filter response to generate textons. The in-put image is then superpixelized where each belonging pixel is regarded as a voter and a soft voting histogram is con-structed for each superpixel by averaging its voters ’ pos-terior texton probabilities. We further propose a modified mode seeking method- called convex shift- to group su-perpixels and generate segments. The distribution of super-pixel histograms is modeled nonparametrically in the his-togram space, using Kullback-Leibler divergence (K-L di-vergence) and kernel density estimation. We show that each kernel shift step can be formulated as a convex optimization problem with linear constraints. Experiment on image seg-mentation shows that convex shift performs mode seeking effectively on an enforced histogram structure, grouping vi-sually similar superpixels. With the incorporation of texton and soft voting, our method generates reasonably good seg-mentation results on natural images with relatively complex contents, showing significant superiority over traditional mode seeking based segmentation methods, while outper-forming or being comparable to state of the art methods. 1.
Biologically Inspired Object Tracking Using Center-Surround Saliency Mechanisms
, 2013
"... A biologically inspired discriminant object tracker is proposed. It is argued that discriminant tracking is a consequence of top-down tuning of the saliency mechanisms that guide the deployment of visual attention. The principle of discriminant saliency is then used to derive a tracker that impleme ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
A biologically inspired discriminant object tracker is proposed. It is argued that discriminant tracking is a consequence of top-down tuning of the saliency mechanisms that guide the deployment of visual attention. The principle of discriminant saliency is then used to derive a tracker that implements a combination of center-surround saliency, a spatial spotlight of attention, and feature-based attention. In this framework, the tracking problem is formulated as one of continuous target-background classification, implemented in two stages. The first, or learning stage, combines a focus of attention (FoA) mechanism, and bottom-up saliency to identify a maximally discriminant set of features for target detection. The second, or detection stage, uses a feature-based attention mechanism and a target-tuned top-down discriminant saliency detector to detect the target. Overall, the tracker iterates between learning discriminant features from the target location in a video frame and detecting the location of the target in the next. The statistics of natural images are exploited to derive an implementation which is conceptually simple and computationally efficient. The saliency formulation is also shown to establish a unified framework for classifier design, target detection, automatic tracker initialization, and scale adaptation. Experimental results show that the proposed discriminant saliency tracker outperforms a number of state-of-the-art trackers in the literature.
SCALE AND SHAPE ADAPTIVE MEAN SHIFT OBJECT TRACKING IN VIDEO SEQUENCES
"... A new technique for object tracking based on the mean shift method is presented. Instead of using a symmetric kernel like in traditional mean shift tracking, the proposed tracking algorithm uses an asymmetric kernel which is retrieved from an object mask. During the mean shift iterations not only th ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
A new technique for object tracking based on the mean shift method is presented. Instead of using a symmetric kernel like in traditional mean shift tracking, the proposed tracking algorithm uses an asymmetric kernel which is retrieved from an object mask. During the mean shift iterations not only the new object position is located but also the kernel scale is altered according to the object scale, providing an initial adaption of the object shape. The final shape of the kernel is then obtained by segmenting the area inside and around the adapted kernel and distinguishing the object segments from the non-object segments. Thus, the object shape is tracked very well even if the object is performing out-ofplane rotations. 1.