Results 1 - 10
of
247
Kernel-Based Object Tracking
, 2003
"... A new approach toward target representation and localization, the central component in visual tracking of non-rigid objects, is proposed. The feature histogram based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity fu ..."
Abstract
-
Cited by 900 (4 self)
- Add to MetaCart
(Show Context)
A new approach toward target representation and localization, the central component in visual tracking of non-rigid objects, is proposed. The feature histogram based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated using the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking. Keywords: non-rigid object tracking; target localization and representation; spatially-smooth similarity function; Bhattacharyya coefficient; face tracking. 1
Learning to detect natural image boundaries using local brightness, color, and texture cues
- PAMI
, 2004
"... The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness, color, and texture associated with natural boundaries. In order to combine the information from these fe ..."
Abstract
-
Cited by 625 (18 self)
- Add to MetaCart
The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness, color, and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, we train a classifier using human labeled images as ground truth. The output of this classifier provides the posterior probability of a boundary at each image location and orientation. We present precision-recall curves showing that the resulting detector significantly outperforms existing approaches. Our two main results are 1) that cue combination can be performed adequately with a simple linear model and 2) that a proper, explicit treatment of texture is required to detect boundaries in natural images.
Shape Distributions
- ACM Transactions on Graphics
, 2002
"... this paper, we propose and analyze a method for computing shape signatures for arbitrary (possibly degenerate) 3D polygonal models. The key idea is to represent the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of an object. The pr ..."
Abstract
-
Cited by 295 (2 self)
- Add to MetaCart
(Show Context)
this paper, we propose and analyze a method for computing shape signatures for arbitrary (possibly degenerate) 3D polygonal models. The key idea is to represent the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of an object. The primary motivation for this approach is to reduce the shape matching problem to the comparison of probability distributions, which is simpler than traditional shape matching methods that require pose registration, feature correspondence, or model fitting
Efficient Additive Kernels via Explicit Feature Maps
"... Maji and Berg [13] have recently introduced an explicit feature map approximating the intersection kernel. This enables efficient learning methods for linear kernels to be applied to the non-linear intersection kernel, expanding the applicability of this model to much larger problems. In this paper ..."
Abstract
-
Cited by 245 (9 self)
- Add to MetaCart
(Show Context)
Maji and Berg [13] have recently introduced an explicit feature map approximating the intersection kernel. This enables efficient learning methods for linear kernels to be applied to the non-linear intersection kernel, expanding the applicability of this model to much larger problems. In this paper we generalize this idea, and analyse a large family of additive kernels, called homogeneous, in a unified framework. The family includes the intersection, Hellinger’s, and χ2 kernels commonly employed in computer vision. Using the framework we are able to: (i) provide explicit feature maps for all homogeneous additive kernels along with closed form expression for all common kernels; (ii) derive corresponding approximate finitedimensional feature maps based on the Fourier sampling theorem; and (iii) quantify the extent of the approximation. We demonstrate that the approximations have indistinguishable performance from the full kernel on a number of standard datasets, yet greatly reduce the train/test times of SVM implementations. We show that the χ2 kernel, which has been found to yield the best performance in most applications, also has the most compact feature representation. Given these train/test advantages we are able to obtain a significant performance improvement over current state of the art results based on the intersection kernel. 1.
A sparse texture representation using local affine regions
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... This article introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. At the feature extraction stage, a sparse set of affine Harris and Laplacian regions is found in the im ..."
Abstract
-
Cited by 210 (15 self)
- Add to MetaCart
(Show Context)
This article introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. At the feature extraction stage, a sparse set of affine Harris and Laplacian regions is found in the image. Each of these regions can be thought of as a texture element having a characteristic elliptic shape and a distinctive appearance pattern. This pattern is captured in an affine-invariant fashion via a process of shape normalization followed by the computation of two novel descriptors, the spin image and the RIFT descriptor. When affine invariance is not required, the original elliptical shape serves as an additional discriminative feature for texture recognition. The proposed approach is evaluated in retrieval and classi-fication tasks using the entire Brodatz database and a publicly available collection of 1000 photographs of textured surfaces taken from different viewpoints.
Local Gabor binary pattern histogram sequence (LGBPHS): A novel non-statistical model for face representation and recognition
- in Proc. IEEE Int. Conf. Computer Vision
"... For years, researchers in face recognition area have been representing and recognizing faces based on subspace discriminant analysis or statistical learning. Nevertheless, these approaches are always suffering from the generalizability problem. This paper proposes a novel non-statistics based face r ..."
Abstract
-
Cited by 150 (11 self)
- Add to MetaCart
(Show Context)
For years, researchers in face recognition area have been representing and recognizing faces based on subspace discriminant analysis or statistical learning. Nevertheless, these approaches are always suffering from the generalizability problem. This paper proposes a novel non-statistics based face representation approach, Local Gabor Binary Pattern Histogram Sequence (LGBPHS), in which training procedure is unnecessary to construct the face model, so that the generalizability problem is naturally avoided. In this approach, a face image is modeled as a “histogram sequence ” by concatenating the histograms of all the local regions of all the local Gabor magnitude binary pattern maps. For recognition, histogram intersection is used to measure the similarity of different LGBPHSes and the nearest neighborhood is exploited for final classification. Additionally, we have further proposed to assign different weights for each histogram piece when measuring two LGBPHSes. Our experimental results on AR and FERET face database show the validity of the proposed approach especially for partially occluded face images, and more impressively, we have achieved the best result on FERET face database. 1.
Classification with Non-Metric Distances: Image Retrieval and Class Representation
, 2000
"... One of the key problems in appearance-based vision is understanding how to use a set of labeled images to classify new images. Classification systems that can model human performance, or that use robust image matching methods, often make use of similarity judgments that are non-metric; but when the ..."
Abstract
-
Cited by 105 (1 self)
- Add to MetaCart
(Show Context)
One of the key problems in appearance-based vision is understanding how to use a set of labeled images to classify new images. Classification systems that can model human performance, or that use robust image matching methods, often make use of similarity judgments that are non-metric; but when the triangle inequality is not obeyed, most existing pattern recognition techniques are not applicable. We note that exemplar-based (or nearest-neighbor) methods can be applied naturally when using a wide class of non-metric similarity functions. The key issue, however, is to find methods for choosing good representatives of a class that accurately characterize it. We show that existing condensing techniques for finding class representatives are ill-suited to deal with non-metric dataspaces. We then focus on developing techniques for solving this problem, emphasizing two points: First, we show that the distance between two images is not a good measure of how well one image can represent ...
Feature-based similarity search in 3D object databases
- ACM Computing Surveys
, 2005
"... The development of effective content-based multimedia search systems is an important research issue due to the growing amount of digital audio-visual information. In the case of images and video, the growth of digital data has been observed since the introduction of 2D capture devices. A similar dev ..."
Abstract
-
Cited by 84 (10 self)
- Add to MetaCart
The development of effective content-based multimedia search systems is an important research issue due to the growing amount of digital audio-visual information. In the case of images and video, the growth of digital data has been observed since the introduction of 2D capture devices. A similar development is expected for 3D data as
Learning Similarity Measure for Natural Image Retrieval With Relevance Feedback
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2002
"... A new scheme of learning similarity measure is proposed for content-based image retrieval (CBIR). It learns a boundary that separates the images in the database into two clusters. Images inside the boundary are ranked by their Euclidean distances to the query. The scheme is called constrained simila ..."
Abstract
-
Cited by 83 (5 self)
- Add to MetaCart
A new scheme of learning similarity measure is proposed for content-based image retrieval (CBIR). It learns a boundary that separates the images in the database into two clusters. Images inside the boundary are ranked by their Euclidean distances to the query. The scheme is called constrained similarity measure (CSM), which not only takes into consideration the perceptual similarity between images, but also significantly improves the retrieval performance of the Euclidean distance measure. Two techniques, support vector machine (SVM) and AdaBoost from machine learning, are utilized to learn the boundary. They are compared to see their differences in boundary learning. The positive and negative examples used to learn the boundary are provided by the user with relevance feedback. The CSM metric is evaluated in a large database of 10 009 natural images with an accurate ground truth. Experimental results demonstrate the usefulness and effectiveness of the proposed similarity measure for image retrieval.
Automated image annotation using global features and robust nonparametric density estimation
- In International ACM Conference on Image and Video Retrieval (CIVR
, 2005
"... Abstract. This paper describes a simple framework for automatically annotating images using non-parametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably an ..."
Abstract
-
Cited by 75 (21 self)
- Add to MetaCart
(Show Context)
Abstract. This paper describes a simple framework for automatically annotating images using non-parametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably annotating images. We report results on subsets of two photographic libraries, the Corel Photo Archive and the Getty Image Archive. We also show how the popular Earth Mover’s Distance measure can be effectively incorporated within this framework. 1