• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Learning globally-consistent local distance functions for shape-based image retreval and classification (2007)

by A Frome, Y Singer, F Sha, J Malik
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 149
Next 10 →

In Defense of Nearest-Neighbor Based Image Classification

by Oren Boiman
"... State-of-the-art image classification methods require an intensive learning/training stage (using SVM, Boosting, etc.) In contrast, non-parametric Nearest-Neighbor (NN) based image classifiers require no training time and have other favorable properties. However, the large performance gap between th ..."
Abstract - Cited by 266 (2 self) - Add to MetaCart
State-of-the-art image classification methods require an intensive learning/training stage (using SVM, Boosting, etc.) In contrast, non-parametric Nearest-Neighbor (NN) based image classifiers require no training time and have other favorable properties. However, the large performance gap between these two families of approaches rendered NNbased image classifiers useless. We claim that the effectiveness of non-parametric NNbased image classification has been considerably undervalued. We argue that two practices commonly used in image classification methods, have led to the inferior performance of NN-based image classifiers: (i) Quantization of local image descriptors (used to generate “bags-of-words”, codebooks). (ii) Computation of ‘Image-to-Image ’ distance, instead of ‘Image-to-Class ’ distance. We propose a trivial NN-based classifier – NBNN, (Naive-Bayes Nearest-Neighbor), which employs NNdistances in the space of the local image descriptors (and not in the space of images). NBNN computes direct ‘Imageto-Class’ distances without descriptor quantization. We further show that under the Naive-Bayes assumption, the theoretically optimal image classifier can be accurately approximated by NBNN. Although NBNN is extremely simple, efficient, and requires no learning/training phase, its performance ranks among the top leading learning-based image classifiers. Empirical comparisons are shown on several challenging databases (Caltech-101,Caltech-256 and Graz-01). 1.
(Show Context)

Citation Context

...metric Blur) [27], ‘SPM NN’ (NN-Image with Spatial Pyramids Match) [27]. (b) Multiple descriptor type methods: ‘NBNN (5 Desc)’, ‘Bosch Trees’ (with ROI Optimization) [5], ‘Bosch SVM’ [6], ‘LearnDist’ =-=[11]-=-, ‘SKM’ [15], ‘Varma’ [27], ‘KTA’ [18]. Our multi-descriptor NBNN algorithm performs even better (72.8% on 15 labelled images). ‘GB Vote NN’ [3] uses an image-to-class NN-based voting scheme (without ...

Computer Vision: Algorithms and Applications

by Richard Szeliski , 2010
"... ..."
Abstract - Cited by 252 (2 self) - Add to MetaCart
Abstract not found

Recognizing Indoor Scenes

by Ariadna Quattoni, Antonio Torralba
"... Indoor scene recognition is a challenging open problem in high level vision. Most scene recognition models that work well for outdoor scenes perform poorly in the indoor domain. The main difficulty is that while some indoor scenes (e.g. corridors) can be well characterized by global spatial properti ..."
Abstract - Cited by 167 (3 self) - Add to MetaCart
Indoor scene recognition is a challenging open problem in high level vision. Most scene recognition models that work well for outdoor scenes perform poorly in the indoor domain. The main difficulty is that while some indoor scenes (e.g. corridors) can be well characterized by global spatial properties, others (e.g, bookstores) are better characterized by the objects they contain. More generally, to address the indoor scenes recognition problem we need a model that can exploit local and global discriminative information. In this paper we propose a prototype based model that can successfully combine both sources of information. To test our approach we created a dataset of 67 indoor scenes categories (the largest available) covering a wide range of domains. The results show that our approach can significantly outperform a state of the art classifier for the task. 1.
(Show Context)

Citation Context

...ntaining similar objects must have similar scene labels and that some objects are more important than others in defining a scene’s identity. Our work is related to work on learning distance functions =-=[4, 6, 8]-=- for visual recognition. Both methods learn to combine local or elementary distance functions. The are two main differences between their approach an ours. First, their method learns a weighted combin...

Is that you? Metric learning approaches for face identification

by Matthieu Guillaumin, Jakob Verbeek, Cordelia Schmid - In ICCV , 2009
"... Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a log ..."
Abstract - Cited by 159 (8 self) - Add to MetaCart
Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and (b) a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN). We evaluate our approaches on the Labeled Faces in the Wild data set, a large and very challenging data set of faces from Yahoo! News. The evaluation protocol for this data set defines a restricted setting, where a fixed set of positive and negative image pairs is given, as well as an unrestricted one, where faces are labelled by their identity. We are the first to present results for the unrestricted setting, and show that our methods benefit from this richer training data, much more so than the current state-of-the-art method. Our results of 79.3 % and 87.5 % correct for the restricted and unrestricted setting respectively, significantly improve over the current state-of-the-art result of 78.5%. Confidence scores obtained for face identification can be used for many applications e.g. clustering or recognition from a single training example. We show that our learned metrics also improve performance for these tasks. 1.
(Show Context)

Citation Context

...For this second method we also use a learned metric, albeit one that is optimised for kNN classification [23]. 1Metric learning has received a lot of attention, for recent work in this area see e.g. =-=[1, 6, 9, 10, 23, 25]-=-. Most methods learn a Mahalanobis metric based on an objective function defined by means of a labelled training set, or from sets of positive (same class) and negative (different class) pairs. The di...

Relative attributes

by Devi Parikh, Kristen Grauman - In Proceedings of ICCV ’11 , 2011
"... Human-nameable visual “attributes ” can benefit various recognition tasks. However, existing techniques restrict these properties to categorical labels (for example, a person is ‘smiling ’ or not, a scene is ‘dry ’ or not), and thus fail to capture more general semantic relationships. We propose to ..."
Abstract - Cited by 151 (20 self) - Add to MetaCart
Human-nameable visual “attributes ” can benefit various recognition tasks. However, existing techniques restrict these properties to categorical labels (for example, a person is ‘smiling ’ or not, a scene is ‘dry ’ or not), and thus fail to capture more general semantic relationships. We propose to model relative attributes. Given training data stating how object/scene categories relate according to different attributes, we learn a ranking function per attribute. The learned ranking functions predict the relative strength of each property in novel images. We then build a generative model over the joint space of attribute ranking outputs, and propose a novel form of zero-shot learning in which the supervisor relates the unseen object category to previously seen objects via attributes (for example, ‘bears are furrier than giraffes’). We further show how the proposed relative attributes enable richer textual descriptions for new images, which in practice are more precise for human interpretation. We demonstrate the approach on datasets of faces and natural scenes, and show its clear advantages over traditional binary attribute prediction for these new tasks. 1.
(Show Context)

Citation Context

...er preferences (often captured via click-data) are incorporated to learn a ranking function with the goal of retrieving more relevant images in the top search results. Learned distance metrics (e.g., =-=[27, 28]-=-) can induce a ranking on images; however,this ranking is also specific to a query image, and typically intended for nearest-neighbor-based classifiers. Our work learns a ranking function on images b...

A New Baseline for Image Annotation

by Ameesh Makadia, Vladimir Pavlovic, Sanjiv Kumar
"... Abstract. Automatically assigning keywords to images is of great interest as it allows one to index, retrieve, and understand large collections of image data. Many techniques have been proposed for image annotation in the last decade that give reasonable performance on standard datasets. However, mo ..."
Abstract - Cited by 138 (0 self) - Add to MetaCart
Abstract. Automatically assigning keywords to images is of great interest as it allows one to index, retrieve, and understand large collections of image data. Many techniques have been proposed for image annotation in the last decade that give reasonable performance on standard datasets. However, most of these works fail to compare their methods with simple baseline techniques to justify the need for complex models and subsequent training. In this work, we introduce a new baseline technique for image annotation that treats annotation as a retrieval problem. The proposed technique utilizes low-level image features and a simple combination of basic distances to find nearest neighbors of a given image. The keywords are then assigned using a greedy label transfer mechanism. The proposed baseline outperforms the current state-of-the-art methods on two standard and one large Web dataset. We believe that such a baseline measure will provide a strong platform to compare and better understand future annotation techniques. 1
(Show Context)

Citation Context

...orate multiple distance measures, possibly defined over distinct feature spaces. Recently, combining different distances or kernels has been shown to yield good performance in object recognition task =-=[13]-=-. In this work, we explore two different ways of linearly combining different distances to create the baseline measures. The first one simply computes the average of different distances after scaling ...

Recognition using Regions

by Chunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik
"... This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have ..."
Abstract - Cited by 106 (5 self) - Add to MetaCart
This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have not been popular as features due to their sensitivity to segmentation errors. In this paper, we start by producing a robust bag of overlaid regions for each image using Arbeláez et al., CVPR 2009. Each region is represented by a rich set of image cues (shape, color and texture). We then learn region weights using a max-margin framework. In detection and segmentation, we apply a generalized Hough voting scheme to generate hypotheses of object locations, scales and support, followed by a verification classifier and a constrained segmenter on each hypothesis. The proposed approach significantly outperforms the state of the art on the ETHZ shape database (87.1 % average detection rate compared to Ferrari et al.’s 67.2%), and achieves competitive performance on the Caltech 101 database.
(Show Context)

Citation Context

...y published approaches in Figure 9. Figure 9. Mean recognition rate (%) over number of training images per category in Caltech 101. With 15 and 30 training images per category, our method outperforms =-=[14, 15, 33, 13]-=- and [19] but [5]. 6. Conclusion In this paper, we have presented a unified framework for object detection, segmentation, and classification using regions. Building on a novel region segmentation algo...

Fast Image Search for Learned Metrics

by Prateek Jain, et al.
"... We introduce a method that enables scalable image search for learned metrics. Given pairwise similarity and dissimilarity constraints between some images, we learn a Mahalanobis distance function that captures the images’ underlying relationships well. To allow sub-linear time similarity search unde ..."
Abstract - Cited by 103 (11 self) - Add to MetaCart
We introduce a method that enables scalable image search for learned metrics. Given pairwise similarity and dissimilarity constraints between some images, we learn a Mahalanobis distance function that captures the images’ underlying relationships well. To allow sub-linear time similarity search under the learned metric, we show how to encode the learned metric parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality make it infeasible to learn an explicit weighting over the feature dimensions. We demonstrate the approach applied to a variety of image datasets. Our learned metrics improve accuracy relative to commonly-used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases.
(Show Context)

Citation Context

...d example-based approaches to pose estimation [23, 2] seek to leverage extremely large image collections, while nearest neighbor classifiers are frequently employed for recognition and shape matching =-=[31, 10]-=-. For most such tasks, the quality of the results relies heavily on the chosen image representation and the distance metric used to compare examples. Unfortunately, preferred representations tend to b...

Visual Rank: applying Page Rank to large-scale image search

by Yushi Jing, Shumeet Baluja - IEEE Trans. Pattern Analysis and Machine Intelligence , 2008
"... Abstract—Because of the relative ease in understanding and processing text, commercial image-search systems often rely on techniques that are largely indistinguishable from text search. Recently, academic studies have demonstrated the effectiveness of employing image-based features to provide either ..."
Abstract - Cited by 96 (4 self) - Add to MetaCart
Abstract—Because of the relative ease in understanding and processing text, commercial image-search systems often rely on techniques that are largely indistinguishable from text search. Recently, academic studies have demonstrated the effectiveness of employing image-based features to provide either alternative or additional signals to use in this process. However, it remains uncertain whether such techniques will generalize to a large number of popular Web queries and whether the potential improvement to search quality warrants the additional computational cost. In this work, we cast the image-ranking problem into the task of identifying “authority ” nodes on an inferred visual similarity graph and propose VisualRank to analyze the visual link structures among images. The images found to be “authorities ” are chosen as those that answer the image-queries well. To understand the performance of such an approach in a real system, we conducted a series of large-scale experiments based on the task of retrieving images for 2,000 of the most popular products queries. Our experimental results show significant improvement, in terms of user satisfaction and relevancy, in comparison to the most recent Google Image Search results. Maintaining modest computational cost is vital to ensuring that this procedure can be used in practice; we describe the techniques required to make this system practical for large-scale deployment in commercial search engines. Index Terms—Image ranking, content-based image retrieval, eigenvector centrality, graph theory. Ç
(Show Context)

Citation Context

...ze image similarities through domain engineering. For example, similarity computations that capture higher order feature dependencies 1 and learning techniques can be efficiently employed [11], [12], =-=[13]-=-, [14]. Further, even nonvisual information, such as user-generated covisitation [15], [16] statistics, can be easily combined with visual features to make similarity scores more semantically relevant...

Fast solvers and efficient implementations for distance metric learning

by Kilian Q. Weinberger, Lawrence K. Saul - In ICML , 2008
"... In this paper we study how to improve nearest neighbor classification by learning a Mahalanobis distance metric. We build on a recently proposed framework for distance metric learning known as large margin nearest neighbor (LMNN) classification. Our paper makes three contributions. First, we describ ..."
Abstract - Cited by 85 (7 self) - Add to MetaCart
In this paper we study how to improve nearest neighbor classification by learning a Mahalanobis distance metric. We build on a recently proposed framework for distance metric learning known as large margin nearest neighbor (LMNN) classification. Our paper makes three contributions. First, we describe a highly efficient solver for the particular instance of semidefinite programming that arises in LMNN classification; our solver can handle problems with billions of large margin constraints in a few hours. Second, we show how to reduce both training and testing times using metric ball trees; the speedups from ball trees are further magnified by learning low dimensional representations of the input space. Third, we show how to learn different Mahalanobis distance metrics in different parts of the input space. For large data sets, the use of locally adaptive distance metrics leads to even lower error rates. 1.
(Show Context)

Citation Context

...008 by the author(s)/owner(s). plistic, many researchers have begun to ask how to learn or adapt the distance metric itself in order to achieve better results (Xing et al., 2002; Chopra et al., 2005; =-=Frome et al., 2007-=-). Distance metric learning is an emerging area of statistical learning in which the goal is to induce a more powerful distance metric from labeled examples. The simplest instance of this problem aris...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University