Results 1 - 10
of
12
Object Recognition with Hierarchical Kernel Descriptors
- In Proc. of CVPR
, 2011
"... Kernel descriptors [1] provide a unified way to generate rich visual feature sets by turning pixel attributes into patch-level features, and yield impressive results on many object recognition tasks. However, best results with kernel descriptors are achieved using efficient match kernels in conjunct ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Kernel descriptors [1] provide a unified way to generate rich visual feature sets by turning pixel attributes into patch-level features, and yield impressive results on many object recognition tasks. However, best results with kernel descriptors are achieved using efficient match kernels in conjunction with nonlinear SVMs, which makes it impractical for large-scale problems. In this paper, we propose hierarchical kernel descriptors that apply kernel descriptors recursively to form image-level features and thus provide a conceptually simple and consistent way to generate imagelevel features from pixel attributes. More importantly, hierarchical kernel descriptors allow linear SVMs to yield stateof-the-art accuracy while being scalable to large datasets. They can also be naturally extended to extract features over depth images. We evaluate hierarchical kernel descriptors both on the CIFAR10 dataset and the new RGB-D Object Dataset consisting of segmented RGB and depth images of 300 everyday objects. 1.
Exploiting descriptor distances for precise image search,” Research report
, 2011
"... apport de recherche ..."
Optimizing Visual Vocabularies Using Soft Assignment Entropies
"... Abstract. The state of the art for large database object retrieval in images is based on quantizing descriptors of interest points into visual words. High similarity between matching image representations (as bags of words) is based upon the assumption that matched points in the two images end up in ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The state of the art for large database object retrieval in images is based on quantizing descriptors of interest points into visual words. High similarity between matching image representations (as bags of words) is based upon the assumption that matched points in the two images end up in similar words in hard assignment or in similar representations in soft assignment techniques. In this paper we study how ground truth correspondences can be used to generate better visual vocabularies. Matching of image patches can be done e.g. using deformable models or from estimating 3D geometry. For optimization of the vocabulary, we propose minimizing the entropies of soft assignment of points. We base our clustering on hierarchical k-splits. The results from our entropy based clustering are compared with hierarchical k-means. The vocabularies have been tested on real data with decreased entropy and increased true positive rate, as well as better retrieval performance. 1
Content-Based Retrieval in Endomicroscopy: Toward an Efficient Smart Atlas for Clinical Diagnosis
"... Abstract. In this paper we present the first Content-Based Image Retrieval (CBIR) framework in the field of in vivo endomicroscopy, with applications ranging from training support to diagnosis support. We propose to adjust the standard Bag-of-Visual-Words method for the retrieval of endomicroscopic ..."
Abstract
- Add to MetaCart
Abstract. In this paper we present the first Content-Based Image Retrieval (CBIR) framework in the field of in vivo endomicroscopy, with applications ranging from training support to diagnosis support. We propose to adjust the standard Bag-of-Visual-Words method for the retrieval of endomicroscopic videos. Retrieval performance is evaluated both indirectly from a classification point-of-view, and directly with respect to a perceived similarity ground truth. The proposed method significantly outperforms, on two different endomicroscopy databases, several stateof-the-art methods in CBIR. With the aim of building a self-training simulator, we use retrieval results to estimate the interpretation difficulty experienced by the endoscopists. Finally, by incorporating clinical knowledge about perceived similarity and endomicroscopy semantics, we are able: 1) to learn an adequate visual similarity distance and 2) to build visual-word-based semantic signatures that extract, from low-level visual features, a higher-level clinical knowledge expressed in the endoscopist own language. 1
1 Learning Semantic and Visual Similarity for Endomicroscopy Video Retrieval
"... Abstract—Content-Based Image Retrieval (CBIR) is a valuable computer vision technique which is increasingly being applied in the medical community for diagnosis support. However, traditional CBIR systems only deliver visual outputs, i.e. images having a similar appearance to the query, which is not ..."
Abstract
- Add to MetaCart
Abstract—Content-Based Image Retrieval (CBIR) is a valuable computer vision technique which is increasingly being applied in the medical community for diagnosis support. However, traditional CBIR systems only deliver visual outputs, i.e. images having a similar appearance to the query, which is not directly interpretable by the physicians. Our objective is to provide a system for endomicroscopy video retrieval which delivers both visual and semantic outputs that are consistent with each other. In a previous study, we developed an adapted bag-of-visualwords method for endomicroscopy retrieval, called “Dense-Sift”, that computes a visual signature for each video. In this study, we present a novel approach to complement visual similarity learningwithsemanticknowledgeextraction,inthefieldofinvivo endomicroscopy. We first leverage a semantic ground truth based on8binaryconcepts,inordertotransformthesevisualsignatures
Supervised Feature Quantization with Entropy Optimization
"... Feature quantization is a crucial component for efficient large scale image retrieval and object recognition. By quantizing local features into visual words, one hopes that features that match each other obtain the same word ID. Then, similarities between images can be measured with respect to the c ..."
Abstract
- Add to MetaCart
Feature quantization is a crucial component for efficient large scale image retrieval and object recognition. By quantizing local features into visual words, one hopes that features that match each other obtain the same word ID. Then, similarities between images can be measured with respect to the corresponding histograms of visual words. Given the appearance variations of local features, traditional quantization methods do not take into account the distribution of matched features. In this paper, we investigate how to encode additional prior information on the feature distribution via entropy optimization by leveraging ground truth correspondence data. We propose a computationally efficient optimization scheme for large scale vocabulary training. The results from our experiments suggest that entropyoptimized vocabulary performs better than unsupervised quantization methods in terms of recall and precision for feature matching. We also demonstrate the advantage of the optimized vocabulary for image retrieval. 1.
Contextual Synonym Dictionary for Visual Object Retrieval ∗
"... In this paper, we study the problem of visual object retrieval by introducing a dictionary of contextual synonyms to narrow down thesemantic gap in visual word quantization. The basic idea is to expandavisual word in the queryimage with its synonyms to boost the retrieval recall. Unlike the existing ..."
Abstract
- Add to MetaCart
In this paper, we study the problem of visual object retrieval by introducing a dictionary of contextual synonyms to narrow down thesemantic gap in visual word quantization. The basic idea is to expandavisual word in the queryimage with its synonyms to boost the retrieval recall. Unlike the existing worksuchas soft-quantization, whichonlyfocuses onthe Euclidean (l2) distance in descriptor space, we utilize the visual words which are more likely to describe visual objects with the same semantic meaning by identifying the words with similar contextual distributions (i.e. contextual synonyms). We describe the contextual distribution of a visual word using the statistics of both co-occurrence and spatial information averaged over all the image patches having this visual word, and propose an efficient system implementation to construct the contextual synonym dictionary for a large visual vocabulary. The whole construction process is unsupervised and the synonym dictionary can be naturally integrated intoastandardbag-of-feature image retrievalsystem. Experimental results on several benchmark datasets are quite promising. The contextual synonym dictionarybased expansion consistently outperforms the l2 distancebased soft-quantization, and advances the state-of-the-art performance remarkably.
Unsupervised Semantic Feature Discovery for Image Object Retrieval and Tag Refinement
"... Abstract—We have witnessed the exponential growth of images and videos with the prevalence of capture devices and the ease of social services such as Flickr and Facebook. Meanwhile, enormous media collections are along with rich contextual cues such as tags, geo-locations, descriptions, and time. To ..."
Abstract
- Add to MetaCart
Abstract—We have witnessed the exponential growth of images and videos with the prevalence of capture devices and the ease of social services such as Flickr and Facebook. Meanwhile, enormous media collections are along with rich contextual cues such as tags, geo-locations, descriptions, and time. To obtain desired images, users usually issue a query to a search engine using either

