• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A bayesian hierarchical model for learning natural scene categories (0)

by L Fei-Fei, P Perona
Venue:In CVPR, 2005. 6
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 948
Next 10 →

Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories

by Cordelia Schmid - In CVPR
"... This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. This technique works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting “spatial pyrami ..."
Abstract - Cited by 1923 (47 self) - Add to MetaCart
This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. This technique works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting “spatial pyramid ” is a simple and computationally efficient extension of an orderless bag-of-features image representation, and it shows significantly improved performance on challenging scene categorization tasks. Specifically, our proposed method exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories. The spatial pyramid framework also offers insights into the success of several recently proposed image descriptions, including Torralba’s “gist ” and Lowe’s SIFT descriptors. 1.
(Show Context)

Citation Context

... 16 × 16 pixel patches computed over a grid with spacing of 8 pixels. Our decision to use a dense regular grid instead of interest points was based on the comparative evaluation of Fei-Fei and Perona =-=[4]-=-, who have shown that dense features work better for scene classification. Intuitively, a dense image description is necessary to capture uniform regions such as sky, calm water, or road surface (to d...

Dynamic topic models

by David M. Blei, John D. Lafferty - In ICML , 2006
"... Scientists need new tools to explore and browse large collections of scholarly literature. Thanks to organizations such as JSTOR, which scan and index the original bound archives of many journals, modern scientists can search digital libraries spanning hundreds of years. A scientist, suddenly ..."
Abstract - Cited by 681 (29 self) - Add to MetaCart
Scientists need new tools to explore and browse large collections of scholarly literature. Thanks to organizations such as JSTOR, which scan and index the original bound archives of many journals, modern scientists can search digital libraries spanning hundreds of years. A scientist, suddenly
(Show Context)

Citation Context

...derlying topics which combined to form the documents. Such hierarchical probabilistic models are easily generalized to other kinds of data; for example, topic models have been used to analyze images (=-=Fei-Fei and Perona, 2005-=-; Sivic et al., 2005), biological data (Pritchard et al., 2000), and survey data (Erosheva, 2002). In an exchangeable topic model, the words of each docuAppearing in Proceedings of the 23 rd Internati...

LabelMe: A Database and Web-Based Tool for Image Annotation

by B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman , 2008
"... We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sha ..."
Abstract - Cited by 679 (46 self) - Add to MetaCart
We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sharing of such annotations. Using this annotation tool, we have collected a large dataset that spans many object categories, often containing multiple instances over a wide variety of images. We quantify the contents of the dataset and compare against existing state of the art datasets used for object recognition and detection. Also, we show how to extend the dataset to automatically enhance object labels with WordNet, discover object parts, recover a depth ordering of objects in a scene, and increase the number of labels using minimal user supervision and images from the web.
(Show Context)

Citation Context

...ttempt to solve a hard data association problem. More recently, it has become popular to apply “bag of word” techniques to discover object categories from images which are known to contain the object =-=[10, 5, 12, 11, 9]-=-. The relative success of such methods raises the question of whether we need images with more detailed annotation (which is more labor intensive to acquire than just captions). We argue that detailed...

Local features and kernels for classification of texture and object categories: a comprehensive study

by J. Zhang, S. Lazebnik, C. Schmid - International Journal of Computer Vision , 2007
"... Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations an ..."
Abstract - Cited by 653 (34 self) - Add to MetaCart
Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ 2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels and classifiers. We then conduct a comparative evaluation with several state-of-the-art recognition methods on four texture and five object databases. On most of these databases, our implementation exceeds the best reported results and achieves comparable performance on the rest. Finally, we investigate the influence of background correlations on recognition performance via extensive tests on the PASCAL database, for which ground-truth object localization information is available. Our experiments demonstrate that image representations based on distributions of local features are surprisingly effective for classification of texture and object images under challenging real-world conditions, including significant intra-class variations and substantial background clutter.
(Show Context)

Citation Context

...e “visual texture” of images containing objects using orderless bag-of-features models. Such models have proven to be effective for object classification [7, 61], unsupervised discovery of categories =-=[16, 51, 55]-=-, and video retrieval [56]. The success of orderless models for these object recognition tasks may be explained with the help of an analogy to bag-of-words models for text document classification [40,...

Linear spatial pyramid matching using sparse coding for image classification

by Jianchao Yang, Kai Yu, Yihong Gong, Thomas Huang - in IEEE Conference on Computer Vision and Pattern Recognition(CVPR , 2009
"... Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algo ..."
Abstract - Cited by 497 (21 self) - Add to MetaCart
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors. 1.
(Show Context)

Citation Context

...owed by experiment results in Sec. 5. Finally, Sec. 6 concludes our paper. 2. Related Work Over the years many works have been done to improve the traditional BoF model, such as generative methods in =-=[7, 21, 3, 1]-=- for modeling the co-occurrence of the codewords or descriptors, discriminative codebook learning in [10, 5, 19, 27] instead of standard unsupervised K-means clustering, and spatial pyramid matching k...

Unsupervised learning of human action categories using spatial-temporal words

by Juan Carlos Niebles, Hongcheng Wang, Li Fei-fei - In Proc. BMVC , 2006
"... Imagine a video taken on a sunny beach, can a computer automatically tell what is happening in the scene? Can it identify different human activities in the video, such as water surfing, people walking and lying on the beach? To automatically classify or localize different actions in video sequences ..."
Abstract - Cited by 494 (8 self) - Add to MetaCart
Imagine a video taken on a sunny beach, can a computer automatically tell what is happening in the scene? Can it identify different human activities in the video, such as water surfing, people walking and lying on the beach? To automatically classify or localize different actions in video sequences is very useful for a variety of tasks, such as video surveillance, objectlevel video summarization, video indexing, digital library organization, etc. However, it remains a challenging task for computers to achieve robust action recognition due to cluttered background, camera motion, occlusion, and geometric and photometric variances of objects. For example, in a live video of a skating competition, the skater moves rapidly across the rink, and the camera also moves to follow the skater. With moving camera, non-stationary background, and moving target, few vision algorithms could identify, categorize and
(Show Context)

Citation Context

...representation of spatial temporal words and an unsupervised approach during learning. Our method is motivated by the recent success of object detection/classification [18, 6] or scene categorization =-=[10]-=- from unlabeled static images. Two related models are generally used, i.e., probabilistic Latent Semantic Analysis (pLSA) by Hofmann [12] and Latent Dirichlet Allocation (LDA) by Blei et al. [3]. In t...

80 million tiny images: a large dataset for non-parametric object and scene recognition

by Antonio Torralba , Rob Fergus, William T. freeman - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... ..."
Abstract - Cited by 421 (18 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...also found in a scene recognition task [31]. We evaluate the scene recognition performance of humans as the image resolution is decreased. We used a data set of 15 scenes that was taken from those in =-=[14]-=-, [24], [32]. Each image was shown at one of five possible resolutions (82, 162 , 322 , 642 , and 2562 pixels) and the participant task was to assign the low-resolution picture to one of the 15 differ...

Mixed membership stochastic block models for relational data with application to protein-protein interactions

by Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, Eric P. Xing, Tommi Jaakkola - In Proceedings of the International Biometrics Society Annual Meeting , 2006
"... We develop a model for examining data that consists of pairwise measurements, for example, presence or absence of links between pairs of objects. Examples include protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with p ..."
Abstract - Cited by 378 (52 self) - Add to MetaCart
We develop a model for examining data that consists of pairwise measurements, for example, presence or absence of links between pairs of objects. Examples include protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilistic models requires special assumptions, since the usual independence or exchangeability assumptions no longer hold. We introduce a class of latent variable models for pairwise measurements: mixed membership stochastic blockmodels. Models in this class combine a global model of dense patches of connectivity (blockmodel) and a local model to instantiate nodespecific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference. We demonstrate the advantages of mixed membership stochastic blockmodels with applications to social networks and protein interaction networks.
(Show Context)

Citation Context

...is violated by the heterogeneity within a unit of analysis—e.g., a document, or a node in a graph. They have been successfully applied in many domains, such as document analysis [1], image processing =-=[7]-=-, and population genetics [9]. Mixed membership models associate each unit of analysis with multiple groups rather than a single groups, via a membership ∗ A longer version of this work is available o...

Supervised topic models

by David M. Blei, Jon D. Mcauliffe - In preparation , 2008
"... ..."
Abstract - Cited by 336 (8 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...nsupervised LDA has previously been used to construct features for classification. The hope was that LDA topics would turn out to be useful for categorization, since they act to reduce data dimension =-=[3, 6]-=-. However, when the goal is prediction, fitting unsupervised topics may not be a good choice. Consider predicting a movie rating from the words in its review. Intuitively, good predictive topics will ...

Learning object categories from google’s image search

by R. Fergus, L. Fei-fei, P. Perona, A. Zisserman - In Proceedings of the International Conference on Computer Vision , 2005
"... Current approaches to object category recognition require datasets of training images to be manually prepared, with varying degrees of supervision. We present an approach that can learn an object category from just its name, by uti-lizing the raw output of image search engines available on the Inter ..."
Abstract - Cited by 316 (18 self) - Add to MetaCart
Current approaches to object category recognition require datasets of training images to be manually prepared, with varying degrees of supervision. We present an approach that can learn an object category from just its name, by uti-lizing the raw output of image search engines available on the Internet. We develop a new model, TSI-pLSA, which extends pLSA (as applied to visual words) to include spa-tial information in a translation and scale invariant man-ner. Our approach can handle the high intra-class vari-ability and large proportion of unrelated images returned by search engines. We evaluate the models on standard test sets, showing performance competitive with existing meth-ods trained on hand prepared datasets. 1.
(Show Context)

Citation Context

...nt Semantic Analysis (pLSA) [12] and its hierarchical Bayesian form, Latent Dirichlet Allocation (LDA) [4]. Recently, these two approaches have been applied to the computer vision: Fei-Fei and Perona =-=[8]-=- applied LDA to scene classification and Sivic et al. applied pLSA to unsupervised object categorisation. In the latter work, the Caltech datasets used by Fergus et al. [10] were combined into one lar...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University