Results 1 -
8 of
8
A.: Blocks that shout: Distinctive parts for scene classification
, 2013
"... The automatic discovery of distinctive parts for an ob-ject or scene class is challenging since it requires simulta-neously to learn the part appearance and also to identify the part occurrences in images. In this paper, we propose a simple, efficient, and effective method to do so. We ad-dress this ..."
Abstract
-
Cited by 52 (1 self)
- Add to MetaCart
(Show Context)
The automatic discovery of distinctive parts for an ob-ject or scene class is challenging since it requires simulta-neously to learn the part appearance and also to identify the part occurrences in images. In this paper, we propose a simple, efficient, and effective method to do so. We ad-dress this problem by learning parts incrementally, starting from a single part occurrence with an Exemplar SVM. In this manner, additional part instances are discovered and aligned reliably before being considered as training exam-ples. We also propose entropy-rank curves as a means of evaluating the distinctiveness of parts shareable between categories and use them to select useful parts out of a set of candidates. We apply the new representation to the task of scene cat-egorisation on the MIT Scene 67 benchmark. We show that our method can learn parts which are significantly more in-formative and for a fraction of the cost, compared to previ-ous part-learning methods such as Singh et al. [28]. We also show that a well constructed bag of words or Fisher vector model can substantially outperform the previous state-of-the-art classification performance on this data. 1.
Learning Discriminative Part Detectors for Image Classification and
"... This is a preliminary version accepted for publication to ICCV 2013 In this paper, we address the problem of learning discriminative part detectors from image sets with category labels. We propose a novel latent SVM model regularized by group sparsity to learn these part detectors. Starting from a l ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
(Show Context)
This is a preliminary version accepted for publication to ICCV 2013 In this paper, we address the problem of learning discriminative part detectors from image sets with category labels. We propose a novel latent SVM model regularized by group sparsity to learn these part detectors. Starting from a large set of initial parts, the group sparsity regularizer forces the model to jointly select and optimize a set of discriminative part detectors in a max-margin framework. We propose a stochastic version of a proximal algorithm to solve the corresponding optimization problem. We apply the proposed method to image classification and cosegmentation, and quantitative experiments with standard benchmarks show that it matches or improves upon the state of the art. 1.
Max-Margin Multiple-Instance Dictionary Learning
"... Dictionary learning has became an increasingly important task in machine learning, as it is fundamental to the representation problem. A number of emerging techniques specifically include a codebook learning step, in which a critical knowledge abstraction process is carried out. Existing approaches ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
(Show Context)
Dictionary learning has became an increasingly important task in machine learning, as it is fundamental to the representation problem. A number of emerging techniques specifically include a codebook learning step, in which a critical knowledge abstraction process is carried out. Existing approaches in dictionary (codebook) learning are either generative (unsupervised e.g. k-means) or discriminative (supervised e.g. extremely randomized forests). In this paper, we propose a multiple instance learning (MIL) strategy (along the line of weakly supervised learning) for dictionary learning. Each code is represented by a classifier, such as a linear SVM, which naturally performs metric fusion for multi-channel features. We design a formulation to simultaneously learn mixtures of codes by maximizing classification margins in MIL. State-of-the-art results are observed in image classification benchmarks based on the learned codebooks, which observe both compactness and effectiveness. 1.
Incorporating Scene Context and Object Layout into Appearance Modeling
"... A scene category imposes tight distributions over the kind of objects that might appear in the scene, the appear-ance of those objects and their layout. In this paper, we propose a method to learn scene structures that can encode three main interlacing components of a scene: the scene category, the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
A scene category imposes tight distributions over the kind of objects that might appear in the scene, the appear-ance of those objects and their layout. In this paper, we propose a method to learn scene structures that can encode three main interlacing components of a scene: the scene category, the context-specific appearance of objects, and their layout. Our experimental evaluations show that our learned scene structures outperform state-of-the-art method of Deformable Part Models in detecting objects in a scene. Our scene structure provides a level of scene understanding that is amenable to deep visual inferences. The scene struc-tures can also generate features that can later be used for scene categorization. Using these features, we also show promising results on scene categorization. 1.
MANTRA: Minimum Maximum Latent Structural SVM for Image Classification and Ranking
"... In this work, we propose a novel Weakly Supervised Learning (WSL) framework dedicated to learn discrimina-tive part detectors from images annotated with a global label. Our WSL method encompasses three main contri-butions. Firstly, we introduce a new structured output la-tent variable model, Minimum ..."
Abstract
- Add to MetaCart
(Show Context)
In this work, we propose a novel Weakly Supervised Learning (WSL) framework dedicated to learn discrimina-tive part detectors from images annotated with a global label. Our WSL method encompasses three main contri-butions. Firstly, we introduce a new structured output la-tent variable model, Minimum mAximum lateNt sTRucturAl SVM (MANTRA), which prediction relies on a pair of latent variables: h+ (resp. h−) provides positive (resp. nega-tive) evidence for a given output y. Secondly, we instantiate MANTRA for two different visual recognition tasks: multi-class classification and ranking. For ranking, we propose efficient solutions to exactly solve the inference and the loss-augmented problems. Finally, extensive experiments high-light the relevance of the proposed method: MANTRA out-performs state-of-the art results on five different datasets. 1.
Do men ride elephant? OpenIE
"... dog dog
eaAng
ice
cream lion carrying
cub Snake laying
eggs
egg man riding
man
riding
elephant Figure 1. Do dogs eat ice cream? While we humans have no trouble answering this question, existing text-based methods have a tough time. In this paper, we present a nov ..."
Abstract
- Add to MetaCart
(Show Context)
dog dog
eaAng
ice
cream lion carrying
cub Snake laying
eggs
egg man riding
man
riding
elephant Figure 1. Do dogs eat ice cream? While we humans have no trouble answering this question, existing text-based methods have a tough time. In this paper, we present a novel approach that can visually verify arbitrary relation phrases. How can we know whether a statement about our world is valid. For example, given a relationship between a pair of entities e.g., ‘eat(horse, hay)’, how can we know whether this relationship is true or false in general. Gath-ering such knowledge about entities and their relationships is one of the fundamental challenges in knowledge extrac-tion. Most previous works on knowledge extraction have focused purely on text-driven reasoning for verifying re-lation phrases. In this work, we introduce the problem of visual verification of relation phrases and developed a
Learning Important Spatial Pooling Regions for Scene Classification
"... We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification. It is often caused by the complexity of latent image structure when convolving part filters with input images. This problem makes mid-lev ..."
Abstract
- Add to MetaCart
(Show Context)
We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification. It is often caused by the complexity of latent image structure when convolving part filters with input images. This problem makes mid-level representation, even after pooling, not distinct enough to classify input data correctly to cate-gories. Our solution is to learn important spatial pooling regions along with their appearance. The experiments show that this new framework suppresses false response and produces improved results on several datasets, including MIT-Indoor, 15-Scene, and UIUC 8-Sport. When combined with global image features, our method achieves state-of-the-art performance on these datasets. 1.
Multi-level Adaptive Active Learning for Scene Classification
"... Abstract. Semantic scene classification is a challenging problem in com-puter vision. In this paper, we present a novel multi-level active learn-ing approach to reduce the human annotation effort for training robust scene classification models. Different from most existing active learning methods th ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Semantic scene classification is a challenging problem in com-puter vision. In this paper, we present a novel multi-level active learn-ing approach to reduce the human annotation effort for training robust scene classification models. Different from most existing active learning methods that can only query labels for selected instances at the target categorization level, i.e., the scene class level, our approach establishes a semantic framework that predicts scene labels based on a latent object-based semantic representation of images, and is capable to query labels at two different levels, the target scene class level (abstractive high level) and the latent object class level (semantic middle level). Specifically, we develop an adaptive active learning strategy to perform multi-level la-bel query, which maintains the default label query at the target scene class level, but switches to the latent object class level whenever an “unexpected ” target class label is returned by the labeler. We conduct experiments on two standard scene classification datasets to investigate the efficacy of the proposed approach. Our empirical results show the proposed adaptive multi-level active learning approach can outperform both baseline active learning methods and a state-of-the-art multi-level active learning method.