Results 11 - 20
of
175
SVMs for Histogram-Based Image Classification
, 1999
"... Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensio ..."
Abstract
-
Cited by 96 (0 self)
- Add to MetaCart
Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x;y) = e \Gammaae P i jx a i \Gammay a i j b with a 1 and b 2 are evaluated on the classification of images extracted from the Corel Stock Photo Collection and shown to far outperform traditional polynomial or Gaussian RBF kernels. Moreover, we observed that a simple remapping of the input x i ! x a i improves the performance of linear SVMs to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.
Performance Evaluation in Content-Based Image Retrieval: Overview and Proposals
, 2000
"... Evaluation of retrieval performance is a crucial problem in content-based image retrieval (CBIR). Many different methods for measuring the performance of a system have been created and used by researchers. This article discusses the advantages and shortcomings of the performance measures currently u ..."
Abstract
-
Cited by 88 (12 self)
- Add to MetaCart
Evaluation of retrieval performance is a crucial problem in content-based image retrieval (CBIR). Many different methods for measuring the performance of a system have been created and used by researchers. This article discusses the advantages and shortcomings of the performance measures currently used. Problems such as dening a common image database for performance comparisons and a means of getting relevance judgments (or ground truth) for queries are explained. The relationship between CBIR and information retrieval (IR) is made clear, since IR researchers have decades of experience with the evaluation problem. Many of their solutions can be used for CBIR, despite the dierences between the fields. Several methods used in text retrieval are explained. Proposals for performance measures and means of developing a standard test suite for CBIR, similar to that used in IR at the annual Text REtrieval Conference (TREC), are presented.
Narrowing the Semantic Gap - Improved Text-Based Web Document Retrieval Using Visual Features
- IEEE Transactions on Multimedia
, 2002
"... In this paper, we present the results of our work that seeks to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, Latent Semantic Indexing (LSI), which has been used for textual information retrieval for many ..."
Abstract
-
Cited by 75 (2 self)
- Add to MetaCart
(Show Context)
In this paper, we present the results of our work that seeks to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, Latent Semantic Indexing (LSI), which has been used for textual information retrieval for many years. In this environment, LSI is used to determine clusters of cooccurring keywords, sometimes, called concepts, so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based web document retrieval, using both keywords and image features to represent the documents. Two different approaches to image feature representation, namely, color histograms and color anglograms, are adopted and evaluated. Experimental results show that LSI, together with both textual and visual features, is able to extract the underlying semantic structure of web documents, thus helping to improve the retrieval performance significantly.
Color Image Segmentation: A State-of-the-Art Survey
"... Segmentation is the low-level operation concerned with partitioning images by determining disjoint and homogeneous regions or, equivalently, by finding edges or boundaries. The homogeneous regions, or the edges, are supposed to correspond to actual objects, or parts of them, within the images. Thus, ..."
Abstract
-
Cited by 68 (0 self)
- Add to MetaCart
Segmentation is the low-level operation concerned with partitioning images by determining disjoint and homogeneous regions or, equivalently, by finding edges or boundaries. The homogeneous regions, or the edges, are supposed to correspond to actual objects, or parts of them, within the images. Thus, in a large number of applications in image processing and computer vision, segmentation plays a fundamental role as the first step before applying to images higher-level operations such as recognition, semantic interpretation, and representation. Until very recently, attention has been focused on segmentation of gray-level images since these have been the only kind of visual information that acquisition devices were able to take and computer resources to handle. Nowadays, color imagery has definitely supplanted monochromatic information and computation power is no longer a limitation in processing large volumes of data. The attention has accordingly been focused in recent years on algorithms for segmentation of color images and various techniques, ofted borrowed from the background of gray-level image segmentation, have been proposed. This paper provides a review of methods advanced in the past few years for segmentation of color images.
Modeling, clustering, and segmenting video with mixtures of dynamic textures
- PAMI
, 2008
"... A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of v ..."
Abstract
-
Cited by 67 (14 self)
- Add to MetaCart
A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectation-maximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, timeseries clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (for example, fire, steam, water, vehicle and pedestrian traffic, and so forth). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (for example, optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.
Feature normalization and likelihood-based similarity measures for image retrieval
- Pattern Recognition Letters
, 2001
"... Distance measures like the Euclidean distance are used to measure similarity between images in content-based image retrieval. Such geometric measures implicitly assign more weighting to features with large ranges than those with small ranges. This paper discusses the effects of five feature normaliz ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
Distance measures like the Euclidean distance are used to measure similarity between images in content-based image retrieval. Such geometric measures implicitly assign more weighting to features with large ranges than those with small ranges. This paper discusses the effects of five feature normalization methods on retrieval performance. We also describe two likelihood ratio-based similarity measures that perform significantly better than the commonly used geometric approaches like the Lp metrics.
A design tool for camera-based interaction
- Proceedings of the SIGCHI conference on Human factors in computing systems, ACM
, 2003
"... Cameras provide an appealing new input medium for interaction. The creation of camera-based interfaces is outside the skill-set of most programmers and completely beyond the skills of most interface designers. Image Processing with Crayons is a tool for creating new camera-based interfaces using a s ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
(Show Context)
Cameras provide an appealing new input medium for interaction. The creation of camera-based interfaces is outside the skill-set of most programmers and completely beyond the skills of most interface designers. Image Processing with Crayons is a tool for creating new camera-based interfaces using a simple painting metaphor. A transparent layers model is used to present the designer with all of the necessary information. Traditional machine learning algorithms have been modified to accommodate the rapid response time required of an interactive design tool.
A Continuous Probabilistic Framework for image matching
, 2001
"... In this paper we describe a probabilistic image matching scheme in which the image representation is continuous and the similarity measure and distance computation are also defined in the continuous domain. Each image is first represented as a Gaussian mixture distribution and images are compared an ..."
Abstract
-
Cited by 43 (19 self)
- Add to MetaCart
In this paper we describe a probabilistic image matching scheme in which the image representation is continuous and the similarity measure and distance computation are also defined in the continuous domain. Each image is first represented as a Gaussian mixture distribution and images are compared and matched via a probabilistic measure of similarity between distributions. A common probabilistic and continuous framework is applied to the representation as well as the matching process, ensuring an overall system that is theoretically appealing. Matching results are investigated and the application to an image retrieval system is demonstrated.
The effects of segmentation and feature choice in a translation model of object recognition
- In IEEE Conf. on Computer Vision and Pattern Recognition
, 2003
"... We work with a model of object recognition where words must be placed on image regions. This approach means that large scale experiments are relatively easy, so we can evaluate the effects of various early and midlevel vision algorithms on recognition performance. We evaluate various image segmentat ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
We work with a model of object recognition where words must be placed on image regions. This approach means that large scale experiments are relatively easy, so we can evaluate the effects of various early and midlevel vision algorithms on recognition performance. We evaluate various image segmentation algorithms by determining word prediction accuracy for images segmented in various ways and represented by various features. We take the view that good segmentations respect object boundaries, and so word prediction should be better for a better segmentation. However, it is usually very difficult in practice to obtain segmentations that do not break up objects, so most practitioners attempt to merge segments to get better putative object representations. We demonstrate that our paradigm of word prediction easily allows us to predict potentially useful segment merges, even for segments that do not look similar (for example, merging the black and white Figure 1. Illustration of labeling. Each region is labeled with the maximally probable word, but a probability distribution over all words is available for each region.
Still Image Segmentation Tools for Object-based Multimedia Applications
- International Journal of Pattern Recognition and Artificial Intelligence
, 2004
"... Abstract — In this paper, a color image segmentation algorithm and an approach to large-format image segmentation are presented, both focused on breaking down images to semantic objects for object-based multimedia applications. The proposed color image segmentation algorithm performs the segmentatio ..."
Abstract
-
Cited by 38 (20 self)
- Add to MetaCart
(Show Context)
Abstract — In this paper, a color image segmentation algorithm and an approach to large-format image segmentation are presented, both focused on breaking down images to semantic objects for object-based multimedia applications. The proposed color image segmentation algorithm performs the segmentation in the combined intensity– texture–position feature space in order to produce connected regions that correspond to the real-life objects shown in the image. A preprocessing stage of conditional image filtering and a modified K-Means-with-connectivity-constraint pixel classification algorithm are used to allow for seamless integration of the different pixel features. Unsupervised operation of the segmentation algorithm is enabled by means of an initial clustering procedure. The large-format image segmentation scheme employs the aforementioned segmentation algorithm, providing an elegant framework for the fast segmentation of relatively large images. In this framework, the segmentation algorithm is applied to reduced versions of the original images, in order to speed-up the completion of the segmentation, resulting in a coarse-grained segmentation mask. The final fine-grained segmentation mask is produced with partial reclassification of the pixels of the original image to the already formed regions, using a Bayes classifier. As shown by experimental evaluation, this novel scheme provides fast segmentation with high perceptual segmentation quality.