Results 1 - 10
of
1,618
Image retrieval: ideas, influences, and trends of the new age
- ACM COMPUTING SURVEYS
, 2008
"... We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger ass ..."
Abstract
-
Cited by 485 (13 self)
- Add to MetaCart
(Show Context)
We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger association of weakly related fields. In this article, we survey almost 300 key theoretical and empirical contributions in the current decade related to image retrieval and automatic image annotation, and in the process discuss the spawning of related subfields. We also discuss significant challenges involved in the adaptation of existing image retrieval techniques to build systems that can be useful in the real world. In retrospect of what has been achieved so far, we also conjecture what the future may hold for image retrieval research.
Face Detection In Color Images
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2002
"... Human face detection is often the first step in applications such as video surveillance, human computer interface, face recognition, and image database management. We propose a face detection algorithm for color images in the presence of varying lighting conditions as well as complex backgrounds. Ou ..."
Abstract
-
Cited by 338 (8 self)
- Add to MetaCart
Human face detection is often the first step in applications such as video surveillance, human computer interface, face recognition, and image database management. We propose a face detection algorithm for color images in the presence of varying lighting conditions as well as complex backgrounds. Our method detects skin regions over the entire image, and then generates face candidates based on the spatial arrangement of these skin patches. The algorithm constructs eye, mouth, and boundary maps for verifying each face candidate. Experimental results demonstrate successful detection over a wide variety of facial variations in color, position, scale, rotation, pose, and expression from several photo collections.
Automatic linguistic indexing of pictures by a statistical modeling approach
- PAMI
"... Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of ..."
Abstract
-
Cited by 300 (25 self)
- Add to MetaCart
(Show Context)
Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2-D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 dierent concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Exper-iments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images. Index Terms { Content-based image retrieval, image classication, hidden Markov model, computer vision, statistical learning, wavelets. 1
Wavelet-Based Texture Retrieval Using Generalized Gaussian Density and Kullback-Leibler Distance
- IEEE Trans. Image Processing
, 2002
"... We present a statistical view of the texture retrieval problem by combining the two related tasks, namely feature extraction (FE) and similarity measurement (SM), into a joint modeling and classification scheme. We show that using a consistent estimator of texture model parameters for the FE step fo ..."
Abstract
-
Cited by 241 (4 self)
- Add to MetaCart
(Show Context)
We present a statistical view of the texture retrieval problem by combining the two related tasks, namely feature extraction (FE) and similarity measurement (SM), into a joint modeling and classification scheme. We show that using a consistent estimator of texture model parameters for the FE step followed by computing the Kullback--Leibler distance (KLD) between estimated models for the SM step is asymptotically optimal in term of retrieval error probability. The statistical scheme leads to a new wavelet-based texture retrieval method that is based on the accurate modeling of the marginal distribution of wavelet coefficients using generalized Gaussian density (GGD) and on the existence a closed form for the KLD between GGDs. The proposed method provides greater accuracy and flexibility in capturing texture information, while its simplified form has a close resemblance with the existing methods which uses energy distribution in the frequency domain to identify textures. Experimental results on a database of 640 texture images indicate that the new method significantly improves retrieval rates, e.g., from 65% to 77%, compared with traditional approaches, while it retains comparable levels of computational complexity.
Flickr tag recommendation based on collective knowledge
- IN WWW ’08: PROC. OF THE 17TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB
, 2008
"... Online photo services such as Flickr and Zooomr allow users to share their photos with family, friends, and the online community at large. An important facet of these services is that users manually annotate their photos using so called tags, which describe the contents of the photo or provide addit ..."
Abstract
-
Cited by 224 (1 self)
- Add to MetaCart
(Show Context)
Online photo services such as Flickr and Zooomr allow users to share their photos with family, friends, and the online community at large. An important facet of these services is that users manually annotate their photos using so called tags, which describe the contents of the photo or provide additional contextual and semantical information. In this paper we investigate how we can assist users in the tagging phase. The contribution of our research is twofold. We analyse a representative snapshot of Flickr and present the results by means of a tag characterisation focussing on how users tags photos and what information is contained in the tagging. Based on this analysis, we present and evaluate tag recommendation strategies to support the user in the photo annotation task by recommending a set of tags that can be added to the photo. The results of the empirical evaluation show that we can effectively recommend relevant tags for a variety of photos with different levels of exhaustiveness of original tagging.
Supervised learning of semantic classes for image annotation and retrieval
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to- ..."
Abstract
-
Cited by 223 (18 self)
- Add to MetaCart
(Show Context)
Abstract—A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning. Index Terms—Content-based image retrieval, semantic image annotation and retrieval, weakly supervised learning, multiple instance learning, Gaussian mixtures, expectation-maximization, image segmentation, object recognition. 1
A survey of content-based image retrieval with high-level semantics
, 2007
"... In order to improve the retrieval accuracy of content-based image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to reducing the ‘semantic gap ’ between the visual features and the richness of human semantics. This paper attemp ..."
Abstract
-
Cited by 150 (5 self)
- Add to MetaCart
In order to improve the retrieval accuracy of content-based image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to reducing the ‘semantic gap ’ between the visual features and the richness of human semantics. This paper attempts to provide a comprehensive survey of the recent technical achievements in high-level semantic-based image retrieval. Major recent publications are included in this survey covering different aspects of the research in this area, including low-level image feature extraction, similarity measurement, and deriving high-level semantic features. We identify five major categories of the state-of-the-art techniques in narrowing down the ‘semantic gap’: (1) using object ontology to define high-level concepts; (2) using machine learning methods to associate low-level features with query concepts; (3) using relevance feedback to learn users’ intention; (4) generating semantic template to support high-level image retrieval; (5) fusing the evidences from HTML text and the visual content of images for WWW image retrieval. In addition, some other related issues such as image test bed and retrieval performance evaluation are also discussed. Finally, based on existing technology and the demand from real-world applications, a few promising future research directions are suggested.
Studying aesthetics in photographic images using a computational approach
- In Proc. ECCV
, 2006
"... Abstract. Aesthetics, in the world of art and photography, refers to the principles of the nature and appreciation of beauty. Judging beauty and other aesthetic qualities of photographs is a highly subjective task. Hence, there is no unanimously agreed standard for measuring aesthetic value. In spit ..."
Abstract
-
Cited by 131 (11 self)
- Add to MetaCart
(Show Context)
Abstract. Aesthetics, in the world of art and photography, refers to the principles of the nature and appreciation of beauty. Judging beauty and other aesthetic qualities of photographs is a highly subjective task. Hence, there is no unanimously agreed standard for measuring aesthetic value. In spite of the lack of firm rules, certain features in photographic images are believed, by many, to please humans more than certain others. In this paper, we treat the challenge of automatically inferring aesthetic quality of pictures using their visual content as a machine learning problem, with a peer-rated online photo sharing Website as data source. We extract certain visual features based on the intuition that they can discriminate between aesthetically pleasing and displeasing images. Automated classifiers are built using support vector machines and classification trees. Linear regression on polynomial terms of the features is also applied to infer numerical aesthetics ratings. The work attempts to explore the relationship between emotions which pictures arouse in people, and their low-level content. Potential applications include content-based image retrieval and digital photography. 1