Results 1 - 10
of
29
Outdoors augmented reality on mobile phone using loxel-based visual feature organization
- In Proceeding of ACM international conference on Multimedia Information Retrieval
, 2008
"... We have built an outdoors augmented reality system for mobile phones that matches camera-phone images against a large database of location-tagged images using a robust image retrieval algorithm. We avoid network latency by implementing the algorithm on the phone and deliver excellent performance by ..."
Abstract
-
Cited by 37 (19 self)
- Add to MetaCart
We have built an outdoors augmented reality system for mobile phones that matches camera-phone images against a large database of location-tagged images using a robust image retrieval algorithm. We avoid network latency by implementing the algorithm on the phone and deliver excellent performance by adapting a state-ofthe-art image retrieval algorithm based on robust local descriptors. Matching is performed against a database of highly relevant features, which is continuously updated to reflect changes in the environment. We achieve fast updates and scalability by pruning of irrelevant features based on proximity to the user. By compressing and incrementally updating the features stored on the phone we make the system amenable to low-bandwidth wireless connections. We demonstrate system robustness on a dataset of location-tagged images and show a smart-phone implementation that achieves a high image matching rate while operating in near real-time.
Features for Image Retrieval: An Experimental Comparison
, 2007
"... An experimental comparison of a large number of different image descriptors for content-based image retrieval is presented. Many of the papers describing new techniques and descriptors for content-based image retrieval describe their newly proposed methods as most appropriate without giving an in-de ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
An experimental comparison of a large number of different image descriptors for content-based image retrieval is presented. Many of the papers describing new techniques and descriptors for content-based image retrieval describe their newly proposed methods as most appropriate without giving an in-depth comparison with all methods that were proposed earlier. In this paper, we first give an overview of a large variety of features for content-based image retrieval and compare them quantitatively on four different tasks: stock photo retrieval, personal photo collection retrieval, building retrieval, and medical image retrieval. For the experiments, five different, publicly available image databases are used and the retrieval performance of the features is analysed in detail. This allows for a direct comparison of all features considered in this work and furthermore will allow a comparison of newly proposed features to these in the future. Additionally, the correlation of the features is analysed, which opens the way for a simple and intuitive method to find an initial set of suitable features for a new task. The article concludes with recommendations which features perform well for what type of data. Interestingly, the often used, but very simple, colour histogram performs well in the comparison and thus can be recommended as a simple baseline for many applications.
FaceTracer: A Search Engine for Large Collections of Images with Faces
"... Abstract. We have created the first image search engine based entirely on faces. Using simple text queries such as “smiling men with blond hair and mustaches, ” users can search through over 3.1 million faces which have been automatically labeled on the basis of several facial attributes. Faces in o ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
Abstract. We have created the first image search engine based entirely on faces. Using simple text queries such as “smiling men with blond hair and mustaches, ” users can search through over 3.1 million faces which have been automatically labeled on the basis of several facial attributes. Faces in our database have been extracted and aligned from images downloaded from the internet using a commercial face detector, and the number of images and attributes continues to grow daily. Our classification approach uses a novel combination of Support Vector Machines and Adaboost which exploits the strong structure of faces to select and train on the optimal set of features for each attribute. We show state-of-the-art classification results compared to previous works, and demonstrate the power of our architecture through a functional, large-scale face search engine. Our framework is fully automatic, easy to scale, and computes all labels off-line, leading to fast on-line search performance. In addition, we describe how our system can be used for a number of applications, including law enforcement, social networks, and personal photo management. Our search engine will soon be made publicly available. 1
Automatic Image Annotation Using Auxiliary Text Information
- in ACL HLT
, 2008
"... The availability of databases of images labeled with keywords is necessary for developing and evaluating image annotation models. Dataset collection is however a costly and time consuming task. In this paper we exploit the vast resource of images available on the web. We create a database of picture ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The availability of databases of images labeled with keywords is necessary for developing and evaluating image annotation models. Dataset collection is however a costly and time consuming task. In this paper we exploit the vast resource of images available on the web. We create a database of pictures that are naturally embedded into news articles and propose to use their captions as a proxy for annotation
Automatic categorization of figures in scientific documents
- In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
, 2006
"... Figures are very important non-textual information contained in scientific documents. Current digital libraries do not provide users tools to retrieve documents based on the information available within the figures. We propose an architecture for retrieving documents by integrating figures and other ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Figures are very important non-textual information contained in scientific documents. Current digital libraries do not provide users tools to retrieve documents based on the information available within the figures. We propose an architecture for retrieving documents by integrating figures and other information. The initial step in enabling integrated document search is to categorize figures into a set of pre-defined types. We propose several categories of figures based on their functionalities in scholarly articles. We have developed a machine-learning-based approach for automatic categorization of figures. Both global features, such as texture, and part features, such as lines, are utilized in the architecture for discriminating among figure categories. The proposed approach has been evaluated on a testbed document set collected from the CiteSeer scientific literature digital library. Experimental evaluation has demonstrated that our algorithms can produce acceptable results for realworld use. Our tools will be integrated into a scientificdocument digital library.
Automatic Extraction of Data Points and Text Blocks from 2-dimensional plots in digital documents
- ASSOCIATION FOR THE ADVANCEMENT OF ARTIFICIAL INTELLIGENCE
, 2008
"... Two dimensional plots (2-D) in digital documents on the web are an important source of information that is largely under-utilized. In this paper, we outline how data and text can be extracted automatically from these 2-D plots, thus eliminating a time consuming manual process. Our information extrac ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Two dimensional plots (2-D) in digital documents on the web are an important source of information that is largely under-utilized. In this paper, we outline how data and text can be extracted automatically from these 2-D plots, thus eliminating a time consuming manual process. Our information extraction algorithm identifies the axes of the figures, extracts text blocks like axes-labels and legends and identifies data points in the figure. It also extracts the units appearing in the axes labels and segments the legends to identify the different lines in the legend, the different symbols and their associated text explanations. Our algorithm also performs the challenging task of separating out overlapping text and data points effectively. Our experiments indicate that these techniques are computationally efficient and provide acceptable accuracy.
Describable Visual Attributes for Face Verification and Image Search
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... We introduce the use ofdescribable visual attributes for face verification and image search. Describable visual attributes are labels that can be given to an image to describe its appearance. This paper focuses on images of faces and the attributes used to describe them, although the concepts also a ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We introduce the use ofdescribable visual attributes for face verification and image search. Describable visual attributes are labels that can be given to an image to describe its appearance. This paper focuses on images of faces and the attributes used to describe them, although the concepts also apply to other domains. Examples of face attributes include gender, age, jaw shape, nose size, etc. The advantages of an attribute-based representation for vision tasks are manifold: they can be composed to create descriptions at various levels of specificity; they are generalizable, as they can be learned once and then applied to recognize new objects or categories without any further training; and they are efficient, possibly requiring exponentially fewer attributes (and training data) than explicitly naming each category. We show how one can create and label large datasets of real-world images to train classifiers which measure the presence, absence, or degree to which an attribute is expressed in images. These classifiers can then automatically label new images. We demonstrate the current effectiveness – and explore the future potential – of using attributes for face verification and image search via human and computational experiments. Finally, we introduce two new face datasets, named FaceTracer and PubFig, with labeled attributes and identities, respectively.
Application Potential of Multimedia Information Retrieval
, 2007
"... This paper will first briefly survey the existing impact of MIR in applications. It will then analyze the current trends of MIR research which can have an influence on future applications. It will then detail the future possibilities and bottlenecks in applying the MIR research results in the main t ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper will first briefly survey the existing impact of MIR in applications. It will then analyze the current trends of MIR research which can have an influence on future applications. It will then detail the future possibilities and bottlenecks in applying the MIR research results in the main target application areas, such as consumer (e.g. personal video recorders, web information retrieval), public safety (e.g. automated smart surveillance systems) and professional world (e.g. automated meeting capture and summarization). In particular, recommendations will be made to the research community regarding the challenges that need to be met to make the knowledge transfer towards the applications more efficient and effective. It will also attempt to study the trends in the applications which can inform the MIR community on directing intellectual resources towards MIR problems which can have a maximal real-world impact.
Names and faces
"... We show that a large and realistic face dataset can be built from news photographs and their associated captions. Our dataset consists of 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recog ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We show that a large and realistic face dataset can be built from news photographs and their associated captions. Our dataset consists of 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured “in the wild ” in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Faces are extracted from the images and names from the associated caption. Our system uses a clustering procedure to find the correspondence between faces and associated names in news picture-caption pairs. The context in which a name appears in a caption provides powerful cues as to whether it is depicted in the associated image. By incorporating simple natural language techniques, we are able to improve our name assignment significantly. Once the procedure is complete, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.
Improving automatic image annotation based on word co-occurrence
- In Proccedings of the 5th International Adaptive Multimedia Retrieval workshop
, 2007
"... Abstract. Accuracy of current automatic image labeling methods is under the requirements of annotation-based image retrieval systems. The performance of most of these labeling methods is poor if we just consider the most relevant label for a given region. However, if we look within the set of the to ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Accuracy of current automatic image labeling methods is under the requirements of annotation-based image retrieval systems. The performance of most of these labeling methods is poor if we just consider the most relevant label for a given region. However, if we look within the set of the top−k candidate labels for a given region, accuracy of most of these systems is improved. In this paper we take advantage of this fact and propose a method (NBIC) based on word co-occurrence that uses the naïve Bayes formulation for improving automatic image annotation methods. Our approach utilizes co-occurrence information of the candidate labels for a region with those candidate labels for the other surrounding regions, within the same image, for selecting the correct label. Co-occurrence information is obtained from an external collection of manually annotated images: the IAPR-TC12 benchmark. Experimental results using a k−nearest neighbors method as our annotation system, give evidence of significant improvements after applying the NBIC method. NBIC is efficient since the co-occurrence information was obtained off-line. Furthermore, our method can be applied to any other annotation system that ranks labels by their relevance. 1

