Results 1 - 10
of
76
Affective Computing
, 1995
"... Recent neurological studies indicate that the role of emotion in human cognition is essential; emotions are not a luxury. Instead, emotions play a critical role in rational decision-making, in perception, in human interaction, and in human intelligence. These facts, combined with abilities computers ..."
Abstract
-
Cited by 1012 (37 self)
- Add to MetaCart
Recent neurological studies indicate that the role of emotion in human cognition is essential; emotions are not a luxury. Instead, emotions play a critical role in rational decision-making, in perception, in human interaction, and in human intelligence. These facts, combined with abilities computers are acquiring in expressing and recognizing affect, open new areas for research. This paper defines key issues in "affective computing," computing that relates to, arises from, or deliberately influences emotions. New models are suggested for computer recognition of human emotion, and both theoretical and practical applications are described for learning, human-computer interaction, perceptual information retrieval, creative arts and entertainment, human health, and machine intelligence. Significant potential advances in emotion and cognition theory hinge on the development of affective computing, especially in the form of wearable computers. This paper establishes challenges and future directions for this emerging field.
Image retrieval: Current techniques, promising directions and open issues
- Journal of Visual Communication and Image Representation
, 1999
"... This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially content-based image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image fea ..."
Abstract
-
Cited by 290 (7 self)
- Add to MetaCart
This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially content-based image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image feature representation and extraction, multidimensional indexing, and system design, three of the fundamental bases of content-based image retrieval. Furthermore, based on the state-of-the-art technology available now and the demand from real-world applications, open research issues are identified and future promising research directions are suggested. C ○ 1999 Academic Press 1.
The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments
- IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2000
"... This paper presents the theory, design principles, implementation, and performance results of PicHunter, a prototype content-based image retrieval (CBIR) system that has been developed over the past three years. In addition, this document presents the rationale, design, and results of psychophysica ..."
Abstract
-
Cited by 150 (2 self)
- Add to MetaCart
This paper presents the theory, design principles, implementation, and performance results of PicHunter, a prototype content-based image retrieval (CBIR) system that has been developed over the past three years. In addition, this document presents the rationale, design, and results of psychophysical experiments that were conducted to address some key issues that arose during PicHunter’s development. The PicHunter project makes four primary contributions to research on content-based image retrieval. First, PicHunter represents a simple instance of a general Bayesian framework we describe for using relevance feedback to direct a search. With an explicit model of what users would do, given what target image they want, PicHunter uses Bayes’s rule to predict what is the target they want, given their actions. This is done via a probability distribution over possible image targets, rather than by refining a query. Second, an entropy-minimizing display algorithm is described that attempts to maximize the information obtained from a user at each iteration of the search. Third, PicHunter makes use of hidden annotation rather than a possibly inaccurate/inconsistent annotation structure that the user must learn and make queries in. Finally, PicHunter introduces two experimental paradigms to quantitatively evaluate the performance of the system, and psychophysical experiments are presented that support the theoretical claims.
Interactive learning using a "society of models"
- SUBMITTED TO SPECIAL ISSUE OF PATTERN RECOGNITION ON IMAGE DATABASE: CLASSIFICATION AND RETRIEVAL
"... Digital library access is driven by features, but features are often context-dependent and noisy, and their relevance for a query is not always obvious. This paper describes an approach for utilizing many data-dependent, user-dependent, and task-dependent features in a semi-automated tool. Instead o ..."
Abstract
-
Cited by 132 (10 self)
- Add to MetaCart
Digital library access is driven by features, but features are often context-dependent and noisy, and their relevance for a query is not always obvious. This paper describes an approach for utilizing many data-dependent, user-dependent, and task-dependent features in a semi-automated tool. Instead of requiring universal similarity measures or manual selection of relevant features, the approach provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highlyspecialized and context-dependent features. The selection process is guided by arichexample-based interaction with the user. The inherent combinatorics
Finding Naked People
, 1996
"... . This paper demonstrates a content-based retrieval strategy that can tell whether there are naked people present in an image. No manual intervention is required. The approach combines color and texture properties to obtain an effective mask for skin regions. The skin mask is shown to be effective f ..."
Abstract
-
Cited by 122 (7 self)
- Add to MetaCart
. This paper demonstrates a content-based retrieval strategy that can tell whether there are naked people present in an image. No manual intervention is required. The approach combines color and texture properties to obtain an effective mask for skin regions. The skin mask is shown to be effective for a wide range of shades and colors of skin. These skin regions are then fed to a specialized grouper, which attempts to group a human figure using geometric constraints on human structure. This approach introduces a new view of object recognition, where an object model is an organized collection of grouping hints obtained from a combination of constraints on geometric properties such as the structure of individual parts, and the relationships between parts, and constraints on color and texture. The system is demonstrated to have 60% precision and 52% recall on a test set of 138 uncontrolled images of naked people, mostly obtained from the internet, and 1401 assorted control images, drawn f...
Image classification for content-based indexing
- IEEE Transactions on Image Processing
, 2001
"... Abstract—Grouping images into (semantically) meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Using binary Bayesian classifiers, we attempt to capture high-level concepts from low-level image features under the constraint ..."
Abstract
-
Cited by 118 (2 self)
- Add to MetaCart
Abstract—Grouping images into (semantically) meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Using binary Bayesian classifiers, we attempt to capture high-level concepts from low-level image features under the constraint that the test image does belong to one of the classes. Specifically, we consider the hierarchical classification of vacation images; at the highest level, images are classified as indoor or outdoor; outdoor images are further classified as city or landscape; finally, a subset of landscape images is classified into sunset, forest, and mountain classes. We demonstrate that a small vector quantizer (whose optimal size is selected using a modified MDL criterion) can be used to model the class-conditional densities of the features, required by the Bayesian methodology. The classifiers have been designed and evaluated on a database of 6931 vacation photographs. Our system achieved a classification accuracy of 90.5 % for indoor/outdoor, 95.3 % for city/landscape, 96.6 % for sunset/forest & mountain, and 96 % for forest/mountain classification problems. We further develop a learning method to incrementally train the classifiers as additional data become available. We also show preliminary results for feature reduction using clustering techniques. Our goal is to combine multiple two-class classifiers into a single hierarchical classifier. Index Terms—Bayesian methods, content-based retrieval, digital libraries, image content analysis, minimum description length, semantic
ImageRover: A Content-Based Image Browser for the World Wide Web
- In Proc. IEEE Workshop on Content-based Access of Image and Video Libraries
, 1997
"... ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the approp ..."
Abstract
-
Cited by 117 (3 self)
- Add to MetaCart
ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that selects the distance metrics appropriate for a particular query. Keywords: Image databases, query by image content, content-based retrieval, world wide web search engines. 1 Introduction For a while now there have been software "robots" roving the World Wide Web (WWW) collecting index information about th...
Content-based representation and retrieval of visual media: A state-of-the-art review
- Multimedia Tools and Applications
, 1996
"... This paper reviews a number of recently available techniques in contentanalysis of visual media and their application to the indexing, retrieval,abstracting, relevance assessment, interactive perception, annotation and re-use of visualdocuments. 1. Background A few years ago, the problems of represe ..."
Abstract
-
Cited by 117 (2 self)
- Add to MetaCart
This paper reviews a number of recently available techniques in contentanalysis of visual media and their application to the indexing, retrieval,abstracting, relevance assessment, interactive perception, annotation and re-use of visualdocuments. 1. Background A few years ago, the problems of representation and retrieval of visualmedia were confined to specialized image databases (geographical, medical, pilot experimentsin computerized slide libraries), in the professional applications of the audiovisualindustries (production, broadcasting and archives), and in computerized training or education. The presentdevelopment of multimedia technology and information highways has put content processing of visualmedia at the core of key application domains: digital and interactive video, large distributed digital libraries, multimedia publishing. Though the most important investments have been targeted at the information infrastructure (networks, servers, coding and compression, deliverymodels, multimedia systems architecture), a growing number of researchers have realized thatcontent processing will be a key asset in putting together successful applications. The need for contentprocessing techniques has been made evident from a variety of angles, ranging from achievingbetter quality in compression, allowing user choice of programs in video-on-demand, achieving betterproductivity in video production, providing access to large still image databases or integrating still images and video in multimedia publishing and cooperative work. Content-based retrieval of visual media and representation of visualdocuments in human-computer interfaces are based on the availability of content representationdata (time-structure for
Object categorization by learned universal visual dictionary
- In ICCV
, 2005
"... Figure 1: Exemplar snapshots of our interactive object categorization demo application. A user selects (sloppily) a region of interest and our algorithm associates an object class label with it. Despite large differences in pose, size, illumination and visual appearance the correct class label (e.g. ..."
Abstract
-
Cited by 114 (8 self)
- Add to MetaCart
Figure 1: Exemplar snapshots of our interactive object categorization demo application. A user selects (sloppily) a region of interest and our algorithm associates an object class label with it. Despite large differences in pose, size, illumination and visual appearance the correct class label (e.g. cow, building, car...) is automatically associated with each selected object instance. Some of these test images were downloaded from the web and none were part of the training set. A video of the interactive demo may be found at the above web site. This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable for many applications such as semantic image retrieval, web search, and interactive image editing. It classifies a region according to the proportions of different visual words (clusters in feature space). The specific visual words and the typical proportions in each object are learned from a segmented training set. The main contribution of this paper is two fold: i) an optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary. The final visual words are described by GMMs. ii) A novel statistical measure of discrimination is proposed which is optimized by each merge operation. High classification accuracy is demonstrated for nine object classes on photographs of real objects viewed under general lighting conditions, poses and viewpoints. The set of test images used for validation comprise: i) photographs acquired by us, ii) images from the web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes). 1.
Temporal Texture Modeling
- In IEEE International Conference on Image Processing
, 1996
"... Temporal textures are textures with motion. Examples include wavy water, rising steam and fire. We model image sequences of temporal textures using the spatio-temporal autoregressive model (STAR). This model expresses each pixel as a linear combination of surrounding pixels lagged both in space and ..."
Abstract
-
Cited by 93 (1 self)
- Add to MetaCart
Temporal textures are textures with motion. Examples include wavy water, rising steam and fire. We model image sequences of temporal textures using the spatio-temporal autoregressive model (STAR). This model expresses each pixel as a linear combination of surrounding pixels lagged both in space and in time. The model provides a base for both recognition and synthesis. We show how the least squares method can accurately estimate model parameters for large, causal neighborhoods with more than 1000 parameters. Synthesis results show that the model can adequately capture the spatial and temporal characteristics of many temporal textures. A 95% recognition rate is achieved for a 135 element database with 15 texture classes. 1.

