| R. W Picard. A Society of Models for Video and Image Libraries. IBM Systems Journal, MIT Media Lab Special Issue, Vol. 35, Nos. 3 & 4, pp. 292-312, 1996. |
....may be able to segment properly the dark part of the motion picture, but such feature looses on performance in the segmentation of the light parts of the movie. The conclusion that one feature cannot be used to describe a general domain was also drawn in the field of image retrieval by Picard [10]. She suggests the use of a society of models rather than one model, where a model is a feature in our terminology. Following this research direction for LSU segmentation in videos, we propose to select a dedicated feature from the feature set for each part of a video. Each part is then ....
R. W. Picard. A society of models for video and image libraries. IBM Systems Journal, 35(3+4):292--312, 1996.
....images of the objects; these images are compared with observed views at recognition time, e.g, eigen images techniques. A similar problem, although in a different context, is encountered in image indexing, where the main problem is to store and organize images to facilitate their retrieval [ 1][26]. The emphasis in this ease is on the kind of features used and the type of requests that can be made by the user. As an example of a problem in which such an approach can be used, we considex here the problem of recognizing landmarks in sequences of images taken from a moving vehicle. Even with ....
R.W. Picard. A society of models for video and image libraries. IBM Systems Journal, 35(3-4):292-312. 1996.
....available within the document itself. Using text speci city such as geometrical shape and contrast, captions or credits may be extracted and processed through OCR for completing the document indexing (see e.g. 2] Finally, combining models for understanding may permit high level interpretation [32]. 4 Temporal analysis The temporal dimension of a video document contains an information that is speci c to this type of document. The temporal analysisof that document typically requires its partitioning into basic elements. It is now recognised that this partitioning can operate at four ....
R. W. Picard. A society of models for video and image libraries. IBM Systems Journal (MIT Media Lab Special Issue), 35(3/4):292312, 1996.
....retrieval (CBIR) has become an active research area. A variety of techniques have been developed. In particular, content based image retrieval using lowlevel features such as color [41, 36, 26] texture [21, 35, 34, 40, 20] shape [42, 22, 14, 23, 24, 15, 10, 17, 46, 47, 33, 43, 32] and others [30, 38, 2, 16, 7] extracted from the images has been well studied. Various image querying systems including QBIC [11] VisualSeek [36] PhotoBook [27] and Virage [5] have been built based on the low level features for general or specific image retrieval tasks. However, retrieving images based on low level ....
Rosalind Picard. A society of models for video and image libraries. Technical Report 360, MIT Media Laboratory Perceptual Computing, 1996.
....a promising technology to address this need, and a variety of CBIR techniques have been developed. In particular, content based image retrieval using low level features such as color [43, 38, 27] texture [22, 37, 36, 42, 21] shape [44, 23, 13, 24, 25, 14, 10, 16, 48, 49, 35, 45, 34] and others [29, 39, 2, 15, 7] has been well studied. Various image querying systems, including QBIC [11] VisualSeek [38] PhotoBook [28] Netra [19] and Virage [5] have been built, using low level features for general or specific image retrieval tasks. However, retrieving images via low level features has proven to be ....
Rosalind Picard. A society of models for video and image libraries. Technical Report 360, MIT Media Laboratory Perceptual Computing, 1996.
....have now reached their upper limits, the new technologies related to archiving, retrieving, and editing images have to integrate high level techniques of computer vision. Consider for instance the problem of image retrieval in a large database, a central problem in the field of computer vision [18, 20]. The user selects an image and asks the computer to find more images that are in some sense similar to the probe image. This is a very hard problem that has multiple (although partial) solutions, depending on the type and the content of the image. A matching is usually performed to align the ....
R.W. Picard. A society of models for video and image libraries. Technical Report TR-360, MIT Media Lab, 1996.
....the flavour of the early mutual information work [28] where the error metric for image registration is based on deviation from causality of the scattergram of transformed intensities. 1.2. Time series analysis Computer vision has recently seen increased use of the tools of time series analysis [2, 3, 13, 17, 18, 24], which model the temporal evolution of physical systems. A time series is a sequence of vector valued observations N X O P . One task of time series analysis is to forecast the value of X Y , given the previously observed values X . If X is produced by an ....
....is also discrete, meaning that dynamics need to be added post hoc. In contrast, the current method models the temporal behaviour as samples of an underlying continuous process, with longer than 1 frame of memory. Therefore the correct dynamics emerge automatically. Earlier, Szummer and Picard [24, 17] used a spatiotemporal autoregressive (STAR) model to describe textures such as water and steam. Their work finds an AR description of how pixels in the sequence evolve as a function of a window of pixels neighbouring in time and space, and is fitted directly to the raw image data. This model is ....
R. Picard. A society of models for video and image libraries. Technical Report 360, MIT Media Lab, 1996.
....the flavour of the early mutual information work [28] where the error metric for image registration is based on deviation from causality of the scattergram of transformed intensities. 1.2. Time series analysis Computer vision has recently seen increased use of the tools of time series analysis [2, 3, 13, 17, 18, 24], which model the temporal evolution of physical systems. A time series is a sequence of vector valued observations fx t g n t=1 . One task of time series analysis is to forecast the value of x t 1 , given the previously observed values x 1: t . If x is produced by an autoregressive process, ....
....is also discrete, meaning that dynamics need to be added post hoc. In contrast, the current method models the temporal behaviour as samples of an underlying continuous process, with longer than 1 frame of memory. Therefore the correct dynamics emerge automatically. Earlier, Szummer and Picard [24, 17] used a spatiotemporal autoregressive (STAR) model to describe textures such as water and steam. Their work finds an AR description of how pixels in the sequence evolve as a function of a window of pixels neighbouring in time and space, and is fitted directly to the raw image data. This model is ....
R. Picard. A society of models for video and image libraries. Technical Report 360, MIT Media Lab, 1996.
....Moreover, MARS [5] requires the user to provide preference weights of the relevant images, which sometimes is difficult for the user to give a clear choice. An example of using both the positive and negative examples, which are chosen by the user, for image retrieval can be found in FourEyes [8]. The system looks at all the local models and determines which model or combination of models best covers the positive examples, while satisfying the constraints implied by the negative examples. In this paper, we propose to apply Support Vector Machine to two classes (positive and negative ....
R. Picard, "A Society of Models for Video and Image Libraries", IBM Systems Journal, vol. 35, no. 3&4, pp. 292-312, 1996.
....known as query by image content for images and query by audio content (QBAC) for sound, is the process of accessing a multimedia data base by automatically analyzing and categorizing the digital content of the entries. There have been many systems built for this purpose in the visual domain [30] [75], with varying degrees of success, but relatively few attempts in the audio domain see [123] for one suggested framework. There are many compelling applications for QBAC systems. There have been several companies started recently to provide Web based intelligent agent systems, which try to ....
R. Picard, "A society of models for video and image libraries," IBM Syst. J., vol. 35, pp. 292--312, 1996.
....lose the information implied by the negative examples. It is our expectation that by utilizing both positive and negative feedbacks, the user s perception subjectivity can be captured more accurately than relevance feedback with positive examples only. Negative examples are also used in FourEyes [5], where the user chooses the both positive and negative examples. FourEyes looks at all of the models and determines which model or combination of models best describe the positive examples chosen by the user, while satisfying the constraints of the negative examples. In MARS [2] manually ....
R. Picard, "A Society of Models for Video and Image Libraries", TR No.360, Media Lab, MIT, 1996.
.... keys [40, 34, 32] using eigenfeatures [29, 36] and image retrieval from a compressed database [33, 40] Locational information between objects in a scene has been incorporated into an image representation using 2D strings [31] and 2D Markov Models [22] A review on image retrieval can be found in [24, 30]. 3 Object Process Methodology and Diagrams Any system has two major aspect: structure and behavior. Structure pertains to relationships among things (objects or processes) in the system that hold in the long run, while behavior has to do with the dynamics of the system, i.e. the way its state ....
R.W. Picard. A society of models for video and image libraries. IBM Systems Journal, 35(3-4):292--312, 1996.
....from which shape, texture, and face features are extracted respectively. Users can then query based on corresponding features in each of the three sub books. In its more recent version of Photobook, FourEyes, Picard et al. proposed to include human in the image annotation and retrieval loop [111, 93, 110, 112, 109, 108, 113, 78]. The motivation of this was based on the observation that there was no single feature which can best model images from each and every domain. Furthermore, human s perception is subjective. They proposed a society of model approach to incorporate the human factor. Experimental results show that ....
R. W. Picard. A society of models for video and image libraries. Tech. Rep., MIT Media Lab, April, 1994.
....can be answered by simply comparing text strings which can be implemented efficiently. To accomplish the same for video sequences, we must be able to compare these sequences efficiently to see if they have similar content. The Photobook system describes a set of similarity measures for images [2, 3] and techniques for searching through images. Likewise, the QBIC system [4] uses image based similarity measures for querying image sequences. However, such similarity measures are expensive for video sequences. When we are looking for video sequences of waterfalls, we are not particularly ....
Rosalind W. Picard, "A Society of Models for Video and Image Libraries," MIT Media Laboratory Perceptual Computing TR 360, to appear in IBM Systems Journal, August 1996, MIT Media Laboratory, Cambridge MA, 1996.
....encourage the empirical evaluation of algorithms in their application domains (see comments in [19] and [15] among others) This further motivates the approach presented in this paper. 1.1. Previous work The problem of selecting the right features is well known in content based retrieval [24][25]. Most automatic classification indexing approaches, however, have relied on using either domain specific features [6] 31] or general features that have worked well in different domains (e.g. color histogram [29] is widely used) Additionally, in most cases, the algorithms themselves have been ....
....[0, 1, 2, 4, 11, 12] CV: 0.43 T: 22.75 R: 116 [1, 2, 4, 7, 12, 15, 17, 37] CV: 0.43 T: 23.63 R: 118 [1, 2, 7, 12, 20, 37] CV: 0.58 T: 23.14 R: 115 1, 2, 7, 9, 11, 12, 19, 20, 37] CV: 1.30 T: 22.36 R: 115 [1, 2, 3, 6, 12, 18, 21, 30, 37] CV: 1. 29 T: 17.60 R: 112 [1, 2, 10, 11, 15, 16, 17, 19, 20, 22, 23, 24, 25, 26, 27, 28, 30, 35, 38] CV: 0.45 T: 27.03 R: 95 Top Grass [0, 2, 7, 8, 9, 21] CV: 0.14 T: 14.86 R: 142 [2, 5, 8, 9, 10, 11, 19, 35, 38] CV: 0.43 T: 16.25 R: 145 [0, 2, 7, 8, 9, 21,22] CV: 0.14 T: 15.06 R: 145 [1, 2, 6, 8, 11, 12, 38] CV: 1.01 T: 16.23 R: 128 [0, 2, 3, 4, 7, 9, 10, ....
[Article contains additional citation context not shown here]
R.W. Picard, "A Society of Models for Video and Image Libraries", MIT Media Laboratory Perceptual Computing Section Technical Report, No. 360, Cambridge, Massachusetts, 1996.
....to video data. In the last years there have been many different approaches to content based video indexing. A rough categorization of the approaches yields two main classes. The first class of approaches mainly addresses the problem of reusing video sequences in TV studios. In these approaches [1, 2] the video sequence is segmented into shots and a key frame is extracted for each shot. The key frame is stored in a library with a reference to the video tape containing the sequence. To get a vision based access to the scenes the user presents a query image and the retrieval system searches ....
R. W. Picard. A society of models for video and image libraries. Perceptual Computing Section Technical Report 360, MIT Media Lab, Cambridge, MA, 1996.
....have now reached their upper limits, the new technologies related to archiving, retrieving, and editing images have to integrate high level techniques of computer vision. Consider for instance the problem of image retrieval in a large database, a central problem in the field of computer vision [18, 20]. The user selects an image and asks the computer to find more images that are in some sense similar to the probe image. This is a very hard problem that has multiple (although partial) solutions, depending on the type and the content of the image. A matching is usually performed to align the ....
R.W. Picard. A society of models for video and image libraries. Technical Report TR-360, MIT Media Lab, 1996.
....from which shape, texture, and face features are extracted respectively. Users can then query based on corresponding features in each of the three sub books. In its more recent version of Photobook, FourEyes, Picard et al. proposed to include human in the image annotation and retrieval loop [112, 94, 110, 113, 109, 111, 114, 79]. The motivation of this was based on the observation that there was no single feature which can best model images from each and every domain. Furthermore, human s perception is subjective. They proposed a society of model approach to incorporate the human factor. Experimental results show that ....
R. W. Picard. A society of models for video and image libraries. Technical report, MIT, 1996.
....staining intensity, and occurrence of particular nuclei classes. INTRODUCTION Content based access to image and video databases is already a reality [1] 3] 10] Recent research work shows that depending on the image data and the desired search mode some approaches are more adequate than others [5]. In particular, in the medical domain global properties of an image, while at times simple and efficient to compute [9] are often less important compared to structural and semantic descriptions of image content [3] However, reasoning over structural and semantic details or objects in an image ....
R.W. Picard, A Society of Models for Video and Image Libraries, MIT Media Laboratory, Perceptual Computing Section, Technical Report No. 360, 1996
.... in other scenes [4] One of the most significant contributors is MIT s Media Laboratory whose FourEyes system, which builds on the work surrounding the well known PhotoBook project [13] tries to overcome the difficulties of dimensional explosion in feature space by using a society of models [14]. In an initial off line phase, a number of different filtering techniques are applied to the data before any queries are made in order to hierarchically cluster the data in as many ways as possible. Groups of these clusters are identified which best represent classes of scenes which are employed ....
....algorithm [10] This process results in irregularly shaped regions which should represent the actual outlines of objects in the images. It is on these areas that all queries are based as opposed to either the entire image or rectangular minimum bounding boxes as employed in some approaches [11] 4][14]. This is an important distinction since the feature extraction will not draw any contaminating biases from outside the required object. With a segmentation available, a feature extraction process is used to obtain information unique to each region. The components extracted are as follows: 0 Area ....
R. Picard. A society of models for video and image libraries. Technical report, MIT Media Laborartory, 1996.
....[25, 15, 14, 30] or shape [11, 16, 1, 29, 31, RR n3206 4 C. Nastar 17, 2] However, a number of recent studies attempt to integrate several image attributes, since a single attribute may simply not be present or lack sufficient discriminatory power for a number of real world applications [24, 9, 26, 12]. In this framework, we have previously [21, 19, 20] modeled an image as a greylevel intensity surface (XYI representation) see also [3] and we have used that for matching and retrieval in a face database. On the other hand, Dorai and Jain [5, 6] use Koenderink s definition of the shape index ....
R.W. Picard. A society of models for video and image libraries. Technical Report TR-360, MIT Media Lab, 1996.
....Piction system [3] uses the captions of newspaper photographs containing human faces to help locate the faces. IBM s QBIC system [4] relies on the user specifying specific visual cues or providing an example image (e.g. a sketch) to begin a query for an image. In Picard and Minka s Foureyes system [5][6] close interaction with a human user supplements information derived from the image content. WebSeer uses the textual information surrounding an image and the image header to supplement the information derived from analyzing the image content. This additional information is used to create a ....
....as a starting point toward a more general image taxonomy. We are working on identifying a taxonomy that fits users needs and is constructed of image classes that can be reliably identified. Some of these categories may include advertisements, geographic maps, landscapes [9] city country scenes [5][10] night scenes, sunsets, scenes with foliage, and so on. Tomasi and Guibis descibe a system which classifies types of images based on image content alone [11] We believe that we need a close interaction between the image understanding algorithms and the associated text indexing algorithms in ....
R.W. Picard (1996). A Society of Models for Video and Image Libraries. Media Laboratory Perceptual Computing Section Technical Report No. 360.
....of color feature class, 30 of texture feature class, and 20 of shape feature class . But users do not naturally sort images by similarity using this kind of language. In particular, as the number of feature classes increases, intuition about how to pick relative weightings among features is lost [32]. Also, since all the measurements of similarities are usually in the range of zero to one, the common practice is to normalize the distance measurements and convert them into similarity values. But normalization process for each feature class will be different, because of difference in the ....
Rosalind Picard. A society of models for video and image libraries. Technical Report 360, MIT Media Laboratory Perceptual Computing, 1996.
No context found.
R. W Picard. A Society of Models for Video and Image Libraries. IBM Systems Journal, MIT Media Lab Special Issue, Vol. 35, Nos. 3 & 4, pp. 292-312, 1996.
No context found.
R. Picard. A society of models for video and image libraries. IBM Systems Journal, 35(3/4):292--312, 1996.
No context found.
R. Picard, A society of models for video and image libraries, in MIT Media Lab Perceptual Computing Section Tech. Rep., Cambridge, MA, no. 360, 1995.
No context found.
R.W. Picard, "The Society of Models for Video and Image Libraries, " Technical Report 360, MIT Media Lab. Perceptual Computing Section, 1995.
No context found.
R.W. Picard, A society of models for video and image libraries, Technical Report 352, MIT Media Lab Perceptual Computing BM Res. Div., Almaden Res. Center, (February 1993).
No context found.
R. Picard. A society of models for video and image libraries. IBM Systems Journal, 35(3/4):292--312, 1996.
No context found.
Picard, R. W. 1996. "A society of models for video and image libraries", MIT media lab. technical report No.360.
No context found.
R.W. Picard, "A Society of Models for Video and Image Libraries," MIT Media Laboratory Perceptual Computing Section Technical Report, No. 360, Cambridge, Massachusetts, 1996.
No context found.
R. Picard, "A Society of Models for Video and Image Libraries," IBM Systems J., Vol. 35, No. 3, 1996, pp. 292-312.
No context found.
Rosalind Picard (1996), A Society of Models for Video and Image Libraries, MIT Media Laboratory Perceptual Computing Section, TR-360.
No context found.
Picard, R. W. 1996. \A society of models for video and image libraries", MIT media lab. technical report No.360.
No context found.
R.W. Picard. A Society of Models for Video and Image Libraries. IBM Systems Journal, 35(3-4):292-312. 1996.
No context found.
Picard, R. W. 1996. "A society of models for video and image libraries", MIT media lab. technical report No.360.
No context found.
Picard, R. W. 1996. "A society of models for video and image libraries", MIT media lab. technical report No.360.
No context found.
R. Picard. A Society of Models for Video and Image Libraries. Technical Report TR 360, MIT Media Lab, Cambridge, MA, 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC