Results 1 - 10
of
28
Color indexing
- International Journal of Computer Vision
, 1991
"... Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. ..."
Abstract
-
Cited by 1124 (23 self)
- Add to MetaCart
Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. Two fundamental goals are determin-ing the location of a known object. Color can be successfully used for both tasks. This article demonstrates that color histograms of multicolored objects provide a robust, efficient cue for index-ing into a large database of models. It shows that color histograms are stable object representations in the presence of occlusion and over change in view, and that they can differentiate among a large number of objects. For solving the identification problem, it introduces a technique called Histogram Intersection, which matches model and im-age histograms and a fast incremental version of Histogram Intersection, which allows real-time indexing into a large database of stored models. For solving the location problem it introduces an algorithm called Histogram Backprojection, which performs this task efficiently in crowded scenes. 1
Control of Selective Perception Using Bayes Nets and Decision Theory
, 1993
"... A selective vision system sequentially collects evidence to support a specified hypothesis about a scene, as long as the additional evidence is worth the effort of obtaining it. Efficiency comes from processing the scene only where necessary, to the level of detail necessary, and with only the neces ..."
Abstract
-
Cited by 87 (1 self)
- Add to MetaCart
A selective vision system sequentially collects evidence to support a specified hypothesis about a scene, as long as the additional evidence is worth the effort of obtaining it. Efficiency comes from processing the scene only where necessary, to the level of detail necessary, and with only the necessary operators. Knowledge representation and sequential decision-making are central issues for selective vision, which takes advantage of prior knowledge of a domain's abstract and geometrical structure and models for the expected performance and cost of visual operators. The TEA-1 selective vision system uses Bayes nets for representation and benefitcost analysis for control of visual and non-visual actions. It is the high-level control for an active vision system, enabling purposive behavior, the use of qualitative vision modules and a pointable multiresolution sensor. TEA-1 demonstrates that Bayes nets and decision theoretic techniques provide a general, re-usable framework for constructi...
Looking for Trouble: Using Causal Semantics to Direct Focus of Attention
- In Proc. ICCV-93
, 1993
"... Vision should provide an explanation of the scene in terms of a causal semantics---what affects what, and why. An important part of the causal explanation of static scenes is what supports what, or, counterfactually: Why aren't things moving? We use simple naive physical knowledge as the basis of a ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
Vision should provide an explanation of the scene in terms of a causal semantics---what affects what, and why. An important part of the causal explanation of static scenes is what supports what, or, counterfactually: Why aren't things moving? We use simple naive physical knowledge as the basis of a vertically integrated vision system that explains arbitrarily complex stacked block structures. The semantics provides a basis for controlling the application of visual attention, and forms a framework for the explanation that is generated. We show how the program sequentially explores scenes of complex blocks structures, identifies functional substructures such as arches and cantilevers, and develops an explanation of why the whole construction stands and the role of each block in its stability. 1 Causal semantics for vision Much work in vision has taken as a central principle the notion that the task of vision is primarily one of explanation (e.g., [Witkin & Tanenbaum 83]). However, most...
Selective Perception Policies for Guiding Sensing and Computation in Multimodal Systems: A Comparative Analysis
- Comput. Vis. Image Underst
, 2003
"... Intensive computations required for sensing and processing perceptual information can impose significant burdens on personal computer systems. We explore several policies for selective perception in SEER, a multimodal system for recognizing o#ce activity that relies on a layered Hidden Markov Model ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Intensive computations required for sensing and processing perceptual information can impose significant burdens on personal computer systems. We explore several policies for selective perception in SEER, a multimodal system for recognizing o#ce activity that relies on a layered Hidden Markov Model representation. We review our e#orts to employ expected-value-of-information (EVI) computations to limit sensing and analysis in a context-sensitive manner. We discuss an implementation of a one-step myopic EVI analysis and compare the results of using the myopic EVI with a heuristic sensing policy that makes observations at di#erent frequencies. Both policies are then compared to a random perception policy, where sensors are selected at random. Finally, we discuss the sensitivity of ideal perceptual actions to preferences encoded in utility models about information value and the cost of sensing.
Sensor Planning for 3D Object Search
, 1996
"... The task of sensor planning for object search is formulated and a strategy for this task is proposed. The searcher is assumed to be a mobile platform equipped with an active camera and a method of calculating depth, like a stereo or laser range finder. The formulation casts sensor planning as an opt ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
The task of sensor planning for object search is formulated and a strategy for this task is proposed. The searcher is assumed to be a mobile platform equipped with an active camera and a method of calculating depth, like a stereo or laser range finder. The formulation casts sensor planning as an optimization problem: the goal is to maximize the probability of detecting the target with minimum cost. The search region is thus characterized by the probability distribution of the presence of the target. The control of the sensing parameters depends on the current state of the search region and the detecting abilities of the recognition algorithm. In order to efficiently determine the sensing actions over time, the huge space of possible actions is reduced to a finite set of actions that must be considered. The result of each sensing operation is used to update the status of the search space.
Physics-Based Visual Understanding
- Computer Vision and Image Understanding
, 1996
"... An understanding of a scene's causal physics---how scene elements interact and respond to forces---is a precondition to reasoning about how the scene came to be, how it may evolve in time, and how it will respond to manipulation. We propose a computationally inexpensive method for recovering causal ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
An understanding of a scene's causal physics---how scene elements interact and respond to forces---is a precondition to reasoning about how the scene came to be, how it may evolve in time, and how it will respond to manipulation. We propose a computationally inexpensive method for recovering causal structure from images, in which which a scene model is built incrementally through interleaved sensing and analysis. Reasoning uses generic qualitative knowledge about rigid-body interactions, reusable between domains and similar to concepts thought to be acquired or activated during child development. Causal constraint propagation reveals anomalous degrees-of-freedom in the scene model; prediction yields sensory plans to resolve them. Sensing operations are highly directed and local in scope, e.g., visual routines and proprioception. Inference-depth and the number of pixels "touched" are bounded by the complexity of the scene. We presents algorithms and semantics that have been successfully...
Where to Look Next in 3D Object Search
- in 1995 IEEE International Symposium for Computer Vision
, 1995
"... The task of sensor planning for object search is formulated and a mechanism for "where to look next" for this task is presented. The searcher is assumed to be a mobile platform equipped with an active camera and a method of calculating depth, like stereo or a laser range finder. The formulation cast ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
The task of sensor planning for object search is formulated and a mechanism for "where to look next" for this task is presented. The searcher is assumed to be a mobile platform equipped with an active camera and a method of calculating depth, like stereo or a laser range finder. The formulation casts sensor planning as an optimization problem: the goal is to maximize the probability of detecting the target object with minimal cost. The search space is thus characterized by the probability distribution of the presence of the target. The control of the sensing parameters depends on the current state of the search space and the detecting ability of the recognition algorithm. In order to represent the environment and to efficiently determine the sensing parameters over time, a concept called the sensed sphere is proposed and its construction, using a laser range finder, is derived. The result of each sensing operation is used to update the status of the search space. 1 Introduction Objec...
Employing Contextual Information in Computer Vision
- In Proceedings of ARPA Image Understanding Workshop
, 1993
"... Contextual information is often essential for visual recognition, but the design of image-understanding systems that effectively use context has remained elusive. We describe some of our experiences in attempting to employ contextual information in computer vision systems. By making explicit the bui ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Contextual information is often essential for visual recognition, but the design of image-understanding systems that effectively use context has remained elusive. We describe some of our experiences in attempting to employ contextual information in computer vision systems. By making explicit the built-in assumptions inherent in all computer vision algorithms, an architecture can be designed in which context can influence the recognition process. This paper describes such an architecture for context-based vision (CBV). 1 Introduction It is generally accepted that the surroundings of an object may have a profound influence on, and in some cases, may be necessary for, visual recognition of the object. What is not so well established is how to design computer vision systems that can exploit such contextual information. When a human observes a scene, or even studies a photograph, he normally has at his disposal a wealth of information that is not captured by the image alone. For example, i...
Control Structures for Incorporating Picture-Specific Context in Image Interpretation
- in Image Interpretation,” Proc. IJCAI '95
, 1995
"... This paper describes an efficient control mechanism for incorporating picture-specific context in the task of image interpretation. Although other knowledge-based vision systems use general domain context in reducing the computational burden of image interpretation, to our knowledge, this is t ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
This paper describes an efficient control mechanism for incorporating picture-specific context in the task of image interpretation. Although other knowledge-based vision systems use general domain context in reducing the computational burden of image interpretation, to our knowledge, this is the first effort in exploring picture-specific collateral information. We assume that constraints on the picture are generated from a natural language understanding module which processes descriptive text accompanying the pictures. We have developed a unified framework for exploiting these constraints both in the object location and identification (labeling) stage. In particular, we describe a technique for incorporating constrained search in context-based vision. Finally, we demonstrate the effectiveness of this approach in PICTION, a system that uses captions to label human faces in newspaper photographs. 1 Introduction To solve the inherently under-constrained task of image ...
Sensor Planning in 3D Object Search: its Formulation and Complexity
- IN THE 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND MATHEMATICS
, 1996
"... Object search is the task of searching for a given 3D object in a given 3D environment by a robot equipped with a camera. Sensor planning for object search refers to the task of how to select the sensing parameters of the camera so as to bring the target into the field of view of the camera and to m ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Object search is the task of searching for a given 3D object in a given 3D environment by a robot equipped with a camera. Sensor planning for object search refers to the task of how to select the sensing parameters of the camera so as to bring the target into the field of view of the camera and to make the image of the target to be easily recognized by the available recognition algorithms. In this paper, we study the task of sensor planning for object search from the theoretical point of view. We formulate the task and point out many of its important properties. We then analyze this task from the complexity level and prove that this task is NP-Complete.

