Results 1 - 10
of
111
Color indexing
- International Journal of Computer Vision
, 1991
"... Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. ..."
Abstract
-
Cited by 1124 (23 self)
- Add to MetaCart
Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. Two fundamental goals are determin-ing the location of a known object. Color can be successfully used for both tasks. This article demonstrates that color histograms of multicolored objects provide a robust, efficient cue for index-ing into a large database of models. It shows that color histograms are stable object representations in the presence of occlusion and over change in view, and that they can differentiate among a large number of objects. For solving the identification problem, it introduces a technique called Histogram Intersection, which matches model and im-age histograms and a fast incremental version of Histogram Intersection, which allows real-time indexing into a large database of stored models. For solving the location problem it introduces an algorithm called Histogram Backprojection, which performs this task efficiently in crowded scenes. 1
Specialization of Perceptual Processes
, 1994
"... In this report, I discuss the use of vision to support concrete, everyday activity. I will argue that a variety of interesting tasks can be solved using simple and inexpensive vision systems. I will provide a number of working examples in the form of a state-of-the-art mobile robot, Polly, which use ..."
Abstract
-
Cited by 81 (6 self)
- Add to MetaCart
In this report, I discuss the use of vision to support concrete, everyday activity. I will argue that a variety of interesting tasks can be solved using simple and inexpensive vision systems. I will provide a number of working examples in the form of a state-of-the-art mobile robot, Polly, which uses vision to give primitive tours of the seventh floor of the MIT AI Laboratory. By current standards, the robot has a broad behavioral repertoire and is both simple and inexpensive (the complete robot was built for less than $20,000 using commercial board-level components). The approach I will use will be to treat the structure of the agent's activity--- its task and environment---as positive resources for the vision system designer. By performing a careful analysis of task and environment, the designer can determine a broad space of mechanisms which can perform the desired activity. My principal thesis is that for a broad range of activities, the space of applicable mechanisms will be broad...
Recovering shape by purposive viewpoint adjustment
- International Journal of Computer Vision
, 1994
"... We present an approach for recovering surface shape from the occluding contour using an active (i.e., moving) observer. It is based onarelation between the geometries of a surface inascene and its occluding contour: If the viewing direction of the observer is along a principal direction for a surfac ..."
Abstract
-
Cited by 52 (8 self)
- Add to MetaCart
We present an approach for recovering surface shape from the occluding contour using an active (i.e., moving) observer. It is based onarelation between the geometries of a surface inascene and its occluding contour: If the viewing direction of the observer is along a principal direction for a surface point whose projection is on the contour, surface shape (i.e., curvature) at the surfacepoint can be recovered from the contour. Unlike previous approaches for recovering shape from the occluding contour, we use an observer that purposefully changes viewpoint in order to achieve a well-de ned geometric relationship with respect to a 3D shape prior to its recognition. We show that there is a simple and e cient viewing strategy that allows the observer to align the viewing direction with one of the two principal directions for a point on the surface. Experimental results demonstrate that our method can be easily implemented and can provide reliable shape information. 1
A Tensor Framework for Multidimensional Signal Processing
- Linkoping University, Sweden
, 1994
"... ii About the cover The figure on the cover shows a visualization of a symmetric tensor in three dimensions, G = λ1ê1ê T 1 + λ2ê2ê T 2 + λ3ê3ê T 3 The object in the figure is the sum of a spear, a plate and a sphere. The spear describes the principal direction of the tensor λ1ê1ê T 1, where the lengt ..."
Abstract
-
Cited by 50 (6 self)
- Add to MetaCart
ii About the cover The figure on the cover shows a visualization of a symmetric tensor in three dimensions, G = λ1ê1ê T 1 + λ2ê2ê T 2 + λ3ê3ê T 3 The object in the figure is the sum of a spear, a plate and a sphere. The spear describes the principal direction of the tensor λ1ê1ê T 1, where the length is proportional to the largest eigenvalue, λ1. The plate describes the plane spanned by the eigenvectors corresponding to the two largest eigenvalues, λ2(ê1ê T 1 + ê2ê T 2). The sphere, with a radius proportional to the smallest eigenvalue, shows how isotropic the tensor is, λ3(ê1ê T 1 + ê2ê T 2 + ê3ê T 3). The visualization is done using AVS [WWW94]. I am very grateful to Johan Wiklund for implementing the tensor viewer module used. This thesis deals with filtering of multidimensional signals. A large part of the thesis is devoted to a novel filtering method termed “Normalized convolution”. The method performs local expansion of a signal in a chosen filter basis which
Polly: A Vision-Based Artificial Agent
- Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93
, 1993
"... In this paper I will describe Polly, a low cost visionbased robot that gives primitive tours. The system is very simple, robust and efficient, and runs on a hardware platform which could be duplicated for less than $10K US. The system was built to explore how knowledge about the structure the enviro ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
In this paper I will describe Polly, a low cost visionbased robot that gives primitive tours. The system is very simple, robust and efficient, and runs on a hardware platform which could be duplicated for less than $10K US. The system was built to explore how knowledge about the structure the environment can be used in a principled way to simplify both visual and motor processing. I will argue that very simple and efficient visual mechanisms can often be used to solve real problems in real (unmodified) environments in a principled manner. I will give an overview of the robot, discuss the properties of its environment, show how they can be used to simplify the design of the system, and discuss what lessons can drawn for the design of other systems. 1 Introduction In this paper, I will describe Polly, a simple artificial agent that uses vision to give primitive tours of the 7th floor of the MIT AI lab (see figure 1). Polly is built from minimalist machinery that is matched to its task ...
Divergent Stereo in Autonomous Navigation: From Bees to Robots
, 1994
"... This report presents some experiments of a real-time navigation system driven by two cameras pointing laterally to the navigation direction (Divergent Stereo). Similarly to what has been proposed in [11; 5], our approach [17; 19] assumes that, for navigation purposes, the driving information is not ..."
Abstract
-
Cited by 41 (15 self)
- Add to MetaCart
This report presents some experiments of a real-time navigation system driven by two cameras pointing laterally to the navigation direction (Divergent Stereo). Similarly to what has been proposed in [11; 5], our approach [17; 19] assumes that, for navigation purposes, the driving information is not distance (as it is obtainable by a stereo setup) but motion and, more precisely, by the use of qualitative optical flow information computed over nonoverlapping areas of the visual field of two cameras. Following this idea, a mobile vehicle has been equipped with a pair of cameras looking laterally (much like honeybees) and a controller based on fast, real-time computation of optical flow has been implemented. The control of the mobile robot (Robee) is based on the comparison between the apparent image velocity of the left and the right cameras. The solution adopted is derived from recent studies [21] describing the behavior of freely flying honeybees and the mechanisms they use to perceive ...
An Architecture for Vision and Action
- In Fourteenth International Joint Conference on Artificial Intelligence
, 1995
"... Vision systems that have successfully supported nontrivial tasks have invariably taken advantage of constraints derived from the task and environment to increase reliability and lower the complexity of perception. We propose that it is possible to build a general purpose vision system, that is, one ..."
Abstract
-
Cited by 37 (8 self)
- Add to MetaCart
Vision systems that have successfully supported nontrivial tasks have invariably taken advantage of constraints derived from the task and environment to increase reliability and lower the complexity of perception. We propose that it is possible to build a general purpose vision system, that is, one that can support a wide variety of tasks, and take advantage of such constraints. The central idea within our proposed architecture is the reactive skill. Skills are concurrent control routines assembled at run time using instructions from a symbolic execution system. Visual modules are used as resources in the construction of these skills. Skills control the agent as continuous feedback loops but are constructed using discrete, symbolic instructions. The key to general-purpose vision is the ability to parametrize the primitive elements of the vision system and to compose visual and control routines in a variety of ways. We demonstrate the architecture in the context of an implemented examp...
Qualitative Egomotion
- International Journal of Computer Vision
, 1993
"... Due to the aperture problem, the only general unambiguous motion measurement in images is normal flow---the projection of image motion on the gradient direction. In this paper we show how a monocular observer can estimate its 3D motion relative to the scene by using normal flow measurements in a ..."
Abstract
-
Cited by 29 (12 self)
- Add to MetaCart
Due to the aperture problem, the only general unambiguous motion measurement in images is normal flow---the projection of image motion on the gradient direction. In this paper we show how a monocular observer can estimate its 3D motion relative to the scene by using normal flow measurements in a global and mostly qualitative way. The problem is addressed through a search technique. By checking constraints imposed by 3D motion parameters on the normal flow field the possible space of solutions is gradually reduced. In the four modules that comprise the solution, constraints of increasing restriction are considered, culminating in testing every single normal flow value for its consistency with a set of motion parameters. The fact that motion is rigid defines geometric relations between certain values of the normal flow field. The selected values form patterns in the image plane that are dependent on only some of the motion parameters. These patterns, which are determined by the signs of the normal flow values, are searched for in order to find the axes of translation and rotation. The third rotational component is computed from normal flow vectors that are only due to rotational motion. Finally, by looking at the complete data set, all solutions that cannot give rise to the given normal flow field are discarded from the solution space.
Provable strategies for vision-guided exploration in three dimensions,” tech
, 1993
"... An approach is presented for exploring an unknown, arbitrary surface in three-dimensional (3D) space by a mobile robot. The main contributions are (1) an analysis of the capabilities a robot must possess and the trade-offs involved in the design of an exploration strategy, and (2) two provablycorrec ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
An approach is presented for exploring an unknown, arbitrary surface in three-dimensional (3D) space by a mobile robot. The main contributions are (1) an analysis of the capabilities a robot must possess and the trade-offs involved in the design of an exploration strategy, and (2) two provablycorrect exploration strategies that exploit these trade-offs and use visual sensors (e.g., cameras and range sensors) to plan the robot’s motion. No such analysis existed previously for the case of a robot moving freely in 3D space. The approach exploits the notion of the occlusion boundary, i.e., the points separating the visible from the occluded parts of an object. The occlusion boundary is a collection of curves that “slide” over the surface when the robot’s position is continuously controlled, inducing the visibility of surface points over which they slide. The paths generated by our strategies force the occlusion boundary to slide over the entire surface. The strategies provide a basis for integrating motion planning and visual sensing under a common computational framework. 1
Map-based navigation in mobile robots. -- I. A review of localization strategies
, 2003
"... For a robot, an animal, and even for man, to be able to use an internal representation of the spatial layout of its environment to position itself is a very complex task, which raises numerous issues of perception, categorization and motor control that must all be solved in an integrated manner to p ..."
Abstract
-
Cited by 26 (9 self)
- Add to MetaCart
For a robot, an animal, and even for man, to be able to use an internal representation of the spatial layout of its environment to position itself is a very complex task, which raises numerous issues of perception, categorization and motor control that must all be solved in an integrated manner to promote survival. This point is illustrated here, within the framework of a review of localization strategies in mobile robots. The allothetic and idiothetic sensors that may be used by these robots to build internal representations of their environment, and the maps in which these representations may be instantiated, are first described. Then map-based navigation systems are categorized according to a 3-level hierarchy of localization strategies, which respectively call upon direct position inference, single-hypothesis tracking, and multiple-hypothesis tracking. The advantages and drawbacks of these strategies, notably with respect to the limitations of the sensors on which they rely, are discussed throughout the text.

