Results 1 -
6 of
6
Improved Rooftop Detection in Aerial Images with Machine Learning
- Machine Learning
, 2002
"... In this paper, we examine the use of machine learning to improve a rooftop detection process, one step in a vision system that recognizes buildings in overhead imagery. We review the problem of analyzing aerial images and describe an existing system that detects buildings in such images. We briefly ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
In this paper, we examine the use of machine learning to improve a rooftop detection process, one step in a vision system that recognizes buildings in overhead imagery. We review the problem of analyzing aerial images and describe an existing system that detects buildings in such images. We briefly detail four algorithms that we selected to improve rooftop detection. The data sets were highly skewed and the cost of mistakes differed between the classes, so we used ROC analysis to evaluate the methods under varying error costs. We report three experiments designed to illuminate facets of applying machine learning to the image analysis task. One investigated learning with all available images to determine the best performing method. Another focused on within-image learning, in which we derived training and testing data from the same image. A final experiment addressed between-image learning, in which training and testing sets came from different images. Results suggest that useful generalization occurred when training and testing on data derived from images differing in location and in aspect. They demonstrate that under most conditions, naive Bayes exceeded the accuracy of other methods and a handcrafted classifier, the solution currently used in the building detection system.
Learning Patterns in Images
- in Machine Learning and Data
, 1998
"... This chapter concerns problems of learning patterns in images and image sequences, and using them for interpreting new images. The chapter concentrates on three problem areas: (i) semantic interpretation of color images of outdoor scenes, (ii) detection of blasting caps in x-ray images of luggage, a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This chapter concerns problems of learning patterns in images and image sequences, and using them for interpreting new images. The chapter concentrates on three problem areas: (i) semantic interpretation of color images of outdoor scenes, (ii) detection of blasting caps in x-ray images of luggage, and (iii) recognizing actions in video image sequences. It discusses the image formation processes in these problem areas, and the choices of representation spaces used in our approaches to solving these problems. The results presented indicate the advantages of applying machine learning to vision. 10.1 INTRODUCTION The underlying motivation of this research is that vision systems need learning capabilities for handling problems for which algorithmic solutions are unknown or difficult to obtain. Learning capabilities can also make vision systems more easily adaptable to different vision problems, and more flexible and robust in handling variable perceptual conditions [MRA94]. Much of the cur...
FOCUS: A Generalized Method for Object Discovery for Robots that Observe and Interact with Humans
- In Proceedings of the 2006 Conference on Human-Robot Interaction
, 2006
"... The essence of the signal-to-symbol problem consists of associating a symbolic description of an object (e.g., a chair) to a signal (e.g., an image) that captures the real object. Robots that interact with humans in natural environments must be able to solve this problem correctly and robustly. Howe ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The essence of the signal-to-symbol problem consists of associating a symbolic description of an object (e.g., a chair) to a signal (e.g., an image) that captures the real object. Robots that interact with humans in natural environments must be able to solve this problem correctly and robustly. However, the problem of providing complete object models a priori to a robot so that it can understand its environment from any viewpoint is extremely difficult to solve. Additionally, many objects have different uses which in turn can cause ambiguities when a robot attempts to reason about the activities of a human and their interactions with those objects. In this paper, we build upon the fact that robots that co-exist with humans should have the ability of observing humans using the different objects and learn the corresponding object definitions. We contribute an object recognition algorithm, FOCUS, that is robust to the variations of signals, combines structure and function of an object, and generalizes to multiple similar objects. FOCUS, which stands for Finding Object Classification through Use and Structure, combines an activity recognizer capable of capturing how an object is used with a traditional visual structure processor. FOCUS learns structural properties (visual features) of objects by knowing first the object's affordance properties and observing humans interacting with that object with known activities. The strength of the method relies on the fact that we can define multiple aspects of an object model, i.e., structure and use, that are individually robust but insufficient to define the object, but can do when combined.
Function-Based Classification from 3D Data and Audio
- Proceedings of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems
, 2006
"... Abstract — We propose a novel scheme for fusion between two types of modalities to support function-based classification. While the first modality targets functional classification from sounds registered at impact, the second one aims classification of objects in 3D images. Using audio one can answe ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract — We propose a novel scheme for fusion between two types of modalities to support function-based classification. While the first modality targets functional classification from sounds registered at impact, the second one aims classification of objects in 3D images. Using audio one can answer functional questions such as what is the material the analyzed objects are built of, if the objects are full or hollow, if they are heavy, and if they are rigidly linked to their supports. Audio based signatures are used to label parts of the object under analysis. Different parts of any object can be partitioned in generic multi-level hierarchical descriptions of functional components. Functionality, in the visual modality reasoning scheme, is derived from a large set of geometric attributes and relationships between object parts. These geometric properties represent labeling signatures to the primitive and functional parts of the analyzed classes. The fusion between both of the modalities relies on a shared cooperation among audio and visual signatures of the functional and primitive parts. The scheme does not require a-priori knowledge about any class. We tested the proposed scheme on a database of about one thousand different 3D objects. The results show high accuracy in classification. I.
Automatic Object Recognition within an Office Scene
- In Canadian Conference on Computer and Robot Vision (CRV2004
, 2004
"... The visionary goal of an easy to use service robot implies some key features like spatial cognition, speech understanding and object recognition. Therefore such a system needs techniques to identify objects in scenes, i.e. to assign the natural category (e.g. "door", "chair", "table") to new objects ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The visionary goal of an easy to use service robot implies some key features like spatial cognition, speech understanding and object recognition. Therefore such a system needs techniques to identify objects in scenes, i.e. to assign the natural category (e.g. "door", "chair", "table") to new objects based on their prototypical geometry.

