Results 1 - 10
of
39
Recognition-by-components: A theory of human image understanding
- Psychological Review
, 1987
"... The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into an arrangement of simple geometric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recog ..."
Abstract
-
Cited by 550 (8 self)
- Add to MetaCart
The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into an arrangement of simple geometric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recognition-by-components (RBC), is that a modest set of generalized-cone components, called geons (N ^ 36), can be derived from contrasts of five readily detectable properties of edges in a two-dimensional image: curvature, collinearity, symmetry, parallelism, and cotermmation. The detection of these properties is generally invariant over viewing position and image quality and consequently allows robust object perception when the image is projected from a novel viewpoint or is degraded. RBC thus provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition: The constraints toward regularization (Pragnanz) characterize not the complete object but the object's components. Representational power derives from an allowance of free combinations of the geons. A Principle of Componential Recovery can account for the major phenomena of object recognition: If an arrangement of two or three geons can be recovered from the input, objects can be quickly recognized even when they are occluded, novel, rotated in depth, or extensively degraded. The results from experiments on the perception of briefly presented pictures by human observers provide empirical support for the theory. Any single object can project an infinity of image configura-tions to the retina. The orientation of the object to the viewer can vary continuously, each giving rise to a different two-dimen-sional projection. The object can be occluded by other objects or texture fields, as when viewed behind foliage. The object need not be presented as a full-colored textured image but in-stead can be a simplified line drawing. Moreover, the object can even be missing some of its parts or be a novel exemplar of its
Preattentive recovery of three-dimensional orientation from line drawings
- Psychological Review
, 1991
"... It has generally been assumed that rapid visual search is based on simple features and that spatial relations between features are irrelevant for this task. Seven experiments involving search for line drawings contradict this assumption; a major determinant of search is the presence of line junction ..."
Abstract
-
Cited by 40 (11 self)
- Add to MetaCart
It has generally been assumed that rapid visual search is based on simple features and that spatial relations between features are irrelevant for this task. Seven experiments involving search for line drawings contradict this assumption; a major determinant of search is the presence of line junctions. Arrow- and Y-junctions were detected rapidly in isolation and when they were embedded in drawings of rectangular polyhedra. Search for T-junctions was considerably slower. Drawings containing T-junctions often gave rise to very slow search even when distinguishing arrow- or Y-junctions were present. This sensitivity to line relations suggests that preattentive processes can extract 3-dimensional orientation from line drawings. A computational model is outlined for how this may be accomplished in early human vision. Although we are still a long way from a complete understanding of visual perception, considerable progress has been made in our understanding of its earliest stages (see Zucker, 1987). These stages are concerned with the extrac-tion of information from the retinal image, and as such are generally assumed to be carried out by processes operating in parallel across the visual field. They are also generally assumed to be
Learning to Segment Images Using Dynamic Feature Binding
- Neural Computation
, 1991
"... Despite the fact that complex visual scenes contain multiple, overlapping objects, people perform object recognition with ease and accuracy. One operation that facilitates recognition is an early segmentation process in which features of objects are grouped and labeled according to which object t ..."
Abstract
-
Cited by 36 (9 self)
- Add to MetaCart
Despite the fact that complex visual scenes contain multiple, overlapping objects, people perform object recognition with ease and accuracy. One operation that facilitates recognition is an early segmentation process in which features of objects are grouped and labeled according to which object they belong. Current computational systems that perform this operation are based on predefined grouping heuristics. We describe a system called MAGIC that learns how to group features based on a set of presegmented examples. In many cases, MAGIC discovers grouping heuristics similar to those previously proposed, but it also has the capability of finding nonintuitive structural regularities in images. Grouping is performed by a relaxation network that attempts to dynamically bind related features. Features transmit a complex-valued signal (amplitude and phase) to one another; binding can thus be represented by phase locking related features. MAGIC's training procedure is a generalizatio...
Neural Dynamics Of 3-D Surface Perception: Figure-Ground Separation And Lightness Perception
- Perception and Psychophysics
, 2000
"... This article develops the FACADE theory of three-dimensional (3-D) vision to simulate data concerning how two-dimensional (2-D) pictures give rise to 3-D percepts of occluded and occluding surfaces. The theory suggests how geometrical and contrastive properties of an image can either cooperate or co ..."
Abstract
-
Cited by 28 (19 self)
- Add to MetaCart
This article develops the FACADE theory of three-dimensional (3-D) vision to simulate data concerning how two-dimensional (2-D) pictures give rise to 3-D percepts of occluded and occluding surfaces. The theory suggests how geometrical and contrastive properties of an image can either cooperate or compete when forming the boundary and surface representations that subserve conscious visual percepts. Spatially long-range cooperation and short-range competition work together to separate boundaries of occluding figures from their occluded neighbors, thereby providing sensitivity to T-junctions without the need to assume that T-junction "detectors" exist. Both boundary and surface representations of occluded objects may be amodally completed, while the surface representations of unoccluded objects become visible through modal processes. Computer simulations include Bregman-Kanizsa figure-ground separation, Kanizsa stratification, and various lightness percepts, including the Munker-White, Be...
Steerable Filters and Local Analysis of Image Structure
, 1992
"... Two paradigms for visual analysis are top-down, starting from high-level models or information about the image, and bottom-up, where little is assumed about the image or objects in it. We explore a local, bottom-up approach to image analysis. We develop operators to identify and classify image junct ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Two paradigms for visual analysis are top-down, starting from high-level models or information about the image, and bottom-up, where little is assumed about the image or objects in it. We explore a local, bottom-up approach to image analysis. We develop operators to identify and classify image junctions, whichcontain important visual cues for identifying occlusion, transparency, and surface bends. Like the human visual system, we begin with the application of linear filters which are oriented in all possible directions. Wedevelop an efficientway to create an oriented filter of arbitrary orientation by describing it as a linear combination of basis filters. This approach to oriented filtering, which we call steerable filters, offers advantages for analysis as well as computation. We design a variety of steerable filters, including steerable quadrature pairs, which measure local energy. We show applications of these filters in orientation and texture analysis, and image representation and enhanc...
Junctions: Detection, Classification and Reconstruction
"... Junctions are important features for image analysis and form a critical aspect of image understanding tasks such as object recognition. We present a unified approach to detecting (location of the center of the junction), classifying (by the number of wedges -- lines, corners, 3-junctions such as T o ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Junctions are important features for image analysis and form a critical aspect of image understanding tasks such as object recognition. We present a unified approach to detecting (location of the center of the junction), classifying (by the number of wedges -- lines, corners, 3-junctions such as T or Y junctions, or 4-junctions such as X-junctions) and reconstructing junctions (in terms of radius size, the angles of each wedge and the intensity in each of the wedges) in images. Our main contribution is a modeling of the junction which is complex enough to handle all these issues and yet simple enough to admit an effective dynamic programming solution. Broadly, we use a template deformation framework along with a gradient criterium to detect radial partitions of the template. We use the minimum description length principle to obtain the optimal number of partitions that best describes the junction. Kona [27] is an implementation of this model. We (quantitatively) demonstrate the stabili...
Geometric reasoning for single image structure recovery
- In proc. CVPR
, 2009
"... We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Sever ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Several physically valid structure hypotheses are proposed by geometric reasoning and verified to find the best fitting model to line segments, which is then converted to a full 3D model. Our experiments demonstrate that our structure recovery from line segments is comparable with methods using full image appearance. Our approach shows how a set of rules describing geometric constraints between groups of segments can be used to prune scene interpretation hypotheses and to generate the most plausible interpretation. Figure 1. Line segments. Can you recognize the building structure? Can you find doors? 1.
Rigidity in cinema seen from the front row, side aisle
- Journal of Experimental Psychology: Human Perception and Performance
, 1987
"... Pictures and cinema seen at a slant present the optics of virtual objects that are distorted and inconsistent with their real counterparts. In particular, it should not be possible for moving objects on slanted film and television screens to be seen as rigid, at least according to rules of linear pe ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Pictures and cinema seen at a slant present the optics of virtual objects that are distorted and inconsistent with their real counterparts. In particular, it should not be possible for moving objects on slanted film and television screens to be seen as rigid, at least according to rules of linear perspective. Previous approaches to this problem have suggested that some process (perhaps cognitive) rectifies the optics of objects in slanted pictures to derive true shape and preserve shape constancy. The means for this rectification is usually thought to be based on recovery of true screen slant. In three experiments I show that this account is unnecessary and insufficient to explain the perception of rotating, rectangular objects in slanted cinema. I present data in favor of an alternate view, one in which the information is sufficient for perceivers to determine rigidity in an object on slanted screens, at least for parallel projections. In the human visual system, local measurements of objects are apparently made according to protective geometry, in those measurements, small amounts of certain distortions in projection are tolerated. Stimuli that appear nonrigid are ones that violate certain local principles, known as Perkins's laws, of projections of rectangular solids. Eye position is not fixed when one looks at a photograph or painting. A puzzle arises from this fact: Linear perspective, the
Automatic Creation of Boundary-Representation Models from Single Line Drawings
, 2002
"... This thesis presents methods for the automatic creation of boundary-representation models of polyhedral objects from single line drawings depicting the objects. This topic is important in that automated interpretation of freehand sketches would remove a bottleneck in current engineering design metho ..."
Abstract
-
Cited by 13 (10 self)
- Add to MetaCart
This thesis presents methods for the automatic creation of boundary-representation models of polyhedral objects from single line drawings depicting the objects. This topic is important in that automated interpretation of freehand sketches would remove a bottleneck in current engineering design methods. The thesis does not consider conversion of freehand sketches to line drawings or methods which require manual intervention or multiple drawings. Thge thesis contains a number of...
Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces
"... There has been a recent push in extraction of 3D spatial layout of scenes. However, none of these approaches model the 3D interaction between objects and the spatial layout. In this paper, we argue for a parametric representation of objects in 3D, which allows us to incorporate volumetric constraint ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
There has been a recent push in extraction of 3D spatial layout of scenes. However, none of these approaches model the 3D interaction between objects and the spatial layout. In this paper, we argue for a parametric representation of objects in 3D, which allows us to incorporate volumetric constraints of the physical world. We show that augmenting current structured prediction techniques with volumetric reasoning significantly improves the performance of the state-of-the-art. 1

