Results 1 -
9 of
9
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics
- in Proc. 8th Int’l Conf. Computer Vision
, 2001
"... This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the s ..."
Abstract
-
Cited by 365 (14 self)
- Add to MetaCart
This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties. 1.
Learning Image Statistics for Bayesian Tracking
- In IEEE International Conference on Computer Vision
, 2001
"... This paper describes a framework for learning probabilistic models of objects and scenes and for exploiting these models for tracking complex, deformable, or articulated objects in image sequences. We focus on the probabilistic tracking of people and learn models of how they appear and move in image ..."
Abstract
-
Cited by 58 (6 self)
- Add to MetaCart
This paper describes a framework for learning probabilistic models of objects and scenes and for exploiting these models for tracking complex, deformable, or articulated objects in image sequences. We focus on the probabilistic tracking of people and learn models of how they appear and move in images. In particular, we learn the likelihood of observing various spatial and temporal filter responses corresponding to edges, ridges, and motion differences given a model of the person. Similarly, we learn probability distributions over filter responses for general scenes that define a likelihood of observing the filter responses for arbitrary backgrounds. We then derive a probabilistic model for tracking that exploits the ratio between the likelihood that image pixels corresponding to the foreground (person) were generated by an actual person or by some unknown background. The paper extends previous work on learning image statistics and combines it with Bayesian tracking using particle filtering. By combining multiple image cues, and by using learned likelihood models, we demonstrate improved robustness and accuracy when tracking complex objects such as people in monocular image sequences with cluttered scenes and a moving camera.
Statistics of range images
- CVPR
, 2000
"... The statistics of range images from natural environments is a largely unexplored eldofresearch. It closely relates to the statistical modeling of the scene geometry in natural environments, and the modeling of optical natural images. We have use d a 3D laser range- nder to collect range images from ..."
Abstract
-
Cited by 48 (5 self)
- Add to MetaCart
The statistics of range images from natural environments is a largely unexplored eldofresearch. It closely relates to the statistical modeling of the scene geometry in natural environments, and the modeling of optical natural images. We have use d a 3D laser range- nder to collect range images from mixed forest scenes. The images are hereanalyzed with respect to di erent statistics. 1
Statistical edge detection: learning and evaluating edge cues
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... We formulate edge detection as statistical inference. This statistical edge detection is data driven, unlike standard methods for edge detection which are model based. For any set of edge detection filters (implementing local edge cues) we use pre-segmented images to learn the probability distributi ..."
Abstract
-
Cited by 44 (4 self)
- Add to MetaCart
We formulate edge detection as statistical inference. This statistical edge detection is data driven, unlike standard methods for edge detection which are model based. For any set of edge detection filters (implementing local edge cues) we use pre-segmented images to learn the probability distributions of filter responses conditioned on whether they are evaluated on or off an edge. Edge detection is formulated as a discrimina-tion task specified by a likelihood ratio test on the filter responses. This approach emphasizes the necessity of modeling the image background (the off-edges). We rep-resent the conditional probability distributions non-parametrically and learn them on two different datasets of 100 (Sowerby) and 50 (South Florida) images. Multiple edges cues, including chrominance and multiple-scale, are combined by using their joint dis-tributions. Hence this cue combination is optimal in the statistical sense. We evaluate the effectiveness of different visual cues using the Chernoff information and Receiver Operator Characteristic (ROC) curves. This shows that our approach gives quantita-tively better results than the Canny edge detector when the image background contains significant clutter. In addition, it enables us to determine the effectiveness of different edge cues and gives quantitative measures for the advantages of multi-level processing, for the use of chrominance, and for the relative effectiveness of different detectors. Fur-thermore, we show that we can learn these conditional distributions on one dataset and adapt them to the other with only slight degradation of performance without knowing the ground truth on the second dataset. This shows that our results are not purely domain specific. We apply the same approach to the spatial grouping of edge cues and obtain analogies to non-maximal suppression and hysteresis.
What are textons
- International Journal of Computer Vision
, 2002
"... Abstract. Textons refer to fundamental micro-structures in generic natural images and thus constitute the basic elements in early (preattentive) visual perception. However, the word “texton ” remains a vague concept in the literature of computer vision and visual perception, and a precise mathematic ..."
Abstract
-
Cited by 42 (15 self)
- Add to MetaCart
Abstract. Textons refer to fundamental micro-structures in generic natural images and thus constitute the basic elements in early (preattentive) visual perception. However, the word “texton ” remains a vague concept in the literature of computer vision and visual perception, and a precise mathematical definition has yet to be found. In this article, we argue that the definition of texton should be governed by a sound mathematical model of images, and the set of textons must be learned from, or best tuned to, an image ensemble. We adopt a generative image model that an image is a superposition of bases from an over-complete dictionary, then a texton is defined as a mini-template that consists of a varying number of image bases with some geometric and photometric configurations. By analogy to physics, if image bases are like protons, neutrons and electrons, then textons are like atoms. Then a small number of textons can be learned from training images as repeating micro-structures. We report four experiments for comparison. The first experiment computes clusters in feature space of filter responses. The second use transformed component analysis in both feature space and image patches. The third adopts a two-layer generative model where an image is generated by image bases and image bases are generated by textons. The fourth experiment shows textons from motion image sequences, which we call movetons. 1
Statistical Modeling and Conceptualization of Visual Patterns
, 2003
"... Natural images contain an overwhelming number of visual patterns generated by diverse stochastic processes. Defining and modeling these patterns is of fundamental importance for generic vision tasks, such as perceptual organization, segmentation, and recognition. The objective of this epistemologi ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
Natural images contain an overwhelming number of visual patterns generated by diverse stochastic processes. Defining and modeling these patterns is of fundamental importance for generic vision tasks, such as perceptual organization, segmentation, and recognition. The objective of this epistemological paper is to summarize various threads of research in the literature and to pursue a unified framework for conceptualization, modeling, learning, and computing visual patterns. This paper starts with reviewing four research streams: 1) the study of image statistics, 2) the analysis of image components, 3) the grouping of image elements, and 4) the modeling of visual patterns. The models from these research streams are then divided into four categories according to their semantic structures: 1) descriptive models, i.e., Markov random fields (MRF) or Gibbs, 2) variants of descriptive models (causal MRF and "pseudodescriptive" models), 3) generative models, and 4) discriminative models. The objectives, principles, theories, and typical models are reviewed in each category and the relationships between the four types of models are studied. Two central themes emerge from the relationship studies.
Human and ideal observers for detecting image curves
- Advances in Neural Information Processing Systems
, 2004
"... This paper compares the ability of human observers to detect target image curves with that of an ideal observer. The target curves are sampled from a generative model which specifies (probabilistically) the geometry and local intensity properties of the curve. The ideal observer performs Bayesian in ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper compares the ability of human observers to detect target image curves with that of an ideal observer. The target curves are sampled from a generative model which specifies (probabilistically) the geometry and local intensity properties of the curve. The ideal observer performs Bayesian inference on the generative model using MAP estimation. Varying the probability model for the curve geometry enables us investigate whether human performance is best for target curves that obey specific shape statistics, in particular those observed on natural shapes. Experiments are performed with data on both rectangular and hexagonal lattices. Our results show that human observers ’ performance approaches that of the ideal observer and are, in general, closest to the ideal for conditions where the target curve tends to be straight or similar to natural statistics on curves. This suggests a bias of human observers towards straight curves and natural statistics. 1
A Model for Recognition of 3D Non-Dense Objects in Range Images
, 2001
"... This paper discusses a deformable template approach to the problem of recognising three dimensional, non-dense objects in high-resolution laser range images. To model the infinite variability in object appearance we develop an imaging model based on a Poisson object process, assuming objects to cons ..."
Abstract
- Add to MetaCart
This paper discusses a deformable template approach to the problem of recognising three dimensional, non-dense objects in high-resolution laser range images. To model the infinite variability in object appearance we develop an imaging model based on a Poisson object process, assuming objects to consist of primitives distributed according to a non-homogeneous Poisson point process. We discuss some computational aspects of the model, and show how we can use the Metropolis-adjusted Langevin Algorithm (MALA) to generate samples from the posterior distribution. We show results applying the model to real laser range images of forest.
Learning Image Statistics for Bayesian Tracking
- In IEEE International Conference on Computer Vision
, 2001
"... This paper describes a framework for learning probabilistic models of objects and scenes and for exploiting these models for tracking complex, deformable, or articulated objects in image sequences. We focus on the probabilistic tracking of people and learn models of how they appear and move in image ..."
Abstract
- Add to MetaCart
This paper describes a framework for learning probabilistic models of objects and scenes and for exploiting these models for tracking complex, deformable, or articulated objects in image sequences. We focus on the probabilistic tracking of people and learn models of how they appear and move in images. In particular, we learn the likelihood of observing various spatial and temporal filter responses corresponding to edges, ridges, and motion differences given a model of the person. Similarly, we learn probability distributions over filter responses for general scenes that define a likelihood of observing the filter responses for arbitrary backgrounds. We then derive a probabilistic model for tracking that exploits the ratio between the likelihood that image pixels corresponding to the foreground (person) were generated by an actual person or by some unknown background. The paper extends previous work on learning image statistics and combines it with Bayesian tracking using particle filtering. By combining multiple image cues, and by using learned likelihood models, we demonstrate improved robustness and accuracy when tracking complex objects such as people in monocular image sequences with cluttered scenes and a moving camera.

