Results 1 -
1 of
1
Describing visual scenes using transformed dirichlet processes
- Advances in Neural Information Processing Systems 18
, 2005
"... Motivated by the problem of learning to detect and recognize objects with minimal supervision, we develop a hierarchical probabilistic model for the spatial structure of visual scenes. In contrast with most existing models, our approach captures the intrinsic uncertainty in the number and identity o ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
Motivated by the problem of learning to detect and recognize objects with minimal supervision, we develop a hierarchical probabilistic model for the spatial structure of visual scenes. In contrast with most existing models, our approach captures the intrinsic uncertainty in the number and identity of objects depicted in a given image. Our scene model is based on the transformed Dirichlet process (TDP), a novel extension of the hierarchical DP in which a set of stochastically transformed mixture components are shared between multiple groups of data. For visual scenes, mixture components describe the spatial structure of visual features in an object–centered coordinate frame, while transformations model the object positions in a particular image. Learning and inference in the TDP, which has many potential applications beyond computer vision, is based on an empirically effective Gibbs sampler. Applied to a dataset of partially labeled street scenes, we show that the TDP’s inclusion of spatial structure improves detection performance, and allows unsupervised discovery of object categories. 1

