Image Parsing: Unifying Segmentation, Detection, and Recognition
, 2005
"... In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural lang ..."
Abstract

In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and reconfigures it dynamically using a set of reversible Markov chain jumps. This computational framework integrates two popular inference approaches  generative (topdown) methods and discriminative (bottomup) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottomup tests/filters.
Filters, Random Fields and Maximum Entropy . . .
 INTERNATIONAL JOURNAL OF COMPUTER VISION
, 1998
"... This article presents a statistical theory for texture modeling. This theory combines filtering theory and Markov random field modeling through the maximum entropy principle, and interprets and clarifies many previous concepts and methods for texture analysis and synthesis from a unified point of vi ..."
Abstract

This article presents a statistical theory for texture modeling. This theory combines filtering theory and Markov random field modeling through the maximum entropy principle, and interprets and clarifies many previous concepts and methods for texture analysis and synthesis from a unified point of view. Our theory characterizes the ensemble of images I with the same texture appearance by a probability distribution f (I) on a random field, and the objective of texture modeling is to make inference about f (I), given a set of observed texture examples. In our theory, texture modeling consists of two steps. (1) A set of filters is selected from a general filter bank to capture features of the texture, these filters are applied to observed texture images, and the histograms of the filtered images are extracted. These histograms are estimates of the marginal distributions of f (I). This step is called feature extraction. (2) The maximum entropy principle is employed to derive a distribution p(I), which is restricted to have the same marginal distributions as those in (1). This p(I) is considered as an estimate of f (I). This step is called feature fusion. A stepwise algorithm is proposed to choose filters from a general filter bank. The resulting model, called FRAME (Filters, Random fields And Maximum Entropy), is a Markov random field (MRF) model, but with a much enriched vocabulary and hence much stronger descriptive ability than the previous MRF models used for texture modeling. Gibbs sampler is adopted to synthesize texture images by drawing typical samples from p(I), thus the model is verified by seeing whether the synthesized texture images have similar visual appearances
Robust Analysis of Feature Spaces: Color Image Segmentation
, 1997
"... A general technique for the recovery of significant image features is presented. The technique is basedon the mean shift algorithm, a simple nonparametric procedure for estimating density gradients. Drawbacks of the current methods (including robust clustering) are avoided. Featurespace of any natu ..."
Abstract

A general technique for the recovery of significant image features is presented. The technique is basedon the mean shift algorithm, a simple nonparametric procedure for estimating density gradients. Drawbacks of the current methods (including robust clustering) are avoided. Featurespace of any naturecan beprocessed, and as an example, color image segmentation is discussed. The segmentation is completely autonomous, only its class is chosen by the user. Thus, the same program can produce a high quality edge image, or provide, by extracting all the significant colors, a preprocessor for contentbased query systems. A 512 x 512 color image is analyzed in less than 10 seconds on a standard workstation. Gray level images are handled as color images having only the lightness coordinate.
Mean Shift Analysis and Applications
, 1999
"... A nonparametric estimator of density gradient, the mean shift, is employed in the joint, spatialrange (value) domain of gray level and color images for discontinuity preserving filtering and image segmentation. Properties of the mean shift are reviewed and its convergence on lattices is proven. The ..."
Abstract

A nonparametric estimator of density gradient, the mean shift, is employed in the joint, spatialrange (value) domain of gray level and color images for discontinuity preserving filtering and image segmentation. Properties of the mean shift are reviewed and its convergence on lattices is proven. The proposed filtering method associates with each pixel in the image the closest local mode in the density distribution of the joint domain. Segmentation into a piecewise constant structure requires only one more step, fusion of the regions associated with nearby modes. The proposed technique has two parameters controlling the resolution in the spatial and range domains. Since convergence is guaranteed, the technique does not require the intervention of the user to stop the filtering at the desired image quality. Several examples, for gray and color images, show the versatilityofthe method and compare favorably with results described in the literature for the same images.
Prior Learning and Gibbs ReactionDiffusion
, 1997
"... This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designi ..."
Abstract

This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designing reactiondiusion equations for image processing. We start by studying the statistics of natural images including the scale invariant properties, then generic prior models were learned to duplicate the observed statistics, based on the minimax entropy theory studied in two previous papers. The resulting Gibbs distributions have potentials of the form U(I; ; S) = P K I)(x; y)) with S = fF g being a set of lters and = f the potential functions. The learned Gibbs distributions con rm and improve the form of existing prior models such as lineprocess, but in contrast to all previous models, inverted potentials (i.e. (x) decreasing as a function of jxj) were found to be necessary. We nd that the partial dierential equations given by gradient descent on U(I; ; S) are essentially reactiondiusion equations, where the usual energy terms produce anisotropic diusion while the inverted energy terms produce reaction associated with pattern formation, enhancing preferred image features. We illustrate how these models can be used for texture pattern rendering, denoising, image enhancement and clutter removal by careful choice of both prior and data models of this type, incorporating the appropriate features. Song Chun Zhu is now with the Computer Science Department, Stanford University, Stanford, CA 94305, and David Mumford is with the Division of Applied Mathematics, Brown University, Providence, RI 02912. This work started when the authors were at ...
Userguided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability
 NeuroImage
, 2006
"... Active contour segmentation and its robust implementation using level set methods are wellestablished theoretical approaches that have been studied thoroughly in the image analysis literature. Despite the existence of these powerful segmentation methods, the needs of clinical research continue to b ..."
Abstract

Active contour segmentation and its robust implementation using level set methods are wellestablished theoretical approaches that have been studied thoroughly in the image analysis literature. Despite the existence of these powerful segmentation methods, the needs of clinical research continue to be fulfilled, to a large extent, using slicebyslice manual tracing. To bridge the gap between methodological advances and clinical routine, we developed an open source application called ITKSNAP, which is intended to make level set segmentation easily accessible to a wide range of users, including those with little or no mathematical expertise. This paper describes the methods and software engineering philosophy behind this new tool and provides the results of validation experiments performed in the context of an ongoing child autism neuroimaging study. The validation establishes SNAP intrarater and interrater reliability and overlap error statistics for the caudate nucleus and finds that SNAP is a highly reliable and efficient alternative to manual tracing. Analogous results for lateral ventricle segmentation are provided.
Interactive learning using a "society of models"
 SUBMITTED TO SPECIAL ISSUE OF PATTERN RECOGNITION ON IMAGE DATABASE: CLASSIFICATION AND RETRIEVAL
"... Digital library access is driven by features, but features are often contextdependent and noisy, and their relevance for a query is not always obvious. This paper describes an approach for utilizing many datadependent, userdependent, and taskdependent features in a semiautomated tool. Instead o ..."
Abstract

Digital library access is driven by features, but features are often contextdependent and noisy, and their relevance for a query is not always obvious. This paper describes an approach for utilizing many datadependent, userdependent, and taskdependent features in a semiautomated tool. Instead of requiring universal similarity measures or manual selection of relevant features, the approach provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highlyspecialized and contextdependent features. The selection process is guided by arichexamplebased interaction with the user. The inherent combinatorics
A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape
 International Journal of Computer Vision
, 2007
"... Abstract. Since their introduction as a means of front propagation and their first application to edgebased segmentation in the early 90’s, level set methods have become increasingly popular as a general framework for image segmentation. In this paper, we present a survey of a specific class of reg ..."
Abstract

Abstract. Since their introduction as a means of front propagation and their first application to edgebased segmentation in the early 90’s, level set methods have become increasingly popular as a general framework for image segmentation. In this paper, we present a survey of a specific class of regionbased level set segmentation methods and clarify how they can all be derived from a common statistical framework. Regionbased segmentation schemes aim at partitioning the image domain by progressively fitting statistical models to the intensity, color, texture or motion in each of a set of regions. In contrast to edgebased schemes such as the classical Snakes, regionbased methods tend to be less sensitive to noise. For typical images, the respective cost functionals tend to have less local minima which makes them particularly wellsuited for local optimization methods such as the level set method. We detail a general statistical formulation for level set segmentation. Subsequently, we clarify how the integration of various low level criteria leads to a set of cost functionals and point out relations between the different segmentation schemes. In experimental results, we demonstrate how the level set function is driven to partition the image plane into domains of coherent color, texture, dynamic texture or motion. Moreover, the Bayesian formulation allows to introduce prior shape knowledge into the level set method. We briefly review a number of advances in this domain.
Comparison of texture features based on gabor filters
 IEEE Trans. on Image Processing
"... Abstract—Texture features that are based on the local power spectrum obtained by a bank of Gabor filters are compared. The features differ in the type of nonlinear postprocessing which is applied to the local power spectrum. The following features are considered: Gabor energy, complex moments, and ..."
Abstract

Abstract—Texture features that are based on the local power spectrum obtained by a bank of Gabor filters are compared. The features differ in the type of nonlinear postprocessing which is applied to the local power spectrum. The following features are considered: Gabor energy, complex moments, and grating cell operator features. The capability of the corresponding operators to produce distinct feature vector clusters for different textures is compared using two methods: the Fisher criterion and the classification result comparison. Both methods give consistent results. The grating cell operator gives the best discrimination and segmentation results. The texture detection capabilities of the operators and their robustness to nontexture features are also compared. The grating cell operator is the only one that selectively responds only to texture and does not give false response to nontexture features such as object contours. Index Terms—Classification, complex moments, discrimination,
A statistical approach to snakes for bimodal and trimodal imagery
 in Proc. Int. Conf. Computer Vision
, 1999
"... In this paper, we describe a new regionbased approach to active contours for segmenting images composed of two or three types of regions characterizable by a given statistic. The essential idea is to derive curve evolutions which separate two or more values of a predetermined set of statistics comp ..."
Abstract

In this paper, we describe a new regionbased approach to active contours for segmenting images composed of two or three types of regions characterizable by a given statistic. The essential idea is to derive curve evolutions which separate two or more values of a predetermined set of statistics computed over geometrically determined subsets of the image. Both global and local image information is used to evolve the active contour. Image derivatives, however, are avoided, thereby giving rise to a further degree of noise robustness compared to most edgebased snake algorithms. 1