Results 1 - 10
of
59
Grounded semantic composition for visual scenes
- Journal of Artificial Intelligence Research
, 2004
"... We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is ..."
Abstract
-
Cited by 105 (24 self)
- Add to MetaCart
(Show Context)
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account. 1.
Extracting Meaningful Curves From Images
- JOURNAL OF MATHEMATICAL IMAGING AND VISION
, 2003
"... Since the beginning, Mathematical Morphology has proposed to extract shapes from images as connected components of level sets. These methods have proved very efficient in shape recognition and shape analysis. In this paper, we present an improved method to select the most meaningful level lines (bou ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
Since the beginning, Mathematical Morphology has proposed to extract shapes from images as connected components of level sets. These methods have proved very efficient in shape recognition and shape analysis. In this paper, we present an improved method to select the most meaningful level lines (boundaries of level sets) from an image. This extraction can be based on statistical arguments, leading to a parameter free algorithm. It permits to roughly extract all pieces of level lines of an image, that coincide with pieces of edges. By this method, the number of encoded level lines is reduced by a factor 100, without any loss of shape contents. In contrast to edge detections algorithm or snakes methods, such a level lines selection method delivers accurate shape elements, without user parameter: no smoothing involved and selection parameters can be computed by Helmholtz Principle.
Adaptive Multiscale Detection of Filamentary Structures Embedded in a Background of Uniform Random Points
, 2003
"... We are given a set of n points that appears uniformly distributed in the unit square [0, 1] 2. We wish to test whether the set actually is generated from a non-uniform distribution having a small fraction of points concentrated on some (a priori unknown) curve with C α-norm bounded by β. An asymptot ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
(Show Context)
We are given a set of n points that appears uniformly distributed in the unit square [0, 1] 2. We wish to test whether the set actually is generated from a non-uniform distribution having a small fraction of points concentrated on some (a priori unknown) curve with C α-norm bounded by β. An asymptotic detection threshold exists in this problem; for a constant T−(α, β)> 0, if the number of points on the curve is smaller than T−(α, β)n 1/(1+α) , reliable detection is not possible for large n. We describe a Multiscale Significant-Runs Algorithm; it can reliably detect concentration of data near a smooth curve, without knowing the smoothness information α or β in advance, provided that the number of points on the curve exceeds T∗(α, β)n 1/(1+α). This algorithm therefore has an optimal detection threshold, up to a factor T∗/T−. At the heart of our approach is an analysis of the data by counting membership in multiscale multianisotropic strips. The strips have an area of C/n and exhibit a variety of lengths, orientations and anisotropies. The strips are partitioned into anisotropy classes; each class is organized as a directed graph whose vertices are strips all of the same anisotropy and whose edges link such strips to their ‘good continuations’. The point cloud data are reduced to counts measuring membership in strips. Each anisotropy graph is reduced to a subgraph consisting of strips with ‘significant’ counts. The algorithm rejects H0 whenever some such subgraph contains a path connecting many consecutive ‘significant’ counts.
A statistical approach to the matching of local features
- SIAM Journal on Imaging Sciences
, 2009
"... Abstract. This paper focuses on the matching of local features between images. Given a set of query descriptors and a database of candidate descriptors, the goal is to decide which ones should be matched. This is a crucial issue, since the matching procedure is often a preliminary step for object de ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
(Show Context)
Abstract. This paper focuses on the matching of local features between images. Given a set of query descriptors and a database of candidate descriptors, the goal is to decide which ones should be matched. This is a crucial issue, since the matching procedure is often a preliminary step for object detection or image matching. In practice, this matching step is often reduced to a specific threshold on the Euclidean distance to the nearest neighbor. Our first contribution is a robust distance between descriptors, relying on the adaptation of the Earth Mover’s Distance to circular histograms. It is shown that this distance outperforms classical distances for comparing SIFT-like descriptors, while its time complexity remains reasonable. Our second and main contribution is a statistical framework for the matching procedure, which yields validation thresholds automatically adapted to the complexity of each query descriptor and to the diversity and size of the database. The method makes it possible to detect multiple occurrences, as well as to deal with situations where the target is not present. Its performances are tested through various experiments on a large image database. Key words. Statistical analysis of matching processes, local feature matching, dissimilarity measure, Earth Mover’s Distance, a contrario. AMS subject classifications. 62H35, 68T45, 68T10
Probabilistic Parameter-Free Motion Detection
- IEEE Conf. Computer Vision and Pattern Recognition, CVPR’04, Washington DC
, 2004
"... We propose an original probabilistic parameter-free method for the detection of independently moving objects in an image sequence. We apply a probabilistic perceptual principle, the Helmholtz principle, whose main advantage is the automatization of the detection decision, by providing a tight contro ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
(Show Context)
We propose an original probabilistic parameter-free method for the detection of independently moving objects in an image sequence. We apply a probabilistic perceptual principle, the Helmholtz principle, whose main advantage is the automatization of the detection decision, by providing a tight control of the number of false alarms. Not only does this method localize the moving objects but it also answers the preliminary question of the presence of motion. In particular, the method works even when no assumption on motion presence is made. The algorithm is composed of three independent steps: estimation of the dominant image motion, spatial segmentation of object boundaries and independent motion detection itself. We emphasize that none of these steps needs any parameter tuning. Results on real image sequences are reported and validate the proposed approach.
Appearance-guided Synthesis of Element Arrangements by Example
, 2009
"... In our case, we aim at producing new arrangements of richer ele- ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
In our case, we aim at producing new arrangements of richer ele-
Document Image Analysis by Probabilistic Network and Circuit Diagram Extraction
- INFORMATICA, AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS
, 2005
"... The paper presents a hierarchical object recognition system for document processing. It is based on a spatial tree structure representation and Bayesian framework. The image components are built up from lower level image components stored in a library. The tree representations of the objects are ass ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
The paper presents a hierarchical object recognition system for document processing. It is based on a spatial tree structure representation and Bayesian framework. The image components are built up from lower level image components stored in a library. The tree representations of the objects are assembled from these components. A probabilistic framework is used in order to get robust behaviour. The method is able to convert general circuit diagrams to their components and store them in a hierarchical datastructure. The paper presents simulation for extracting the components of sample circuit diagrams.
Image segmentation by a contrario simulation
, 2008
"... Segmenting an image into homogeneous regions generally involves a decision criterion to establish whether two adjacent regions are similar. Decisions should be adaptive to get robust and accurate segmentation algorithms, avoid hazardous a priori and have a clear interpretation. We propose a decision ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Segmenting an image into homogeneous regions generally involves a decision criterion to establish whether two adjacent regions are similar. Decisions should be adaptive to get robust and accurate segmentation algorithms, avoid hazardous a priori and have a clear interpretation. We propose a decision process based on a contrario reasoning: two regions are meaningfully different if the probability of observing such a difference in pure noise is very low. Since the existing analytical methods are intractable in our case, we extend them to allow a mixed use of analytical computations and Monte-Carlo simulations. The resulting decision criterion is tested experimentally through a simple merging algorithm, which can be used as a post-filtering and validation step for existing segmentation methods.
A contrario matching of SIFT-like descriptors
"... In this paper, the matching of SIFT-like features [5] between images is studied. The goal is to decide which matches between descriptors of two datasets should be selected. This matching procedure is often a preliminary step towards some computer vision applications, such as object detection and ima ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
In this paper, the matching of SIFT-like features [5] between images is studied. The goal is to decide which matches between descriptors of two datasets should be selected. This matching procedure is often a preliminary step towards some computer vision applications, such as object detection and image registration for instance. The distances between the query descriptors and the database candidates being computed, the classical approach is to select for each query its nearest neighbor, depending on a global threshold on dissimilarity measure. In this contribution, an a contrario framework for the matching procedure is introduced, based on a threshold on a probability of false detections. This approach yields dissimilarity thresholds automatically adapted to each query descriptor and to the diversity and size of the database. We show on various experiments on a large image database, the ability of such a method to decide whether a query and its candidates should be matched. 1.