Results 1 - 10
of
20
Statistical Models for Images: Compression, Restoration and Synthesis
- In 31st Asilomar Conf on Signals, Systems and Computers
, 1997
"... this paper, we examine the problem of decomposing digitized images, through linear and/or nonlinear transformations, into statistically independent components. The classical approach to such a problem is Principal Components Analysis (PCA), also known as the Karhunen-Loeve (KL) or Hotelling transfor ..."
Abstract
-
Cited by 116 (31 self)
- Add to MetaCart
this paper, we examine the problem of decomposing digitized images, through linear and/or nonlinear transformations, into statistically independent components. The classical approach to such a problem is Principal Components Analysis (PCA), also known as the Karhunen-Loeve (KL) or Hotelling transform. This is a linear transform that removes second-order dependencies between input pixels. The most well-known description of image statistics is that their power spectra take the form of a power law [e.g., 20, 11, 24]. Coupled with a constraint of translationinvariance, this suggests that the Fourier transform is an appropriate PCA representation. Fourier and related representations are widely used in image processing applications.
Independent Component Analysis Of Natural Image Sequences Yields Spatiotemporal Filters Similar To Simple Cells In Primary Visual Cortex
- PROC. R. SOC. LOND. B
, 1998
"... ..."
Bayesian Multi-Scale Differential Optical Flow
, 1999
"... Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 14.2 Dierential formulation . . . . . . . . . . . . . . . . . . . . . 398 14.3 Uncertainty Model . . . . . . . . . . . . . . . . . . . . . . . . 400 14.4 Coarse-to- ne estimation . . . . . . . . . . . . . . . . . . . . 404 14. ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 14.2 Dierential formulation . . . . . . . . . . . . . . . . . . . . . 398 14.3 Uncertainty Model . . . . . . . . . . . . . . . . . . . . . . . . 400 14.4 Coarse-to- ne estimation . . . . . . . . . . . . . . . . . . . . 404 14.5 Implementation issues . . . . . . . . . . . . . . . . . . . . . . 411 14.5.1 Derivative lter kernels . . . . . . . . . . . . . . . . . 411 14.5.2 Averaging lter kernels . . . . . . . . . . . . . . . . . 412 14.5.3 Multi-scale warping . . . . . . . . . . . . . . . . . . . 412 14.5.4 Boundary handling . . . . . . . . . . . . . . . . . . . 414 14.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 14.6.1 Performance measures . . . . . . . . . . . . . . . . . 414 14.6.2 Synthetic sequences . . . . . . . . . . . . . . . . . . . 415 14.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 14.8 References . . . . . . . . . . . . . . . . . . . . . . . . .
Temporal dynamics of motion integration for the initiation of tracking eye movements at ultra-short latencies
- Visual Neuroscience
, 2000
"... The perceived direction of a grating moving behind an elongated aperture is biased towards the aperture’s long axis. This “barber pole ” illusion is a consequence of integrating one-dimensional (1D) or grating and two-dimensional (2D) or terminator motion signals. In humans, we recorded the ocular f ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The perceived direction of a grating moving behind an elongated aperture is biased towards the aperture’s long axis. This “barber pole ” illusion is a consequence of integrating one-dimensional (1D) or grating and two-dimensional (2D) or terminator motion signals. In humans, we recorded the ocular following responses to this stimulus. Tracking was always initiated at ultra-short latencies ( � 85 ms) in the direction of grating motion. With elongated apertures, a later component was initiated 15–20 ms later in the direction of the terminator motion signals along the aperture’s long axis. Amplitude of the later component was dependent upon the aperture’s aspect ratio. Mean tracking direction at the end of the trial (135–175 ms after stimulus onset) was between the directions of the vector sum computed by integrating either terminator motion signals only or both grating and terminator motion signals. Introducing an elongated mask at the center of the “barber pole ” did not affect the latency difference between early and later components, indicating that this latency shift was not due to foveal versus peripheral locations of 1D and 2D motion signals. Increasing the size of the foveal mask up to 90 % of the stimulus area selectively reduced the strength of the grating motion signals and, consequently, the amplitude of the early component. Conversely, reducing the contrast of, or indenting the aperture’s edges, selectively reduced the strength of terminator motion signals and, consequently, the amplitude of the later component. Latencies were never affected by these manipulations. These results tease
Probabilistic Multichannel Optical Flow Analysis, Based on a Multipurpose Visual Representation of Image Sequences
- in IS&T/SPIE 11 th International Symposium on Electronic Imaging’99
, 1999
"... Directional filters are not normally used as pre-filters for optical flow estimation because orientation selectivity tends to increase the aperture problem. Despite this fact, here we apply a subband decomposition using directional spatio-temporal filters to discriminate multiple motions at the same ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Directional filters are not normally used as pre-filters for optical flow estimation because orientation selectivity tends to increase the aperture problem. Despite this fact, here we apply a subband decomposition using directional spatio-temporal filters to discriminate multiple motions at the same location. We first obtain multiple estimates of the velocity by applying the classic gradient constraint to the output of each filter (a bank of 6 directional second order Gaussian derivatives-GD2- at 3 spatial scales). Spatio-temporal gradients of GD2 channel responses are easily obtained as linear combinations of the set of 10 separable GD3 channel responses, which constitutes a multipurpose scheme for visual representation of image sequences. Then, we obtain an overdetermined linear system by imposing local constant velocity. This system is solved by least-squares yielding an estimate of the velocity and its covariance matrix (a 2D confidence measure). After segmenting the resulting 6x3 velocity estimates (grouping together those estimates whose Mahalanobis distance is below a given threshold) we combine them using Bayesian probability rules. Segmentation maintains the ability to represent multiple motions while combination reduces the aperture problem. Results for synthetic and real sequences are highly satisfactory. Mean errors in complex standard sequences are below those provided by most published methods.
End-stopping and the aperture problem: Two-dimensional motion signals in macaque V1
- Neuron
, 2003
"... ble first step toward measuring velocity would be to filter out long contours, and to respond selectively to endpoints. Such selectivity for endpoints was first ob-Department of Neurobiology served by Hubel and Wiesel (1965), who identified it ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
ble first step toward measuring velocity would be to filter out long contours, and to respond selectively to endpoints. Such selectivity for endpoints was first ob-Department of Neurobiology served by Hubel and Wiesel (1965), who identified it
Time-Recursive Velocity-Adapted Spatio-Temporal Scale-Space Filters
- In Proc. ECCV, volume 2350 of LNCS
, 2002
"... This paper presents a theory for constructing and computing velocity-adapted scale-space filters for spario-temporal image data. ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
This paper presents a theory for constructing and computing velocity-adapted scale-space filters for spario-temporal image data.
Statistical models of images and early vision
- Proceedings of the Int. Symposium on Adaptive Knowledge Representation and Reasoning (AKRR2005
, 2005
"... A fundamental question in visual neuroscience is: Why are the receptive fields and response properties of visual neurons as they are? A modern approach to this problem emphasizes the importance of adaptation to ecologically valid input. In this paper, we will review work on modelling statistical reg ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A fundamental question in visual neuroscience is: Why are the receptive fields and response properties of visual neurons as they are? A modern approach to this problem emphasizes the importance of adaptation to ecologically valid input. In this paper, we will review work on modelling statistical regularities in ecologically valid visual input (“natural images”) and the obtained functional explanation of the properties of visual neurons. A seminal statistical model for natural images was linear sparse coding which is equivalent to the model called independent component analysis (ICA). Linear features estimated by ICA resemble wavelets or Gabor functions, and provide a very good description of the properties of simple cells in the primary visual cortex. We have introduced extensions of ICA that are based on modelling dependencies of the ”independent ” components estimated by basic ICA. The dependencies of the components are used to define either a grouping or a topographic order between the components. With natural image data, these models lead to emergence of further properties of visual neurons: the topographic organization and complex cell receptive fields. We have also modelled the temporal structure of natural image sequences, which provides an alternative approach to the sparseness used in most models. These models can be combined in a unifying framework that we call bubble coding. Finally, we will discuss a promising new direction of research: predictive visual neuroscience. There, the goal is to try to predict response properties of neurons in areas that are poorly understood, still based on statistical modelling of natural input. 1.
Action Recognition with a Bio–Inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions
- in "Proceedings of the 10th European Conference on Computer Vision", LNCS
"... Abstract. Here we show that reproducing the functional properties of MT cells with various center–surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio–inspired model of the motion pathway in primates: It is a feed ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. Here we show that reproducing the functional properties of MT cells with various center–surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio–inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure and, more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio–inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos. 1

