Results 1 - 10
of
21
Image Quality Assessment: From Error Visibility to Structural Similarity
- IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2004
"... Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapt ..."
Abstract
-
Cited by 301 (26 self)
- Add to MetaCart
Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000.
Image Quality Assessment: From Error Measurement to Structural Similarity
- IEEE Trans. Image Processing
, 2004
"... Objective methods for assessing perceptual image quality traditionally attempt to quantify the visibility of errors (di#erences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly ..."
Abstract
-
Cited by 68 (10 self)
- Add to MetaCart
Objective methods for assessing perceptual image quality traditionally attempt to quantify the visibility of errors (di#erences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MatLab implementation of the proposed algorithm is available online at http://www.cns.nyu.edu/~lcv/ssim/.
Foveation Scalable Video Coding with Automatic Fixation Selection
, 2003
"... Image and video coding is an optimization problem. A successful image and video coding algorithm delivers a good tradeo# between visual quality and other coding performance measures, such as compression, complexity, scalability, robustness, and security. In this paper, we follow two recent trends in ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
Image and video coding is an optimization problem. A successful image and video coding algorithm delivers a good tradeo# between visual quality and other coding performance measures, such as compression, complexity, scalability, robustness, and security. In this paper, we follow two recent trends in image and video coding research. One is to incorporate human visual system (HVS) models to improve the current state-of-the-art of image and video coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate scalable image and video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream.
Fast Algorithms For Foveated Video Processing
, 2003
"... this paper In this paper, we implementfoveation ltering having low pass lters with continuously varying cuto frequencies. For discrete images, we are forced to use a xed set of cuto frequencies, yet, unlike the WT or the STFT, we allow for an arbitrary cuto frequency. Using such low pass lters, ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
this paper In this paper, we implementfoveation ltering having low pass lters with continuously varying cuto frequencies. For discrete images, we are forced to use a xed set of cuto frequencies, yet, unlike the WT or the STFT, we allow for an arbitrary cuto frequency. Using such low pass lters, the performance of the algorithm does not depend on the cuto frequency (as in the WT). As long as the local bandwidth transition is monotonically changed, position-varying lowpass ltering can be utilized to compute foveated images that better approximate the human visual system. Another merit of this approach is computational simplicity (easy implementation) and adaptiveinterface with standard video
Real-Time Foveation Techniques for Low Bit Rate Video Coding
- Real-time Imaging
, 2002
"... Lossy video compression methods often rely on modeling the abilities and limitations of the intended receiver, the Human Visual System (HVS), to achieve the highest possible compression with as little e#ect on perceived quality as possible. Foveation, which is nonuniform resolution perception of ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Lossy video compression methods often rely on modeling the abilities and limitations of the intended receiver, the Human Visual System (HVS), to achieve the highest possible compression with as little e#ect on perceived quality as possible. Foveation, which is nonuniform resolution perception of the visual stimulus by the HVS due to the non-uniform density of photoreceptor cells in the eye, has been demonstrated to be useful for reducing bit rates beyond the abilities of uniform resolution video coders. In this work, we present realtime foveation techniques for low bit rate video coding. First, we develop an approximate model for foveation. Then, we demonstrate that foveation, as described by this model, can be incorporated into standard motion compensation and Discrete Cosine Transform (DCT) based video coding techniques for low bit rate video coding, such as the H.263 or MPEG-4 video coding standards, without incurring prohibitive complexity overhead. We demonstrate that foveation in the DCT domain can actually result in computational speedups. The techniques presented can be implemented using the baseline modes in the video coding standards and do not require any modification to, or post processing at, the decoder.
Individual predictions of eye-movements with dynamic scenes
- Electronic Imaging 2003, volume 5007. SPIE
, 2003
"... We present a model that predicts saccadic eye-movements and can be tuned to a particular human observer who is viewing a dynamic sequence of images. Our work is motivated by applications that involve gaze-contingent interactive displays on which information is displayed as a function of gaze directi ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We present a model that predicts saccadic eye-movements and can be tuned to a particular human observer who is viewing a dynamic sequence of images. Our work is motivated by applications that involve gaze-contingent interactive displays on which information is displayed as a function of gaze direction. The approach therefore differs from standard approaches in two ways: (i) we deal with dynamic scenes, and (ii) we provide means of adapting the model to a particular observer. As an indicator for the degree of saliency we evaluate the intrinsic dimension of the image sequence within a geometric approach implemented by using the structure tensor. Out of these candidate saliencybased locations, the currently attended location is selected according to a strategy found by supervised learning. The data are obtained with an eye-tracker and subjects who view video sequences. The selection algorithm receives candidate locations of current and past frames and a limited history of locations attended in the past. We use a linear mapping that is obtained by minimizing the quadratic difference between the predicted and the actually attended location by gradient descent. Being linear, the learned mapping can be quickly adapted to the individual observer. Keywords: Eye-movements, saccades, saliency map, intrinsic dimension, machine learning, gaze-contingent display
Bayesian Integration of Face and Low-level Cues for Foveated Video Coding
- IEEE TRANS. CIRCUITS SYST. VIDEO TECHNOL.
, 2008
"... We present a Bayesian model that allows to automatically generate fixations/foveations and that can be suitably exploited for compression purposes. The twofold aim of this work is to investigate how the exploitation of high-level perceptual cues provided by human faces occurring in the video can enh ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We present a Bayesian model that allows to automatically generate fixations/foveations and that can be suitably exploited for compression purposes. The twofold aim of this work is to investigate how the exploitation of high-level perceptual cues provided by human faces occurring in the video can enhance the compression process without reducing the perceived quality of the video and to validate such assumption with an extensive and principled experimental protocol. To such end, the model integrates top-down and bottom-up cues to choose the fixation point on a video frame: at the highest level, a fixation is driven by prior information and by relevant objects, namely human faces, within the scene; at the same time, local saliency together with novel and abrupt visual events contribute by triggering lower level control. The performance of the resulting video compression system has been evaluated with respect to both the perceived quality of foveated video clips and the compression gain with an extensive evaluation campaign, which has eventually involved 200 subjects.
Bovik, “Visual Importance Pooling for Image Quality Assessment
- IEEE journal of Selected Topics in Signal Processing
, 2009
"... Abstract—Recent image quality assessment (IQA) metrics achieve high correlation with human perception of image quality. Naturally, it is of interest to produce even better results. One promising method is to weight image quality measurements by visual importance. To this end, we describe two strateg ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract—Recent image quality assessment (IQA) metrics achieve high correlation with human perception of image quality. Naturally, it is of interest to produce even better results. One promising method is to weight image quality measurements by visual importance. To this end, we describe two strategies—visual fixation-based weighting, and quality-based weighting. By contrast with some prior studies we find that these strategies can improve the correlations with subjective judgment significantly. We demonstrate improvements on the SSIM index in both its multiscale and single-scale versions, using the LIVE database as a test-bed. Index Terms—Image quality assessment (IQA), quality-based weighting, structural similarity, subjective quality assessment, visual fixations. I.
Dynamic predictions of tracked gaze
- In Seventh International Symposium on Signal Processing and its Applications, Special Session on Foveated Vision in Image and Video Processing, 2003. 800 ms 600 ms 400 ms 200 ms 0 ms
, 2003
"... We present a model for predicting the eye-movements of observers who is viewing dynamic sequences of images. As an indicator for the degree of saliency we evaluate an invariant of the spatio-temporal structure tensor that indicates an intrinsic dimension of at least two. The saliency is used to deri ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We present a model for predicting the eye-movements of observers who is viewing dynamic sequences of images. As an indicator for the degree of saliency we evaluate an invariant of the spatio-temporal structure tensor that indicates an intrinsic dimension of at least two. The saliency is used to derive a list of candidate locations. Out of this list, the currently attended location is selected according to a mapping found by supervised learning. The true locations used for learning are obtained with an eyetracker. In addition to the saliency-based candidates, the selection algorithm uses a limited history of locations attended in the past. The mapping is linear and can thus be quickly adapted to the individual observer. The mapping is optimal in the sense that it is obtained by minimizing, by gradient descent, the overall quadratic difference between the predicted and the actually attended location. 1
Finding corners in images by foveated search
- PROC. SPIE INT. SOC. OPT. ENG. 6077 (60770Y)
, 2006
"... We develop a new approach to finding corners in images that combines foveated edge detection and curvature calculation with saccadic placement of foveal fixations. Each saccade moves the fovea to a location of high curvature combined with high edge gradient. Edges are located using a foveated Canny ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We develop a new approach to finding corners in images that combines foveated edge detection and curvature calculation with saccadic placement of foveal fixations. Each saccade moves the fovea to a location of high curvature combined with high edge gradient. Edges are located using a foveated Canny edge detector with spatial constant that increases with eccentricity. Next, we calculate a measure of local corner strength, based on a product of curvature and gradient. An inhibition factor based on previous visits to a region of the image prevents the system from repeatedly returning to the same locale. A long saccade is move thes fovea to previously unexplored areas of the image. Subsequent short saccades improve the accuracy of the location of the corner approximated by the long saccade. The system is tested on two natural scenes and the results compared against subjects observing the same test images through an eyetracker. Results show that the algorithm is a good locator of corners.

