DMCA
Deep gaze I: Boosting saliency prediction with feature maps trained on imagenet (2015)
Venue: | in ICLR Workshop |
Citations: | 2 - 0 self |
Citations
1742 | A model of saliency-based visual attention for rapid scene analysis
- Itti, Koch, et al.
- 1998
(Show Context)
Citation Context ...explain only one third of the explainable information in the spatial fixation structure (Kümmerer et al., 2014). Most of the existing models use low-level cues like edge-detectors and color filters (=-=Itti et al., 1998-=-) or local image statistics (Zhang et al., 2008; Bruce and 1 ar X iv :1 41 1. 10 45 v1s[ cs .C V]s4sN ovs20 14 Im ag e w it h fi xa ti o n s D ee p G az e n o ce n te r b ia s D ee p G az e w it h ce ... |
1007 | Imagenet classification with deep convolutional neural networks
- Krizhevsky, Sutskever, et al.
(Show Context)
Citation Context ...cy prediction. We present a novel way of reusing existing neural networks that have been pretrained on the task of object recognition in models of fixation prediction. Using the well-known network of =-=Krizhevsky et al., 2012-=-, we come up with a new saliency model that significantly outperforms all state-of-the-art models on the MIT Saliency Benchmark. We show that the structure of this network allows new insights in the p... |
837 | Imagenet: A large-scale hierarchical image database - Deng, Dong, et al. - 2009 |
210 | Learning to predict where humans look - Judd, Ehinger, et al. - 2009 |
202 | Decaf: A deep convolutional activation feature for generic visual recognition. Retrieved from arXiv:1310.1531 - Donahue, Jia, et al. - 2013 |
192 | Caffe: Convolutional architecture for fast feature embedding. - Jia, Shelhamer, et al. - 2014 |
175 | Theano: a CPU and GPU math expression compiler - Bergstra, Breuleux, et al. - 2010 |
154 | Very Deep Convolutional Networks for Large-Scale Image Recognition.
- Simonyan, Zisserman
- 2015
(Show Context)
Citation Context ...liency models with increasingly higher predictive power. Thus, the obvious next step suggested by this approach is to replace the Krizhevsky network by the ImageNet 2014 winning networks such as VGG (=-=Simonyan and Zisserman, 2014-=-) and GoogLeNet (Szegedy et al., 2014). A second conceptual contribution of this work is to optimize saliency models by maximizing the log-likelihood of a point process (see Barthelmé et al., 2013; K... |
146 | Visual correlates of fixation selection: Effects of scale and time
- Tatler, Baddeley, et al.
- 2005
(Show Context)
Citation Context ...neralization performance of these models on the remaining 540 images from MIT1003 that have not been used in training. As performance measure we use shuffled area under the curve (shuffled AUC) here (=-=Tatler et al., 2005-=-). In AUC, the saliency map is treated as a classifier score to separate fixations from “nonfixations”: presented with two locations in the image, the classifier chooses the location with the higher s... |
118 |
The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions
- Tatler
(Show Context)
Citation Context ... a Gaussian kernel whose width is controlled by σ, yielding the saliency map s(x, y) = ∑ k wkrk(x, y) ∗Gσ. It is well known that fixation locations are strongly biased towards the center of an image (=-=Tatler, 2007-=-). To account for this center bias, the saliency prediction is linearly combined with a fixed center bias prediction c(x, y): o(x, y) = αc(x, y) + s(x, y) To predict fixation probabilities, this outpu... |
93 |
How people look at pictures.
- Buswell
- 1935
(Show Context)
Citation Context ...ns choose eye fixations, we can hope to understand and explain human behaviour in a number of vision-related tasks. For this reason human eye movements have been studied for more than 80 years (e.g., =-=Buswell, 1935-=-). During the last 20 years, many models have been developed trying to explain fixations in terms of so called “saliency maps”. Recently, it has been suggested to model saliency maps probabilistically... |
76 | Predicting human gaze using low-level saliency combined with face detection - Cerf, Harel, et al. |
66 |
Quantitative science and the definition of measurement in psychology.
- Michell
- 1997
(Show Context)
Citation Context ...ation gain, we can asses how far we have come in describing the fixations. It is important to note that this interpretation is only possible due to the fact that information gain is on a ratio scale (=-=Michell, 1997-=-): differences and ratios of information gains are meaningful – opposed to other measures like AUC. In Figure 3, the percentage of information gain explained is plotted for our model in comparison to ... |
65 | Going Deeper with Convolutions - Szegedy, Liu, et al. - 2014 |
39 |
F.: Visualizing and understanding convolutional neural networks. In: arXiv preprint arXiv:1311.2901
- Zeiler, R
- 2013
(Show Context)
Citation Context ...e, but also to improve our understanding of the internal implementation of fixation selection in the brain by formulating new hypotheses that lead to new experimental paradigms. Finally, results from =-=Zeiler and Fergus, 2013-=- show ways to interpret 2 co n v1 re lu 1 p o o l1 n or m 1 co n v2 . . . n or m 5 re lu 5 fu ll1 . . . Krizhevsky/caffe network rk(x, y) blur ∑ k wkrk(x, y) + softmax Figure 2: The model structure of... |
27 |
A benchmark of computational models of saliency to predict human fixations
- Judd, Durand, et al.
(Show Context)
Citation Context ... model (eDN) is able to explain only 34 %. Deep Gaze I is able to increase this information gain to 56 %. 7 2.2 Results on MIT Saliency Benchmark We submitted our model to the MIT Saliency Benchmark (=-=Judd et al., 2012-=-). The benchmark evaluates saliency models on a dataset of 300 images and 40 subjects. The fixations are not available to make training for these fixations impossible. The MIT Saliency Benchmark evalu... |
20 | Saliency detection: A boolean map approach
- Zhang, Sclaroff
(Show Context)
Citation Context ...that performed better than the center bias. The x-axis is at the level of the center bias model. The three top performing models after Deep Gaze I are in order of decreasing performance: BMS (82.57%, =-=Zhang and Sclaroff, 2013-=-), Mixture of Saliency Models (82.09%, Han and Satoh, 2014), and eDN (81.92%, Vig et al., 2014). Notice that AUC and shuffled AUC use different definitions of saliency map: While AUC expects the salie... |
14 | On the relationship between optical variability visual saliency and eye fixations: a computational approach - Garcia-Diaz, Leborán, et al. - 2012 |
11 | RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis”. en - Riche - 2013 |
9 | Performance-optimized hierarchical models predict neural responses in higher visual cortex. - Yamins, Hong, et al. - 2014 |
7 | Large-Scale Optimization of Hierarchical Features for Saliency Prediction
- Vig, Dorr, et al.
- 2014
(Show Context)
Citation Context ...er layers in a way that would allow to formulate predictions that can be tested psychophysically. A first attempt at modelling saliency with deep convolutional networks has been performed recently by =-=Vig et al., 2014-=- (eDN), yielding state-of-the-art performance. However, training deep neural networks on fixations suffers from the usually small training sets compared to the training data used in other tasks. To re... |
6 | Modeling fixation locations using spatial point processes - Barthelmé, Trukenbrod, et al. |
2 | How close are we to understanding image-based saliency?” In: arXiv:1409.7686 [cs.CV - Kümmerer, Wallis, et al. - 2014 |
2 |
An adaptive low dimensional quasi-Newton sum of functions optimizer
- Sohl-Dickstein, Poole, et al.
- 2014
(Show Context)
Citation Context ... log-likelihoods, cost functions and gradients were done in theano (Bergstra et al., 2010). To minimize the cost function on the training set of fixations, the batch based BFGS method as described in =-=Sohl-Dickstein et al., 2014-=- was used. It combines the benefits of batch based methods with the advantage of second order methods, yielding high convergence rates without next to no hyperparameter tuning. To avoid overfitting to... |
1 | Tsotsos (2009). “Saliency, attention, and visual search: An information theoretic approach”. In: Journal of vision 9.3 - Bruce, John |