DMCA
Learning spatiotemporal features with 3D convolutional networks (2015)
Venue: | In ICCV |
Citations: | 5 - 0 self |
Citations
716 | Behavior recognition via sparse spatio-temporal features
- Dollár, Rabaud, et al.
- 2005
(Show Context)
Citation Context ...s) by extending Harris corner detectors to 3D. SIFT and HOG are also extended into SIFT-3D [34] and HOG3D [19] for action recognition. Dollar et al. proposed Cuboids features for behavior recognition =-=[5]-=-. Sadanand and Corso built ActionBank for action recognition [33]. Recently, Wang et al. proposed improved Dense Trajectories (iDT) [44] which is currently the state-of-the-art hand-crafted feature. T... |
250 | Rich feature hierarchies for accurate object detection and semantic segmentation /
- Girshick
- 2014
(Show Context)
Citation Context ...parallel machines (GPUs, CPU clusters), together with large amounts of training data, convolutional neural networks (ConvNets) [28] have made a come back providing breakthroughs on visual recognition =-=[10, 24]-=-. ConvNets have also been applied to the problem of human pose estimation in both images [12] and videos [13]. More interestingly these deep networks are used for image feature learning [7]. Similarly... |
202 | Decaf: A deep convolutional activation feature for generic visual recognition. Retrieved from arXiv:1310.1531
- Donahue, Jia, et al.
- 2013
(Show Context)
Citation Context ...ition [10, 24]. ConvNets have also been applied to the problem of human pose estimation in both images [12] and videos [13]. More interestingly these deep networks are used for image feature learning =-=[7]-=-. Similarly, Zhou et al. and perform well on transferred learning tasks. Deep learning has also been applied to video feature learning in an unsupervised setting [27]. In Le et al. [27], the authors u... |
168 |
Large displacement optical flow:descriptor matching in variational motion estimation
- Brox, Malik
- 2010
(Show Context)
Citation Context ...he Temporal stream network [36]. For iDT, we use the code kindly provided by the authors [44]. For [36], there is no public model available to evaluate. However, this method uses Brox’s optical flows =-=[3]-=- as inputs. We manage to evaluate runtime of Brox’s method using two different versions: CPU implementation provided by the authors [3] and the GPU implementation provided in OpenCV. We report runtime... |
45 |
Detecting irregularities in images and
- Boiman, Irani
- 2005
(Show Context)
Citation Context ...rch, recommendation, ranking etc. The computer vision community has been working on video analysis for decades and tackled different problems such as action recognition [26], abnormal event detection =-=[2]-=-, and activity understanding [23]. Considerable progress has been made in these individual problems by employing different specific solutions. However, there is still a growing need for a generic vide... |
39 | Long-term recurrent convolutional networks for visual recognition and description.
- Donahue, Hendricks, et al.
- 2015
(Show Context)
Citation Context ...nly attends to salient motion. Best viewed on a color screen. Method Accuracy (%) Imagenet + linear SVM 68.8 iDT w/ BoW + linear SVM 76.2 Deep networks [18] 65.4 Spatial stream network [36] 72.6 LRCN =-=[6]-=- 71.1 LSTM composite model [39] 75.8 C3D (1 net) + linear SVM 82.3 C3D (3 nets) + linear SVM 85.2 iDT w/ Fisher vector [31] 87.9 Temporal stream network [36] 83.7 Two-stream networks [36] 88.0 LRCN [6... |
14 | Dynamic scene understanding: The role of orientation features in space and time in scene classification.
- Derpanis, Lecce, et al.
- 2012
(Show Context)
Citation Context ...re 7 plots the ROC curves of C3D compared with current methods and human performance. C3D has clearly made a significant improvement which is a halfway from current state-of-the-art method to Dataset =-=[4]-=- [41] [8] [9] Imagenet C3D Maryland 43.1 74.6 67.7 77.7 87.7 87.7 YUPENN 80.7 85.0 86.0 96.2 96.7 98.1 Table 5. Scene recognition accuracy. C3D using a simple linear SVM outperforms current methods on... |
13 | Learning human pose estimation features with convolutional networks
- Jain, Tompson, et al.
- 2014
(Show Context)
Citation Context ...al neural networks (ConvNets) [28] have made a come back providing breakthroughs on visual recognition [10, 24]. ConvNets have also been applied to the problem of human pose estimation in both images =-=[12]-=- and videos [13]. More interestingly these deep networks are used for image feature learning [7]. Similarly, Zhou et al. and perform well on transferred learning tasks. Deep learning has also been app... |
2 | Spacetime forests with complementary features for dynamic scene recognition
- Feichtenhofer, Pinz, et al.
(Show Context)
Citation Context ...s the ROC curves of C3D compared with current methods and human performance. C3D has clearly made a significant improvement which is a halfway from current state-of-the-art method to Dataset [4] [41] =-=[8]-=- [9] Imagenet C3D Maryland 43.1 74.6 67.7 77.7 87.7 87.7 YUPENN 80.7 85.0 86.0 96.2 96.7 98.1 Table 5. Scene recognition accuracy. C3D using a simple linear SVM outperforms current methods on Maryland... |
2 | Bags of spacetime energies for dynamic scene recognition
- Feichtenhofer, Pinz, et al.
- 2014
(Show Context)
Citation Context ...t Sport1M UCF101 ASLAN YUPENN UMD Object Task action recognition action recognition action similarity labeling scene classification scene classification object recognition Method [29] [39]([25]) [31] =-=[9]-=- [9] [32] Result 90.8 75.8 (89.1) 68.7 96.2 77.7 12.0 C3D 85.2 85.2 (90.4) 78.3 98.1 87.7 22.3 Table 1. C3D compared to best published results. C3D outperforms all previous best reported methods on a ... |
2 | Evaluating new variants of motion interchange patterns
- Hanani, Levy, et al.
- 2013
(Show Context)
Citation Context ...2 different 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 false positive rate tru espo sit ivesra te C3D Human Performance STIP [21] OSSML [22] MIP [20] MIP+STIP+MBH =-=[11]-=- iDT+FV [45] Imagenet Random Chance Figure 7. Action similarity labeling result. ROC curve of C3D evaluated on ASLAN. C3D achieves 86.5% on AUC and outperforms current state-of-the-art by 11.1%. Metho... |
1 | Up next: retrieval methods for large scale related video suggestion
- Bendersky, Garcia-Pueyo, et al.
- 2014
(Show Context)
Citation Context ...rajectory). 2. Related Work Videos have been studied by the computer vision community for decades. Over the years various problems like action recognition [26], anomaly detection [2], video retrieval =-=[1]-=-, event and action detection [30, 17], and many more have been proposed. Considerable portion of these works are about video representations. Laptev and Lindeberg [26] proposed spatio-temporal interes... |