#### DMCA

## Unsupervised Temporal Commonality Discovery

### Cached

### Download Links

Citations: | 3 - 2 self |

### Citations

1892 | Robust real-time face detection,”
- Viola, Jones
- 2004
(Show Context)
Citation Context ...mporal words for the subsequence in the interval [b1, e1]. Another notable benefit of the histogram representation is that it allows for fast recursive computation using the concept of integral image =-=[29]-=-. That is, for frame t, we accumulate the sum of ϕ A[1,t] of the histograms up to t. Using this structure, we can efficiently compute the histogram for any subsequence A[t1, t2] as ϕ A[t1,t2] = ϕ A[1,... |

1637 | Video google: A text retrieval approach to object matching in videos.
- Sivic, Zisserman
- 2003
(Show Context)
Citation Context ...tate R from Q; 8 end 9 Assign the optimal rectangle r ∗ ← R; 3.3 Construction of a Bounding Function Representation of signals: Throughout the paper we will use the Bag of Temporal Words (BoTW) model =-=[26,32]-=- to represent video segments. Observe, that any features that can be discretized into histograms can fit into our framework. In BoTW the codebook is built using a clustering method (e.g., k-means) to ... |

739 | Learning Realistic Human Actions from Movies,
- Laptev, Marszałek, et al.
- 2008
(Show Context)
Citation Context ...ences in a longer video. Let Q be the query sequence we want to find in the target video T. We can modify (1) by fixing one of the pairwise sequences: min bt,et d ( ϕ T[bt,et], ϕQ ) s.t. et − bt ≥ ℓ. =-=(14)-=- The problem now becomes simpler but it still is an integer programming. Nevertheless, Algorithm 1 can be applied again to find the optimal match efficiently. Searching for multiple segments can also ... |

649 | A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results
- Everingham, Gool, et al.
- 2012
(Show Context)
Citation Context ...nds, we show in the following exemplar constructions of bounds between histograms, i.e., ℓ1, intersection, and χ 2 distance, which have been widely applied to many tasks such as objection recognition =-=[9,13]-=- and action recognition [7,11,14,16,22]. 1) Bounding ℓ1 distance: Applying the operators min/max on (2), we get min(h − b , k− b ) ≤ min(hb, kb) ≤ min(h + b , k+ b max(h − b , k− b ) ≤ max(hb, kb) ≤ m... |

561 |
Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology,
- Gusfield
- 1997
(Show Context)
Citation Context ...these work detects motifs within only one sequence, but TCD considers two (or more) sequences. Moreover, it is unclear how these technique can be robust to noise. The longest common subsequence (LCS) =-=[10,17,21]-=- is also related to TCD. The LCS problem consists on finding the longest subsequence that is common within a set of sequences (often just two) [21,31]. Closer to our work is the algorithm for longest ... |

170 | Detecting irregularities in images and in video.
- Boiman, Irani
- 2007
(Show Context)
Citation Context ...ed discovery of visual patterns in images has been a long standing computer vision problem driven by applications to cosegmentation [8,15,20], learning grammars of images [34], detecting irregularity =-=[6]-=- and automatic tagging [23]. Although recently there has been several work on unsupervised discovery of visual patterns in images, a relatively unexplored problem in computer vision is to discover com... |

170 | Action bank: A high-level representation of activity in video.
- Sadanand, Corso
- 2012
(Show Context)
Citation Context ...exemplar constructions of bounds between histograms, i.e., ℓ1, intersection, and χ 2 distance, which have been widely applied to many tasks such as objection recognition [9,13] and action recognition =-=[7,11,14,16,22]-=-. 1) Bounding ℓ1 distance: Applying the operators min/max on (2), we get min(h − b , k− b ) ≤ min(hb, kb) ≤ min(h + b , k+ b max(h − b , k− b ) ≤ max(hb, kb) ≤ max(h + b , k+ ), (4) Reordering both th... |

169 |
The complexity of some problems on subsequences and supersequences.
- Maier
- 1978
(Show Context)
Citation Context ...these work detects motifs within only one sequence, but TCD considers two (or more) sequences. Moreover, it is unclear how these technique can be robust to noise. The longest common subsequence (LCS) =-=[10,17,21]-=- is also related to TCD. The LCS problem consists on finding the longest subsequence that is common within a set of sequences (often just two) [21,31]. Closer to our work is the algorithm for longest ... |

147 | Automatic Recognition of Facial Actions in Spontaneous Expressions.
- Bartlett, Littlewort, et al.
- 2006
(Show Context)
Citation Context ...art. We just want to illustrate the versatility of our approach. 5 Experimental Results We evaluated our approach on two experiments. First, we discovered common facial events in the RU-FACS database =-=[5]-=-. Second, we found multiple common human actions in CMU-Mocap dataset [1]. The code is available at http://www. humansensing.cs.cmu.edu/software/tcd.html. 5.1 Common Facial Events Discovery This exper... |

137 | Segmenting Motion Capture Data into Distinct Behaviors. Graphics Interface,
- BARBIC, SAFONOVA, et al.
- 2004
(Show Context)
Citation Context ...greater than ℓ. To show an example, consider two 1-D sequences A = [1, 2, 2, 1] and B = [1, 1, 3]. Suppose we use ℓ1 distance, set the minimal length ℓ = 3, and represent their 3-bin histograms as ϕ A=-=[1,4]-=- = [2, 2, 0], ϕ A[1,3] = [1, 2, 0] and ϕB = [2, 0, 1]. Hereby we can conclude by showing the distances: dℓ1(ϕ A[1,4], ϕB) = 3 < 4 = dℓ1 (ϕ A[1,3], ϕB). Differences from ESS [13] and STBB [32]: Althoug... |

121 | A Stochastic Grammar of Images.
- Zhu, Mumford
- 2006
(Show Context)
Citation Context ...ry. 1 Introduction Unsupervised discovery of visual patterns in images has been a long standing computer vision problem driven by applications to cosegmentation [8,15,20], learning grammars of images =-=[34]-=-, detecting irregularity [6] and automatic tagging [23]. Although recently there has been several work on unsupervised discovery of visual patterns in images, a relatively unexplored problem in comput... |

120 | Efficient subwindow search: A branch and bound framework for object localization.
- Lampert, Blaschko, et al.
- 2009
(Show Context)
Citation Context ...three billion possible matchings that need to be computed at different lengths and locations. Therefore, the naive approach is computationally prohibitive for reasonable length sequences. Inspired by =-=[13,32]-=- that used the branch and bound (B&B) algorithm to efficiently search for optimal image patches or video volumes, we propose to adopt B&B for searching simultaneously over all possible segments in eac... |

114 | The kernel trick for distances.
- Schölkopf
- 2000
(Show Context)
Citation Context ...| 2) Bounding intersection distance: Given two normalized histograms ϕA = { h1, . . . , hK} and ϕB = { k1, . . . , kK}, we define their intersection distance by the Hilbert space representation =-=[24]-=-: K∑ d∩( ϕA, ϕB) = − min( hb, kb). (10) By (3) and (4), we can find its lower bound and upper bound: K∑ K∑ l∩(R) = − b=1 min( h+ b |A−| b=1 k+ b , |B−| ) and u∩(R) = − b=1 min( h− b |A+| k− b , |B... |

75 | Learning latent temporal structure for complex event detection.
- Tang, Li, et al.
- 2012
(Show Context)
Citation Context ...es unsupervised search of commonalities in video sequences. Also, note that there are several studies that address the problem of event detection or sequence labeling of human actions in video (e.g., =-=[12,27,32]-=-). However, unlike TCD, those studies require learning a set of classifiers from training data. (2) Formulate the TCD as an integer optimization problem and propose an efficient B&B algorithm that fin... |

69 | Unsupervised discovery of action classes, in:
- Wang, Jiang, et al.
- 2006
(Show Context)
Citation Context ...ch video sequence (see Fig. 1). Two are the main contributions of this study: (1) Introduce the new problem of unsupervised TCD. While there exist studies that address commonality discovery in images =-=[8,15,20,30]-=-, to the best of our knowledge there is little work that tackles unsupervised search of commonalities in video sequences. Also, note that there are several studies that address the problem of event de... |

64 | Learning spatiotemporal graphs of human activities, in:
- Brendel, Todorovic
- 2011
(Show Context)
Citation Context ...exemplar constructions of bounds between histograms, i.e., ℓ1, intersection, and χ 2 distance, which have been widely applied to many tasks such as objection recognition [9,13] and action recognition =-=[7,11,14,16,22]-=-. 1) Bounding ℓ1 distance: Applying the operators min/max on (2), we get min(h − b , k− b ) ≤ min(hb, kb) ≤ min(h + b , k+ b max(h − b , k− b ) ≤ max(hb, kb) ≤ max(h + b , k+ ), (4) Reordering both th... |

48 | Cross-view action recognition via view knowledge transfer.
- Liu, Shah, et al.
- 2011
(Show Context)
Citation Context ...exemplar constructions of bounds between histograms, i.e., ℓ1, intersection, and χ 2 distance, which have been widely applied to many tasks such as objection recognition [9,13] and action recognition =-=[7,11,14,16,22]-=-. 1) Bounding ℓ1 distance: Applying the operators min/max on (2), we get min(h − b , k− b ) ≤ min(hb, kb) ≤ min(h + b , k+ b max(h − b , k− b ) ≤ max(hb, kb) ≤ max(h + b , k+ ), (4) Reordering both th... |

42 | Branch and Bound Algorithm for Computing the Minimum Stability Degree of Parameter-dependent Linear Systems.”
- Balakrishnan, Boyd, et al.
- 1991
(Show Context)
Citation Context ...l B&B is much more efficient than the naive search. Note that the optimal discovered sequences can be of length greater than ℓ. To show an example, consider two 1-D sequences A = [1, 2, 2, 1] and B = =-=[1, 1, 3]-=-. Suppose we use ℓ1 distance, set the minimal length ℓ = 3, and represent their 3-bin histograms as ϕ A[1,4] = [2, 2, 0], ϕ A[1,3] = [1, 2, 0] and ϕB = [2, 0, 1]. Hereby we can conclude by showing the... |

40 |
Selection and context for action recognition
- Han, Bo, et al.
- 2009
(Show Context)
Citation Context |

38 |
Common visual pattern discovery via spatially coherent correspondences
- Liu, Yan
- 2010
(Show Context)
Citation Context ...and bound, temporal commonality discovery. 1 Introduction Unsupervised discovery of visual patterns in images has been a long standing computer vision problem driven by applications to cosegmentation =-=[8,15,20]-=-, learning grammars of images [34], detecting irregularity [6] and automatic tagging [23]. Although recently there has been several work on unsupervised discovery of visual patterns in images, a relat... |

38 | Longest Common Subsequences,
- Paterson, Dancık
- 1994
(Show Context)
Citation Context ...these work detects motifs within only one sequence, but TCD considers two (or more) sequences. Moreover, it is unclear how these technique can be robust to noise. The longest common subsequence (LCS) =-=[10,17,21]-=- is also related to TCD. The LCS problem consists on finding the longest subsequence that is common within a set of sequences (often just two) [21,31]. Closer to our work is the algorithm for longest ... |

36 |
la Torre, “Joint Segmentation and Classification of Human Actions
- Hoai, Lan, et al.
(Show Context)
Citation Context ...es unsupervised search of commonalities in video sequences. Also, note that there are several studies that address the problem of event detection or sequence labeling of human actions in video (e.g., =-=[12,27,32]-=-). However, unlike TCD, those studies require learning a set of classifiers from training data. (2) Formulate the TCD as an integer optimization problem and propose an efficient B&B algorithm that fin... |

33 | Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In: CVPR,
- Schindler, Krishnamurthy, et al.
- 2008
(Show Context)
Citation Context ...terns in images has been a long standing computer vision problem driven by applications to cosegmentation [8,15,20], learning grammars of images [34], detecting irregularity [6] and automatic tagging =-=[23]-=-. Although recently there has been several work on unsupervised discovery of visual patterns in images, a relatively unexplored problem in computer vision is to discover common temporal patterns among... |

32 |
Discriminative Video Pattern Search for Efficient Action Detection,
- Yuan, Liu, et al.
- 2011
(Show Context)
Citation Context ...three billion possible matchings that need to be computed at different lengths and locations. Therefore, the naive approach is computationally prohibitive for reasonable length sequences. Inspired by =-=[13,32]-=- that used the branch and bound (B&B) algorithm to efficiently search for optimal image patches or video volumes, we propose to adopt B&B for searching simultaneously over all possible segments in eac... |

32 | Unsupervised discovery of facial events.
- Zhou, Torre, et al.
- 2010
(Show Context)
Citation Context ...ifferent subjects. We represented features as the distances between the height of lips and teeth, angles for mouth corners and SIFT descriptors in the points tracked by Active Appearance Models (AAM) =-=[33]-=- (see Fig. 4(a) for an illustration). We built a 1,000-entry codebook on a random subset of 50,000 feature vectors (see Sec. 3.3).10 W.-S. Chu, F. Zhou, and F. De la Torre SW1 SW2 SW3 (a) SR= √ 2, SS... |

30 | Scale invariant cosegmentation for image groups
- Mukherjee, Singh
- 2011
(Show Context)
Citation Context ...and bound, temporal commonality discovery. 1 Introduction Unsupervised discovery of visual patterns in images has been a long standing computer vision problem driven by applications to cosegmentation =-=[8,15,20]-=-, learning grammars of images [34], detecting irregularity [6] and automatic tagging [23]. Although recently there has been several work on unsupervised discovery of visual patterns in images, a relat... |

25 | Unsupervised View and Rate Invariant Clustering of Video Sequences
- Turaga, Veeraraghavan, et al.
- 2009
(Show Context)
Citation Context ... a dictionary of atomic actions. Zhou et al. [33] combined spectral clustering and dynamic time warping to cluster time series, and applied it to learn taxonomies of facial expressions. Turaga et al. =-=[28]-=- used extensions of switching linear dynamical systems for clustering human actions in video sequences. However, if we cluster two sequences that only have one segment in common, previous methods for ... |

24 | Discovering Multivariate Motifs Using Subsequence Density Estimation and Greedy
- Minnen, Isbell, et al.
- 2007
(Show Context)
Citation Context ...rs only similar segments and avoids the need for clustering all the video that is computationally expensive and prone to local minima. Another unsupervised technique related to TCD is motif detection =-=[18,19]-=-. Time series motif algorithms find repeated patterns within a single sequence. Minnen et al. [18] discovered motifs as high-density regions in the space of all subsequences. Mueen and Keogh [19] furt... |

23 | Online discovery and maintenance of time series motifs. KDD
- Mueen, Keogh
- 2010
(Show Context)
Citation Context ...rs only similar segments and avoids the need for clustering all the video that is computationally expensive and prone to local minima. Another unsupervised technique related to TCD is motif detection =-=[18,19]-=-. Time series motif algorithms find repeated patterns within a single sequence. Minnen et al. [18] discovered motifs as high-density regions in the space of all subsequences. Mueen and Keogh [19] furt... |

23 |
Unsupervised learning of event and-or grammar and semantics from video,” in Computer Vision,
- Si, Pei, et al.
- 2011
(Show Context)
Citation Context ... clustering algorithms for unsupervised discovery of human actions. Wang et al. [30] exploited deformable template matching of shape and context in static images to discover action classes. Si et al. =-=[25]-=- learned an event grammar by clustering event co-occurrence into a dictionary of atomic actions. Zhou et al. [33] combined spectral clustering and dynamic time warping to cluster time series, and appl... |

11 | Momi-cosegmentation: Simultaneous segmentation of multiple objects among multiple images,” in
- W-S, Chen
- 2010
(Show Context)
Citation Context ...and bound, temporal commonality discovery. 1 Introduction Unsupervised discovery of visual patterns in images has been a long standing computer vision problem driven by applications to cosegmentation =-=[8,15,20]-=-, learning grammars of images [34], detecting irregularity [6] and automatic tagging [23]. Although recently there has been several work on unsupervised discovery of visual patterns in images, a relat... |

2 |
S.: Efficient subwindow search with submodular score functions
- An, Peursum, et al.
- 2011
(Show Context)
Citation Context ...y show that in general B&B is much more efficient than the naive search. Note that the optimal discovered sequences can be of length greater than ℓ. To show an example, consider two 1-D sequences A = =-=[1, 2, 2, 1]-=- and B = [1, 1, 3]. Suppose we use ℓ1 distance, set the minimal length ℓ = 3, and represent their 3-bin histograms as ϕ A[1,4] = [2, 2, 0], ϕ A[1,3] = [1, 2, 0] and ϕB = [2, 0, 1]. Hereby we can concl... |

1 |
Frame-level temporal calibration of unsynchronized cameras by using Longest Consecutive Common Subsequence
- Wang, Velipasalar
(Show Context)
Citation Context ...to noise. The longest common subsequence (LCS) [10,17,21] is also related to TCD. The LCS problem consists on finding the longest subsequence that is common within a set of sequences (often just two) =-=[21,31]-=-. Closer to our work is the algorithm for longest consecutive common subsequence (LCCS) [31] that finds the longest contiguous part of original sequences (e.g., videos). However, different from TCD, t... |