Results 1 - 10
of
52
Streaming Hierarchical Video Segmentation
"... Abstract. The use of video segmentation as an early processing step in video analysis lags behind the use of image segmentation for image analysis, despite many available video segmentation methods. A major reason for this lag is simply that videos are an order of magnitude bigger than images; yet m ..."
Abstract
-
Cited by 45 (6 self)
- Add to MetaCart
(Show Context)
Abstract. The use of video segmentation as an early processing step in video analysis lags behind the use of image segmentation for image analysis, despite many available video segmentation methods. A major reason for this lag is simply that videos are an order of magnitude bigger than images; yet most methods require all voxels in the video to be loaded into memory, which is clearly prohibitive for even medium length videos. We address this limitation by proposing an approximation framework for streaming hierarchical video segmentation motivated by data stream algorithms: each video frame is processed only once and does not change the segmentation of previous frames. We implement the graph-based hierarchical segmentation method within our streaming framework; our method is the first streaming hierarchical video segmentation method proposed. We perform thorough experimental analysis on a benchmark video data set and longer videos. Our results indicate the graph-based streaming hierarchical method outperforms other streaming video segmentation methods and performs nearly as well as the full-video hierarchical graph-based method. 1
Object Segmentation in Video: A Hierarchical Variational Approach for Turning Point Trajectories into Dense Regions
"... Point trajectories have emerged as a powerful means to obtain high quality and fully unsupervised segmentation of objects in video shots. They can exploit the long term motion difference between objects, but they tend to be sparse due to computational reasons and the difficulty in estimating motion ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
(Show Context)
Point trajectories have emerged as a powerful means to obtain high quality and fully unsupervised segmentation of objects in video shots. They can exploit the long term motion difference between objects, but they tend to be sparse due to computational reasons and the difficulty in estimating motion in homogeneous areas. In this paper we introduce a variational method to obtain dense segmentations from such sparse trajectory clusters. Information is propagated with a hierarchical, nonlinear diffusion process that runs in the continuous domain but takes superpixels into account. We show that this process raises the density from 3% to 100 % and even increases the average precision of labels. 1.
Segmentation of moving objects by long term video analysis
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... ..."
Video segmentation with superpixels
- In ACCV
, 2012
"... Abstract. Due to its importance, video segmentation has regained in-terest recently. However, there is no common agreement about the neces-sary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Abstract. Due to its importance, video segmentation has regained in-terest recently. However, there is no common agreement about the neces-sary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. Our results show that a frame-based superpixel segmen-tation combined with a few motion and appearance-based affinities are sufficient to obtain good video segmentation performance. A second con-tribution of the paper is the extension of [1] to include motion-cues, which makes the algorithm globally aware of motion, thus improving its performance for video sequences. Finally, we contribute an extension of an established image segmentation benchmark [1] to videos, allowing coarse-to-fine video segmentations and multiple human annotations. Our results are tested on BMDS [2], and compared to existing methods. 1
A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis
"... Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of subproblems appearing in video segmentation and that is large enough to avoid overfitting. Consequently, there is little analysis of video segmentation which generalizes across subtas ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
(Show Context)
Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of subproblems appearing in video segmentation and that is large enough to avoid overfitting. Consequently, there is little analysis of video segmentation which generalizes across subtasks, and it is not yet clear which and how video segmentation should leverage the information from the still-frames, as previously studied in image segmentation, alongside video specific information, such as temporal volume, motion and occlusion. In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy. 1.
Multi-scale clustering of frame-to-frame correspondences for motion segmentation
- In: ECCV (Oct
"... Abstract. We present an approach for motion segmentation using inde-pendently detected keypoints instead of commonly used tracklets or tra-jectories. This allows us to establish correspondences over non-consecutive frames, thus we are able to handle multiple object occlusions consistently. On a fram ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
(Show Context)
Abstract. We present an approach for motion segmentation using inde-pendently detected keypoints instead of commonly used tracklets or tra-jectories. This allows us to establish correspondences over non-consecutive frames, thus we are able to handle multiple object occlusions consistently. On a frame-to-frame level, we extend the classical split-and-merge al-gorithm for fast and precise motion segmentation. Globally, we cluster multiple of these segmentations of different time scales with an accurate estimation of the number of motions. On the standard benchmarks, our approach performs best in comparison to all algorithms which are able to handle unconstrained missing data. We further show that it works on benchmark data with more than 98 % of the input data missing. Finally, the performance is evaluated on a mobile-phone-recorded sequence with multiple objects occluded at the same time. 1
Two-Granularity Tracking: Mediating Trajectory and Detection Graphs for Tracking under Occlusions
"... Abstract. We propose a tracking framework that mediates grouping cues from two levels of tracking granularities, detection tracklets and point trajectories, for segmenting objects in crowded scenes. Detection tracklets capture objects when they are mostly visible. They may be sparse in time, may mis ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Abstract. We propose a tracking framework that mediates grouping cues from two levels of tracking granularities, detection tracklets and point trajectories, for segmenting objects in crowded scenes. Detection tracklets capture objects when they are mostly visible. They may be sparse in time, may miss partially occluded or deformed objects, or contain false positives. Point trajectories are dense in space and time. Their affinities integrate long range motion and 3D disparity information, useful for segmentation. Affinities may leak though across similarly moving objects, since they lack model knowledge. We establish one trajectory and one detection tracklet graph, encoding grouping affinities in each space and associations across. Two-granularity tracking is cast as simultaneous detection tracklet classification and clustering (cl 2) in the joint space of tracklets and trajectories. We solve cl 2 by explicitly mediating contradictory affinities in the two graphs: Detection tracklet classification modifies trajectory affinities to reflect object specific dis-associations. Non-accidental grouping alignment between detection tracklets and trajectory clusters boosts or rejects corresponding detection tracklets, changing accordingly their classification. We show our model can track objects through sparse, inaccurate detections and persistent partial occlusions. It adapts to the changing visibility masks of the targets, in contrast to detection based bounding box trackers, by effectively switching between the two granularities according to object occlusions, deformations and background clutter. 1
Flattening Supervoxel Hierarchies by the Uniform Entropy Slice
"... Supervoxel hierarchies provide a rich multiscale decomposition of a given video suitable for subsequent processing in video analysis. The hierarchies are typically computed by an unsupervised process that is susceptible to undersegmentation at coarse levels and over-segmentation at fine levels, whic ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Supervoxel hierarchies provide a rich multiscale decomposition of a given video suitable for subsequent processing in video analysis. The hierarchies are typically computed by an unsupervised process that is susceptible to undersegmentation at coarse levels and over-segmentation at fine levels, which make it a challenge to adopt the hierarchies for later use. In this paper, we propose the first method to overcome this limitation and flatten the hierarchy into a single segmentation. Our method, called the uniform entropy slice, seeks a selection of supervoxels that balances the relative level of information in the selected supervoxels based on some post hoc feature criterion such as objectness. For example, with this criterion, in regions nearby objects, our method prefers finer supervoxels to capture the local details, but in regions away from any objects we prefer coarser supervoxels. We formulate the uniform entropy slice as a binary quadratic program and implement four different feature criteria, both unsupervised and supervised, to drive the flattening. Although we apply it only to supervoxel hierarchies in this paper, our method is generally applicable to segmentation tree hierarchies. Our experiments demonstrate both strong qualitative performance and superior quantitative performance to state of the art baselines on benchmark internet videos. 1.
Hierarchical video representation with trajectory binary partition tree
- In CVPR
, 2013
"... As early stage of video processing, we introduce an iter-ative trajectory merging algorithm that produces a region-based and hierarchical representation of the video se-quence, called the Trajectory Binary Partition Tree (BPT). From this representation, many analysis and graph cut tech-niques can be ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
As early stage of video processing, we introduce an iter-ative trajectory merging algorithm that produces a region-based and hierarchical representation of the video se-quence, called the Trajectory Binary Partition Tree (BPT). From this representation, many analysis and graph cut tech-niques can be used to extract partitions or objects that are useful in the context of specific applications. In order to define trajectories and to create a precise merging algorithm, color and motion cues have to be used. Both types of informations are very useful to characterize objects but present strong differences of behavior in the spa-tial and the temporal dimensions. On the one hand, scenes and objects are rich in their spatial color distributions, but these distributions are rather stable over time. Object mo-tion, on the other hand, presents simple structures and low spatial variability but may change from frame to frame. The proposed algorithm takes into account this key difference and relies on different models and associated metrics to deal with color and motion information. We show that the proposed algorithm outperforms existing hierarchical video segmentation algorithms and provides more stable and pre-cise regions. 1.