45 citations found. Retrieving documents...
H. J. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applic., vol. 1, no. 1, pp. 89--111, 1995.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Dissolve Detection in MPEG Compressed Video - Lifang Gu Ken (1997)   (1 citation)  (Correct)

....data, which must be restored from the compressed data by the time consuming decompression operation. Since most digital videos are stored in a compressed format such as MPEG, several algorithms for detecting abrupt shot boundaries directly in the compressed domain have recently emerged [1,4,7,11]. These methods use the information directly available from an MPEG stream, such as DCT coefficients, motion vectors and bit rates, as dissimilarity measures. Since decompression is not necessary, these compressed domain methods are computationally more efficient. In addition, some of these ....

H. Zhang, C.Y. Low and S.W, Smoliar, "Video parsing and browsing using compressed data", Multimedia Tools and Applications, vol. 1, no. 1, 1995,


Video Analysis in MPEG Compressed Domain - Gu (2002)   (Correct)

....di#erence based algorithms is that they are sensitive to busy scenes, in which intensities change substantially from frame to frame due to camera object motion. Since the availability of MPEG video, several algorithms for detecting cuts directly in the MPEG domain have emerged [AHC93, YL95, SD95, ZLS95, LZ95, FLM96, KDR96, IP97, GHP98, JHEJ98, KKC99, MIP99] These methods use the information directly available from an MPEG stream, such as DCT coe#cients, motion vectors and bit rates, to calculate the frame dissimilarity. A full review of cut detection algorithms operating directly on MPEG video ....

H.J. Zhang, C.Y. Low, and S.W. Smoliar. Video parsing and browsing using compressed data. Multimedia Tools and Applications, 1(1):89--111, 1995.


Rapid Estimation of Camera Motion from Compressed.. - Tan, Saur, Kulkarni.. (2000)   (13 citations)  (Correct)

....small intensity changes between successive images. In addition to reducing the computation time, use of the DC sequence can make the scene change detection more robust to small object and camera motions. The scene change detection algorithm used is similar to those proposed by Zhang et al. [3] and Yeo et al. 5] However, instead of using the intensity histogram of the entire DC image, we divide each DC image into blocks (12 image blocks in this paper) and the intensity histograms of corresponding image blocks of successive frames are compared to each other. This allows us to capture ....

H. J. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applicat., vol. 1, no. 1, pp. 89--111, Mar. 1995.


Survey of Compressed-Domain Features used in.. - Wang, Divakaran..   (Correct)

....of different frames are used to measure their similarity. They use this method to detect shot boundaries and show its effectiveness for cut detection. The computational complexity is relatively high, because inner product is involved in the calculation. b. DCT block difference Zhang et al. [68] compare the relative difference of all coefficients in a DCT block to measure the similarity between two DCT blocks. A cut is detected if a large amount of blocks have changed significantly in terms of DCT block difference. This method involves less computation than the above one. c. Variance of ....

....operation estimation based on dominant motion These methods estimate camera operations by looking at the motion vectors directly. They do not adopt explicit physical camera models like those mentioned above. Usually different operations (e.g. pan and zoom) are estimated separately. Zhang et al. [68] use sum of difference between motion vectors and the modal vector to determine pan and tilt. Zooming is detected by the change of signs of motion vectors across the center of zoom, as zooming often results in a pattern of many motion vectors pointing to or away from the zooming center. The ....

[Article contains additional citation context not shown here]

H. J. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications, Vol.1, No. 1, pp. 89-111, 1995.


ShotWeave: A Shot Clustering Technique for Story Browsing.. - Zhou, Tavanapong (2001)   (3 citations)  (Correct)

....segmentation techniques. A typical automatic video segmentation consists of three important steps. The rst step is shot boundary detection (SBD) A shot boundary is declared if a dissimilarity measure between consecutive frames exceeds a threshold value. Some of the recent SBD techniques are [3, 7, 1, 2, 8 13]. The second step is keyframe selection. For each shot, one or more frames that best represent the shot, termed key frame(s) are extracted to reduce the processing overhead in the next step. For instance, key frames can be selected from one or more pre determined location(s) in a shot. More ....

Zhang, H.J., Low, C.Y., Smoliar, S.W.: Video parsing and browsing using compressed data. Multimedia Tools and Applications 1 (1995) 89-111


A Noise-Reduction Approach to Scene Segmentation for Large.. - Tavanapong, Zhou   (Correct)

....important steps. In the first step, given an input video, a shot boundary is declared if a similarity measure between the boundary frames exceeds a threshold value. Recent years have seen several notable shot boundary detection techniques (SBDs) that either process compressed video data directly [14, 13, 10] or uncompress videos first before processing them [15, 4, 5, 1, 12, 7] In the subsequent step, one or more frames that best represent the shot, termed keyframe (s) are extracted for each shot to reduce the amount of processing overhead in the next step. Finally, shots are grouped into scenes ....

H. J. Zhang, C. Y. Low, and S. W. Smoliar. Video parsing and browsing using compressed data. Multimedia Tools and Applications, 1:89--111, 1995.


Representation Of Motion Activity In Hierarchical Levels .. - Sun, Manjunath..   (Correct)

....hierarchical levels, the P frame macroblock information of an MPEG video is used. There are two reasons for using P frame information. First, digital video possesses redundant information; therefore, P frames are good temporal samples of the original video, and have been used in many applications [9]. Second, P frames are encoded with macroblock information that can be processed quickly, as discussed in the next section. We adaptively segment video into levels with fixed percentages (1 to 20 ) of original video length using the method proposed in [6] This is based on the observation that ....

. H. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications, 1(1): pp.89-111, 1995.


Brief Summary - One Of The   (Correct)

....differences. In the next step, each shot is represented by a number of key frames. In our tests we used only two key frames per shot, although we believe that the performance of the algorithm can be improved using more sophisticated key frame extraction methods, as proposed in [4] 5] or [6]. For each shot k, the GSU boundary likelihood L k is computed, by measuring its interrelationship to nextfollowing shots. L k represents the likelihood that the boundary between two consecutive GSUs can be assumed to lie around the shot k. In order to obtain a realistic picture about this ....

Zhang H., Low C.Y., Smoliar S.W., "Video Parsing and Browsing using Compressed Data", Multimedia Tools and Applications, vol. 1, pp. 89-111, Kluwer Academic Publishers, 1995.


Automated Segmentation of Movies into Logical Story Units - Hanjalic, Lagendijk, Biemond (1999)   (1 citation)  (Correct)

.... 15] It is generally accepted that content analysis of video sequences requires a preprocessing procedure that first breaks up the sequences into temporally homogeneous segments called video shots [1, 2, 3, 8, 16] then condenses these segments into one or a few representative frames (key frames) [4, 7, 21, 24], and finally determines the relationship among shots using their key frame based representation. This last step we call video content organization. Since video streams to be dealt with in modern digital storage systems are mainly in MPEG compressed form, no content related operations are possible ....

....we define our own dissimilarity measure. We assume that the video sequence is segmented into shots, using any of the methods found in literature [1, 2, 3, 8, 16] Each detected video shot is represented by one or multiple key frames such that its visual information is captured as good as possible [4, 7, 21, 24]. For dynamic shots more key frames are needed than for stationary shots. Since we are using MPEG compressed video sequences, the key frames are DC images which are typically 90 x 72 pixels, i.e. 64 times smaller the original frames. 10For each shot, all key frames are merged together in one ....

Zhang H., Low C.Y., Smoliar S.W., "Video Parsing and Browsing using Compressed Data", Multimedia Tools and Applications, vol. 1, pp. 89-111, Kluwer Academic Publishers, 1995.


A General Framework For Video Segmentation Based On Temporal.. - Mohan (2000)   (1 citation)  (Correct)

....motion vectors in video frame to distinguish camera motions like panning and zooming from that of GTs. The algorithm is simple and intuitive, and has been found to be effective on a wide variety of video. Variants or extensions of such methods have been implemented in compressed DCT domain [11, 13, 14, 18]. The main problem of such methods is that they require careful threshold selection and are affected by noise [5] In particular, the thresholds for GT can only be chosen by experience; and the thresholds that work for one video may not work for other videos. The other problem with using color ....

Zhang HJ, Low CW, Gong Y & Smoliar SW [1995]. Video Parsing and Browsing Using Compressed Data, Multimedia Tools and Applications, 1(1), 91-111.


A Survey on: Content-based Access to Image and Video Databases - Aas, Eikvil (1997)   (Correct)

....over 200 gigabytes when uncompressed. If one was able to analyse video directly in compressed format, one would save both the auxiliary storage for decompressed data and the computational costs for decompression. Several methods have been proposed for scene change detection on compressed MPEG [22, 32, 20, 47, 41] and Motion JPEG [4] data. These methods have proved to be sufficiently accurate for segmentation of the majority of shots in a video sequence. Before describing the approaches for MPEG data, a brief description on the MPEG standard can be useful. MPEG defines three different types of frames. ....

....P frames and motion vector information to characterize scene changes. Sethi et al. 32] use only the DC coefficients of I frames to perform hypothesis testing using the luminance histogram, while Liu et al. 20] make use of only information in P and B frames to detect scene changes. Zhang et al. [47] detect abrupt transitions between shots by counting the number of valid motion vectors in P or B frames, while a full frame approach is taken to detect gradual transitions. Yeo et al. 41] have developed algorithms to identify both abrupt and gradual scene transitions using the DC coefficients ....

H. J. Zhang, C. Y. Low, S. W. Smoliar, Video parsing and browsing using compressed data. Multimedia tools and applications, Vol. 1, pp. 89--111, Mars, 1995.


Survey on Compressed-Domain Features used in.. - Wang, Divakaran..   (Correct)

....of different frames are used to measure their similarity. They use this method to detect shot boundaries and show its effectiveness for cut detection. The computational complexity is relatively high, because inner product is involved in the calculation. b. DCT block difference Zhang et al. [68] compare the relative difference of all coefficients in a DCT block to measure the similarity between two DCT blocks. A cut is detected if a large amount of blocks have changed significantly in terms of DCT block difference. This method involves less computation than the above one. c. Variance of ....

....operation estimation based on dominant motion These methods estimate camera operations by looking at the motion vectors directly. They do not adopt explicit physical camera models like those mentioned above. Usually different operations (e.g. pan and zoom) are estimated separately. Zhang et al. [68] use sum of difference between motion vectors and the modal vector to determine pan and tilt. Zooming is detected by the change of signs of motion vectors across the center of zoom, as zooming often results in a pattern of many motion vectors pointing to or away from the zooming center. The ....

[Article contains additional citation context not shown here]

H. J. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications, Vol.1, No. 1, pp. 89-111, 1995.


Panoramic Video Capturing and Compressed Domain.. - Sun, Foote, Kimber.. (2001)   (2 citations)  (Correct)

....The ROI output can be used to display the video (see section 5) It can also be used to extract the ROI video from the original panoramic video for storage purposes. 4.2 Detection of the ROI in the Compressed Domain Compressed domain video processing can achieve fast speeds. While Zhang et al. [21] and many others use compression domain information as features for video data segmentation and indexing, very few efforts have been made to use them for detection purposes. An example of compressed domain face detection is given in [19] The method proposed here is based on our previous work on ....

Zhang, H. J., Low,C. Y., and Smoliar,S. W. "Video Parsing and Browsing Using Compressed Data," Multimedia Tools and Applications, 1(1): pp.89-111, 1995.


Accessing News Video Libraries through Dynamic Information.. - Christel (2001)   (1 citation)  (Correct)

....of Figure 2 contributed the most toward the document s ranking for the query air crash. Hence that shot s thumbnail is used to represent the whole document in Figure 1. The automatic breakdown of video into component shots has received a great deal of attention by the image processing community [13, 20, 24, 28, 29]. The thumbnail images for each shot can be arranged into a single chronological display, a storyboard surrogate, which captures the visual flow of a video document along with the locations of matches to a query. From Figure 1 s interface, clicking on the filmstrip icon for a document displays a ....

....of digital video. A number of video libraries have converged on the idea of including a thumbnail image in the storyboard for each shot in the video, including CAETI [28] Pictorial Transcripts [13] and the Baltimore Learning Community [15] backed by early research by Zhang, Aoki, and others [2, 29]. Others suggest that more than one image should be added per shot depending on the composition of the shot determined through motion analysis [27] Other researchers have implemented subsampling; in which a storyboard image is extracted at evenly distributed intervals across a video [25] An ....

Zhang, H.J., Low, C.Y., and Smoliar, S.W. Video parsing and browsing using compressed data. Multimedia Tools and Applications 1(1) (1995), 89-111.


Image Browsing using Hierarchical Clustering - Krishnamachari, Abdel-Mottaleb (1999)   (1 citation)  (Correct)

....the user navigate through the database in a structured manner. In [9] a content management systems for home video libraries, which automatically segments the video clips and extracts a visual table of content, was presented. The user can browse the video material through the table of content. In [10], a system was presented for parsing and browsing the video through the extracted key frames. In this paper we present a browsing approach where users can navigate non linearly through the images in the database. The approach is to automatically create a hierarchical clustering of the images in ....

H.J. Zhang, Y.L. Chien, and S.W. Smoliar, "Video Parsing and Browsing Using Compressed Data," Proc. Multimedia Tools and Applications, Vol. 1, No. 1, pp. 89-111, 1995.


Temporal Video Segmentation: A Survey - Koprinska, Carrato (2001)   (2 citations)  (Correct)

....are more difficult to detect than cuts. They must be distinguished from camera operations (Figure 4) and object movement that exhibit temporal variances of the same order and cause false positives. It is particularly difficult to detect dissolves between sequences involving intensive motion [14,44,47]. Figure 4 Camera operation recognition is an important issue also for another reason. As camera operations usually explicitly reflect how the attention of the viewer should be directed, the clues obtained are useful for key frame selection. For example, when a camera pans over a ....

....simple, requires minimum encoding and produces good accuracy. The total number of parameters needed to implement this algorithm is 7. 2.2. 5 Temporal Video Segmentation Based on DCT Coefficients, MB Coding Mode and MVs A very interesting two pass approach is taken by Zhang, Low and Smoliar [47]. They first locate the regions of potential transitions, camera operations and object motion, applying the pair wise DCT coefficients comparison of I frames (Eq.15) as in their previous approach (see section 2.2.2) The goal of the second pass is to refine and confirm the break points detected by ....

H.J. Zhang, C.Y. Low, S.W. Smoliar, Video parsing and browsing using compressed data, Multimedia Tools and Applications 1 (1995) 89-111. 42 Captions of the figures


Hybrid Rule-Based/Neural Approach For Segmentation Of Mpeg.. - Koprinska, Carrato   (Correct)

....are more difficult to detect than cuts. They must be distinguished from camera operations (Figure 3) and object movement that exhibit temporal variances of the same order and cause false positives. It is particularly difficult to detect dissolves between sequences involving intensive motion [9,12,35,37]. 4 Figure 1. Dissolve, cut Figure 2. Fade out followed by fade in 5 Figure 3. Basic camera operations: fixed, zooming (focal length change of a stationary camera) panning tilting (camera rotation around its horizontal vertical axis) tracking booming (horizontal vertical transverse ....

.... transitions are detected by cumulative difference measures [36] Other categories of video segmentation techniques are feature based [35] and model based [1,8,12,34] Camera operations are recognized by computing the motion vectors between successive frames and analyzing their characteristics [3,37] or by examination of spatiotemporal images [4,30] For good seminal papers on video segmentation in uncompressed domain see [2,7,9] For video segmentation in MPEG compressed domain a natural solution is to use the DC terms as they are directly related to the pixel domain, possibly ....

[Article contains additional citation context not shown here]

H.J. Zhang, C.Y. Low, and S.W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools & Applications, Vol. 1, pp. 89-111, 1995.


VCB: Video Clip Browser and more - Messelodi, Modena (1995)   (Correct)

....In the particular case of video streams, the use of browsers relieves the operator of the slow and fatiguing process of scanning the sequence frame by frame. In the recent literature, many works can be found dealing with the development of browsing tools for video streams, see [7] 8] 9] [10], 11] 12] 13] In this paper we present the Video Clip Browser and more, a set of tools designed to navigate through the frames of a video clip. The distinguishing feature of our system is that the browsing tools are placed in a common environment and are closely connected each other, i.e. ....

H. J. Zhang, C. Y. Low, and S. W. Smoliar. Video Parsing and Browsing Using Compressed Data. Multimedia Tools and Applications, 1:91--113, 1995.


Visual Information Retrieval from Large Distributed.. - Chang, Smith, Beigi.. (1997)   (24 citations)  (Correct)

....and transcripts of broadcast video [6] for news video retrieval. 2 Complementary to visual search is visual summarization. By decomposing the video, e.g. by using automated scene detection, a more spatially and or temporally compact presentation of the video can be generated. For example, [7] has developed news video summarization systems, with efficient browsing interfaces using video event detection and clustering. Others have also developed automated video analysis techniques of continuous video sequences to generate mosaic images for improved browsing and indexing. Other research ....

H. Zhang, C.Y. Low, and S. Smoliar, "Video Parsing and Browsing Using Compressed Data," J. of Multimedia Tools and Applications, Vol. 1, No.1, Kluwer Academic Publishers, March 1995, pp.89-111.


Columbia's VoD and Multimedia Research Testbed.. - Chang.. (1996)   (3 citations)  (Correct)

.... For example, dedicated stor 2 age architectures for real time multi access have been studied in [9, 10, 11, 12, 13] Systematic approaches to the design of video servers (VS) are reported in [14, 15, 16, 17] Innovative methods for indexing searching images by image contents were addressed in [4, 5, 6, 7, 8]. In addition, many field trials of VOD services using proprietary high performance VS technologies have made news headlines. Lastly, a major international forum, DAVIC, has been active in specifying standards for critical protocols and interfaces for achieving interoperability between various ....

H. Zhang, C.Y. Low, and S. Smoliar, "Video Parsing and Browsing Using Compressed Data," J. of Multimedia Tools and Applications, Vol. 1, No.1, Kluwer Academic Publishers, March 1995, pp.89111.


Columbia's VoD and Multimedia Research Testbed.. - Chang.. (1997)   (3 citations)  (Correct)

....fields as well. For example, dedicated storage architectures for real time multi access have been studied in [8, 9, 10] Systematic approaches to the design of video servers (VS) are reported in [11, 12, 13] Innovative methods for indexing searching images by image contents were addressed in [4, 5, 6, 7]. In addition, many field trials of VoD services using proprietary high performance VS technologies have made news headlines. Lastly, a major internaJournal on Multimedia Tools and Applications, Special Issue on Video on Demand, Kluwer Academic Publishers, 2 tional forum, DAVIC, has been active ....

H. Zhang, C.Y. Low, and S. Smoliar, "Video Parsing and Browsing Using Compressed Data," J. of Multimedia Tools and Applications, Vol. 1, No.1, Kluwer Academic Publishers, March 1995, pp.89111.


Illumination Invariant Image Indexing Using Moments and.. - Mandal, Panchanathan.. (1998)   (Correct)

....regular or central moments. With the advent of various image compression standards [10,11] the current and future databases are likely to employ compression techniques for efficient storage. This has led to the proliferation of a number of compressed domain indexing techniques in the literature [12]. Here, indexing is performed directly on the compressed data (see Fig. 2) Various indexing techniques have been proposed employing KarhunenLoeve transform (KLT) 13] discrete cosine transform (DCT) 14] and discrete wavelet transform (DWT) 15] A technique exploiting the directional property ....

H. Zhang, C. Y. Low and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications 1(1), 89-111, (1995).


Zodiac: A History-Based Interactive Video Authoring System - Chiueh, Mitra, Neogi, Yang (1998)   (4 citations)  (Correct)

....for typical video clips of 360x240 resolution. For thumbnail frames of resolution 64x64, compressed frame size is 67 to 87 of the original. 5 Shot Scene Boundary Detection Much research effort has been invested in the detection of shot scene boundaries of digital video sequences. Zhang et al. [21] has a nice summary on the recent results using image processing techniques. Zodiac takes a completely different approach in that it attempts to uncover high level structures of digital video sequences by analyzing their associated edit history. The key observation underlying this approach is that ....

H. Zhang, Chien Yong Low, S. W. Smoliar, "Video parsing and browsing using compressed data", Multimedia Tools and Applications, vol.1, no.1, p. 89-111, March 1995.


Image Indexing Using Moments and Wavelets - Mandal, Aboulnasr, Panchanathan (1996)   (13 citations)  (Correct)

....this technique fails to provide a good coding performance. Moreover, the performance of the technique is sensitive to the weights which should be chosen carefully. An indexing scheme might be more useful when it is associated with the coding scheme used for storing the images in the database [6]. Lee et al. 7] have proposed an indexing technique in the subband domain [8] where the histograms of the lowpass subband coefficients are compared hierarchically from the highest level to the lowest level (i.e. the image) of the pyramid. Although the hierarchical nature of this technique reduces ....

H. Zhang, C. Y. Low and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications, Vol. 1, No. 1, pp. 89-111, 1995.


Tools for Browsing a TV Situation Comedy Based on Content.. - Joshua Wachman (2001)   (3 citations)  (Correct)

....tools must be available to examine the underlying video content in order to provide the means for intelligent responses to database queries. The last is among the most provocative issues related to video browsers, and the one motivating this work. There are several browsing methods [14] 19] [21], 9] 7] 2] which are useful in analyzing, labeling, correlating, integrating or reducing the search space associated with data in a video sequence. None of these is directed at identifying specific plot 3 elements, such as which characters are in a particular shot. What distinguishes the ....

H. Zhang, C. Y. Low, and S. Smoliar. Video parsing and browsing using compressed data. Journal of Multimedia Tools and Applications, 1(1):89--111, March 1995.


Compressed Domain Video Segmentation - Kobla, Doermann, Rosenfeld (1996)   (5 citations)  (Correct)

....shot changes. Cherfaoui and Bertin [2] describe a technique based on camera movements. Zabih et al. 17] describe a technique using intensity edges that detects various edits such as cuts, dissolves, fades, etc. A lot of work has also been done on detecting shot changes in the compressed domain [1, 7, 9, 11, 13, 16, 19]. The earliest approaches were by Arman et al. 1] who used only the DCT information of I frames to detect disparity. This technique works well with clips containing only I frames (XING format or Motion JPEG) but has problems if the clip contains P and B frames and the I frames are spaced widely ....

....from component DCT blocks [15] These techniques would allow the P and B frames in an MPEG clip to be treated just like any other I frame and there would be no loss of temporal resolution, as noted above. This technique was employed by Yeo and Liu [16] and by Shen and Delp [13] Zhang et al. [19] used a hybrid approach involving a DCT comparison technique for the I frames and a motion vector validity scheme for the P and B frames. Meng et al. 9] also used the validity of motion vectors to detect shot changes. They described an adaptive technique for determining the thresholds used in ....

[Article contains additional citation context not shown here]

H.J. Zhang, C.Y. Low, and S.W. Smoliar. Video parsing and browsing using compressed data. Multimedia Tools and Applications, 1:89--111, 1995.


Video Scene Change Detection Using The Generalized Sequence Trace - Taskiran, Delp (1998)   (4 citations)  (Correct)

....by a grant from the Rockwell Foundation. Address all correspondence to E. J. Delp, ace ecn.purdue.edu, http: www.ece.purdue.edu #ace or 1 765 494 1740. domain edge detection algorithm. This method uses the number of entering and exiting edge pixels to find scene changes. Zhang, et al. [4] used the normalized inner product of vectors consisting of predetermined collections of DCT coefficients from a number of preset regions in a frame. They then use a global threshold to detect scene changes. Yeo and Liu [5] detect scene changes by using both pixel differences and luminance ....

....400 500 600 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 frame number, Figure 2: The generalized trace for tv2 3. SCENE CHANGE DETECTION Generally, after a dissimilarity measure is derived from the video sequence, most work use some type of global thresholding technique to detect the scene changes [4], 1] Simple as this approach may be, a priori selection of the threshold is a problem since scene change is a local activity. Considering this fact, others have used sliding windows to process the data and detect shots [5] We approach the problem differently. Noting that the edges of the ....

H. Zhang, C. Low, and S. Smoliar, "Video parsing and browsing using compressed data," SPIE Conference on Multimedia Tools and Applications Vol. 1, No. 1, pp. 89-11 1995


Visual Search In A Smash System - Lagendijk, Hanjalic, Ceccarelli.. (1997)   (6 citations)  (Correct)

....form. Most existing approaches to locate shot changes (video parsing) and key frames avoid the decoding of the compressed data by operating on information that is directly available in the bitstream, namely intra coded pictures, DC values (avoiding inverse DCTs) and motion vector information [3,4,6 8]. The second issue is that of determining where shot changes occur and what frames qualify as key frames. Most proposals determine where shot changes occur if a certain frame toframe action measure A i (n) surpasses a threshold [9] Within shot i usually the first and or last frame are selected as ....

H. Zhang, C.Y. Low and S.W. Smoliar, "Video Parsing and Browsing using Compressed Data", Multimedia Tools and Applic., vol. 1, pp. 89-111, Kluwer Acad. Publ., 1995.


Visual Information Retrieval from Large Distributed.. - Chang, Smith, Beigi.. (1997)   (24 citations)  (Correct)

....sunsets) in [7] Textual indexes are generated from the captions and transcripts of broadcast video [8,9,10] for news video retrieval. A complementary function with visual search is summarization. Scene based techniques are used in efficient browsing interfaces and event detection and clustering [11,12]. Video analysis techniques are used to construct mosaic images for efficient browsing and indexing from continuous video sequences [13] 2 Work has also begun in a critical area which aims at automatic decoding of semantic meanings of visual content. Learning through iterative user interaction ....

H. Zhang, C.Y. Low, and S. Smoliar, "Video Parsing and Browsing Using Compressed Data," J. of Multimedia Tools and Applications, Vol. 1, No.1, Kluwer Academic Publishers, March 1995, pp.89-111.


Multimedia Applications and Their Implications on Database.. - Klas, Aberer (1995)   (5 citations)  (Correct)

....access. Another reason for introducing such abstractions is to allow the user to refer to the data in terms of abstractions which make up his model of the application domain. These abstractions may be provided by the user or by the system based on the contents of the multimedia data (see e.g. JLS95] It can be very reasonable to store these derived abstractions since their computations may be very expensive. For the retrieval and organization of the multimedia data it should be possible to provide several layers of abstractions. Assume for example that we have a database of videos. A ....

H. Jiang, C.Y. Low, and S.W. Smoliar. Video Parsing and Browsing Using Compressed Data. Multimedia Tools and Applications, 1(1):89--111, March 1995.


Browsing operations on MPEG video - Radhika Nagpal   (Correct)

....different layout schemes can be employed to optimize disk seeks between GOPs. The perceptual effect however is that of highlights of the video clip rather than fast motion. The picture quality is preserved but the perception of motion is lost. A similar kind of browsing is that using keyframes [3]. Keyframes display representative frame(s) for every shot or scene change. Unlike the IBM scheme the speedup here is fixed and every shot and scene boundary is displayed. This requires segmentation and parsing of the video stream which is a complex operation for compressed video. 4.1 Play I ....

Zhang, Low, Smoliar. "Video Parsing and Browsing Using Compressed Data", Multimedia Tools and Applications, Vol.1, No.1, 1994.


Light-years from Lena: Video and Image Libraries of the Future - Picard (1995)   (19 citations)  (Correct)

....the same point of interest, e.g. a person browsing through a store. A segment is a sequence of scenes that forms a story unit, e.g. a flashback. Video parsing research has been directed so far toward the problem of detecting shots, e.g. Araman et al. 4] Tonomura et al. 5] and Zhang et al. [6]. Most of the methods have relied either on differencing (high pass filtering) all the pixels or a subset of them in two frames, or on differences of gray level or color statistics. In both cases a close analogy exists to early work in spatial edge detection. Matched filter methods are also ....

....in complexity for the goal of compression alone. However, if the goal is both compression and content access, then the extra complexity is justified, for it saves enormous work during the searching stage. Shot change detection has already been run successfully on compressed MPEG and JPEG data [4] [6]. Smith and Chang have also run retrieval algorithms directly on compressed data [22] 3 Summary: Hard questions In this short paper I have tried to overview key image processing research problems which must be solved to give people access to the content of digital video and image libraries. ....

H.-J. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications, vol. 1, pp. 80--111, March 1995.


A Multi-paradigm Querying Approach for a Generic Multimedia .. - Wen, Li, Ma, Zhang (2002)   Self-citation (Zhang)   (Correct)

No context found.

Zhang, H. J. et al, "Video parsing and browsing using compressed data" Multimedia Tools and Applications 1(1), 89-111, 1995


A Multithreaded Client-Server Architecture for Distributed.. - Gollapudi (1996)   (1 citation)  Self-citation (Zhang)   (Correct)

....t 3 Gamma t 4 Average Delay 0 0 0 Speed Ratio 1 1 1 Utilization 1 1 1 Skew 0 3 0 3 0 3 Table 6.2: Parameter values for presentation with delay recovery In the second set of experiments, we presented a video stream along with an audio stream. The video clips are MPEG encoded streams [SZ94, ZTSY95, ZLS95]. The nominal presentation schedule is shown in Figure 6.5(a) Unlike the slide presentation, the presentation of both the streams is continuous. The allowable skip for both the video and audio stream is set to 0.50 seconds. Similar to the previous case, QoS parameters were measured with and ....

H.J. Zhang, C.Y. Low, and S.W. Smoliar. Video Parsing and Browsing Using Compressed Data. Multimedia Tools and Applications, 1(1):89--111, March 1995.


Semantic Multicast: Intelligently Sharing.. - Dao, Perry, Shek..   (9 citations)  Self-citation (Zhang)   (Correct)

....algorithms have been developed for video data files which are static, these techniques have to be adapted to a real time video transmission scenario, where the video has been compressed and packetized. Of particular importance are methods which operate on the bits in the compressed domain [22, 16]. Apart from such scene change tags, other types of annotations which can be derived from the processing of video data are low level image features such as spatial color representations, texture measures etc. which can be important features to search for other scenes which are visually similar. ....

H.J. Zhang, C.Y. Low, and S.W. Smoliar. Video parsing and browsing using compressed data. Multimedia Tools and Applications, 1995.


Video Clustering - Vailaya, Jain, Zhang   Self-citation (Zhang)   (Correct)

....clips, it may be possible to cluster shots into classes based on the various topics covered in the lectures making future indexing efficient. 2. Previous Work Work reported in the literature on video parsing and representation can be broadly classified into shot detection and keyframe extraction [18, 17, 19, 14], clustering of shots [1, 16, 15] and clustering of still images (keyframes) 3, 10, 4, 9, 21, 22] 2.1. Shot Detection and Keyframe Extraction Shots have been identified as the fundamental unit of video and their detection is the foremost task of scene segmentation. Shot boundaries can be ....

....wipes, etc) Techniques described in [18, 17] use a priori models for video parsing and segmentation and may not be effective for general purpose video segmentation. Recent schemes based on video content encoded in DCT (Discrete Cosine Transform) coefficients and motion vector information [19], and reduced image sequences [14] have been reported in the literature. These schemes are sufficiently accurate in segmenting the video into shots and save both the additional storage and computation costs required in the decompression of video. Representation of shots is another key issue in ....

[Article contains additional citation context not shown here]

H. J. Zhang, C. Y. Low, and S. W. Smoliar. Video parsing and browsing using compressed data. Multimedia Tools and Applications, (1):89--111, 1995.


QoS Management in Educational Digital Library Environments - Zhang, Gollapudi (1996)   (2 citations)  Self-citation (Zhang)   (Correct)

....Delay 1 1 1.71 11 Speed Ratio 0.89 0.78 0.9 Utilization 1 1 1 1 Skew 0 3 0 3 0 3 0 3 Table 2: Parameter values for presentation without delay recovery In the second set of experiments, we presented a video stream along with an audio stream. The video clips are MPEG encoded streams [SZ94, ZTSY95, ZLS95] The nominal presentation schedule is shown in Figure 6(a) Unlike the slide presentation, the presentation of both the streams is continuous. The allowable skip for both the video and audio stream is set to 0.50 seconds. Similar to the previous case, QoS parameters were measured with and ....

H.J. Zhang, C.Y. Low, and S.W. Smoliar. Video Parsing and Browsing Using Compressed Data. Multimedia Tools and Applications, 1(1):89--111, March 1995.


A Unified Framework for Semantic Shot Classification in.. - Duan, Xu, Tian, Xu, Jin (2005)   (Correct)

No context found.

H. J. Zhang, C. Y. Low, and S. W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applic., vol. 1, no. 1, pp. 89--111, 1995.


Video Transition: Modelling and Prediction - Ren, Singh   (Correct)

No context found.

H.J. Zhang, C.Y. Low and S.W. Smoliar, "Video parsing and browsing using compressed data", Multimedia Tools and Applications, vol. 1, pp. 89-111, 1995.


A Motion Activity Descriptor and Its Extraction in.. - Sun, Divakaran, Manjunath   (3 citations)  (Correct)

No context found.

Hongjiang Zhang, Chien~Yong Low, and Stephen W. Smoliar, "Video parsing and browsing using compressed data," Multimedia Tools and Applications, 1(1): pp.89-111, 1995.


Video Segmentation using Combined Cues - Toller, Lewis, Nixon (1998)   (Correct)

No context found.

H.J. Zhang, C. Y. Low, and S.W. Smoliar. Video parsing and browsing using compressed data. In Multimedia Tools and Applications,volume 1, pages 89#111. Kluwer Academic Publishers, Boston, 1995.


Compact Color Descriptor for Fast Image and Video.. - Krishnamachari.. (2000)   (Correct)

No context found.

H.J. Zhang, Y.L. Chien, and S.W. Smoliar, "Video Parsing and Browsing Using Compressed Data," Proc. Multimedia Tools and Applications, Vol. 1, No. 1, pp. 89-111, 1995.


A Sequence Analysis System for Video Databases - Ceccarelli Hanjalic Lagendijk   (Correct)

No context found.

H.J. Zhang, C.Y. Low, S.W. Smoliar: "Video parsing and browsing using compressed data", Multimedia tools and applications, Mar. 95, pp 89-112


Indexing, Browsing and Searching of Digital Video and Digital.. - Smeaton (2000)   (Correct)

No context found.

Zhang, H., Low, C. and Smoliar, S.. Video Parsing and Browsing Using Compressed Data. Multimedia Tools and Applications. 1:89-111, (1995).


Information Visualization within a Digital Video Library - Christel, Martin (1998)   (6 citations)  (Correct)

No context found.

Zhang, H.J., Low, C.Y., and Smoliar, S.W. (1995b). Video parsing and browsing using compressed data. Multimedia Tools and Applications, 1, 89-111.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC