4 citations found. Retrieving documents...
M. R. Naphade and T. S. Huang. Stochastic modeling of soundtrack for efficient segmentation and indexing of video. In to be presented at SPIE Storage and retrieval, Jan 2000.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Semantic Video Indexing using a probabilistic framework - Naphade   (6 citations)  (Correct)

....a number of pixels more than a threshold are processed. This is done to avoid processing small non dominant regions and noisy spurious regions that may have formed due to imperfect segmentation. Each tracked region thus gives a time series of feature vectors. Audio features are extracted as in [8]. 4 Constructing Multijects for semantic concepts We use a unified approach to model concepts in video and audio (independently and jointly) In case of sites the feature vector is modeled as a gaussian mixture model (GMM) The temporal flow is not taken into consideration. In case of objects and ....

....matrices, mixing proportions (GMM and HMMS) and transition matrix (HMM) The following site multijects are used in our experiments: sky, water, forest, rocks and snow. In audio we have developed various multijects including human speech, aircraft flying, music with an approach similar to [8] (we do not report the results for multijects involving audio in this work) We would like to stress that our framework remains conceptually same in cases of concepts using either video or audio or both. Denoting the feature vector for the region j as # X j , and using the parlance of hypothesis ....

M. R. Naphade and T. S. Huang. Stochastic modeling of soundtrack for efficient segmentation and indexing of video. In to be presented at SPIE Storage and retrieval, Jan 2000.


A Probabilistic Framework for Semantic Indexing and Retrieval .. - Naphade, Huang (2000)   (6 citations)  Self-citation (Naphade Huang)   (Correct)

....components for sites and 98 components for objects and events. Each segmented region with a bounding box with sufficient number of pixels is processed to avoid processing small nondominant regions. The tracked region thus gives a time series of feature vectors. Audio features are extracted as in [8]. 4 Constructing Multijects We have a unified approach to model video and audio (independently or together) Let # X j be the feature vector for region j. For each concept we define hypotheses H 0 , H 1 as H 0 : # X j # P 0 ( # X j ) 1) H 1 : # X j # P 1 ( # X j ) where P 0 ( ....

....[10] to estimate the means, covariance matrices, mixing proportions (GMM and HMMS) and transition matrix (HMM) The following site multijects have been developed: sky, water, forest, rocks and snow. In audio we have developed various multijects including human speech, aircraft flying, music etc [8]. For models involving both audio and video we follow the hierarchical HMM approach as in [6] 4.1 Frame level multiject based features It is futile to expect perfect segmentation. We address the problem of multiple concepts existing in the same region. This is done by checking each segment for ....

M. R. Naphade and T. S. Huang. Stochastic modeling of soundtrack for efficient segmentation and indexing of video. In to be presented at SPIE Storage and retrieval, Jan 2000.


MARS (Multimedia Analysis and Retrieval System): A test-bed.. - Huang, Naphade   Self-citation (Naphade Huang)   (Correct)

....15 mel frequency cepstral coefficients (MFCC) 15 delta and 2 energy coefficients. A window width of 25 ms and overlap of 10 ms is used. This gives a 32 coefficients feature vector per window. The segmentation of the audio track and the detection of audio multijects is simultaneously performed [9]. 4 Estimating the multiject models We now describe how we construct the multijects for various concepts. We use a unified approach to model concepts in video and audio (independently and jointly) 4.1 Multijects based on Video In case of sites the feature vector is modeled as a gaussian ....

....replace the Gaussian mixture models and the feature vectors for all the frames within a shot constitute to the time series modeled by the HMM. 4. 2 Multijects based on Audio In audio we have developed various multijects including human speech, aircraft flying, music with an approach similar to [9]. We use hidden Markov models to develop audio multijects. For each audio multiject a prototype HMM with 3 states and a mixture of 15 gaussian components in each state is used to model the temporal characteristics and the emitting densities of the class. For each mixture a diagonal covariance ....

[Article contains additional citation context not shown here]

M. R. Naphade and T. S. Huang. Stochastic modeling of soundtrack for efficient segmentation and indexing of video. In to be presented at SPIE Storage and retrieval, Jan 2000.


A Factor Graph Framework for Semantic Indexing and .. - Naphade.. (2000)   (1 citation)  Self-citation (Naphade Huang)   (Correct)

....affine motion parameters for each region tracked by the spatio temporal segmentation algorithm are used as motion features. The feature vector has 84 components for sites (only color, texture and edge features) and 98 components for objects and events. 1 Audio features are extracted as in [9]. 4 Modeling semantic concepts using Multijects We use an identical approach to model concepts in video and audio (independently and jointly) The following site multijects are used in our experiments: sky, water, forest, rocks and snow. Results on audio based multijects like human speech, ....

....concepts using Multijects We use an identical approach to model concepts in video and audio (independently and jointly) The following site multijects are used in our experiments: sky, water, forest, rocks and snow. Results on audio based multijects like human speech, music are presented in [9] and those on audio visual multijects like explosion are presented in [7] Denoting the feature vector for the region j as # X j , we model the concept as a binary random variable and define the two hypotheses H 0 and H 1 as H 0 : # X j # P 0 ( # X j ) 1) H 1 : # X j # P 1 ( # X j ) ....

M. R. Naphade and T. S. Huang. Stochastic modeling of soundtrack for efficient segmentation and indexing of video. In to be presented at SPIE Storage and retrieval, Jan 2000.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC