| Perrine Delacourt and Christian J.Wellekens. Audio data indexing: use of second order statistics for speaker-based segmentation. In International Conference on Multimedia Computing and Systems, Florence, Italy, volume 2, pages 959--963, 1999. |
....the audio feature vector. The aim now is to locate speaker change instants used later on for enhancing scene boundary detection. In order to do that, rstly feature vectors of successive K speech segments SK0 ; SK0 K , are grouped together to form sequences of feature vectors of the form [11]: X = fc 1 ; cLS K 0 z SK 0 ; c 1 ; cLS K 1 z SK 0 1 ; c 1 ; cLS K 0 K z SK 0 K g (13) Grouping is performed on the basis of the total duration of the grouped speech segments. This is expected to be equal or greater than 2sec, when ....
....in many cases. Speaker change instants are evaluated with a detection accuracy of 62.8 . We have 30.23 false detections, while missed detections are of a percentage of 34.89 . Enhancement of this method may be achieved by simultaneously considering other similarity measures as well, as shown in [11]. Despite, however, of the suboptimal performance of speaker change instants detection, their use during audio visual interaction for scene boundary detection leads to a satisfactory outcome, in combination with the other segmentation results. In order to evaluate the performance of the visual ....
P. Delacourt and C. Wellekens, \Audio data indexing: Use of second-order statistics for speaker-based segmentation", in Proc. of 1999 IEEE Int. Conf. on Multimedia Computing and Systems, Florence, Italy, 1999, vol. II, pp. 959-963.
....feature vector. The aim now is to locate speaker changing instants used later on for enhancing scene boundary detection. In order to do that, firstly feature vectors of successive K speech segments SK 0 ; SK 0 K 1 , are grouped together to form sequences of feature vectors of the form [11]: X = fc1 ; cL S K 0 z SK 0 ; c1 ; cL S K 1 z SK 0 1 ; c1 ; cL S K 0 K 1 z SK 0 K 1 g (13) Grouping is performed on the basis of the total duration of the grouped speech segments. This is expected to be greater than 2sec, when assuming ....
....in many cases. Speaker change instants are evaluated with a detection accuracy of 62.8 , we have 30.23 false detections, while missed detections are of a percentage of 34.89 . Enhancement of this method may be achieved by simultaneously considering other similarity measures as well, as shown in [11]. Despite, however, of the suboptimal performance of speaker change instants, their use during audio visual interaction for scene boundary detection leads to a satisfactory outcome in combination with the other segmentation results. In order to evaluate the performance of the visual segmentation ....
P. Delacourt and C. Wellekens, "Audio data indexing: Use of second-order statistics for speaker-based segmentation", in Proc. of 1999 IEEE Int. Conf. on Multimedia Computing and Systems, Florence, Italy, 1999, vol. II, pp. 959--963.
No context found.
P. Delacourt, C.J. Wellekens, Audio Data Indexing: Use of Second Order Statistics for Speaker Based Segmentation, Proc. ICMCS 99, Florence, June 1999.
....lab to test more accurately our approach. The speech signal is parameterized with 12 melcepstral coefficients. The addition of the Delta coefficients (first derivatives) does not improve the results and increases the time of computation. For this reason, the Delta coefficients are not used (see [15]) 3.2. Assessment methods A good segmentation should provide the correct speaker changes and therefore segments containing a single speaker. We distinguish two types of errors related to speaker change detection. A false alarm (FA) occurs when a speaker change is detected although it does not ....
P. Delacourt and C. J. Wellekens, "Audio data indexing: use of second-order statistics for speaker-based segmentation," in ICMCS, 1999.
No context found.
Perrine Delacourt and Christian J.Wellekens. Audio data indexing: use of second order statistics for speaker-based segmentation. In International Conference on Multimedia Computing and Systems, Florence, Italy, volume 2, pages 959--963, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC