| H. Glotin and F. Berthommier. Test of several external posterior weighting functions for multiband Full Combination ASR. In Proc. ICSLP '00, Beijing, China, October 2000. |
....of the harmonicity mask from the autocorrelelogram was 1. Sum over the frequency channels to compute a summary autocorrelogram. 2 Our use of the autocorrelogram has parallels with that of Glotin and Berthommier who employ a similar measure of harmonicity to derive an estimate of subband SNR [9]. Linking ASA and robust ASR by missing data techniques Barker, Green and Cooke 2. Find the lag of the largest peak in the summary autocorrelogram. 3. Take a slice through the autocorrelogram at this lag. 4. Rescale the slice using a sigmoid function to obtain a frame of the soft ....
H. Glotin and F. Berthommier. Test of several external posterior weighting functions for multiband Full Combination ASR. In Proc. ICSLP '00, Beijing, China, October 2000.
....filtering in the pitch domain ( 90,350] Hz) For each cell, we calculate the ratio R R] O R U , where R] is the local maximum in time delay segment corresponding to the fundamental frequency, and U is the cell energy. This measure is strongly correlated with SNR in the 5 20dB range [15]. Figure 1 demonstrates explicitly the correlation of this measure in clean and noisy speech with the noisy Clean audio Noisy audio WER (relative) WER (relative) Audio only 14.44 48.10 AV MS 14.62 ( 1.2) 36.61 ( 23.9) AV MS UTTER 13.47 ( 6.7) 35.27 ( 26.7) AV PROD 14.19 ( 1.7) 35.21 ....
....This may be due to the fact that our estimate of the audio stream reliability (voicing) is more accurate in clean speech. Further investigation is due to examine more appropriate variable weights over the noisy product model. A different mapping function for R can be explored for that purpose [15]. 5. DISCRIMINATIVE COMBINATION OF AUDIO AND VISUAL MODELS The Discriminative Model Combination (DMC) approach [4] aims at an optimal integration of independent sources of information in a log linear model that computes the probability for a hypothesis. The parameters of this new model are the ....
H. Glotin and F. Berthommier, "Test of several external posterior weighting functions for multiband full combination ASR," in Proc. ICSLP, 2000, vol. 1, pp. 333--336.
....by band pass filtering in the pitch domain ( 90,350] Hz) For each cell, we calculate the ratio R = R1=R0, where R1 is the local maximum in time delay segment corresponding to the fundamental frequency, and R0 is the cell energy. This measure is strongly correlated with SNR in the 5 20dB range [15]. Figure 1 demonstrates explicitly the corClean audio Noisy audio WER (relative) WER (relative) Audio only 14.44 48.10 AV MS 14.62 1.2 36.61 23.9 AV MS UTTER 13.47 6.7 35.27 26.7 AV PROD 14.19 1.7 35.21 26.8 AV PROD UTTER 35.43 26.3 AV PROD LOCAL 37.15 22.8 Table 1. ....
....This may be due to the fact that our estimate of the audio stream reliability (voicing) is more accurate in clean speech. Further investigation is due to examine more appropriate variable weights over the noisy product model. A different mapping function for R can be explored for that purpose [15]. 5. DISCRIMINATIVE COMBINATION OF AUDIO AND VISUAL MODELS The Discriminative Model Combination (DMC) approach [4] aims at an optimal integration of independent sources of information in a log linear model that computes the probability for a hypothesis. The parameters of this new model are the ....
H. Glotin and F. Berthommier, "Test of several external posterior weighting functions for multiband full combination asr," in Proc. ICSLP, 2000, vol. I, pp. 333--336.
No context found.
H. Glotin and F. Berthommier. Test of several external posterior weighting functions for multiband Full Combination ASR. In Proc. ICSLP '00, Beijing, China, October 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC