| E. Klabbers and R. Veldhuis. Reducing audible spectral discontinuities. IEEE Trans. Speech and Audio Processing, Vol. 9, No. 1, pp. 39--51, 2001. 106 |
....Much of the earlier work in the literature has concentrated on instance level costs that directly compare speech segments, or instantiations of the speech units. Numerical metrics such as Euclidean, Kullback Leibler, and Mahalanobis distances calculated over spectral features have been considered [21, 22, 39, 61, 69, 160, 137, 70, 161]. The concatenation cost defined here bears similarity to disconcatibility as proposed by Iwahashi [63] and to splicing cost proposed as by Bulyko [22] The use of mutual information to find boundaries across which information is blocked is related to rifts [6] as studied in statistical machine ....
....factor for practical applications (what is the input vocabulary size does it read free form text ) As described by Pols in [32] speech synthesis can be diagnosed on both modular and global levels. Recent evaluations of synthesizers have been performed both independently of an application [160, 111, 28, 70, 106] and as part of a spoken dialogue system [156, 157] Chu et al. was able to correlate Mean Opinion Scores (MOS) taken from human subjects with objective concatenation cost functions [28, 106] When the cost function is used as the metric in unit selection, the MOS of the resulting synthesis can be ....
E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities," IEEE Trans. Acoustics, Speech and Signal Processing, vol. 9, no. 1, pp. 39--51, Jan. 2001.
....Much of the earlier work in the literature has concentrated on instance level costs that directly compare speech segments, or instantiations of the speech units. Numerical metrics such as Euclidean, Kullback Leibler, and Mahalanobis distances calculated over spectral features have been considered [3, 4, 10 12, 16]. Because we define costs at the class level for scalability, we apply the metrics not to pairs of individual examples but to pairs of distributions of multiple examples. 2. REVIEW OF UNIT SELECTION Unit selection involves finding an appropriate sequence of units, u, from a speech corpus given ....
E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities, " IEEE Transactions on Speech and Audio Processing, 39--51, January 2001.
....e.g. 63, 45] because they tend to give high quality speech in conjunction with duration and fundamental frequency modifications needed for prosody control. Further, mel cepstral distances are shown in one study to be among the worst predictors of audible discontinuities in perceptual experiments [38], though other work shows that the performance difference relative to the best case (a KullbackLiebler spectral distance) is not that large [64] Another area where speech recognition technology is limited, again because of the poor representation of source characteristics, is in voice conversion ....
E. Klabbers and R. Veldhuis, "Reducing Audible Spectral Discontinuities," IEEE Trans. Speech and Audio Processing, 9(1):39-51, 2001.
....less likely to have perceived discontinuities at joins at the phone boundaries. Vowels, on the other hand, are found to have smoother concatenations in the middle of the phone. Furthermore, different vowels have different degrees of perceived discontinuity when spliced in the middle of the phone [16]. This evidence motivates an implementation of a more flexible unit selection framework, that would provide separate controls for quantifying the potential perceptual discontinuity at a given boundary, separately from the spectral mismatch between the candidate units. In this work, we introduce a ....
Klabbers, E., and Veldhuis, R., "Reducing audible spectral discontinuities", IEEE Transactions on Speech and Audio Processing, 9(1):39--51, 2001.
No context found.
E. Klabbers and R. Veldhuis. Reducing audible spectral discontinuities. IEEE Trans. Speech and Audio Processing, Vol. 9, No. 1, pp. 39--51, 2001. 106
No context found.
Esther Klabbers and Raymond Veldhuis, "Reducing audible spectral discontinuities," IEEE Transaction on Speech and Audio Processing, vol. 9, no. 1, January 2001.
No context found.
E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities, " IEEE Trans. Speech Audio Processing, vol. 9, pp. 39--51, Jan. 2001.
No context found.
E. Klabbers and R. Veldhuis, "Reducing Audible Spectral Discontinuities," IEEE Trans. Speech Audio Proc., Special Issue Speech Synth., N. Campbell, M. Macon, and J. Schroeter, Eds., Vol. 9, No. 1, pp. 39--51, January 2001.
No context found.
E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities," IEEE Trans. on Speech and Audio Proc., vol. SAP-09, no. 01, pp. 39--51, Jan 2001.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC