9 citations found. Retrieving documents...
E. Klabbers and R. Veldhuis. Reducing audible spectral discontinuities. IEEE Trans. Speech and Audio Processing, Vol. 9, No. 1, pp. 39--51, 2001. 106

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Corpus-Based Unit Selection for Natural-Sounding Speech Synthesis - Yi (2003)   (Correct)

....Much of the earlier work in the literature has concentrated on instance level costs that directly compare speech segments, or instantiations of the speech units. Numerical metrics such as Euclidean, Kullback Leibler, and Mahalanobis distances calculated over spectral features have been considered [21, 22, 39, 61, 69, 160, 137, 70, 161]. The concatenation cost defined here bears similarity to disconcatibility as proposed by Iwahashi [63] and to splicing cost proposed as by Bulyko [22] The use of mutual information to find boundaries across which information is blocked is related to rifts [6] as studied in statistical machine ....

....factor for practical applications (what is the input vocabulary size does it read free form text ) As described by Pols in [32] speech synthesis can be diagnosed on both modular and global levels. Recent evaluations of synthesizers have been performed both independently of an application [160, 111, 28, 70, 106] and as part of a spoken dialogue system [156, 157] Chu et al. was able to correlate Mean Opinion Scores (MOS) taken from human subjects with objective concatenation cost functions [28, 106] When the cost function is used as the metric in unit selection, the MOS of the resulting synthesis can be ....

E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities," IEEE Trans. Acoustics, Speech and Signal Processing, vol. 9, no. 1, pp. 39--51, Jan. 2001.


Information-Theoretic Criteria for Unit Selection Synthesis - Yi, Glass (2002)   (6 citations)  (Correct)

....Much of the earlier work in the literature has concentrated on instance level costs that directly compare speech segments, or instantiations of the speech units. Numerical metrics such as Euclidean, Kullback Leibler, and Mahalanobis distances calculated over spectral features have been considered [3, 4, 10 12, 16]. Because we define costs at the class level for scalability, we apply the metrics not to pairs of individual examples but to pairs of distributions of multiple examples. 2. REVIEW OF UNIT SELECTION Unit selection involves finding an appropriate sequence of units, u, from a speech corpus given ....

E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities, " IEEE Transactions on Speech and Audio Processing, 39--51, January 2001.


The Impact Of Speech Recognition On Speech Synthesis - Ostendorf, Bulyko (2002)   (1 citation)  (Correct)

....e.g. 63, 45] because they tend to give high quality speech in conjunction with duration and fundamental frequency modifications needed for prosody control. Further, mel cepstral distances are shown in one study to be among the worst predictors of audible discontinuities in perceptual experiments [38], though other work shows that the performance difference relative to the best case (a KullbackLiebler spectral distance) is not that large [64] Another area where speech recognition technology is limited, again because of the poor representation of source characteristics, is in voice conversion ....

E. Klabbers and R. Veldhuis, "Reducing Audible Spectral Discontinuities," IEEE Trans. Speech and Audio Processing, 9(1):39-51, 2001.


Unit Selection for Speech Synthesis Using Splicing Costs.. - Bulyko, Ostendorf (2001)   (2 citations)  (Correct)

....less likely to have perceived discontinuities at joins at the phone boundaries. Vowels, on the other hand, are found to have smoother concatenations in the middle of the phone. Furthermore, different vowels have different degrees of perceived discontinuity when spliced in the middle of the phone [16]. This evidence motivates an implementation of a more flexible unit selection framework, that would provide separate controls for quantifying the potential perceptual discontinuity at a given boundary, separately from the spectral mismatch between the candidate units. In this work, we introduce a ....

Klabbers, E., and Veldhuis, R., "Reducing audible spectral discontinuities", IEEE Transactions on Speech and Audio Processing, 9(1):39--51, 2001.


High-Quality and Flexible Speech Synthesis with Segment Selection.. - Toda (2003)   (Correct)

No context found.

E. Klabbers and R. Veldhuis. Reducing audible spectral discontinuities. IEEE Trans. Speech and Audio Processing, Vol. 9, No. 1, pp. 39--51, 2001. 106


Mosievius: Feature Driven Interactive Audio Mosaicing - Lazier, Cook (2003)   (3 citations)  (Correct)

No context found.

Esther Klabbers and Raymond Veldhuis, "Reducing audible spectral discontinuities," IEEE Transaction on Speech and Audio Processing, vol. 9, no. 1, January 2001.


Audio Textures: Theory and Applications - Lu, Wenyin, Zhang (2004)   (Correct)

No context found.

E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities, " IEEE Trans. Speech Audio Processing, vol. 9, pp. 39--51, Jan. 2001.


A Novel Discontinuity Metric for Unit Selection Text-to-Speech.. - Bellegarda   (Correct)

No context found.

E. Klabbers and R. Veldhuis, "Reducing Audible Spectral Discontinuities," IEEE Trans. Speech Audio Proc., Special Issue Speech Synth., N. Campbell, M. Macon, and J. Schroeter, Eds., Vol. 9, No. 1, pp. 39--51, January 2001.


Data-Driven Perceptually Based Join Costs - Syrdal, Conkie   (Correct)

No context found.

E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities," IEEE Trans. on Speech and Audio Proc., vol. SAP-09, no. 01, pp. 39--51, Jan 2001.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC