| J. Wouters, M. Macon, Control of spectral dynamics in concatenative speech synthesis, IEEE Transactions on Speech and Audio Processing 9 (1) (2001) 30-38. |
....Much of the earlier work in the literature has concentrated on instance level costs that directly compare speech segments, or instantiations of the speech units. Numerical metrics such as Euclidean, Kullback Leibler, and Mahalanobis distances calculated over spectral features have been considered [21, 22, 39, 61, 69, 160, 137, 70, 161]. The concatenation cost defined here bears similarity to disconcatibility as proposed by Iwahashi [63] and to splicing cost proposed as by Bulyko [22] The use of mutual information to find boundaries across which information is blocked is related to rifts [6] as studied in statistical machine ....
J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Trans. Acoustics, Speech and Signal Processing,vol. 9, no. 1, pp. 30--38, Jan. 2001.
....Much of the earlier work in the literature has concentrated on instance level costs that directly compare speech segments, or instantiations of the speech units. Numerical metrics such as Euclidean, Kullback Leibler, and Mahalanobis distances calculated over spectral features have been considered [3, 4, 10 12, 16]. Because we define costs at the class level for scalability, we apply the metrics not to pairs of individual examples but to pairs of distributions of multiple examples. 2. REVIEW OF UNIT SELECTION Unit selection involves finding an appropriate sequence of units, u, from a speech corpus given ....
J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Transactions on Speech and Audio Processing, 30--38, January 2001.
.... 4 2 0 Fig. 2. Log magnitude spectrum of the phoneme template # before (top) and after modification (bottom) The dashed lines represent vocal tract feature parameters## ## and # # , respectively. proved upon by using parts of the spectral modification method suggested by Wouters et. al [4]. 2.4. Estimating transition weights To guarantee smooth concatenation at the diphone boundaries, we impose conditions ## # ## # # and ## # ## # #. It then follows that # . Thus the spectrum of two joinable diphones will be identical at their left and right boundaries, respectively, even ....
J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Trans. Speech and Audio Proc., vol. 9, no. 1, pp. 30--38, January 2001.
.... point de concatnation (l tendue du lissage correspond au tiers des units impliques) Une analogie trs forte existant entre le lissage de F0 et le lissage spectral, d autres mthodes utilises normalement pour le lissage de coefficients spectraux pourraient donc s appliquer (par exemple la mthode de [Wou01] qui permet de faon lgante de contrler la dynamique des mouvements formantiques ) 5. TESTS PERCEPTIFS Nous avons effectu des tests perceptifs prliminaires sur de courtes phrases synthtiques dont l intonation est produite selon deux mthodes: 1 simple concatnation des units slectionnes, ....
Wouters, J. and Macon M.W. (2001), "Control of spectral dynamics in concatenative speech synthesis", IEEE Trans. on Speech and Audio Proc., Vol.9(1), p 30-38.
....effects of coarticulation are strong and the boundary is a poor choice for making a splice. Our unit database contains half phone segments. We used the Festival TTS system to cluster the units according to the decision tree clustering procedure described in [8] Motivated by work described in [17], we used line spectral frequencies (LSF) for the parametric representation of units. Two BMM models were trained for each cluster: one with dependencies on the preceding frames (i.e. left to right feature dependency links) and another with the dependencies on the following frames ....
Wouters, J., and Macon, M., "Control of spectral dynamics in concatenative speech synthesis", IEEE Trans. Speech and Audio Processing, 9(1):30--38, 2001.
.... we use linear distribution of pitch discontinuity to left and right units (the region of interpolation is set as one third of each unit) The problem is similar to that of spectral smoothing, therefore other smoothing algorithms may be successfully applied (as the control of formant dynamics in [9]) 5. LISTENING TESTS Informal listening tests (preference tests) are performed with small phrases of synthetic speech produced with target pitch curves obtained by i) concatenating actual pitch curves of the selected units, ii) concatenating shifted pitch curves of the selected units. The ....
J. Wouters and M.W.Macon, "Control of spectral dynamics in concatenative speech synthesis", IEEE Trans. on Speech andAudio Proc., Vol. 9(1), Jan.2001, p 30-38.
....between the rst frame in unit U j and the rst frame in unit U i 1 which follows unit U i , as illustrated in Fig. 4.4. This approach is more robust than computing a distance between two consecutive frames, because it does not imply continuity at join points. Motivated by work described in [107], we used a line spectral frequencies (LSF) representation of frames at unit boundaries. Squared di erences between individual LSF vector components were weighted by the inverse of the distance between the adjacent spectral lines. We also included F0 and energy for computing the concatenation ....
J. Wouters and M. Macon. Control of spectral dynamics in concatenative speech synthesis. IEEE Transactions on Speech and Audio Processing, 9(1):30-38,
....Within each cluster, units were assigned a target cost based on their distance to the cluster mean. Concatenation costs are typically computed as Euclidean or Mahalanobis distance between spectral features representing boundary frames of the corresponding units. Motivated by work described in [9], we used a line spectral frequencies (LSF) representation of frames at unit boundaries for computing the concatenation costs. Many areas of language and speech processing have adopted the weighted finite state transducer (WFST) formalism [10, 11] because it supports a complete representation of ....
Wouters, J., and Macon, M., "Control of spectral dynamics in concatenative speech synthesis", IEEE Transactions on Speech and Audio Processing, 9(1):30--38, 2001.
No context found.
J. Wouters, M. Macon, Control of spectral dynamics in concatenative speech synthesis, IEEE Transactions on Speech and Audio Processing 9 (1) (2001) 30-38.
No context found.
J. Wouters and M.W. Macon. Control of spectral dynamics in concatenative speech synthesis. IEEE Trans. Speech and Audio Processing, Vol. 9, No. 1, pp. 30--38, 2001.
No context found.
J. Wouters and M. Macon, "Control of Spectral Dynamics in Concatenative Speech Synthesis," IEEE Trans. Speech Audio Proc., Special Issue Speech Synth., N. Campbell, M. Macon, and J. Schroeter, Eds., Vol. 9, No. 1, pp. 30-- 38, January 2001.
No context found.
J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Trans. Speech and Audio Processing, 9(1):30-38, 2001.
No context found.
J. Wouters, M. Macon, Control of spectral dynamics in concatenative speech synthesis, IEEE Transactions on Speech and Audio Processing , 9(1), 30-38. (2001).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC