41 citations found. Retrieving documents...
T. Dutoit, An Introduction to Text-to-Speech Synthesis, Kluwer Academic Publishers, Dordrecht, 1997.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Corpus-Based Unit Selection for Natural-Sounding Speech Synthesis - Yi (2003)   (Correct)

....based on the query and formulate a reply via language generation. The symbolic response is then transformed back to the speech domain by the process of speech synthesis described above. 1. 1 Background Although speech synthesis has had a long history [71, 87] progress is still being made [43, 152] and recent attention in the field has been primarily focused on concatenating real human speech selected from a large corpus or inventory. The method of concatenation [107] is not a new idea and has been applied to other parametric speech representations including, for example, linear prediction ....

T. Dutoit, Ed., An introduction to text-to-speech synthesis,KluwerAcademic Publishers, 1996.


Applying Talking Head Technology To A Web Based Weather Service - Dam, de Souza (2002)   (Correct)

.... and silence are used for the purpose of contrast and emphasis, They serve somewhat the same purpose the white of this paper does for the print, but they do it in a more dramatic way [15] Pitch variation can serve to emphasise certain words or syllables in order to highlight their importance [8]. Researchers and people involved in speech have described the speech correlates of emotion. This is a complex task given that there is no consensus on what the emotions are, but most seem to agree on the nature of basic emotions [16] The same remarks can be made concerning how the speech ....

T. Dutoit. An Introduction toText-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997.


CORA: An Anthropomorphic Robot Assistant for Human.. - Iossifidis.. (2002)   (Correct)

....lower resolution of the fovea cameras with respect to the human fovea. 2. 2 Acoustic and Phonetic Abilities Cora s acoustic system consisting of microphones on the hardware side and the speech recognizer ears [7] 8] on the software side forms together with the speech synthesizer called mbrola [3] [1] a dialog system. 2.3 Head Configuration For grasping the relative configuration of body, arm and head is crucial: Humans typically use visually controlled arm and hand movements with the position of the head above and behind the grasping position. This and the fact that most visually ....

Thierry Dutoit. An Introduction to Text-to-Speech Synthesis. Dordrecht: Kluwer, 1997.


Speech Synthesis Using HMM-Based Acoustic Unit Inventory - Matousek   (Correct)

....voiced sounds) or by Gaussian noise source. Prosody matching is straightforward, since pitch, duration and intensity are explicit parameters of the model. Spectral discontinuities encountered in the concatenation points are simple to smooth by linear interpolating the difference between PARCORs [7]. The resulting synthetic signal is intelligible, more natural than in previous approach. However it is accompanied by a characteristic buzziness caused by pulse like nature of the excitation signal. An alternative solution to this problem is the usage of the LP residual excitation signal (so ....

Dutoit T. (1997): "An Introduction to Text-toSpeech Synthesis"; Kluwer Academic Publishers, Dordrecht.


Generating Multilingual Personalized Descriptions.. - Androutsopoulos.. (2001)   (5 citations)  (Correct)

....exhibits are selected by approaching them. Once an exhibit has been selected, the system retrieves from the database all the information that is relevant to the exhibit, and using natural language generation [McDonald 2000; McKeown 1995; Reiter Dale 1997, 2000] and speech synthesis techniques [Dutoit 1997] it produces an appropriate textual or spoken description. Figure 3 shows an English description generated by the current demonstrator, and Figure 4 shows the same description in Greek. The structure of the database and the generation process will be discussed further in Section 3. visitor ....

T. Dutoit. An Introduction to Text-To-Speech Synthesis. Kluwer Academic Press, 1997.


AIBO's first words. The social learning of language and meaning - Steels, Kaplan (2001)   (2 citations)  (Correct)

....might occur in the dialog. Although the recognition rate is high, it is not perfect and so provisions must be made in the language game script to overcome the problem of recognition error. The speech synthesis system is a state of the art text to speech synthesiser, similar to the one described in [Dutoit, 1997]. An associative memory stores relations between object views and words. The different views of an object form an implicit category [Kaplan, 1998] based on the fact that they are named the same way. Word learning takes place by reinforcement learning [Sutton and Barto, 1998] When the ....

Dutoit, T. (1997). An introduction to Text-To-Speech Synthesis. Kluwer Acamdec Publishers, Dordrecht.


Lifelike Gesture Synthesis and Timing for Conversational Agents - Wachsmuth, Kopp (2001)   (Correct)

....marked with prominence values which support the subsequent generation of prosodic parameters (phoneme length; intonation) to produce a linguistic representation of text input. From this representation, speech output is generated by the use of MBROLA and the German diphone database provided for it [3]. MBROLA is a real time concatenative speech synthesizer which is based on the multi band resynthesis, pitchsynchronous overlap add procedure (MBR PSOLA) To achieve a variety of alterations in intonation and speech timing, pitch can be varied by a factor within the range of 0.5 to 2.0, and phone ....

Dutoit, T. (1997) An Introduction to Text-to-Speech Synthesis. Dordrecht: Kluwer Academic Publishers.


General-Purpose Architectures for Media Processing.. - Parthasarathy..   (Correct)

....based on cepstral co efficients attributable to the shape of the vocal tract. We run the benchmark with the RASTA front end filtering technique that allows for removal of additive noise and spectral distortion. We use the ex5 c1.wav input file from the UCLA MediaBench suite. Speech synthesis [33] is the process of converting text to speech and consists of two parts (1) the natural language processing, and (2) the digital signal processing. Our speech synthesis benchmark focuses on the former. Specifically, our benchmark is based on the alpha version of the FreeSpeech text to speech ....

Thierry Dutoit. An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, 1996.


A Multi-Strategy Approach to Improving Pronunciation by Analogy - Marchand, Damper   (5 citations)  (Correct)

....can actually outperform traditional rules. However, this possibility is not usually given much credence. For instance, Divay and Vitale (1997) recently wrote: To our knowledge, learning algorithms, although promising, have not (yet) reached the level of rule sets developed by humans (p. 520) Dutoit (1997) takes this further, stating such training based strategies are often assumed to exhibit much more intelligence than they do in practice, as revealed by their poor transcription scores (footnote 14, p. 115) Pronunciation by analogy (PbA) is a data driven technique for the automatic ....

Dutoit, Thierry (1997). Introduction to Text-to-Speech Synthesis. Dordrecht, The Netherlands: Kluwer.


The Need for Increased Speech Synthesis Research: Report .. - Sproat, Ostendorf..   (Correct)

....abbreviation can be expanded as doctor or drive (and in other ways too) Thus, it must be able to enumerate the possible expansions of this abbreviation and to disambiguate among them given the context. Text analysis methods for TTS have been reviewed in a number of places, including [Klatt, 1987, Dutoit, 1997, Sproat, 1998] Here, we will only briefly describe the various approaches that have been taken to the more prominent problems, namely word pronunciation and homograph disambiguation. The architecturally simplest approach to word pronunciation involves letter to sound rules. These are rules that ....

Dutoit, T. (1997). An Introduction to Text-to-Speech Synthesis. Kluwer, Dordrecht.


Domain Specific Text Processing for Speech Synthesis - Heyman (2001)   (Correct)

....speech has had flaws demanding a certain tolerance of the listener. Improved synthesis techniques have made the quality of the generated speech increasingly higher and it has lately become possible to make use of it in commercial applications, for example telecommunication services [Allen, 1992] [Dutoit, 1997]. Now, it is not necessarily the acoustic quality of what is being said that is likely to disturb the listener, but rather the failure of the Text to Speech (TTS) system to correctly read out words and expressions that constitute some linguistic problem. When Text to Speech synthesis is used to ....

....task can be divided into two broad processes, linguistic analysis and synthesis. The goal of the analysis part is to make a narrow phonetic transcription of the text, with additional information about prosody. The synthesizing part should use the transcription to produce natural sounding speech [Dutoit, 1997]. This introduction will describe the analysis; the synthesis will be briefly outlined at the end. The part of the system that performs the analysis comprises several components that work on different levels; there is some initial pre processing, morphological, contextual and syntactic analyses, ....

[Article contains additional citation context not shown here]

Dutoit, T., (1997), An Introduction to Text-to-Speech Synthesis, Kluwer Academic Publishers, Dordrecht.


On The Reduction Of Concatenation Artefacts In Diphone.. - Esther Klabbers And (1998)   (8 citations)  (Correct)

....for the vowel u in the synthesized Dutch word zuk . It reveals a considerable jump in F2 of around 500 Hz at the diphone boundary (between 180 and 185 ms) This, together with other informal observations, suggests that the problem is of a spectral nature. Other causes are discussed in [2]. Several approaches have been proposed to solve this problem: # The number of audible discontinuities can be reduced by using larger units such as triphones. This does not solve the problem as discontinuities continue to occur albeit less frequently. Moreover, the inventory size increases ....

....of the u in zu remains at around 1500 Hz while the F2 of the u in uk decreases very slowly from approximately 1000 Hz to 800 Hz. # Spectral mismatch can be reduced by wave form interpolation, spectral envelope interpolation or formant trajectory smoothing, the latter of which is preferred [2]. It requires a signal representation that allows this type of operation. The disadvantage of formants as a representation is that they are very difficult to estimate reliably. Wave form and spectral envelope interpolation have the disadvantage that smooth transitions are often achieved at the ....

T. Dutoit. An introduction to text-to-speech synthesis.Kluwer Academic Press, Dordrecht, 1997.


Using Synchronous Speech To Minimize Variability - Fred Cummins Department (2001)   (Correct)

....for concatenative synthesis. A large part of the art of concatenative synthesis consists of establishing a database of basic speech units from which elements are selected for concatenation. Where conventional approaches use diphones, demisyllables or phonemes as their basic concatenative units [1, 2], we take the whole word as our basic unit. Word based concatenative synthesis has not been actively pursued for many reasons: Contextual variation across words in different syntagmatic positions is, of course, very large indeed. Also, the number of words required for even basic TTS is very large ....

Thierry Dutoit. An Introduction to Text-to-Speech Synthesis, volume 3 of Text, Speech and Language Technology. Kluwer Academic, 1997.


ProZed: A Multilingual Prosody Editor for Speech Synthesis - Hirst (2000)   (Correct)

....and dialects as well as from different speech styles will increase exponentially over the next two or three decades. In this paper I present an overview ofProZed an aid for developing prosody rules for speech synthesis using the MOMEL and INTSINT [19] algorithms and interfaced with the MBROLA , [12]MBROLIGN [23] and Praat [4] programs. It allows the interactive editing of a symbolic representation of an utterance in any of the twenty languages and dialects for which an MBROLA diphone database is currently available. ProZed defines a number of different levels of representation of varying ....

....generally uses hidden Markov modelling, requires a large hand labelled training corpus. Recent experiments, however, 11] 23] have shown that a reasonably accurate alignment of phonemic labels can be obtained without prior training by using a diphone synthesis system (such as that described in [12]) Once the corpus to be labelled has been transcribed phonemically, a synthetic version is generated with a fixed duration for each phoneme and with a constant F0 value. A dynamic time warping algorithm is then used to transfer the phoneme labels from the synthetic speech to the original signal. ....

Dutoit, T. 1997. An introduction to Text-to-Speech synthesis. Kluwer Academic Press, Dordrecht.


Reducing Audible Spectral Discontinuities - Klabbers, Veldhuis (2001)   (9 citations)  (Correct)

....database of one female speaker, this is most prominent in vowels and semi vowels. It is due to variability in the pronunciation of these sounds which is caused by the phonetic prosodic context. Discontinuities are caused by mismatches in , phase or spectral envelopes across concatenation points [8]. In Calipso, IPO s diphone synthesis system [29] mismatches are avoided by monotonizing the diphones before storing them in the database. Phase mismatches are avoided by using a method called phase synthesis for re synthesis of the nonsense words [9] Phase synthesis is based on accurate ....

....representations that allow these types of operations. The disadvantage of formants as a representation is that they are very difficult to estimate reliably. Waveform and spectral envelope interpolation have the disadvantage that smooth transitions are often achieved at the expense of naturalness [8]. Examples of signal representations that allow waveform interpolation are multi band resyn 1063 6676 01 10.00 2001 IEEE 40 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 1, JANUARY 2001 TABLE I COMPOSITION OF MATERIAL FOR THE PERCEPTUAL EXPERIMENT; THE TOTAL NUMBER OF C VC ....

[Article contains additional citation context not shown here]

T. Dutoit, An Introduction to Text-To-Speech Synthesis. Norwell, MA: Kluwer, 1997.


Joint Evaluation Of Text-To-Speech Synthesis In French Within.. - d'Alessandro   (Correct)

....marks. Using the same type of (enriched) phonemic input and the same concatenation modification system it will be possible to assess prosodic quality independently of the other modules. It was decided to take advantage of a freely available high quality diphone based synthesis system for French [1, 6], that was already used by several partners. Then, the perceived differences in prosodic quality will not be influenced by the segmental level, which will be the same for all the tested systems. As the aim is to test prosody and not e.g. morphosyntactic analysis or GP conversion, an enriched ....

Dutoit, T. (1997). An introduction to text-tospeech synthesis. Kluwer Academic Publishing, Dordrecht, 1997.


Automatic Romanization for Thai - Charoenporn, Chotimongkol.. (1999)   (Correct)

....the characters in the syllable. However, the pronunciations of some syllables are still ambiguous, that is the homographic syllable. Linguistic rules are introduced for selecting the best syllable sequence, and converting it into the roman script. Following is an example of the romanization rules [1]. 1) Syllable segmentation candidates: ### ## Romanization rules: U# ## # ### #### U# ## #### B Results: Syllable segmentation: ## Romanization: worraphan 2) Syllable segmentation candidates: # #a# Romanization rules: U# ## #oe####B# 9 Results: Syllable segmentation: ....

Dutoit, T. An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publisher, 1997, The Netherlands..


Speech Synthesis By Phonological Structure Matching - Taylor, Black (1999)   (7 citations)  (Correct)

....implicitly. A technique for using signal processing only when it is needed most is also described. The technique produces better quality speech than previous approaches and is also significantly faster. 1. INTRODUCTION It is common in any overview of a speech synthesis system (e.g. 13] [8]) to see the system broken down into a number of components, which nearly always include things such as text normalisation, lexical lookup, intonation, duration, diphone concatenation and signal processing. A standard model of waveform generation over the last years has been for the higher level ....

Thierry Dutoit. An Introduction to Text to Speech Synthesis. Kluwer Academic Publishers, 1997.


A System of Stylized Intonation Contours in German - Pirker, Alter, Rank.. (1998)   (Correct)

....durations. 8. IMPLEMENTATION At the moment the technical setup in which our study is performed consists of freely available software only. F0 analysis, labelling of speech data and TD PSOLA resynthesis are undertaken with the powerful SFS system from UCL. Also the MBROLA speech synthesizer [5] is used for producing speech samples. A number of perl scripts are used in order to perform prosody transplantation from natural speech to synthesized samples. A testing framework has been implemented that comprises the production of sets of test tokens with systematically varying prosodic ....

Dutoit T.: An Introduction to Text-to-Speech Synthesis, Kluwer Academic Publishers, Boston/Dordrecht/London, Text, Speech and Language Technology, Vol. 3, 1997.


The Symbolic Coding Of Segmental Duration And Tonal Alignment: An.. - Hirst (1999)   (Correct)

....a shorter than average i: segment. Using a linear transcription the same utterance would be coded: M lAI T kDI D[ s B (3) A linear INTSINT representation of example (2) To provide auditory assessment, the system has been interfaced with the MBROLA diphone speech synthesiser [6] [7] by a MacPerl script int2pho which takes as input a tiered INTSINT file ( int) like that of (2) above) and provides as output an input file for MBROLA ( pho) with appropriate durations and pitch values. With the MBROLA synthesiser, the appropriate set of diphones and a table of mean durations for ....

Dutoit, T. 1997. An Introduction to Text-to-Speech Synthesis, Dordrecht: Kluwer.


Syllable Reconstruction in Concatenated Waveform Speech.. - Tatham, Morton, Lewis (1999)   (Correct)

....carried out to determine whether we could take one of the word based limited domain versions of the system, and make it more general by excising syllables from existing polysyllabic words and recombining them into new words. Initially the study treats temporal rather than spectral considerations. 1. PRELIMINARIES Concatenated waveform synthesis [1] uses an inventory of stored waveforms. This paper reports experiments in enlarging MeteoSPRUCE a weather forecasting application of our general purpose high level tts engine SPRUCE [2] to widen its usability without the need for re recording ....

....of the word based limited domain versions of the system, and make it more general by excising syllables from existing polysyllabic words and recombining them into new words. Initially the study treats temporal rather than spectral considerations. 1. PRELIMINARIES Concatenated waveform synthesis [1] uses an inventory of stored waveforms. This paper reports experiments in enlarging MeteoSPRUCE a weather forecasting application of our general purpose high level tts engine SPRUCE [2] to widen its usability without the need for re recording [3] 4] 5] 6] 7] Before embarking on the task ....

[Article contains additional citation context not shown here]

Dutoit, T. 1997. An Introduction to Text-to-Speech Synthesis. Dordrecht: Kluwer Academic Publishers


A Multi-Strategy Approach to Improving Pronunciation by Analogy - Marchand, Damper (1999)   (5 citations)  (Correct)

....can actually outperform traditional rules. However, this possibility is not usually given much credence. For instance, Divay and Vitale (1997) recently wrote: To our knowledge, learning algorithms, although promising, have not (yet) reached the level of rule sets developed by humans (p. 520) Dutoit (1997) takes this further, stating such training based strategies are often assumed to exhibit much more intelligence than they do in practice, as revealed by their poor transcription scores (footnote 14, p. 115) Pronunciation by analogy (PbA) is a data driven technique for the automatic ....

Dutoit, T. (1997). Introduction to Text-to-Speech Synthesis. Dordrecht, The Netherlands: Kluwer.


Pronunciation Modeling In Speech Synthesis - Miller (1998)   (1 citation)  (Correct)

....This dissertation investigates the area of pronunciation modeling in speech synthesis. By pronunciation modeling, we mean architectures and principles for generating high quality human like pronunciations. The term pronunciation modeling has previously been applied in the context of speech recognition (e.g. Byrne et al. 1997). In that context, it describes theories and procedures for handling the pronunciation variation that naturally occurs across speakers. In contrast, our work is in the domain of text to speech synthesis, which, as we will show, requires modeling the pronunciation variation of an individual whose ....

....future, in text to speech systems, some segments and even syllables will disappear entirely and certain functors will be greatly attenuated. We will describe methods for achieving this kind of speech synthesis. The term pronunciation modeling has previously been applied in the context of speech recognition (e.g. Byrne et al. 1997). In that context, it describes theories and procedures for handling the pronunciation variation that naturally occurs across speakers. In contrast, our work is in the domain of text to speech synthesis, which, as we will show, requires modeling the pronunciation variation of an individual whose ....

Cognition 34: 137-195. Dutoit, Thierry. 1997. An introduction to text-to-speech synthesis. Dordrecht: Kluwer.


Survey of Data-Driven Approaches to Speech Synthesis - Ng (1998)   (Correct)

....a result, despite much progress in the field, completely natural speech synthesis is still an elusive goal. Research in speech synthesis has had a long history. Comprehensive reviews can be found in Klatt s 1987 JASA article [15] which covers work up to the late 1980s, and in Dutoit s 1997 book [6], which includes more recent work done in the last decade. Many of the earlier approaches to speech synthesis were based on knowledge engineered rules derived from linguistic theories and acoustic analyses. These systems were developed and improved by iteratively analyzing the characteristics of ....

....out the segment transitions and to match the specified prosodic characteristics. Desirable properties for the set of speech segments include accounting for as many coarticulatory effects as possible, having minimal discontinuities at the concatenation points, and being as few in number as possible [6, 27]. Longer segments are able to capture more coarticulation and have fewer concatenation points than shorter ones; however, the number of different segments grows exponentially with the length of the segment. To minimize concatenation discontinuities, it is advantageous to store multiple instances ....

[Article contains additional citation context not shown here]

T. Dutoit, An Introduction to Text-To-Speech Synthesis. Kluwer Academic Publishers, 1997.


On The Reduction Of Concatenation Artefacts In Diphone.. - Klabbers, Veldhuis (1998)   (8 citations)  (Correct)

....for the vowel u in the synthesized Dutch word zuk . It reveals a considerable jump in F2 of around 500 Hz at the diphone boundary (between 180 and 185 ms) This, together with other informal observations, suggests that the problem is of a spectral nature. Other causes are discussed in [2]. Several approaches have been proposed to solve this problem: ffl The number of audible discontinuities can be reduced by using larger units such as triphones. This does not solve the problem as discontinuities continue to occur albeit less frequently. Moreover, the inventory size increases ....

....the u in zu remains at around 1500 Hz while the F2 of the u in uk decreases very slowly from approximately 1000 Hz to 800 Hz. ffl Spectral mismatch can be reduced by wave form interpolation, spectral envelope interpolation or formant trajectory smoothing, the latter of which is preferred [2]. It requires a signal representation that allows this type of operation. The disadvantage of formants as a representation is that they are very difficult to estimate reliably. Wave form and spectral envelope interpolation have the disadvantage that smooth transitions are often achieved at the ....

T. Dutoit. An introduction to text-to-speech synthesis. Kluwer Academic Press, Dordrecht, 1997.


Heterogeneous Relation Graphs as a Mechanism for.. - Taylor, Black, Caley (2001)   (8 citations)  (Correct)

....level in the tree for each node. 5.2 Other Formalisms Some systems (e.g. Traber, 1995) Bailly and Tran, 1989) use feature structure mechanisms as their core data structure. Given the inherent context free nature of such formalisms, representing tree structures is not a problem. But as Dutoit (Dutoit, 1997) explains, these systems are nearly always used for high level semantic and syntactic analysis only, and in nearly all cases, some sort of multi level data structure mechanism is used for phonetic and phonological information. 5.3 The Strengths of HRG The main strength of the HRG formalism is ....

Dutoit, T. (1997). An Introduction to Text to Speech Synthesis. Kluwer Academic Publishers.


The Mbrola Project: Towards A Set Of High Quality.. - Dutoit, Pagel..   (32 citations)  Self-citation (Dutoit)   (Correct)

....2. MBROLA ALGORITHM The MBROLA 2. 00 program uses a technique known as Multi Band Resynthesis OverLap Add which produces speech by diphone (triphone or polyphone will be available in future versions) concatenation (for an introduction to concatenative approaches to TTS synthesis, refer to [2]) Like the well known PSOLA methods (TD PSOLA for Time Domain Pitch Synchronous OverLap Add [3] or PIOLA [4] standing for Pitch Inflected Overlap Add, or MBR PSOLA[5] standing for Multi Band Resynthesis Pitch Synchronous OverLap Add) it adds overlapping frames directly in the time domain. ....

T. Dutoit. An Introduction to Text-to-Speech Synthesis.Kluwer Academic Publishers, Boston, 1996. Forthcoming textbook.


Diphone Concatenation using a Harmonic plus Noise Model.. - Stylianou, Dutoit.. (1997)   (5 citations)  Self-citation (Dutoit)   (Correct)

....to the desired prosody. Thanks to the pitch synchronous scheme of HNM, a simple and flexible technique can be used for that purpose. A mapping between the synthesis and the analysis instants is determined, specifying which analysis instant should be selected for any given synthesis instant[11][2](p.255) The amplitudes of the new harmonics are then obtained by sampling the spectral envelope defined by the original harmonic amplitudes. If the original phase is used, the phase of the new harmonics are obtained by sampling the phase envelope at the modified pitch harmonics. Before that, the ....

T. Dutoit. An introduction to text-to-speech synthesis. Kluwer Academic Publishers, The Netherlands, 1997.


Modeling Improved Prosody Generation from High-Level.. - Xydas, al. (2005)   (Correct)

No context found.

T. Dutoit, An Introduction to Text-to-Speech Synthesis, Kluwer Academic Publishers, Dordrecht, 1997.


Tone-Group F 0 selection for modeling focus prominence - In Small-Footprint Speech   (Correct)

No context found.

Dutoit, T., 1997. An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht.


Text-To-Speech Technologies for Mobile Telephony Services - Paulseph-John Farrugia..   (Correct)

No context found.

Thierry Dutoit. An Introduction to Text-To-Speech Synthesis, volume 3 of Text, Speech and Language Technology. Kluwer Academic Publishers, P.O. Box 322, 3300 AH Dordrecht, The Netherlands, 1997.


An Architecture for Voice-Enabled Interfaces over Local.. - Bagein Pietquin Ris (2003)   (Correct)

No context found.

T. Dutoit, "An Introduction to Text-To-Speech Synthesis", Kluwer Academic Publishers, Dordrecht, 1997.


Enabling Speech Based Access to Information - Management Systems Over (2003)   (Correct)

No context found.

T. Dutoit, "An Introduction to Text-To-Speech Synthesis", Kluwer Academic Publishers, Dordrecht, 1997.


XML Representation Languages as a Way of Interconnecting TTS.. - Schröder, Breuer (2004)   (Correct)

No context found.

T. Dutoit, An Introduction to Text-to-Speech Synthesis. Dordrecht: Kluwer Academic, 1997.


Emotional Speech Synthesis for Emotionally-Rich Virtual Worlds - Schröder (2003)   (Correct)

No context found.

Thierry Dutoit. An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht, 1997.


"May I talk to you? :-)" - Facial Animation from Text - Albrecht, Haber, Kähler.. (2002)   (Correct)

No context found.

T. Dutoit. An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht, 1997.


First Implementation of VHML on the Java Text-to-Speech Synthesiser - De Souza   (Correct)

No context found.

T. Dutoit. An Introduction toText-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997.


"May I talk to you? :-)" - Facial Animation from Text - Albrecht, Haber, Kähler.. (2002)   (Correct)

No context found.

T. Dutoit. An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht, 1997.


A Novel Discontinuity Metric for Unit Selection Text-to-Speech.. - Bellegarda   (Correct)

No context found.

T. Dutoit, An Introduction to Text--to--Speech Synthesis, Norwell, MA: Kluwer, 1997.


Subjective Evaluation Of Join Cost Smoothing Methods - Jithendra Vepa And   (Correct)

No context found.

T. Dutoit, An Introduction to Text-to-Speech Synthesis, Kluwer Academic Publishers, The Netherlands, 1997.


Prosody Modelling for Syllable-Based Speech Synthesis - Kopecek, Pala   (Correct)

No context found.

T. Dutoit, An Introduction to Text-to-Speech Synthesis, Kluwer Academic Publishers, 1997.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC