• Documents
  • Authors
  • Tables

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

TTS From Zero - Building Synthetic Voices for New Languages,” Doctoral Thesis (2009)

by J Kominek
Add To MetaCart

Tools

Sorted by:
Results 1 - 4 of 4

Text-To-Speech for Languages without an Orthography

by Sukhada Palkar, Alan W Black, Alok Parlikar
"... Speech synthesis models are typically built from a corpus of speech that has accurate transcriptions. However, many of the languages of the world do not have a standardized writing system. This paper is an initial attempt at building synthetic voices for such languages. It may seem useless to develo ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Speech synthesis models are typically built from a corpus of speech that has accurate transcriptions. However, many of the languages of the world do not have a standardized writing system. This paper is an initial attempt at building synthetic voices for such languages. It may seem useless to develop a text-to-speech system when there is no text available. But we will discuss some well defined use cases where we need these models. We will present our method to build synthetic voices from only speech data. We will present experimental results and oracle studies that show that we can automatically devise an artificial writing system for these languages, and build synthetic voices that are understandable and usable.

systems in multiple languages from ‘found ’ data

by O. Watts, A. Stan, R. Clark, Y. Mamiya, M. Giurgiu, J. Yamagishi, S. King , 2013
"... and lightly-supervised learning for rapid construction of TTS ..."
Abstract - Add to MetaCart
and lightly-supervised learning for rapid construction of TTS
(Show Context)

Citation Context

...port [3]. Prior work has also presented unsupervised methods for building systems based on letters rather than phonemes [4, 5], induction of phone-sets [6, 7], syllable-like units [8, 9], or lexicons =-=[10]-=-. However, this work has not been presented as an integrated framework for producing end-to-end TTS systems. Furthermore, despite the significant work on unsupervised learning in Natural Language Proc...

Language Technologies Institute,

by Gopala Krishna Anumanchipalli, Ying-chang Cheng, Joseph Fern, Xiaohan Huang, Qi Mao, Alan W Black
"... This paper is an initial investigation into using knowledge-based parameters in the field of statistical parametric speech synthesis (SPSS). Utilizing the types of speech parameters used in the Klatt Formant Synthesizer we present automatic techniques for deriving such parameters from a speech datab ..."
Abstract - Add to MetaCart
This paper is an initial investigation into using knowledge-based parameters in the field of statistical parametric speech synthesis (SPSS). Utilizing the types of speech parameters used in the Klatt Formant Synthesizer we present automatic techniques for deriving such parameters from a speech database and building a statistical parametric speech synthesizer from these derived parameters. Although the work is exploratory, it shows promise in using more speech production inspired parameterizations for statistical speech synthesis. Index Terms: statistical speech synthesis, Klatt formant synthesizer. 1.

GRAPHEME-TO-PHONEMEMODEL GENERATION FOR INDO-EUROPEAN LANGUAGES

by Sebastian Ochs, Tanja Schultz
"... In this paper, we evaluate grapheme-to-phoneme (g2p) mod-els among languages and of different quality. We cre-ated g2p models for Indo-European languages with word-pronunciation pairs from the GlobalPhone project and from Wiktionary [1]. Then we checked their quality in terms of consistency and comp ..."
Abstract - Add to MetaCart
In this paper, we evaluate grapheme-to-phoneme (g2p) mod-els among languages and of different quality. We cre-ated g2p models for Indo-European languages with word-pronunciation pairs from the GlobalPhone project and from Wiktionary [1]. Then we checked their quality in terms of consistency and complexity as well as their impact on Czech, English, French, Spanish, Polish, and German ASR. While the GlobalPhone dictionaries were manually cross-checked and have been used successfully in LVCSR, Wiktionary pro-nunciations have been provided by the Internet community and can be used to rapidly and economically create pronunci-ation dictionaries for new languages and domains. Index Terms — web-derived pronunciations, multilingual speech recognition, pronunciation modeling 1.
(Show Context)

Citation Context

... data-driven approach with heuristical and statistical methods. We use Sequitur G2P, a data-driven g2p converter developed at RWTH Aachen University which works with joint-sequence models [12]. As in =-=[13]-=-, we evaluate the quality and complexity of the g2p models over increasing amount of data. 3. PRONUNCIATION EXTRACTION FROM WIKTIONARY To accumulate training data for g2p models, we downloaded dumps o...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University