Results 1 -
3 of
3
TTS from zero: Building synthetic voices for new languages
, 2009
"... A developer wanting to create a speech synthesizer in a new voice for an under-resourced language faces hard problems. These include difficult decisions in defining a phoneme set and a laborious process of accumulating a pronunciation lexicon. Previously this has been handled through involvement of ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
A developer wanting to create a speech synthesizer in a new voice for an under-resourced language faces hard problems. These include difficult decisions in defining a phoneme set and a laborious process of accumulating a pronunciation lexicon. Previously this has been handled through involvement of a language technologies expert. By definition, experts are in short supply. The goal of this thesis is to lower barriers facing a non-technical user in building “TTS from Zero. ” Our approach focuses on simplifying the lexicon building task by having the user listen to and select from a list of pronunciation alternatives. The candidate pronunciations are predicted by grapheme-to-phoneme (G2P) rules that are learned incrementally as the user works through the vocabulary. Studies demonstrate success for Iraqi, Hindi, German, and Bulgarian, among others. We compare various word selection strategies that the active learner uses to acquire maximally predictive rules. Incremental G2P learning enables iterative voice building. Beginning with 20 minutes of recordings, a bootstrapped synthesizer provides pronunciation examples for lexical review, which is fed into the next round of training with more recordings to create a larger, better voice... and so
Pmtools: A Pronunciation Modeling Toolkit
"... This paper reports on a pronunciation modeling toolkit --- pmtools --- that allows one to train a weighted finite-state transducer using a Classification and Regression Tree (CART) training paradigm. Tools are provided to automatically align a pronunciation dictionary consisting of a set of words a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper reports on a pronunciation modeling toolkit --- pmtools --- that allows one to train a weighted finite-state transducer using a Classification and Regression Tree (CART) training paradigm. Tools are provided to automatically align a pronunciation dictionary consisting of a set of words and their pronunciations, train a set of CART trees on the aligned dictionary, and compile those trees out into a special class of weighted finitestate transducer. Most of the complexity --- aligning the data, labeling the data with features, and training the trees --- is hidden from the user. While
ISCA Archive Pmtools: A Pronunciation Modeling Toolkit
"... This paper reports on a pronunciation modeling toolkit — pmtools — that allows one to train a weighted finite-state transducer using a Classification and Regression Tree (CART) training paradigm. Tools are provided to automatically align a pronunciation dictionary consisting of a set of words and th ..."
Abstract
- Add to MetaCart
(Show Context)
This paper reports on a pronunciation modeling toolkit — pmtools — that allows one to train a weighted finite-state transducer using a Classification and Regression Tree (CART) training paradigm. Tools are provided to automatically align a pronunciation dictionary consisting of a set of words and their pronunciations, train a set of CART trees on the aligned dictionary, and compile those trees out into a special class of weighted finitestate transducer. Most of the complexity — aligning the data, labeling the data with features, and training the trees — is hidden from the user. While some new techniques, e.g. in automatic alignment, are introduced here, the main focus of this work is to provide a toolkit to ease the development of pronunciation models using fairly standard techniques. By the time of the workshop, pmtools will be available free for non-commercial use. 1.