24 citations found. Retrieving documents...
W. Daelemans and A. Van den Bosch, "Languageindependent data-oriented grapheme-to-phoneme conversion,", In J. P. H. Van Santen, R. W.Sproat, J. P. Olive, and J. Hirschberg, (Eds.), pp. 77-89. SpringerVerlag, Berlin, 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Sentence 782 of The New C Standard - Jones (2003)   (Correct)

....to phoneme mapping for a language involved a great deal of manual processing. For instance, Bemdt, Reggia, and Mitchum [17] manually analysed 17,310 words to derive probabilities for grapheme to phoneme mapping of English. Automating this process has obvious advantages. Daelemans and van den Bosch [69] created a language independent conversion process that takes a set of examples (words and their corresponding phonetic representation) and automatically creates the grapheme to phoneme mapping. More recent research, for instance Pagel, Lenzo, and Black [209] has attempted to handle out of ....

Walter Daelemans and Antal van den Bosch. Language-independent data-oriented grapheme-tophoneme conversion. In Jan P. H. van Santen, Richard W. Sproat, Joseph P. Olive, and Julia Hirschberg, editors, Progress in Speech Synthesis, pages 77 89. Springer, New York, 1997.


Grapheme-To-Phone Using Finite-State Transducers - Caseiro, Trancoso, Oliveira.. (2002)   (Correct)

....as a transducer #.Thealignment between a grapheme sequence # and a phone sequence# was obtained as###### ## # #. When creating # we opted to capitalize ontheknowledge obtained from the rule system, although automatic techniques exist that can learn such a transducer automatically [15]. Besides the usual matching of 1 grapheme to 1 phone, we also allowed the direct matching of some sequences. The cost of matching a grapheme sequence with a phone sequence was set to zero if there is a rule that assigns the phone sequence to the grapheme sequence (completely ignoring the context ....

W. Daelemans and A. van den Bosch, "LanguageIndependent Data-Oriented Grapheme-to-Phoneme Conversion, " in Progress in Speech Synthesis,J.vanSaten, R. Sproat, J. Olive, and J. Hirschberg, Eds. Springer, New York,USA, 1997.


TreeTalk-D: a Machine Learning Approach to Dutch Word.. - Bertjan Busser   (Correct)

....is also possible. For example, x generally maps to ks . In this case we apply compound phonemes , by defining a new phoneme symbol that maps to two (or more) phonemes. For example, ks becomes X in our coding scheme. We aligned the CELEX II data using a two step probabilistic procedure (Daelemans Van den Bosch, 1997). In the first step possible co occurrences between letters and phonemes are counted, taking into account all possible permutations of null phonemes. In the second step, for each word the most frequent co occurrences are gathered and, on the basis of these most frequent co occurrences, a most ....

Daelemans, W. &Van den Bosch, A. (1997) Language-independent Dataoriented Grapheme--to--Phoneme Conversion. Van Santen, J., R.


Issues in Building General Letter to Sound Rules - Black, Lenzo, Pagel (1998)   (17 citations)  (Correct)

.... initialize prob(L,P) 1 foreach word in training set count with DTW all possible L P association for all possible epsilon positions in the phonetic transcription EM loop 2 foreach word in training set compute new p(L,P) on alignment path 3 if (prob = new p) goto 2 This differs from [6] in that the probabilities are distributed equally ( scattered ) among each of the possible alternatives, rather than assigning an arbitrary weight to each shift. When we build models from the results of alignment using each of the above algorithms on the OALD we get the follow results Method ....

W. Daelemans and A. van den Bosch. Language-independent data-oriented grapheme-to-phoneme conversion. In J. van Santen, R. Sproat, J. Olive, and J. Hirschberg, editors, Progress in speech synthesis, pages 77--90. Springer Verlag, 1996.


Predicting Phrase Breaks with Memory-Based Learning - Busser, Daelemans, van den.. (2001)   (3 citations)  Self-citation (Daelemans Van den bosch)   (Correct)

.... have used TiMBL 1 , an MBL software package developed in our group [Daelemans et al. 2000] In addition, we have obtained especially good results in the past with MBL applied to the tasks of word level phonemisation (grapheme to phoneme conversion) and stress assignment for different languages [Daelemans and Van den Bosch, 1996, Van den Bosch, 1997, Busser et al. 1999] In sum, MBL is a natural candidate for use in predicting phonological properties at sentence level. The TiMBL software emulates the following variants of MBL: IB1: The distance between a test item and each memory item is defined as the number of ....

Daelemans, W. and Van den Bosch, A. (1996). Language-independent dataoriented grapheme-to-phoneme conversion. In Van Santen, J. P. H., Sproat, R. W., Olive, J. P., and Hirschberg, J., editors, Progress in Speech Processing, pages 77--89. SpringerVerlag, Berlin.


Transcription Of Out-Of-Vocabulary Words In Large.. - Decadt.. (2002)   (4 citations)  Self-citation (Daelemans)   (Correct)

.... on all examples at classification time) is superior to an eager learning approach (extracting rules or other abstractions from the examples and using these to handle new cases, see [4] for some evidence) Furthermore, the results of research on a similar task (grapheme to phoneme conversion, see [5]) suggest that MBL may be very well suited for our task, P2G conversion. TIMBL is a software package for MBL implementing a wide range of algorithms, weighting metrics, and other parameters. It can take as input patterns (or instances) of feature values with a corresponding class symbol ....

W. Daelemans and A. Van den Bosch, "Languageindependent data-oriented grapheme-to-phoneme conversion, " in Progress in Speech Processing, J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, Eds., pp. 77--89. Springer-Verlag, Berlin, 1996.


Phoneme-To-Grapheme Conversion For.. - Decadt.. (2001)   Self-citation (Daelemans)   (Correct)

.... on all examples at classification time) is superior to an eager learning approach (extracting rules or other abstractions from the examples and using these to handle new cases, see [4] for some evidence) Furthermore, the results of research on a similar task (grapheme to phoneme conversion, see [5]) suggest that MBL may be very well suited for our task, P2G conversion. TIMBL is a software package for MBL implementing a wide range of algorithms, weighting metrics, and other parameters. It can take as input patterns (or instances) of feature values with a corresponding class symbol ....

W. Daelemans and A. Van den Bosch, "Language-independent data-oriented grapheme-to-phoneme conversion," in Progress in Speech Processing, J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, Eds., pp. 77--89. Springer-Verlag, Berlin, 1996.


TiMBL: Tilburg Memory-Based Learner - version 4.0.. - Daelemans, Zavrel.. (2001)   Self-citation (Daelemans Van den bosch)   (Correct)

.... and speech areas are hyphenation and syllabification (Daelemans and van den Bosch, 1992) classifiying phonemes in speech (Kocsor et al. 2000) assignment of word stress (Daelemans, Gillis, and Durieux, 1994) grapheme to phoneme conversion, van den Bosch and Daelemans, 1993; Daelemans and van den Bosch, 1996); predicting linking morphemes in Dutch compounds (Krott, Baayen, and Schreuder, 2001) diminutive formation (Daelemans et al. 1998) and morphological analysis (van den Bosch, Daelemans, and Weijters, 1996; van den Bosch and Daelemans, 1999) Work on syntacto semantic tasks at the sentence ....

.... compounds (Krott, Baayen, and Schreuder, 2001) diminutive formation (Daelemans et al. 1998) and morphological analysis (van den Bosch, Daelemans, and Weijters, 1996; van den Bosch and Daelemans, 1999) Work on syntacto semantic tasks at the sentence level has focused on part of speech tagging (Daelemans et al. 1996; Zavrel and Daelemans, 1999; van Halteren, Zavrel, and Daelemans, 2001) PP attachment (Zavrel, Daelemans, and Veenstra, 1997) word sense disambiguation (Veenstra et al. 2000; Stevenson and Wilks, 1999; Kokkinakis, 2000) subcategorization (Buchholz, 1998) phrase chunking (Veenstra, 1998; ....

Daelemans, W. and A. van den Bosch. 1996. Language-independent data-oriented grapheme-tophoneme conversion. In J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing. Springer-Verlag, Berlin, pages 77--89.


Machine Learning for Modeling Dutch Pronunciation Variation. - Eronique Hoste Steven (1999)   Self-citation (Daelemans)   (Correct)

....proved to be an extremely powerful method for overcoming the linguistic knowledge acquisition bottleneck. Di erent approaches are available, such as decision tree learning (Dietterich, 1997) neural network or connectionist approaches (Sejnowski and Rosenberg, 1987) memory based learning (Daelemans and van den Bosch, 1996) etc. Data driven approaches can yield results that are comparable to and often even better than rule based approaches, as described in Daelemans and van den Bosch (1996) in which a comparison is made between Morpa cum Morphon (Nunn and van Heuven, 1993) an example of a linguistic knowledge based ....

....learning (Dietterich, 1997) neural network or connectionist approaches (Sejnowski and Rosenberg, 1987) memory based learning (Daelemans and van den Bosch, 1996) etc. Data driven approaches can yield results that are comparable to and often even better than rule based approaches, as described in Daelemans and van den Bosch (1996) in which a comparison is made between Morpa cum Morphon (Nunn and van Heuven, 1993) an example of a linguistic knowledge based approach to grapheme tophoneme conversion and IG Tree, an example of a memory based approach (Daelemans et al. 1997) In this study, we will look for the patterns and ....

[Article contains additional citation context not shown here]

W. Daelemans and A. van den Bosch. 1996. Language-independent data-oriented graphemeto -phoneme conversion. In J. Van Santen, R. Sproat, J. Olive, and J. Hirschberg, editors, Progress in Speech Synthesis, pages 77-90. New York: Springer Verlag.


A Rule Induction Approach to Modeling Regional.. - Hoste, Gillis, Daelemans (2000)   Self-citation (Daelemans)   (Correct)

....proved to be an extremely powerful method for overcoming the linguistic knowledge acquisition bottleneck. Di erent approaches are available, such as decision tree learning (Dietterich, 1997) neural network or connectionist approaches (Sejnowski and Rosenberg, 1987) memory based learning (Daelemans and van den Bosch, 1996) etc. Data driven approaches can yield comparable (and often even better) results than the rule based approach, as described in the work of Daelemans and van den Bosch (1996) in which a comparison is made between Morpa cumMorphon (Heemskerk and van Heuven, 1993) an example of a linguistic ....

....(Dietterich, 1997) neural network or connectionist approaches (Sejnowski and Rosenberg, 1987) memory based learning (Daelemans and van den Bosch, 1996) etc. Data driven approaches can yield comparable (and often even better) results than the rule based approach, as described in the work of Daelemans and van den Bosch (1996) in which a comparison is made between Morpa cumMorphon (Heemskerk and van Heuven, 1993) an example of a linguistic knowledge based approach to grapheme to phoneme conversion and IG Tree, an example of a memory based approach (Daelemans et al. 1997) In this study, we will look for the patterns ....

[Article contains additional citation context not shown here]

W. Daelemans and A. van den Bosch. 1996. Language-independent data-oriented graphemeto -phoneme conversion. In J. Van Santen, R. Sproat, J. Olive, and J. Hirschberg, editors, Progress in Speech Synthesis, pages 77-90. New York: Springer Verlag.


Inductive Lexica - Daelemans, Durieux (2000)   Self-citation (Daelemans)   (Correct)

....of linguistic description and knowledge (phonotactics, phonology, morphology, syntax) MITalk (Allen et al. 1987) is a classical example of a rule based solution to the problem. It is, however, possible to achieve excellent grapheme to phoneme conversion accuracy using machine learning techniques (Daelemans and van den Bosch, 1996). To make this problem suitable for machine learning algorithms, the following steps have to be taken: Gamma Automatic alignment. In order to make full use of the generalization possibilities implicit in splitting up the task into subtasks, the task is recast as the transcription of each letter ....

....made based on similar cases in memory. The decisions for each letter are then combined to produce the final pronunciation representation. The learning method which was used is a combination of decision tree induction and memory based learning, for details see Daelemans and van den Bosch (1993) Daelemans and van den Bosch (1996) and van den Bosch and Daelemans (1993) The method is applicable in the context of our inductive lexica approach because (i) it is corpus based (it takes as training material the pairs of spellings and associated pronunciations already present in the lexicon) ii) it is language independent and ....

Daelemans, W. and A. van den Bosch: 1996, `Language-Independent Data-Oriented Grapheme-to-Phoneme Conversion'. In: J. P. H. van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg (eds.): Progress in Speech Synthesis. New York, NY: Springer Verlag, pp. 77--90.


TiMBL: Tilburg Memory-Based Learner - version 3.0.. - Daelemans, Zavrel.. (2000)   Self-citation (Daelemans Van den bosch)   (Correct)

.... TiMBL package is tribl [15] The memory based algorithms implemented in the TiMBL package have been successfully applied to a large range of Natural Language Processing tasks in our group: hyphenation and syllabification ( 17] assignment of word stress ( 11] grapheme to phoneme conversion ([13]) diminutive formation ( 16] morphological analysis ( 29, 28] part of speech tagging ( 18, 35] PP attachment ( 36] word sense disambiguation( 30] subcategorization ( 4] chunking (partial parsing) 31] and shallow parsing ( 10, 5] Relations to statistical language processing are ....

W. Daelemans and A. Van den Bosch. Language-independent data-oriented grapheme-to-phoneme conversion. In J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing, pages 77--89. Springer-Verlag, Berlin, 1996.


Forgetting Exceptions is Harmful in Language Learning - Daelemans, van den Bosch.. (1999)   (24 citations)  Self-citation (Daelemans Van den bosch)   (Correct)

....1988) For example in the sentence they can can a can , the word can is tagged as modal verb, main verb and noun respectively. We assume a tagger architecture that processes a sentence from the left to the right by classifying instances representing words in their contexts (as described in Daelemans et al. 1996)) The word s already tagged left context is represented by the disambiguated categories of the two words to the left, the word itself and its ambiguous right context are represented by categories which denote ambiguity classes (e.g. verb or noun) The data set for the part of speech tagging task, ....

....tasks Language processing tasks are usually described as complex mappings between representations: from spelling to sound, from strings of words to parse trees, from parse trees to semantic formulas, etc. These mappings can be approximated by (cascades of) classification tasks (Ratnaparkhi, 1997; Daelemans, 1996; Cardie, 1996; Magerman, 1994) which makes them amenable to machine learning approaches. One of the most salient characteristics of natural language processing mappings is that they are noisy and complex. Apart from some regularities, they contain also many sub regularities and (pockets of) ....

[Article contains additional citation context not shown here]

Daelemans, W. and A. Van den Bosch. 1996. Language-independent data-oriented graphemeto -phoneme conversion. In J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing. Springer-Verlag, Berlin, pages 77--89.


Rapid Development of NLP Modules with Memory-Based.. - Daelemans, van den.. (1998)   (7 citations)  Self-citation (Daelemans Van den bosch)   (Correct)

....classification subtasks. In this section we illustrate some examples of recent work on MBLE on three light NLP tasks: i) text to speech conversion in TREETALK, ii) part of speech tagging in MBT, and (iii) phrase chunking in MBC. 3.1. TREETALK: Text to speech conversion The TREETALK system [26, 25, 11, 24] has originally been designed for isolated word pronunciation, i.e. converting a written word to its phonemic representation as found in a pronunciation dictionary, and efforts are underway to extend it to modeling speech phenomena in texts, such as sentence accents and prosody. In this ....

W. Daelemans and A. Van den Bosch. Languageindependent data-oriented grapheme-to-phonemeconversion. In J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing, pages 77--89. Springer-Verlag, Berlin, 1996.


TiMBL: Tilburg Memory Based Learner - version 2.0 -.. - Daelemans, Zavrel.. (1999)   Self-citation (Daelemans Van den bosch)   (Correct)

.... in the TiMBL package is tribl [13] The memory based algorithms implemented in the TiMBL package have been successfully applied to a large range of Natural Language Processing tasks: hyphenation and syllabification ( 15] assignment of word stress ( 9] grapheme to phoneme conversion ([11]) diminutive formation ( 14] morphological analysis ( 26] part of speech tagging ( 16] PP attachment ( 31] word sense disambiguation( 27] subcategorization ( 4] and chunking (partial parsing) 28] Relations to statistical language processing are discussed in [30] A partial ....

W. Daelemans and A. Van den Bosch. Language-independent data-oriented grapheme-to-phoneme conversion. In J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing, pages 77--89. Springer-Verlag, Berlin, 1996. BIBLIOGRAPHY 38


Automatic Phonetic Transcription of Words Based On Sparse Data - Wolters, van den Bosch (1997)   Self-citation (Van den bosch)   (Correct)

.... is needed in high quality text to speech synthesis (Yvon, 1996) However, Bakiri and Dietterich (1993) have shown that their approach based on ID 3 (Quinlan, 1986) decision trees outperforms the sophisticated DECTalk rule set for English (Allen et al. 1987) Van den Bosch and Daelemans, 1993; Daelemans and Van den Bosch, 1997) report similar results for Dutch. In both cases, the training corpora contained around 18000, and the test corpora around 2000 words. With the exception of (Dietterich and Bakiri, 1995) most researchers have relied on large machine readable pronunciation dictionaries for training and test data. ....

....vectors rather than on more complex expressions such as those in first order logic (Kolodner, 1993; Lavrac and Dzeroski, 1994) 2 feedforward: the output of the units in layer i is only fed to units in layer j i. We examine two ibl algorithms, viz. ib1 and ib1 ig. ib1 (Aha et al. 1991; Daelemans et al. 1997) constructs a database of instances during learning. An instance consists of a fixed length vector of n feature value pairs, and an information field containing its class(es) When the classification of a feature value vector is ambiguous, the frequencies of the relevant classes in the training ....

[Article contains additional citation context not shown here]

Daelemans, W. and Van den Bosch, A. (1997). Language-independent data-oriented grapheme-to-phoneme conversion. In Van Santen, J. P. H., Sproat, R. W., Olive, J. P., and Hirschberg, J., editors, Progress in Speech Processing, pages 77--89.


TiMBL: Tilburg Memory-Based Learner - version 1.0 - .. - Daelemans, Zavrel, .. (1998)   Self-citation (Daelemans Van den bosch)   (Correct)

....between the ib1 ig and igtree algorithms. The memory based algorithms implemented in the TiMBL package have been successfully applied to a large range of Natural Language Processing tasks: hyphenation and syllabification ( 8] assignment of word stress ( 9] graphemeto phoneme conversion ([11]) diminutive formation ( 15] morphological analysis ( 25] part of speech tagging ( 12] PP attachment ( 28] Not yet published experimental results exist for word sense disambiguation, subcategorisation, and chunking (partial parsing) Relations to statistical language processing are ....

W. Daelemans and A. Van den Bosch. Language-independent data-oriented grapheme-to-phoneme conversion. In J. P. H. Van Santen, R. W. Sproat, BIBLIOGRAPHY 28 J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing, pages 77--89. Springer-Verlag, Berlin, 1996.


Modularity in Inductively-Learned Word Pronunciation.. - van den Bosch.. (1998)   (2 citations)  Self-citation (Daelemans Van den bosch)   (Correct)

....of new, unseen instances of the same (sub)task, i.e. we measure their generalisation accuracy. Weiss and Kulikowski, 1991) describe n fold cross validation (n fold cv) as a procedure for mea 2 Graphemic parsing is not represented in the celex data. We used an automatic alignment algorithm (Daelemans and Van den Bosch, 1997) to determine which letters are the first or only letters of a grapheme. letter window instances phoneme window instances instance left right classifications left right classif. number context focus context m a g s gs context focus context y s 1 b o o k 1 1 b 1 b 1 b u k 1 1 2 b o ....

Daelemans, W. and A. Van den Bosch. 1997. Languageindependent data-oriented grapheme-to-phoneme conversion.


Towards Automatic Word Segmentation of Dialect Speech - Eric Sanders Andrea   (Correct)

No context found.

W. Daelemans and A. Van den Bosch, "Languageindependent data-oriented grapheme-to-phoneme conversion,", In J. P. H. Van Santen, R. W.Sproat, J. P. Olive, and J. Hirschberg, (Eds.), pp. 77-89. SpringerVerlag, Berlin, 1996.


Text Preprocessing for Speech Synthesis - Uwe Reichel Hartmut   (Correct)

No context found.

W. Daelemans and A. van den Bosch. 1997. LanguageIndependent Data-Oriented Grapheme-to-PhonemeConversion. In J.P.H. van Santen, R.W. Sproat, J.P. Olive, and J. Hirschberg, editors, Progress in Speech Synthesis, pages 77--89. Springer, New York.


Using Morphology and Phoneme History - To Improve Grapheme-To-Phoneme   (Correct)

No context found.

W. Daelemans and A. van den Bosch, "LanguageIndependent Data-Oriented Grapheme-to-Phoneme Conversion, " in Progress in Speech Synthesis, J. P. H. e. a. van Santen, Ed. New York: Springer, 1997.


Pmtools: A Pronunciation Modeling Toolkit - Richard Sproat Att   (Correct)

No context found.

Daelemans, W., and van den Bosch, A. Languageindependent data-oriented grapheme-to-phoneme conversion. In Progress in Speech Synthesis, J. van Santen, R. Sproat, J. Olive, and J. Hirschberg, Eds. Springer, New York, NY, 1997, pp. 77--89.


Hybrid Grapheme to Phoneme Conversion for Unlimited Vocabulary - Kim, Lee, Lee (1998)   (Correct)

No context found.

Walter M. P. Daelemans and Antal P. J. van den Bosch. Language-independent dataoriented grapheme-to-phoneme conversion. In Jan P.H. van Santen, Richard W. Sproat, Joseph P. Olive, and Julia Hirschberg, editors, Progress in Speech Synthesis. SpringerVerlag, 1997.


Pronunciation Modeling In Speech Synthesis - Miller (1998)   (1 citation)  (Correct)

No context found.

Daelemans, Walter M. P., and Antal P. J. Van den Bosch. 1997. Language independent data-oriented grapheme-to-phoneme conversion. In Progress in speech synthesis, ed. Jan P. H. van Santen, Richard W. Sproat, Joseph P. Olive, and Julia Hirschberg, 77-89. New York: Springer.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC