• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 28,794
Next 10 →

Table 2: Correct and incorrect morphemes generated by the simple frequency analysis

in Stochastic approaches to morphology acquisition
by Justin M. Aronoff, Nuria Giralt, Toben H. Mintz
"... In PAGE 5: ... Additionally, it yielded seven correct morpheme types and one incorrect morpheme type. These are shown in Table2 , transcribed here and in all further examples back into orthographic form when possible for ease of reading. These results indicate that the proposed morpheme extraction procedure can result in high type accuracy and moderate token accuracy, although it does not result in a large set of morphemes.... ..."

Table 1 Comparison of terms found by statistical analysis and domain- specific morphemes

in Terminology extraction and automatic indexing - comparison and qualitative evaluation of methods
by Hans Friedrich Witschel 2005
"... In PAGE 9: ...Table 1 Comparison of terms found by statistical analysis and domain- specific morphemes Table1 shows terms that have been extracted from four documents that cover different topics. Terms in the middle column were found by using sta- tistical methods, the ones in the rightmost column originate in morphologi- cal analysis.... ..."
Cited by 1

Table 3: Statistics and example morpheme analyses in Finnish. #anal is the average amount of analysis per word (separated by a comma), #morph the average amount of morphemes per analysis (separated by a space), and lexicon the total amount of morpheme types.

in Unsupervised morpheme analysis evaluation by a comparison to a linguistic Gold Standard – Morpho Challenge 2007
by Mikko Kurimo, Mathias Creutz, Matti Varjokallio 2007
Cited by 2

Table 5: Statistics and example morpheme analyses in German. #anal is the average amount of analysis per word (separated by a comma), #morph the average amount of morphemes per analysis (separated by a space), and lexicon the total amount of morpheme types.

in Unsupervised morpheme analysis evaluation by a comparison to a linguistic Gold Standard – Morpho Challenge 2007
by Mikko Kurimo, Mathias Creutz, Matti Varjokallio 2007
Cited by 2

Table 6: Statistics and example morpheme analyses in English. #anal is the average amount of analysis per word (separated by a comma), #morph the average amount of morphemes per analysis (separated by a space), and lexicon the total amount of morpheme types.

in Unsupervised morpheme analysis evaluation by a comparison to a linguistic Gold Standard – Morpho Challenge 2007
by Mikko Kurimo, Mathias Creutz, Matti Varjokallio 2007
Cited by 2

Table 1 shows the main features of the three textual samples relating to size, num- ber of words and pseudo-morphemes and vocabulary size, both in words and pseudo- morphemes for each database [6]. Figure 1 shows some of the interesting conclusions derived of this analysis. The first important outcome of our analysis is that the vocabulary size of pseudo- morphemes is reduced about 60% (Fig. 1, a) in all cases relative to the vocabulary size of words. Regarding the unit size, Fig. 1 (b) shows the plot of Relative Fre- quency of Occurrence (RFO) of the pseudo-morphemes and words versus their length in characters over the textual sample named STDBASQUE. Although only 10% of the pseudo-morphemes in the vocabulary have fewer than four characters, such small morphemes have an Accumulated Frequency of about 40% in the databases (the Ac- cumulated Frequency is calculated as the sum of the individual pseudo-morphemes RFO) [7].

in Language Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish
by Bordel G, Ezeiza A, Lopez De Ipina K, López J. M, Peñagarikano M, Zulueta E, Sistemen Ingeniaritza Eta Automatika Saila, Josemanuel. Lopez
"... In PAGE 2: ... This approach has been evaluated over three textual samples analysing both the coverage and the Out of Vocabulary rate, when we use words and pseudo-morphemes obtained by the automatic morphological segmentation tool AHOZATI [6]. Table1 . Main characteristics of the textual databases for morphologic analysis STDBASQUE NEWSPAP ER BCNEW S Text amount 1,6M 1,3M 2,5M Number of words 197,589 166,972 210,221 Number of pseudo-morphemes 346,232 304,767 372,126 Number of sentences 15,384 13,572 19,230 Vocabulary size in words 50,121 38,696 58,085 Vocabulary size in pseudo-morphemes 20,117 15,302 23,983 ... ..."

Table 8: Grammatical morphemes for nouns and present tense verbs

in Stochastic approaches to morphology acquisition
by Justin M. Aronoff, Nuria Giralt, Toben H. Mintz
"... In PAGE 9: ... Homogenous environments An analysis of the noun and verb categories reveals a striking difference in terms of the number of available grammatical morphemes. Although there are a number of exceptions, nouns generally have at least one of three frequent grammatical morphemes; verbs, on the other hand, have a much larger set (see Table8 ). The result of the small set of grammatical morphemes for nouns is that there is less competition among the morphemes resulting in a higher likelihood of a small number of morphemes occurring with a much higher frequency than anything else, rather than a cascade of frequencies running across a large set of actual morphemes and intermixing with the background noise.... ..."

Table 3: HMM Tagging, unknown-word handling, and error-correction experiment results. From the left: corpus domain, number of morphemes in the corpus (corpus size), average number of possible POS tags for each morpheme (amounts of ambiguity), HMM alone tagging performance (%), HMM tagging performance with unknown-word guessing (%), and nal hybrid tagging (HMM + unknown-word + error correction) performance (%). The average number of tags for each morpheme includes the tags from the morpheme dictionary and also from the pattern dictionary resulting from the morphological analysis. The HMM alone tagging performance counted all the unknown- words as tagging failures.

in Hybrid POS tagging with generalized unknown-word handling
by Geunbae Lee, Jeongwon Cha, Jong-hyeok Lee

Table 1. Results of web-validation for a selection of 7 suffix morphemes: ographer=writer, ography=writing, ology=study, omania=obsession, onaut=traveller, ophobia=fear of , oscope=viewer

in Exploring Linguistic Creativity via Predictive Lexicology
by unknown authors
"... In PAGE 3: ... This secondary web validation reveals that 654 (20%) of the 3304 word-forms are actually H-Creative (that is, novel and useful). Table1 provides a breakdown of the analysis on a morpheme by morpheme basis. 5.... ..."

Table 3. Comparision of speaker independent morpheme error rates with different language models (25k vocabu- lary).

in Modeling Morphosyntax With Finite-State Transducers
by And Its Application, Máté Szarvas
"... In PAGE 2: ...ems. The evaluation domain, however, was quite limited. Therefore we repeated the experiments on a 25k vocab- ulary task to see how these results generalize for larger tasks. We can see from Table3 that the use of morpho- logical information proved useful on this larger task as well, though the improvement was much smaller than on the 1350 word task. We did not finish yet the analysis of the errors but preliminary investigations suggest that the reduced improvement is related to the OOV words.... ..."
Next 10 →
Results 1 - 10 of 28,794
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University