Results 1 - 10
of
28,794
Table 2: Correct and incorrect morphemes generated by the simple frequency analysis
"... In PAGE 5: ... Additionally, it yielded seven correct morpheme types and one incorrect morpheme type. These are shown in Table2 , transcribed here and in all further examples back into orthographic form when possible for ease of reading. These results indicate that the proposed morpheme extraction procedure can result in high type accuracy and moderate token accuracy, although it does not result in a large set of morphemes.... ..."
Table 1 Comparison of terms found by statistical analysis and domain- specific morphemes
2005
"... In PAGE 9: ...Table 1 Comparison of terms found by statistical analysis and domain- specific morphemes Table1 shows terms that have been extracted from four documents that cover different topics. Terms in the middle column were found by using sta- tistical methods, the ones in the rightmost column originate in morphologi- cal analysis.... ..."
Cited by 1
Table 3: Statistics and example morpheme analyses in Finnish. #anal is the average amount of analysis per word (separated by a comma), #morph the average amount of morphemes per analysis (separated by a space), and lexicon the total amount of morpheme types.
2007
Cited by 2
Table 5: Statistics and example morpheme analyses in German. #anal is the average amount of analysis per word (separated by a comma), #morph the average amount of morphemes per analysis (separated by a space), and lexicon the total amount of morpheme types.
2007
Cited by 2
Table 6: Statistics and example morpheme analyses in English. #anal is the average amount of analysis per word (separated by a comma), #morph the average amount of morphemes per analysis (separated by a space), and lexicon the total amount of morpheme types.
2007
Cited by 2
Table 1 shows the main features of the three textual samples relating to size, num- ber of words and pseudo-morphemes and vocabulary size, both in words and pseudo- morphemes for each database [6]. Figure 1 shows some of the interesting conclusions derived of this analysis. The first important outcome of our analysis is that the vocabulary size of pseudo- morphemes is reduced about 60% (Fig. 1, a) in all cases relative to the vocabulary size of words. Regarding the unit size, Fig. 1 (b) shows the plot of Relative Fre- quency of Occurrence (RFO) of the pseudo-morphemes and words versus their length in characters over the textual sample named STDBASQUE. Although only 10% of the pseudo-morphemes in the vocabulary have fewer than four characters, such small morphemes have an Accumulated Frequency of about 40% in the databases (the Ac- cumulated Frequency is calculated as the sum of the individual pseudo-morphemes RFO) [7].
in Language Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish
"... In PAGE 2: ... This approach has been evaluated over three textual samples analysing both the coverage and the Out of Vocabulary rate, when we use words and pseudo-morphemes obtained by the automatic morphological segmentation tool AHOZATI [6]. Table1 . Main characteristics of the textual databases for morphologic analysis STDBASQUE NEWSPAP ER BCNEW S Text amount 1,6M 1,3M 2,5M Number of words 197,589 166,972 210,221 Number of pseudo-morphemes 346,232 304,767 372,126 Number of sentences 15,384 13,572 19,230 Vocabulary size in words 50,121 38,696 58,085 Vocabulary size in pseudo-morphemes 20,117 15,302 23,983 ... ..."
Table 8: Grammatical morphemes for nouns and present tense verbs
"... In PAGE 9: ... Homogenous environments An analysis of the noun and verb categories reveals a striking difference in terms of the number of available grammatical morphemes. Although there are a number of exceptions, nouns generally have at least one of three frequent grammatical morphemes; verbs, on the other hand, have a much larger set (see Table8 ). The result of the small set of grammatical morphemes for nouns is that there is less competition among the morphemes resulting in a higher likelihood of a small number of morphemes occurring with a much higher frequency than anything else, rather than a cascade of frequencies running across a large set of actual morphemes and intermixing with the background noise.... ..."
Table 3: HMM Tagging, unknown-word handling, and error-correction experiment results. From the left: corpus domain, number of morphemes in the corpus (corpus size), average number of possible POS tags for each morpheme (amounts of ambiguity), HMM alone tagging performance (%), HMM tagging performance with unknown-word guessing (%), and nal hybrid tagging (HMM + unknown-word + error correction) performance (%). The average number of tags for each morpheme includes the tags from the morpheme dictionary and also from the pattern dictionary resulting from the morphological analysis. The HMM alone tagging performance counted all the unknown- words as tagging failures.
Table 1. Results of web-validation for a selection of 7 suffix morphemes: ographer=writer, ography=writing, ology=study, omania=obsession, onaut=traveller, ophobia=fear of , oscope=viewer
"... In PAGE 3: ... This secondary web validation reveals that 654 (20%) of the 3304 word-forms are actually H-Creative (that is, novel and useful). Table1 provides a breakdown of the analysis on a morpheme by morpheme basis. 5.... ..."
Table 3. Comparision of speaker independent morpheme error rates with different language models (25k vocabu- lary).
"... In PAGE 2: ...ems. The evaluation domain, however, was quite limited. Therefore we repeated the experiments on a 25k vocab- ulary task to see how these results generalize for larger tasks. We can see from Table3 that the use of morpho- logical information proved useful on this larger task as well, though the improvement was much smaller than on the 1350 word task. We did not finish yet the analysis of the errors but preliminary investigations suggest that the reduced improvement is related to the OOV words.... ..."
Results 1 - 10
of
28,794