Results 1 - 10
of
14,859
Table 1 - The Penn TreeBank project tagset
"... In PAGE 24: ... The tags are taken from a tagset, which is a predefined tags list. Table1 shows the well-known Penn TreeBank tagset [29]. ... In PAGE 27: ... For instance, take the sentence: The house of the rising sun. By part-of-speech tagging the sentence (using the tagset given in Table1 ), we get: The/DT house/NN of/IN the/DT rising/VBG sun/NN. Shallow parsing the sentence will result in: [ The/DT house/NN ] of/IN [ the/DT rising/VBG sun/NN ] which means that The/DT house/NN and the/DT rising/VBG sun/NN are both base noun-phrases.... ..."
Table C.3 Listing of Syntactic Tags of Penn Tree-Bank Tags
2001
Table 3.4 Changes of Penn Tree-Bank Syntactic Tags to ASPIN Tags Tags
2001
Table 1: Penn Chinese Treebank phrase tags.
2007
"... In PAGE 3: ... We examined all phrase types in the Tree- bank; potentially phrases of any type could be can- didates for reordering rules. Table1 provides a list of Treebank phrase tags for easy reference. We ruled out several phrase types as not requiring reordering... ..."
Cited by 2
Table 1: Penn Chinese Treebank phrase tags.
2007
"... In PAGE 3: ... We examined all phrase types in the Tree- bank; potentially phrases of any type could be can- didates for reordering rules. Table1 provides a list of Treebank phrase tags for easy reference. We ruled out several phrase types as not requiring reordering... ..."
Cited by 2
Table 3.3 Changes of Penn Tree-Bank POS Tags to 20 ASPIN POS-Tags Tag
2001
Table 2: The distribution of the 10 most frequent types of empty node and their antecedents in the Penn Tree- bank (adapted from Johnson2002). Bracketted lines designate long-distance dependencies that are local in DG
2005
"... In PAGE 6: ... The ten most frequent types of empty nodes cover more than 60,000 of the approximately 64,000 empty nodes of sections 2-21 of the Penn Treebank. Table2 , reproduced from Johnson (2002) [line numbers and counts from the whole Treebank added], gives an overview. Empty units, empty complementizers and empty relative pronouns [lines 4,5,9,10] pose no problem for DG as they are optional, non-head material.... ..."
Table 4. Test set perplexity for the language models for the Penn Treebank corpus.
"... In PAGE 7: ... Note that, in the tuning set, more than 5% of the words are unknown. Finally, we have applied the hybrid language model to the test set of the Penn Tree- bank corpus (with the best values of epsilon1 for the tuning set) and the test set perplexity is shown in Table4 . We have also estimated a trigram model which uses words directly, using linear discounting as a smoothing technique [18].... In PAGE 7: ...able 4. Test set perplexity for the language models for the Penn Treebank corpus. 6 Discussion and conclusions In this paper, we have combined a POStag connectionist N-gram model with a unigram model of words. The results for this hybrid language model (see Table4 ) show that this new approach is feasible. As we can see in Table 2, the behaviour of the connectionist model for the POStagged corpus, LMc, offers better performance than the conventional class N-gram models.... ..."
Table 2.1: Summary of corpus-based parsing approaches the publication. The rst column indicates if the training corpus is annotated or unannotated. `WSJ apos; means the PennTreeBank, Wall Street Journal section. The second column indicates if it uses context in the parsing. The use of context di ers system by system. `LR apos; means probabilistic LR parser, which is di erent from the 5
Table 6. The tree-bank in numbers
"... In PAGE 17: ... sentences, i.e. 0.8%). Table6 also shows some other figures pertaining to the syntactic annotation in the tree-bank. The total number of constituent nodes (beyond POS tags) in parse- trees expresses that the parse-trees contain, on average, just over one constituent-node per word.... In PAGE 18: ... Once occurring phenomena provide good examples of this sparseness problem. As Table6 shows, once-occurring mor- phemes (67% of all morphemes) are dramatically less ambiguously represented at the POS tag level than other morphemes, and the many once-occurring grammar rules (66%) signify, most probably, non-representative distributions over syntactic struc- tures, but possibly also annotation errors. As a result, any probabilistic parser that can be generated using this corpus is expected to suffer from coverage problems.... ..."
Results 1 - 10
of
14,859