MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Combining Multiple Classifiers to Improve Part of Speech Tagging: A Case Study for Brazilian Portuguese (2000) [1 citations — 0 self]

Download:
Download as a PDF | Download as a PS
by Rachel V. Xavier Aires, Ra M. Aluísio, Denise C. S. Kuhn, Marcio L. B, Osvaldo N. Oliveira
In the Proceedings of the Brazilian AI Symposium (SBIA’2000
http://www.nilc.icmc.usp.br/nilc/download/AiresSBIA2000.ps
Add To MetaCart

Abstract:

Abstract. Four taggers have been trained on a 100,000-word corpus of Brazilian Portuguese, namely Unigram (Treetagger), N-gram (Treetagger), transformationbased (TBL) and Maximum-Entropy tagging (MXPOST). The latter displayed the best accuracy (88.73%), which is still much lower than the state-of-the-art accuracy for English. The low accuracy is attributed to the reduced size of the training corpus. Twelve methods of combination were used, four of which led to an improvement over the MXPOST accuracy. The best result (89.42%) was obtained with a majority-wins voting strategy. 1

Citations

1581 Bagging predictors – Breiman - 1996
1219 Building a large annotated corpus of English: the Penn Treebank – Marcus, Santorini, et al. - 1993
593 Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging – Brill - 1995
387 Stacked generalization – Wolpert - 1992
245 A maximum entropy part-of-speech tagger – Ratnaparkhi - 1996
220 Decision combination in multiple classifier systems – Ho, Hull, et al. - 1994
200 Probabilistic part-of-speech tagging using decision trees – Schmid - 1994
164 mbt: A Memory-Based Part of Speech Tagger Generator – Daelemans, Zavrel, et al. - 1996
86 Machine-learning research: four current directions – Dietterich - 1997
78 Classifier combination for improved lexical disambiguation – Brill, Wu - 1998
68 Statistical techniques for natural language parsing – Charniak - 1997
60 Improving data driven wordclass tagging by system combination – Halteren, Zavrel, et al. - 1998
42 Combining Artificial Neural Nets: Ensemble and Modular Multi-net Systems – Sharkey - 1999
4 Linguistic issues in the development of ReGra: A grammar checker for Brazilian Portuguese. Natural Language Engineering – Martins, Hasegawa, et al. - 1998
3 Automatic parsing of Portuguese – Bick - 1996
3 A neural network approach to part-of-speech tagging – Marques, Lopes - 1996
1 et al.: Part-of-Speech Tagging for Portuguese Texts – Villavicencio - 1995