| Walter Daelemans, Antal van den Bosch, Jakub Zavrel, Jorn Veenstra, Sabine Buchholz, and Bertjan Busser. 1998. Rapid development of NLP modules with Memory-Based Learning. |
....analogical, case based, instance based, and lazy learning. we selected MBL as our learning algorithm because it suits well the domains with a large number of features 66 from heterogeneous sources and can remember exceptional and low frequency cases that are useful to extrapolate from [5]. In addition, we can customize the learner using different weighting functions according to linguistic bias. The main steps for the learning process are: Step 1: Prepare training data in which all noun phrases, entity names and their relations are manually annotated. Step 2: Segmenting, ....
Walter Daelemans, Antal van den Bosch, Jakub Zavrel, Jorn Veenstra, Sabine Buchholz, and Bertjan Busser. Rapid development of NLP modules with memory-based learning, In Proceedings of ELSNET in Wonderland, pp. 105-i 13. Utrecht: ELSNET, 1998.
....of data outperform an excellent model trained with a small amount of data. This need for large training sets is a lesson that recurs in a number of domains, from acoustic speech recognition [19] and speechreading [33] to face recognition [23] gesture recognition [24] natural language processing [10], speech production [35] and others. Moreover, for areas where we may not yet have adequate models, we know how to broaden and improve classes of models to include more degrees of freedom to account for sources of variation, to set parameters, and so on given enough data. Again, in broad ....
Walter Daelemans, Antal van den Bosch, Jakub Zavrel, Jorn Veenstra, Sabine Buchholz, and Bertjan Busser. Rapid development of NLP modules with memory-based learning. In Roberto Basili and Maria Theresa Pazienza, editors, ECML98 TANLPS Workshop Notes, pages 1--17, Technische Universitat Chemnitz, 1998.
....words and it generated removable rules for ambiguous tags: whenever such a rule applies to a word, the rule s tag is removed from the set of candidates. Using a simple linguistic background knowledge the generated tagger achieved 96.4 per word accuracy on the test data 1 . Daelemans al. in [7] presented the MBT memory based tagger generator system for different languages. They trained their system on a subset of the Wall Street Journal corpus (2 million words) for English and achieved 96.4 accuracy.For other languages they reached the following results: Dutch (95.7 ) Spanish(97.8 ) ....
W. Daelemans, A. v. d. Bosch, and J. Zavrel. Rapid development of NLP modules with memory-based learning. In Proc. of ELSNET in Wonderland,Utrecht, pages 105--113, 1998.
....task as such a classification problem allows one to use many existing machine learning algorithms. In our own work, we have advocated the use of Memory Based Learning (MBL) techniques for POS tagging (Daelemans et al. 1996) and for classification tasks in Natural Language Processing in general (Daelemans et al. 1998). MBL provides a solution to both the sparse data problem, via an implicit similarity based smoothing scheme, and the challenges of a rich feature set, via automatic feature weighting. In this paper we will first review the basic techniques of Memory Based Learning (Section 2) Next, in Section 3, ....
.... experiments, comparing a number of alternative tagging methods (R: rule based (Brill, 1994) T: trigram (Steetskamp, 1995) and E: maximum entropy (Ratnaparkhi, 1996) on the same corpus, the tagged LOB corpus (Johansson, 1986) This work is described in more detail in (van Halteren, Zavrel, and Daelemans, 1998). Each of these taggers uses different 3 The F in the unknown words pattern only indicates the position of the focus, it is not included as a feature in the actual pattern. 4 Thanks go to Prof. Jir i Kraus of the Czech Academy of Sciences for permission to use this corpus. Tag set # Words ....
[Article contains additional citation context not shown here]
Daelemans, W., A. Van den Bosch, J. Zavrel, J. Veenstra, S. Buchholz, and G. J. Busser. 1998. Rapid development of NLP modules with Memory-Based Learning. In Proceedings of ELSNET in Wonderland, March, 1998, pages 105--113. ELSNET.
No context found.
Walter Daelemans, Antal van den Bosch, Jakub Zavrel, Jorn Veenstra, Sabine Buchholz, and Bertjan Busser. 1998. Rapid development of NLP modules with Memory-Based Learning.
No context found.
W. Daelemans, A. van den Bosch, J. Zavrel, J. Veenstra, S. Buchholz and B. Busser, Rapid development of NLP modules with memory-based learning. Proceeding on ELSNET in wonderland (ELSNET98). Utrecht. 1998, pp. 105-113
No context found.
Daelemans W., van den Bosch A., Zavrel J., Veenstra J., Buchholz S., Busser B. (1998) Rapid development of NLP modules with memory-based learning. Proceeding on ELSNET in wonderland (ELSNET98). Utrecht. Pp. 105-113
No context found.
Daelemans, W., van den Bosch, A., Zavrel, J., Veenstra, J., Buchholz, S., Busser, B.: Rapid development of NLP modules with memory-based learning, Proc. ELSNET in Wonderland (ELSNET98), Utrecht, 1998, pp. 105 -- 113.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC