| Cardie, C. 1996. Automating feature set selection for case-based learning of linguistic knowledge. In Proc. of the Conference on Empirical Methods in Natural Language Processing, May 17-18 1996. University of Pennsylvania. |
....by analogy to earlier examples. Variations on this theme are known under names such as Analogy based, Example based, Instance based, Case based, Memory based, Experiencebased, Data Oriented, Usage Based and Exposure based models (see e.g. Scha (1992) Skousen (1989) R. 1991) Mitchell (1994) Cardie (1996), Daelemans (1995) and others) The basic idea is that language processing and learning are fundamentally interwoven. Each language experience leaves a memory trace which can be used to guide later processing. When a new instance of a task is input to the processor, a set of relevant instances ....
....used to describe patterns. If we have no information about the importance of features, this is a reasonable choice. But if we do have some information about feature relevance, we could probably do better. One possibility is to add linguistic bias to weight or select different features (see e. g Cardie (1996)) An alternative approach, and one that is more in line with the spirit of the empiricist approach to NLP is to look at the behavior of features in the set of examples used for training. We can compute statistics about the relevance of features by looking at which features are good predictors of ....
Cardie, C. 1996. Automating feature set selection for case-based learning of linguistic knowledge. In Proc. of the Conference on Empirical Methods in Natural Language Processing, May 17-18 1996. University of Pennsylvania.
.... has very little benefit and sometimes degrades performance [12] Wettschereck et al. provide a good review and an empirical evaluation of feature weighting methods for a class of lazy learning algorithms [16] Some researchers have developed algorithms just for the selection of relevant features [3, 13 15]. In this paper we present a classification learning algorithm that achieves high accuracy, comparable to nearest neighbor classifier, and is not adversely affected by the presence of irrelevant features. The VFI5 (Voting Feature Intervals) algorithm described here is quite robust with respect to ....
Cardie, C.: Automating Feature Set Selection for Case-Based Learning of Linguistic Knowledge. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, University of Pennsylvania (1996) 113--126
....which may or may not involve the direct application of contrastive linguistic knowledge. A number of researchers have already applied machine learning techniques to various NLP tasks like accent restoration by Yarowsky (1994) who achieves 99 accuracy, relative pronoun disambiguation (Cardie, 1992, 1993, 1996), Japanese) anaphora resolution (Aone Bennett, 1995) part of speech tagging (Weischedel al. 1993; Brill, 1995; Daelemans, Zavrel, Berck, Gillis, 1996) where tagging accuracies between 96 and 97 are achieved, cue phrase classification (Siegel McKeown, 1994; Litman, 1996) and word ....
Cardie, C. (1996). Automating feature set selection for case-based learning of linguistic knowledge.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC