35 citations found. Retrieving documents...
Scott Cost and Steven Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57 -- 78, 1993.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Inductive Bias in Case-Based Reasoning Systems - Griffiths, Bridge (1995)   (1 citation)  (Correct)

....three possibilities to improve a case based system: ffl store new cases in the case base CB ffl change the measure of similarity [oe] ffl change CB and [oe] WG94, p. 79] Many case based learning algorithms have been defined illustrating these options; IB2 [AKA91] VS CBR [WG94] and PEBLS [CS93] [YJL94] show a number of options for adjusting the represented hypothesis. The current section will study the situation where concepts are learnt using a single fixed similarity measure, and the hypothesis is updated by alterations to the case base alone. Specifically, having defined a simple ....

S Cost and S Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10(1):37--66, March 1993.


Predicting Phrase Breaks with Memory-Based Learning - Busser, Daelemans, van den.. (2001)   (3 citations)  (Correct)

....IB1 MVDM: For typical symbolic (nominal) features, values are not ordered. In the previous variants, mismatches between values are all interpreted as equally important, regardless of how similar (in terms of classification behaviour) the values are. We adopted the modified value difference metric [Cost and Salzberg, 1993] to assign a different distance between each pair of values of the same feature. MVDM IG: MVDM with IG weighting. IGTREE: In this variant, an oblivious decision tree is created with features as tests, and ordered according to information gain of features, as a heuristic approximation of the ....

Cost, S. and Salzberg, S. (1993). A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Skousen's Analogical Modeling Algorithm: A comparison with .. - Walter Daelemans Itk (1997)   (1 citation)  (Correct)

....by retrieving the most similar memory items according to some distance metric, and extrapolating the category of these items to the new input pattern. Instances of this form of nearest neighbour method include instance based learning (Aha et al. 1991) exemplar based learning (Salzberg, 1991; Cost and Salzberg, 1993), and memory based reasoning (Stan ll and Waltz, 1986) The approach has been applied to a wide range of problems using not only numeric and binary values (for which nearest neighbour methods are traditionally used) but also using symbolic, unordered features. Advantages of the approach include ....

....nearest neighbour methods are traditionally used) but also using symbolic, unordered features. Advantages of the approach include an often surprisingly high classi cation accuracy, the capacity to learn polymorphous concepts, high speed of learning, and perspicuity of algorithm and classi cation (Cost and Salzberg, 1993). Aha et al. Aha et al. 1991) have shown that the basic instance based learning algorithm can pac learn any concept whose boundary is a union of a nite number of closed hyper curves of nite size (a class of concepts similar to that which ID3 and backpropagation can learn) Training speed is ....

[Article contains additional citation context not shown here]

Cost, S. and Salzberg, S. (1993). A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57-78.


TiMBL: Tilburg Memory-Based Learner - version 4.0.. - Daelemans, Zavrel.. (2001)   (Correct)

.... earlier experiences (as in rule induction and rule based processing) The approach has surfaced in different contexts using a variety of alternative names such as similarity based, example based, exemplar based, analogical, case based, instance based, and lazy learning (Stanfill and Waltz, 1986; Cost and Salzberg, 1993; Kolodner, 1993; Aha, Kibler, and Albert, 1991; Aha, 1997) Historically, memorybased learning algorithms are descendants of the k nearest neighbor (henceforth k NN) algorithm (Cover and Hart, 1967; Devijver and Kittler, 1982; Aha, Kibler, and Albert, 1991) An MBL system, visualized ....

....of a feature are seen as equally dissimilar. However, if we think of an imaginary task in e.g. the phonetic domain, we might want to use the information that b and p are more similar than b and a . For this purpose a metric was defined by Stanfill and Waltz (1986) and further refined by Cost and Salzberg (1993). It is called the (Modified) Value Difference Metric (MVDM; equation 5.10) and it is a method to determine the similarity of the values of a feature by looking at co occurrence of values with target classes. For the distance between two values V 1 ; V 2 of a feature, we compute the difference of ....

Cost, S. and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Unpacking Multi-Valued Symbolic Features and Classes in.. - van den Bosch, Zavrel (2000)   (Correct)

....of a classification problem to stored instances of the same problem. The approach has surfaced in different contexts using a variety of alternative names such as similarity based, example based, exemplar based, analogical, case based, instance based, and lazy learning (Stanfill Waltz, 1986; Cost Salzberg, 1993; Kolodner, 1993; Aha et al. 1991) Historically, memorybased learning algorithms are descendants of the k nearest neighbour (henceforth k NN) algorithm (Cover Hart, 1967; Devijver Kittler, 1982) Memory based learning is lazy as it involves adding training examples (feature value vectors ....

.... be an option to enhance performance, as suggested in (Ricci Aha, 1998) Finally, as (Howe Cardie, 1997) also remark, there is a relation between using specific ( local ) weights for features for each class separately, and the (modified) value difference metric (M)VDM (Stanfill Waltz, 1986; Cost Salzberg, 1993) which estimates the distance between two values (when matching a feature value of a memory instance and a test instance) on the basis of the similarities between these values class distributions. VDM distances are summed over all classes (and normalized) and are strictly speaking not feature ....

Cost, S., & Salzberg, S. (1993). A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10, 57--78.


Stress Assignment In Spanish Proper Names - San-Segundo, Montero..   (Correct)

....metric (equation 4) all the features have the same importance. We can add more knowledge analysing the behaviour of each feature in the training set. The features with higher discrimination power should contribute with a larger weight to the distance. So, we considered a weighted distance, as in [8]: To calculate the weights W , we use the Information Gain (IG) 9] 10] The IG for a specific feature is the difference between the Information Entropy (IE) of the training set, and the average Information Entropy along the different training subsets obtained dividing the training set according ....

S. Cost, and S. Salzberg. "A weighted nearest neighbour algorithm for learning with symbolic features". Machine Learning 10: pp. 57-58, 1993.


Single-Classifier Memory-Based Phrase Chunking - Veenstra, van den Bosch (2000)   (5 citations)  (Correct)

....classifier architecture that could be expected to improve accuracy. Given this restriction we have explored the following: 1. The generalization accuracy of TiMBL with default settings (multi valued features, overlap metric, feature weighting) 2. The usage of MVDM (Stanfill and Waltz, 1986; Cost and Salzberg, 1993) (Section 2) which should work well on word value pairs with a medium or high frequency, but may work badly on word value pairs with low frequency. 3. The straightforward unpacking of feature values into binary features. On some tasks we have found that splitting multi valued features into ....

....f is measured by computing the difference in uncertainty (i.e. entropy) between the situations without and with knowledge of the value of that feature. The resulting IG values can then be used as weights in equation 1. Modified Value Difference Metric The Modified Value Difference Metric (mvdm) (Cost and Salzberg, 1993) estimates the distance between two values of a feature by comparing the class distribution of both features. Mvdm can give good estimates if there are enough occurrences of the two values, but for low frequent values unreliable values of mvdm can occur. For this data we can expect that this ....

S. Cost and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Optical Music Recognition System within a.. - Choudhury..   (1 citation)  (Correct)

.... be easily modified, by continually adding new samples that it encounters into the database, to become an adaptive system (Aha 1997) In fact, the nearest neighbor algorithm is one of the simplest learning methods known, and yet no other algorithm has been shown to outperform it consistently (Cost and Salzberg 1993). Furthermore, the performance of the classifier can be dramatically increased by using weighted feature vectors. Finding a good set of weights, however, is extremely time consuming, thus a genetic algorithm (Holland 1975) is used to find a solution (Wettschereck, Aha, and Mohri 1997) Note that ....

Cost, S. and S. Salzberg (1993). A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning 10, 57--78.


Inductive Lexica - Daelemans, Durieux (2000)   (Correct)

.... features on the basis of their relevance for solving the task (see Wettschereck et al. 1996) for an overview) Another addition to the basic algorithm that has proved relevant for many natural language processing tasks is the introduction of a value difference metric (Stanfill and Waltz, 1986; Cost and Salzberg, 1993). Such a metric assigns different distances to pairs of values for the same attribute. In tagging e.g. such a metric would assign a smaller distance between proper nouns and common nouns than between proper nouns and adjectives, for example. These biases can of course also be manually added to ....

Cost, S. and S. Salzberg: 1993, `A weighted nearest neighbour algorithm for learning with symbolic features'. Machine Learning 10, 57--78.


TiMBL: Tilburg Memory-Based Learner - version 3.0.. - Daelemans, Zavrel.. (2000)   (Correct)

....experiences (as in rule induction and rule based processing) The approach has surfaced in different contexts using a 7 CHAPTER 4. LEARNING ALGORITHMS 8 variety of alternative names such as similarity based, example based, exemplarbased, analogical, case based, instance based, and lazy learning [25, 7, 22, 2, 1]. Historically, memory based learning algorithms are descendants of the k nearest neighbor (henceforth k nn) algorithm [8, 19, 2] An mbl system, visualized schematically in Figure 4.1, contains two components: a learning component which is memory based (from which mbl borrows its name) and a ....

....are seen as equally dissimilar. However, if we think of an imaginary task in e.g. the phonetic domain, we might want to use the information that b and p are more similar than b and a . For this purpose a metric was defined by Stanfill Waltz [25] and further refined by Cost Salzberg [7]. It is called the (Modified) Value Difference Metric (mvdm; equation 4.10) and it is a method to determine the similarity of the values of a feature by looking at co occurrence of values with target classes. For the distance between two values V 1 ; V 2 of a feature, we compute the difference of ....

S. Cost and S. Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78, 1993.


Instance-Family Abstraction in Memory-Based Language Learning - van den Bosch   (Correct)

.... data, as measured in fambl s probing phase (averaged over 10fold cross validation experiments) Table 8 lists the average generalization accuracies pro duced by ib1 gr, fambl gr, and fambl gr mvdm, a variation of fambl that combines gr feature weighting and the mvdm value difference metric (Cost and Salzberg, 1993), on the selected UCI benchmark data sets. We have added the generalization accuracies obtained by c4.5 and c4.5rules (Quinlan, 1993) with default parameter settings, on the same data sets. Both c4.5 and c4.5rules employ gr feature weighting, as a means to decide on feature test ordering, and ....

Cost, S. and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Careful Abstraction from Instance Families in Memory-Based.. - van den Bosch (1999)   (Correct)

....with pure memory based learning. We use the term abstraction here as denoting the forgetting of learning material during learning. This material notion of abstraction is not to be confused with the informational abstraction from learning material exhibited by weighting metrics (Salzberg, 1991; Cost and Salzberg, 1993; Wettschereck, Aha, and Mohri, 1997) and, more generally, by the abstraction bias in all memory based learning approaches that high similarity between instances is to be preferred over lower similarity, and that only the most similar items are to be used as information source for extrapolating ....

....single instances) is unlikely to happen. ib3 may not be careful enough for these types of data. 2.1. 2 Oblivious (partial) decision tree abstraction Many studies have demonstrated the positive effects of using informational abstraction, such as feature weighting, in the distance function (Cost and Salzberg, 1993; Wettschereck, Aha, and Mohri, 1997) on classification accuracy in memory based learning of many types of tasks, including real world learning tasks. This appears to hold for language learning tasks in general, as witnessed by several empirical studies (Weijters, 1991; Daelemans and Van den ....

[Article contains additional citation context not shown here]

Cost, S. and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Rapid Development of NLP Modules with Memory-Based.. - Daelemans, van den.. (1998)   (7 citations)  (Correct)

.... rules abstracted from earlier experiences (as in rule induction and rule based processing) The approach has surfaced in different contexts using a variety of alternative names such as similarity based, example based, exemplar based, analogical, case based, instance based, and lazy learning[23, 7, 17, 2, 3]. Historically, memory based learning algorithms are descendants of the k nearest neighbor (henceforth k NN) algorithm [8, 16, 2] A MBL system, visualized schematically in Figure 1, contains two components: a learning component which is memory based (from which MBL borrows its name) and a ....

....of Metrics Memory Based Learning Architecture Learning Performance Figure 1: General architecture of an MBL system. TIMBL 1 is the name of a software package developed by the ILK group at Tilburg University, containing variations of the memory based learning algorithms IB1,IB1 IG, and MVDM [2, 13, 12, 7], and IGTREE [12] a decision tree optimization of memory based learning. Below, we outline the functioning of IB1 IG and IGTREE in Subsections 2.1 and 2.2, respectively. 1 The TIMBL software package is freely available for research purposes from the ILK web pages; consult URL http: ilk.kub.nl. ....

S. Cost and S. Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78, 1993.


TiMBL: Tilburg Memory Based Learner - version 2.0 -.. - Daelemans, Zavrel.. (1999)   (Correct)

....experiences (as in rule induction and rule based processing) The approach has surfaced in different contexts using a CHAPTER 4. LEARNING ALGORITHMS 7 variety of alternative names such as similarity based, example based, exemplarbased, analogical, case based, instance based, and lazy learning [23, 6, 20, 2, 1]. Historically, memory based learning algorithms are descendants of the k nearest neighbor (henceforth k nn) algorithm [7, 17, 2] An mbl system, visualized schematically in Figure 4.1, contains two components: a learning component which is memory based (from which mbl borrows its name) and a ....

....are seen as equally dissimilar. However, if we think of an imaginary task in e.g. the phonetic domain, we might want to use the information that b and p are more similar than b and a . For this purpose a metric was defined by Stanfill Waltz [23] and further refined by Cost Salzberg [6]. It is called the (Modified) Value Difference Metric (mvdm; equation 4.7) and it is a method to determine the similarity of the values of a feature by looking at co occurrence of values with target classes. For the distance between two values V 1 ; V 2 of a feature, we compute the difference of ....

S. Cost and S. Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78, 1993.


Memory-Based Lexical Acquisition and Processing - Daelemans (1994)   (10 citations)  (Correct)

.... to some distance metric, and extrapolating the category of this item to the new input pattern (applying the consistency heuristic) Instances of this form of nearest neighbour method include instance based learning (Aha et al. 1991] exemplar based learning (Salzberg [1990] Cost and Salzberg [1993]) memory based reasoning (Stanfill and Waltz [1986] and case based reasoning (Kolodner [1993] Advantages of the approach include an often surprisingly high classification accuracy, the capacity to learn polymorphous concepts, high speed of learning, and perspicuity of algorithm and ....

.... (applying the consistency heuristic) Instances of this form of nearest neighbour method include instance based learning (Aha et al. 1991] exemplar based learning (Salzberg [1990] Cost and Salzberg [1993] memory based reasoning (Stanfill and Waltz [1986] and case based reasoning (Kolodner [1993]) Advantages of the approach include an often surprisingly high classification accuracy, the capacity to learn polymorphous concepts, high speed of learning, and perspicuity of algorithm and classification (see e.g. Cost and Salzberg [1993] Learning speed is extremely fast (it consists ....

[Article contains additional citation context not shown here]

Cost, S. and Salzberg, S.: A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning 10, (1993) 57--78.


Fast NP Chunking Using Memory-Based Learning Techniques - Veenstra, Buchholz (1998)   (15 citations)  (Correct)

....experience. In AI, the concept of MBL has appeared in several disciplines (from computer vision to robotics) using names such as: example based, case based, similaritybased, exemplar based, instance based, lazy, analogical or nearest neigbour (Stanfill and Waltz, 1986; Aha, 1991; Kolodner, 1993; Cost and Salzberg, 1993). Similar accounts about this type of analogical reasoning can be found in non mainstream linguistics and psycholinguistics as well (Skousen, 1989; Derwing and Skousen, 1989; Chandler, 1992; Scha, 1992) In computational linguistics the idea of MBL has recently gained some popularity (Daelemans ....

S. Cost and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Memory-Based Lexical Acquisition and Processing - Daelemans (1994)   (10 citations)  (Correct)

.... the most similar memory item according to some distance metric, and extrapolating the category of this item to the new input pattern (applying the consistency heuristic) Instances of this form of nearest neighbour method include instance based learning ( 2] exemplarbased learning ( 21] [5]) memory based reasoning ( 26] and case based reasoning ( 17] Advantages of the approach include an often surprisingly high classification accuracy, the capacity to learn polymorphous concepts, high speed of learning, and perspicuity of algorithm and classification (see e.g. 5] Learning ....

.... ( 21] 5] memory based reasoning ( 26] and case based reasoning ( 17] Advantages of the approach include an often surprisingly high classification accuracy, the capacity to learn polymorphous concepts, high speed of learning, and perspicuity of algorithm and classification (see e.g. [5]) Learning speed is extremely fast (it consists basically of storing patterns) and performance speed, while relatively slow on serial machines, can be considerably reduced by using k d trees on serial machines ( 13] 1 To have reliable results, this process is repeated 10 times with different ....

[Article contains additional citation context not shown here]

Cost, S. and Salzberg, S.: A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning 10, (1993) 57--78.


Protein Folding: Symbolic Refinement Competes with Neural Networks - Susan Craw (1995)   (2 citations)  (Correct)

....uses the Viterbi algorithm, a Markov model approach, for propagation. Delcher et al. claim that explicit modelling of causal links allows the user to experiment with the model and gain understanding of the application, and thus the approach is superior to neural networks. Cost and Salzberg s PEBLS [2] uses a nearest neighbour algorithm both to build a casebased library from training examples and to retrieve future predictions from the library. Testing has experimented with various distance metrics and window sizes; the best results for the Chou Fasman domain being achieved with Manhattan ....

....of KRUST s refined theories and other systems described in Section 3. We have tended to quote other results from Maclin Shavlik [9] and Delcher et al. 6] since we all used the same 128 example proteins, and adopted the same train and test strategy. We note however that although Cost Salzberg [2] use the same set of proteins, they assign 100 proteins for training and only the remaining 28 are used for testing. Since biochemists are particularly interested in accurately identifying helix and strand sites, the overall predictive accuracy has also been broken down into the accuracies for the ....

[Article contains additional citation context not shown here]

S. Cost and S. Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10(1):57--78, 1993.


Machine Learning in Molecular Biology Sequence Analysis - Chan (1991)   (1 citation)  (Correct)

....to the test instance. The classification of the test instance is then based on this selected example(s) Each exemplar in memory is associated with a weight that reflects its effectiveness in correct prediction and is incorporated into the distance metric. This weight is adjusted during training (Cost and Salzberg, 1990b) For example, using the cup and non cup descriptions in the previous section, we can transform Table 1 to Table 2. The binary substitutions are: no = 0 and yes = 1. COLOR and MATERIAL, which are not binary descriptors, are omited from Table 2 for simplicity. Section 4.1.2 describes a method ....

....King s and Seshu et al. s approaches besides the representation of rules. King uses a greedy approach to generate rules to cover the examples while Seshu et al. keep generating a new set of rules based on an enhanced set of features until accuracy cannot be improved. Exemplar based Learning Cost and Salzberg (1990a) used an exemplar based learning approach. Based on the examples, they use the value difference metric (Stanfill and Waltz, 1986) to generate distance tables for each symbolic feature. This metric provides a numeric distance measure between two values of a symbolic feature and is defined as: ....

[Article contains additional citation context not shown here]

Cost, S. and Salzberg, S. (1990b). A weighted nearest neighbour algorithm for learning with symbolic features. Technical Report JHU-90/11, Department of Computer Science, Johns Hopkins University, Baltimore, MD.


TiMBL: Tilburg Memory-Based Learner - version 1.0 - .. - Daelemans, Zavrel, .. (1998)   (Correct)

.... mental rules abstracted from earlier experiences (as in rule induction and rule based processing) The approach has surfaced in different contexts using a variety of alternative names such as similarity based, example based, exemplarbased, analogical, case based, instance based, and lazy learning [22, 5, 19, 2, 1]. CHAPTER 3. LEARNING ALGORITHMS 5 Historically, memory based learning algorithms are descendants of the k nearest neighbor (henceforth k nn) algorithm [6, 16, 2] An mbl system, visualized schematically in Figure 3.1, contains two components: a learning component which is memory based (from ....

....8 are seen as equally dissimilar. However, if we think of an imaginary task in e.g. the phonetic domain, we might want to use the information that b and p are more similar than b and a . For this purpose a metric was defined by Stanfill Waltz [22] and further refined by Cost Salzberg [5]. It is called the (Modified) Value Difference Metric (mvdm; equation 3.7) and it is a method to determine the similarity of the values of a feature by looking at co occurrence of values with target classes. For the distance between two values V 1 ; V 2 of a feature, we compute the difference of ....

S. Cost and S. Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78, 1993.


Towards a Theory of Optimal Similarity Measures - Griffiths, Bridge (1997)   (1 citation)  (Correct)

....other form of similarity measure than the weighted similarity measure oe w . It would be interesting to instantiate equation (9) with a prior distribution corresponding to the bias toward naturally occurring classification problems assumed in general purpose learners such as IB4 [1] or PEBLS [3], and to compare the learner derived that way with existing instance based learners. 2. However, if weighted, dimensional similarity measures are abandoned in order to achieve better generalisation accuracy and more efficient learning, then V S CBR2 and V S CBR3 demonstrate that some other ....

S Cost and S Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10(1):37--66, March 1993.


Resolving PP attachment Ambiguities with Memory-Based.. - Zavrel, Daelemans, Veenstra (1997)   (15 citations)  (Correct)

....hand, e.g. by replacing words with semantic labels, but again we prefer a more empiricist approach in which distances between values of the same feature are computed differentially on the basis of properties of the training set. To this end, we use the Modified Value Difference Metric (MVDM) see Cost and Salzberg (1993); a variant of a metric first defined in Stanfill and Waltz (1986) This metric (Equation 5) computes the frequency distribution of each value of a feature over the categories. Depending on the similarity of their distributions, pairs of values are assigned a distance. ffi(V 1 ; V 2 ) n X ....

Cost, S. and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


On Concept Space and Hypothesis Space in Case-Based Learning.. - Griffiths (1995)   (4 citations)  (Correct)

....representation hCB; oei depends on the interaction between the available cases and the similarity measure, a casebased or instance based learning algorithm may alter its hypothesis by manipulating either of the two components [15, p. 79] The algorithms IB2 [1] VS CBR [15] and PEBLS [5], for example, each show different ways of adjusting the represented hypothesis via changes to the case base and or the similarity measure. In the current paper, we restrict our study to the following family of very simple case based learning algorithms. Definition 1. CB1(oe) Learning Algorithm ....

S Cost and S Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10(1):37--66, Mar 1993.


Skousen's Analogical Modeling Algorithm: A comparison.. - Daelemans, Gillis.. (1994)   (1 citation)  (Correct)

....by retrieving the most similar memory items according to some distance metric, and extrapolating the category of these items to the new input pattern. Instances of this form of nearest neighbour method include instance based learning (Aha et al. 1991) exemplar based learning (Salzberg, 1991; Cost and Salzberg, 1993), and memory based reasoning (Stanfill and Waltz, 1986) The approach has been applied to a wide range of problems using not only numeric and binary values (for which nearest neighbour methods are traditionally used) but also using symbolic, unordered features. Advantages of the approach ....

....methods are traditionally used) but also using symbolic, unordered features. Advantages of the approach include an often surprisingly high classification accuracy, the capacity to learn polymorphous concepts, high speed of learning, and perspicuity of algorithm and classification (Cost and Salzberg, 1993). Aha et al. Aha et al. 1991) have shown that the basic instance based learning algorithm can pac learn any concept whose boundary is a union of a finite number of closed hypercurves of finite size (a class of concepts similar to that which ID3 and backpropagation can learn) Training speed is ....

[Article contains additional citation context not shown here]

Cost, S. and Salzberg, S. (1993). A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.


Protein Folding: Symbolic Refinement Competes with Neural.. - Craw, Hutton (1995)   (2 citations)  (Correct)

....(to be assigned patterns of structures) and evidence nodes (containing the amino acids) are connected. Firstly, belief propagation provides a probabilistic model of the protein structure and secondly, belief is propagated by the Viterbi algorithm, a Markov model approach. Cost and Salzberg s PEBLS (Cost and Salzberg, 1993) uses a nearest neighbour algorithm both to build a case based library from training examples and to retrieve future predictions from the library. Experiments have varied the distance metrics and window sizes; Manhattan distance, weighted library cases, a window size of 19 and post processing to ....

.... refined theories give comparable accuracy to FSkbann 3 for ff helices and increased accuracy for fi2 The Chou Fasman, ANN and FSkbann figures are extracted from (Maclin and Shavlik, 1993) The Belief Propagation and Viterbi figures appear in (Delcher et al. 1993) PEBLS results are from (Cost and Salzberg, 1993). 3 All statements are statistically significant at the 5 level unless otherwise stated. In the absence of detailed results for the other systems we have assumed a similar variance to our testing, since in general they used the Table 5: Accuracy Method All ff fi fl Chou Fasman 57.3 31.7 ....

[Article contains additional citation context not shown here]

Cost, S. and Salzberg, S. (1993). A weighted nearest neighbour algorithm for learning with symbolic features.


Instance-Based Learning: Nearest Neighbour with Generalisation - Martin (1995)   (Correct)

....problem, although it reduces its effect by reducing the proportion of cases in which the distance function determines the class. One solution is to construct a value difference metric (Stanfill and Waltz, 1986) which records a computed difference between each value and each other value. PEBLS (Cost and Salzberg, 1994) achieves this by making two passes over the data. 48 During the first pass, it uses Bayesian probability to compute the value distance metric for all possible pairs of attribute values. During the second, it adopts these metrics to aid classification and learning. This approach is very ....

....will therefore favour the new example over all previous exemplars until it has accumulated an appreciable number of positive and negative predictions to weight it more fairly. A better method might be to give each new example an average weighting, say the average of all current exemplar weights. Cost and Salzberg (1994) suggest that the new exemplar be given the weighting of its nearest neighbour of the same class. The new exemplar therefore begins with a weighting from a point nearby in the problem space. This assumes that if a point in space has a particular prototypicality, the area close to it will be ....

Cost, C. and Salzberg, S. (1994) A weighted nearest neighbour algorithm for learning with symbolic features. Unpublished paper.


Inductive Bias in Case-Based Reasoning Systems - Griffiths, Bridge (1995)   (1 citation)  (Correct)

....are three possibilities to improve a case based system: ffl store new cases in the case base CB ffl change the measure of similarity [oe] ffl change CB and [oe] WG94, p. 79] Many case based learning algorithms have been defined illustrating these options; IB2 [AKA91] VS CBR [WG94] and PEBLS [CS93] [YJL94] show a number of options for adjusting the represented hypothesis. The current section will study the situation where concepts are learnt using a single fixed similarity measure, and the hypothesis is updated by alterations to the case base alone. Specifically, having defined a simple ....

S Cost and S Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10(1):37--66, March 1993.


K*: An Instance-based Learner Using an Entropic Distance Measure - Cleary, Trigg (1995)   (5 citations)  (Correct)

....a further extension to improve tolerance to noisy data; instances that have a sufficiently bad classification history are forgotten, only instances that have a good classification history are used for classification. Aha (1992) described IB4 and IB5, which handle irrelevant and novel attributes. Cost Salzberg (1993) use a modification of Stanfill Waltz s (1986) value difference metric in conjunction with an instance weighting scheme in their system PEBLS. This scheme was designed for classifying objects where feature values are symbolic. Numeric distances between symbolic values are calculated based on the ....

Cost, S. & Salzberg, S. (1993) "A Weighted Nearest Neighbour Algorithm for Learning with Symbolic Features." Machine Learning 10, pp. 57-78.


A Weighted Polynomial Information Gain Kernel for.. - Phrase Attachment.. (2003)   (Correct)

No context found.

Scott Cost and Steven Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57 -- 78, 1993.


Inductive Generalisation in Case-Based Reasoning Systems - Griffiths (1996)   (1 citation)  (Correct)

No context found.

S Cost and S Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10(1):37--66, March 1993.


An Experimental Comparison of Genetic and Classical Concept .. - Kokai, Toth, Zvada (2002)   (Correct)

No context found.

Cost, S., Salzberg, S.: A weighted Nearest Neighbour Algorithm for Learning with Symbolic Features, Machine Learning, 10, pp. 57-58


A Weighted Polynomial Information Gain Kernel for.. - Phrase Attachment.. (2003)   (Correct)

No context found.

Scott Cost and Steven Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57 -- 78, 1993.


Metrics for Classifying Heterogeneous Objects - Bezem, Blok, Keijzer (1998)   (Correct)

No context found.

S. Cost and S. Salzberg. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning 10(1): 57--78, 1993.


MBT: A Memory-Based Part of Speech Tagger-Generator - Daelemans, Zavrel, Berck.. (1996)   (48 citations)  (Correct)

No context found.

Cost, S. and Salzberg, S. (1993). `A weighted nearest neighbour algorithm for learning with symbolic features.' Machine Learning, 10, 57--78.


Resolving PP attachment Ambiguities with Memory-Based.. - Zavrel, Daelemans, Veenstra (1997)   (15 citations)  (Correct)

No context found.

Cost, S. and S. Salzberg (1993). A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning 10, 57--78.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC