111 citations found. Retrieving documents...
Kira, K. and Rendell, L. (1992a). The feature selection problem: traditional methods and new algorithm. In Proc. AAAI'92, San Jose, CA.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

Documents 51 to 100  Previous 50  Next 50

Machine Learning in Prognosis of the Femoral Neck.. - Kukar, Kononenko.. (1996)   (4 citations)  (Correct)

....labeled with the calculated class probability distribution and are used for classification in the same manner as ordinary leaves. The main di#erence between Assistant and its reimplementation Assistant R is that RELIEFF is used for attribute selection. RELIEFF [16] is an extension of RELIEF [10, 11]. The key idea of RELIEF is to estimate attributes according to how well their values distinguish among the instances that are near to each other. For that purpose, given an instance, RELIEF searches for its two nearest neighbors: one from the same class (called nearest hit ) and the other from ....

K. Kira and L. Rendell. The feature selection problem: traditional methods and new algorithm. In Proc. AAAI'92, San Jose, CA, 1992.


Feature Selection via Discretization - Liu, Setiono (1997)   (3 citations)  (Correct)

....can in general improve their predictive accuracy, shorten the learning period, and form simpler concepts. There are abundant feature selection algorithms. Some use methods like principle component to compose a smaller number of new features [11,12] some select a subset of the original attributes [1,5]. This paper considers the latter since it not only has the above virtues, but also serves as an indicator on what kind of data (along those selected features) should be collected. In the latter category of feature selection, the algorithms can be further divided in terms of data types. The two ....

....data types. The two basic types of data are nominal (e.g. attribute color may have values of red, green, yellow) and ordinal (e.g. attribute winning position can have values of 1, 2, and 3, or attribute salary can have 22345.00, 46543.89, etc. as its values) Many feature selection algorithms [1,3,5,10] are shown to work effectively on discrete data or even more strictly, on binary data (and or binary class value) In order to deal with numeric attributes, a common practice for those algorithms is to discretize the data before conducting feature selection. This paper provides a way to select ....

[Article contains additional citation context not shown here]

K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In AAAI-92, Proceedings Ninth National Conference on Artificial Intelligence, pages 129--134. AAAI Press/The MIT Press, 1992.


Hybrid Search of Feature Subsets - Dash, Liu (1998)   (2 citations)  (Correct)

....does make feature selection simpler and the class separability of the original data can be retained. This aspect of feature selection is related to the study of search strategies. Extensive research effort has been devoted to this study [19, 18, 10] Examples are Branch Bound [15, 18] Relief [6, 9], Wrapper methods [7] Approximate Markov Blanket [8] and LVF [13] The search process starts with either an empty set or a full set. For the former, it expands the search space by adding one feature at a time (Step wise Forward Selection) an example is Focus [1] for the latter, it expands the ....

K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129--134. Menlo Park: AAAI Press/The MIT Press, 1992.


GD: A Measure based on Information Theory for.. - Lorenzo.. (1998)   (Correct)

....approaches have been proposed to select the more relevant attributes that de ne a class. Some works on attribute selection were the WINNOW algorithm proposed by Littlestone [15] the FOCUS algorithm proposed by Almuallim and Dietterich [3] and the Relief algorithm proposed by Kira and Rendell [11]. All these algorithms share as a common characteristic that they do not include the performance of the classi er as a measure to guide the selection of the attributes. John et al. 10] propose the wrapper approach that utilizes the performance of the classi er to carry out the selection of the ....

Kenji Kira and Larry A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proc. of the 10th National Conf. on Articial Intelligence, pages 129-134, 1992.


Some Issues on Scalable Feature Selection - Huan Liu Rudy (1998)   (1 citation)  (Correct)

....interested in improving the performance of their algorithms and in cleaning their data. In handling large databases, feature selection is even more important since many learning algorithms may falter or take too long to run before the data is reduced. Most feature selection methods (refer to (Kira Rendell, 1992; John et al. 1994; Blum Langley, 1997) can be grouped into two categories: exhaustive or heuristic search of an optimal set of M attributes. For example, Almuallim and Dietterich s FOCUS algorithm (Almuallim Dietterich, 1994) starts with an empty feature set and carries out exhaustive ....

....to construct a hypothesis consistent with a given set of examples. It works on binary, noise free data. Its time complexity is O(min(N M ; 2 N ) They proposed three heuristic algorithms to speed up the search. There are many heuristic feature selection algorithms. The Relief algorithm (Kira Rendell, 1992) assigns a relevance weight to each feature, which is meant to denote the relevance of the feature to the target concept. Relief samples instances randomly from the training set and updates the relevance values based on the difference between the selected instance and the two nearest instances of ....

Kira, K., & Rendell, L. (1992). The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 129--134). Menlo Park: AAAI Press/The MIT Press.


Feature Subset Selection by Bayesian networks based.. - Inza.. (1999)   (5 citations)  (Correct)

....is known as filter approach. Mainly inspired on these statistical measures, in the 90 s, more complex filter measures which do not use the final induction algorithm in the evaluation function generate new FSS algorithms, such as FOCUS (Almuallin and Dietterich [4] RELIEF (Kira and Rendell [40]) Cardie s algorithm [16] Koller and Sahami s work with probabilistic concepts [48] or the named Incremental Feature Selection (Liu and Setiono [57] Nowadays, the filter approach is receiving considerable attention from the Data Mining community to deal with huge databases when the wrapper ....

K. Kira, L.A. Rendell, The feature selection problem: Traditional methods and a new algorithm, in: Proceedings of the Tenth National Conference on Artificial Intelligence, San Jose, CA, 1992, pp. 129-134.


Efficient and Scalable Pareto Optimization by.. - Menczer, Degeratu.. (2000)   (2 citations)  (Correct)

....to determine, for a given problem, the relative importance of these two competing objectives. The combinatorial feature selection problem has been studied extensively in the machine learning literature, with approximate solutions being found using both filter and wrapper models (John et al. 1994; Kira and Rendell, 1992), and exact solutions found using integer programming (Narendra and Fukunaga, 1977) Further, iterative selection of feature subsets is fundamental to traditional predictive techniques such as regression and decision tree construction. The problem of optimally (in a least norm error sense) ....

Kira, K. and Rendell, L. (1992). The feature selection problem: Traditional methods and a new algorithm. In Proc. 10th National Conference on Artificial Intelligence, pages 129--134, San Mateo, CA. Morgan Kaufmann.


Implicit Feature Selection with the Value Difference Metric - Payne, Edwards   (Correct)

....in the test instance is irrelevant) then the resulting attribute distance will also be small and have little impact on the choice of nearest neighbour. Many Nearest Neighbour learning algorithms employ weights to modify the effect a specific component has in the resulting classification process [1, 8, 15, 18]. For example, PEBLS [5] and EACH [15] assign a weight to each of the instances (or hyper rectangles in the case of EACH) and modify this weight according to whether the instances result in correct or incorrect class predictions. The weight is used to measure the reliability of an instance, and ....

....for incorrect ones. Thus, the contribution of irrelevant attributes to the classification task falls as the contribution of other attributes rises. The resulting weights can be used to determine which attributes should be retained in the attribute subset, and which attributes should be discarded [8]. An alternative approach is to use the weights to control the influence that each attribute has on the distance between two instances. Those attributes which are awarded low weights will have a diminished effect on the resulting class predictions. 4 EVALUATION OF THE VDM FOR IMPLICIT FEATURE ....

K. Kira and L.A. Rendell, `The Feature Selection Problem: Traditional Methods and a New Algorithm', in Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-92), pp. 129--134. MIT Press, (1992).


Feature Selection for Classification - Dash, Liu (1997)   (36 citations)  (Correct)

....it from various angles. But as expected, many of those are similar in intuition and or content. The following lists those that are conceptually different and cover a range of definitions. 1. Idealized: find the minimally sized feature subset that is necessary and sufficient to the target concept [22]. 2. Classical: select a subset of M features from a set of N features, M N, such that the value of a criterion function is optimized over all subsets of size M [34] 3. Improving Prediction accuracy: the aim of feature selection is to choose a subset of features for improving prediction ....

....present, and future categories. Their main focus was the branch and bound methods [34] and its variants, 16] No experimental study was conducted in this paper. Their survey was published in the year 1987, and since then many new and efficient methods have been introduced (e.g. Focus [2] Relief [22], LVF [28] Doak followed a similar approach to Siedlecki and Sklansky s survey and grouped the different search algorithms and evaluation functions used in feature selection methods independently, and ran experiments using some combinations of evaluation functions and search procedures. In this ....

[Article contains additional citation context not shown here]

Kira, K. and Rendell, L.A., The feature selection problem: Traditional methods and a new algorithm. In: Proceedings of Ninth National Conference on Artificial Intelligence, 129--134, 1992.


Feature Selection vs Theory Refomulation: a Study of Genetic.. - Burns, Danyluk   (Correct)

....performs an exhaustive search of all feature subsets and finds the smallest subset of features which has the property that if any two data points in the feature subset agree, their classification agrees. Focus, however, is computationally expensive and unable to handle noisy data. The Relief (Kira Rendell, 1992) algorithm selects the set of relevant features and filters data for C4.5. Relief computes relevance by examining an arbitrary number of data points chosen at random and producing a weight which represents their summed square difference from the nearest positively and nearest negatively ....

Kira, K. and Rendell, L. (1992). The feature selection problem: Traditional methods and a new algorithm. The Proceedings of the Tenth National Conference on Artificial Intelligence.


Evolving Heterogeneous Neural Agents by Local Selection - Menczer, Street, Degeratu (2000)   (1 citation)  (Correct)

....out of sample error. John [26] makes the distinction between wrapper models, which choose feature subsets in the MIT Press Math6X9 1999 09 30:19:43 Page 17 18 Filippo Menczer, W. Nick Street, and Melania Degeratu context of the classification algorithm, and filter models, such as Relief [27], which choose a subset before applying the classification method. Bradley et al. 4] build feature minimization directly into the classification objective, and solve using parametric optimization. The method described here uses the evolutionary algorithm to search through the space of possible ....

K. Kira and L. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129--134, San Mateo, CA, 1992. Morgan Kaufmann.


Modeling Languages and Condor: Metacomputing for Optimization - Ferris, Munson (1998)   (Correct)

....selection problem chooses a small number of the data characteristics with the best predictive capability. This problem is applicable in numerous situations and is becoming increasingly important, especially in data mining. Many approaches to solving the problem have been postulated and used [5, 6, 7, 20, 21, 22, 23, 29]. The method presented in this paper generates a large number of independent mixed integer programs (MIP) To make this technique practical, we need to perform the individual optimizations in parallel. Rather than require a large parallel computer, we utilize a metacomputer, a confederation of ....

K. Kira and L. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129--134, San Mateo, CA, 1992. Morgan Kaufmann.


INDIGENT: Genetically Refining Expert Neural Networks - Burns (1998)   (Correct)

....performs an exhaustive search of all feature subsets and finds the smallest subset of features which has the property that if any two data points in the feature subset agree, their classification agrees. Focus, however, is computationally expensive and unable to handle noisy data. The Relief [KR92] algorithm selects the set of relevant features and filters data for c4.5. Relief computes relevance by examining an arbitrary number of data points chosen at random and producing a weight which represents their summed square difference from the nearest positively and nearest negatively ....

K. Kira and L. Rendell. The feature selection problem: Traditional methods and a new algorithm. In The Proceedings of the Tenth National Conference on Artificial Intelligence, 1992.


Modeling Languages and Condor: Metacomputing for Optimization - Ferris, Munson (1998)   (Correct)

....selection problem chooses a small number of the data characteristics with the best predictive capability. This problem is applicable in numerous situations and is becoming increasingly important, especially in data mining. Many approaches to solving the problem have been postulated and used [5, 6, 7, 20, 21, 22, 23, 29]. The method presented in this paper generates a large number of independent mixed integer programs (MIP) To make this technique practical, we need to perform the individual optimizations in parallel. Rather than require a large parallel computer, we utilize a metacomputer, a confederation of ....

K. Kira and L. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129--134, San Mateo, CA, 1992. Morgan Kaufmann.


Decomposition of Heterogeneous Classification Problems - Apte, Hong, Hosking.. (1998)   (6 citations)  (Correct)

....class, disregarding other features. Toovercome this problem, new merit measures for features have recently been developed. They takeinto account the presence of other features that mayinteract in imparting information about the class. They include the RELIEF measure developed by Kira and Rendell [4] and its follow on RELIEFF developed by Kononenko et al. 3] and the contextual merit (CM) developed by Hong [5] These new measures require more computation than the twomyopic varieties, but are more robust in general. Wenow describe CM in more detail. CM assigns merit to a feature taking ....

Kira, K., and Rendell, L., The feature selection problem: traditional methods and a new algorithm, in Proceedings of AAAI--92, 129--134, 1992.


Use of Contextual Information for Feature Ranking and Discretization - Hong (1997)   (12 citations)  (Correct)

....The techniques we describe here have been implemented in the RAMP (Rule Abstraction for Modeling and Prediction) system [7] We summarize some benchmarking experiments and real problem experience using the RAMP system in a later section. The closest approachestowhatispresented here are RELIEF [8] and its more recent follow on RELIEFF [9] We will briefly contrast our approach to these in the concluding section. 2 Common characteristics of traditional decision tree generation Determining the relative importance of features is at the heart of all practical decision tree generation ....

....refining technique will greatly improve the run time of NFD. 8) Wehave used the contextual merits of features only in a relative sense. The absolute magnitude of the merits should be understood better and made use of. We mentioned earlier that the RELIEF algorithm proposed by Kira and Rendell [8]was found to be the closest approach to ours. Briefly, this is how RELIEF works. For a randomly chosen example, find one nearest example in the same class and one in the counter class. Distance function is similar to (1 3) except without a ramp threshold) Their Relevance variable, ....

K. Kira and L. Rendell. The Feature Selection Problem: Traditional Methods and a New Algorithm. In Proceedings of AAAI--92, pages 129--134, 1992.


Data Mining in Medicine: Selected Techniques and Applications - Lavrac (1998)   (Correct)

.... and GCS (evaluation of coma according to the Glasgow Coma Scale) Recent implementations of the ASSISTANT algorithm include ASSISTANT R and ASSISTANT R2 [27] Instead of the standardly used informativity search heuristic, ASSISTANTR employs ReliefF as a heuristic for attribute selection [24, 21]. This heuristic is an extension of RELIEF [20, 21] which is a non myopic heuristic measure that is able to estimate the quality of attributes even if there are strong conditional dependencies between attributes. In addition, wherever appropriate, instead of the relative frequency, ASSISTANT R ....

.... the Glasgow Coma Scale) Recent implementations of the ASSISTANT algorithm include ASSISTANT R and ASSISTANT R2 [27] Instead of the standardly used informativity search heuristic, ASSISTANTR employs ReliefF as a heuristic for attribute selection [24, 21] This heuristic is an extension of RELIEF [20, 21] which is a non myopic heuristic measure that is able to estimate the quality of attributes even if there are strong conditional dependencies between attributes. In addition, wherever appropriate, instead of the relative frequency, ASSISTANT R uses the m estimate of probabilities, which typically ....

Kira, K., Rendell, L. (1992) The feature selection problem: traditional methods and new algorithm. Proc. AAAI'92, San Jose, CA, July 1992.


Towards an Evolutionary Algorithm: A Comparison of Two Feature.. - Chen, Liu   (Correct)

....that every feature has a 0.5 probability to be on off even when we have some knowledge about features after generating many subsets of features. The crux of the matter is how we modify this probability for each feature without significantly increasing the complexity of LVF. The work of Relief [14] and the work on the SpinGlass problem [4] have inspired us to approach the degradation problem with LVF via another perspective. Relief samples the data to assign update a weight to each feature relying on concepts of near miss and near hit which are two nearest neighbors of I (one from each ....

K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conferenceon Artificial Intelligence, pages 129--134. Menlo Park: AAAI Press/The MIT Press, 1992.


Induction of decision trees using RELIEFF - Kononenko, Simec (1995)   (5 citations)  (Correct)

....weight of evidence (Michie Al Attar, 1992) and j measure (Smyth Goodman, 1990) However, all these measures assume that attributes are independent and therefore in domains with strong dependencies between attributes the greedy search has poor chances of revealing a good hypothesis. Recently, Kira and Rendell (1992a,b) developed an algorithm called RELIEF, which was shown to be very efficient in estimating the quality of attributes. For example, in the parity problems of various degrees with a significant number of irrelevant (random) additional attributes RELIEF is able to correctly estimate the relevance ....

.... information gain instead of RELIEFF and to the results of the naive Bayesian classifier that uses the m estimate of probabilities (Cestnik, 1990) In addition we discuss also two other approaches: ffl the use of RELIEFF as a preprocessor for eliminating irrelevant attributes, as suggested by Kira and Rendell (1992a,b) ffl dealing with multiple class problems by generating a series of decision trees, each for one class; this is one possible approach to the extension of RELIEF to deal with decision problems with more than two classes, as suggested by Kira and Rendell (1992a,b) In the next section we ....

[Article contains additional citation context not shown here]

Kira K. & Rendell L. (1992b) The feature selection problem: traditional methods and new algorithm. Proc. AAAI'92, San Jose, CA, July 1992.


An Efficient Two-Step Method for Classification of Spatial.. - Koperski, Han, Stefanovic   (8 citations)  (Correct)

....of objects. For example, one can use Minimum Bounding Rectangles (MBRs) to find coarse g close to predicates which imply that MBRs of two objects are within specific distance threshold. Then, some machine learning methods may be used for the extraction of the relevant predicates or functions [12, 23]. In our experiments we used RELIEF algorithm [12] whose algorithmic description is presented in Table 4. The RELIEF algorithm uses nearest neighbor approach to find relevant predicates. For every object s in the sample two nearest neighbors are found where one neighbor belongs to the same class ....

....Rectangles (MBRs) to find coarse g close to predicates which imply that MBRs of two objects are within specific distance threshold. Then, some machine learning methods may be used for the extraction of the relevant predicates or functions [12, 23] In our experiments we used RELIEF algorithm [12] whose algorithmic description is presented in Table 4. The RELIEF algorithm uses nearest neighbor approach to find relevant predicates. For every object s in the sample two nearest neighbors are found where one neighbor belongs to the same class as object s (nearest hit) and the other neighbor ....

[Article contains additional citation context not shown here]

K. Kira, and L. A. Rendell. The Feature Selection Problem: Traditional Methods and a New Algorithm. Proc. of the Tenth National Conference on Artificial Intelligence AAAI-92 pp. 129--134, Cambridge, MA, MIT Press.


Learning Bayesian Networks Using Feature Selection - Provan, Singh (1995)   (17 citations)  (Correct)

....model filters out less relevant features using an algorithm different from the induction algorithm used for the learning, and a wrapper model uses induction algorithm itself for feature selection. Three filter model approaches that have been taken are: the FOCUS algorithm [2] the Relief algorithm [14, 15] (which Kononenko has extended in [16] and an extended nearest neighbor algorithm [5] Wrapper based approaches have been studied in [13, 6, 18] among others. 9 A growing consensus in this research is that the success of feature selection is strongly correlated to the data itself, as well as ....

K. Kira and L. Rendell. The Feature Selection Problem: Traditional Methods and a New Algorithm. In Proc. AAAI, pages 129--134, Minneapolis, MN, 1992. AAAI Press.


On Growing Better Decision Trees from Data - Murthy (1997)   (17 citations)  (Correct)

....feature subset selection methods in the machine learning community, resulting in several empirical evaluations. Some of these studies produced interesting insights on how to increase the efficiency and effectiveness of the heuristic search for good feature subsets. For examples of this work, see [251, 276, 69, 113, 336, 3]. Composite features Sometimes the aim is not to choose a good subset of features, but instead to find a few good composite features, which are arithmetic or logical combinations of the atomic features. In the decision tree literature, Henrichon and Fu [206] were probably the first to discuss ....

Kenji Kira and Larry A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In AAAI-92 [7], pages 129--134.


Feature Subset Selection as Search with Probabilistic Estimates - Kohavi (1994)   (17 citations)  (Correct)

....of the relevant features that optimizes some performance function, usually prediction accuracy. The pattern recognition literature (Devijver Kittler 1982) statistics literature (Miller 1990; Neter, Wasserman, Kutner 1990) and recent machine learning papers (Almuallim Dietterich 1991; Kira Rendell 1992; Kononenko 1994) consist of many such measures that are all based on the data alone. Most measures in the pattern recognition and statistics literature are monotonic, i.e. for a sequence of nested feature subsets F 1 F 2 Delta Delta Delta F k , the measure f obeys f(F 1 ) f(F 2 ) ....

Kira, K., and Rendell, L. A. 1992. The feature selection problem: Traditional methods and a new algorithm.


Intelligent Data Analysis in Medicine and Pharmacology - Lavrac, Keravnou, (eds.) (1997)   (4 citations)  (Correct)

No context found.

Kira, K. and Rendell, L. (1992a). The feature selection problem: traditional methods and new algorithm. In Proc. AAAI'92, San Jose, CA.


Self-Monitoring in VINO - Saleeb (1999)   (Correct)

No context found.

Kira, K. and Rendell, L. The feature selection problem: Traditional methods and a new algorithm In Proceedings of the Tenth National Conference on Artificial Intelligence, pp. Page 22 134. MIT Press Cambridge, MA 1992.

Documents 51 to 100  Previous 50  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC