| Setiono, R., and Liu, H. A probabilistic approach to feature selection-a filter solution. In Proceedings of International Conference on Machine Learning, 319-327 (1996). |
....65 0.1 5.0 Sigma0.0 4.3 Sigma3.5 AHOC TABLE III Performance Statistics of feature selection by ANNIGMA wrapper fitness function is designed to combine the classification accuracy and the cost of using a set of features. Dash and Liu [12] proposed a hybrid algorithm of probabilistic search [13] and complete search to take advantage of both algorithms. It begins with Las Vegas filter(LVF) 4] a probabilistic feature selection algorithm, to reduce the number of features and then runs Automatic Branch Bound (ABB) a complete search algorithm. B. Comparisons The algorithms that are ....
....[4] a probabilistic feature selection algorithm, to reduce the number of features and then runs Automatic Branch Bound (ABB) a complete search algorithm. B. Comparisons The algorithms that are surveyed for the comparison including the original wrapper [1] Las Vegas filter(LVF) [13], Las Vegas wrapper(LVW) 4] neural net feature selector (NNFS) 9] hybrid approach(hybrid) 12] informationtheoretic filter(INFO) 14] and AHOC genetic algorithm (AHOC) 8] The last column of Table III shows these performance data. The comparison is made on the basis of both the number ....
H. Liu and R. Setiono, "A probabilistic approach to feature selection - a filter solution," in Proceedings of the 13th International Conference on Machine Learning (ICML'96), Bari, Italy, 1996, pp. 319--327.
....and Freeman [BBF00] As its first order of business, EDA eliminates inappropriate attributes and reduces the cardinality of the retained categorical attributes. Next it provides attribute selection. Different attribute selection methods exist. inconsistency rates are utilized by Liu Setiono [LS96]. The concept of a Markov blanket is used in Koller Sahami [KS96] While there are others methods (for example, see Jebara Jaakkola [JJ00] most are used primarily for predictive and not descriptive mining and thus do not address general purpose attribute selection for clustering. Attributes ....
Liu, H. and Setiono, R. A probabilistic approach to feature selection - a filter solution. Machine Learning: ICML'96, 319-327, Bari, Italy, July 1996.
....it comes from. Lanzi [51] developes a binary coded GA in which each binary digit stands for the presence (1) or the absence (0) of a given feature. The algorithm uses standard genetic operators, crossover and mutation without modification, and a fitness function based on the inconsistency rate [54] introduced by the feature elimination. Other authors ( 50] and [55] have proposed some approximate measures to evaluate feature subsets without applying an inductive learning algorithm that can be used in the fitness function in a filter GA for feature selection. 3.2 Genetic Feature ....
Liu, H. and Setiono, R., (1996) "A probabilistic approach to feature selection: a filter solution," Proceedings of the 13th International Conference on Machine Learning (ICML'96).
....continuous features into discrete ranges during the construction of a decision tree. Many of the feature selection algorithms described in the next chapter require continuous features to be discretized, or give superior results if discretization is performed at the outset [AD91, HNM95, KS96b, LS96] Discretization is used as a preprocessing step for the correlation based approach to feature selection presented in this thesis, which requires all features to be of the same type. This section describes some discretization approaches from the machine learning literature. 1 CART [BFOS84] M5 ....
....feature in the subset. The third algorithm is like the second except that each positive negative example pair contributes a weighted increment to the score of each feature that discriminates it. The increment depends on the total number of features that discriminate the pair. Liu and Setiono [LS96] describe an algorithm similar to FOCUS called LVF. Like FOCUS, LVF is consistency driven and, unlike FOCUS, can handle noisy domains if the approximate noise level is known a priori. LVF generates a random subset S from the feature subset space during each round of execution. If S contains fewer ....
H. Liu and R. Setiono. A probabilistic approach to feature selection: A filter solution. In Machine Learning: Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann, 1996.
....all datasets, while BSE is effective for datasets with a large number of original attributes. ffl The ANNIGMA wrapper approach is computationally feasible. Recently, many clever feature selection techniques for neural nets were proposed, ranging from filter based approaches to genetic algorithms [8, 9, 10, 11, 12, 13]. Comparing our experimental results with the performance data reported in their papers, we found surprisingly that our simple approach outperforms their sophisticated approaches in almost all test datasets. This suggests that feature selection for neural nets might not be as difficult as ....
.... the UCI Machine Learning Repository [7] These datasets were chosen to include datasets with various characteristics and to maximize the comparability of the ANNIGMA wrapper approach with other published approaches to feature subset selection, especially those concerning neural network induction [6, 8, 10, 11] 14 3.1 Dataset Description and Preparation Table 1 gives the summary of our experimental datasets. The second column lists the amount of samples for training and the third column lists the data sizes of the hold out sets for validation. The ratio of training and hold out set is two to one ....
[Article contains additional citation context not shown here]
H. Liu and R. Setiono, "A probabilistic approach to feature selection - a filter solution," in Proceedings of the 13th International Conference on Machine Learning (ICML'96), (Bari, Italy), pp. 319--327, 1996.
....exponentially according to the size of the features used. Therefore such systems are mainly concerned with the feature selection problem. Recent studies have shown many directions for this topic including, the redefinition of relevance in ML context [JKP94] and the combination of existing methods [LS96]. However the theoretical complexity of LEGAL F remains exponential, and ongoing research is dealing with this problem in order to demonstrate the potential of Galois Lattice in Concept Learning. Experimentations will be conducted on other datasets in the UCI repository database [MM96] ....
H. Liu and R. Setiono. A Probabilistic Approach to Feature Selection - A Filter Solution. In Proceedings of the Thirteenth International Conference (ICML' 96), Bari, Italy, July 3-6 1996.
....Testing Alg. Aha Bankert (1994) Beam variants of forward Calinski Harabasz IB1 (BEAM) backward selection separability index Almuallim Dietterich (1991) Breadth first Consistency ID3 (FOCUS) Cardie (1993) C4.5 decision tree kNN CBL Kubat et al. 1993) ID3 decision tree Naive Bayes Liu Setiono (1996) Las Vegas (i.e. Monte Carlo Consistency ID3 (LVF) random sampling) Singh Provan (1996) Forward selection Maximise 1 of 3 Bayesian (Info AS) information metrics Network Table 1: Comparison of different attribute selection studies (filter model) In the wrapper model (Figure 9) the attribute ....
Liu, H. and Setiono, R. (1996). A Probabilistic Approach to Feature Selection - A Filter Solution.
....this exhaustive approach is out of the question. Therefore, various feature selection methods have been designed to avoid exhaustive search while still aiming at the optimal subset. Examples are Branch Bound [15, 17] Relief [7, 10] Wrapper methods [8] Approximate Markov Blanket [9] and LVF [12]. We will review some of these methods briefly in the next section. The feature selection problem can be viewed as a search problem [18,17, 11] The search process starts with either an empty set or a full set. For the former, it expands the search space by adding one feature at a time ....
....by instancebased learning algorithms [1] Relief assigns a weighttoeach feature that reflects its ability to distinguish among the classes, and then selects those features with weights that exceed a user specified threshold. Another algorithm whichdoes not explicitly search exhaustively is LVF [12] that randomly searches the feature space. For each candidate subset, it calculates an inconsistency countbased on the intuition that the class label associated with the maximum number of patterns is most probably the correct class, considering only the features in the subset. The two methods in ....
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In L. Saitta, editor, Machine Learning: Proceedings of the 13th International Conference. Morgan Kaufmann Publishers, 1996.
....2 N subsets. When N is large, this exhaustive approach is out of the question. Therefore, various feature selection methods have been designed to avoid exhaustive search while still aiming at the optimal subset. Examples are Branch Bound [7] Focus [1] Relief [4] Wrapper methods [3] and LVF [5]. The feature selection problem can be viewed as a search problem [9] The search process starts with either an empty set or a full set. For the former, it expands the search space by adding one feature at a time (Sequential Forward Selection) 1] for the latter, it expands the search space by ....
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In Proceedings of ICML, pages 319--327. Morgan Kaufmann, 1996.
....[7] To sum up, the exhaustive search approach is infeasible in practice; the heuristic search approach can reduce the search time significantly, but will fail on hard problems (e.g. the parity problem) or cannot remove redundant features. A probabilistic approach is proposed as an alternative [15] in selecting the optimal suboptimal subset(s) of features. In the context of large sized databases, however, it would still take considerably long time to check if a subset is valid or not 2 . We had first hand experience of this problem when our probabilistic system was dispatched to a local ....
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In L. Saitta, editor, Proceedings of International Conference on Machine Learning (ICML-96), pages 319--327. Morgan Kaufmann Publishers, 1996.
....data types. The two basic types of data are nominal (e.g. attribute color may have values of red, green, yellow) and ordinal (e.g. attribute winning position can have values of 1, 2, and 3, or attribute salary can have 22345.00, 46543.89, etc. as its values) Many feature selection algorithms [1,3,5,10] are shown to work effectively on discrete data or even more strictly, on binary data (and or binary class value) In order to deal with numeric attributes, a common practice for those algorithms is to discretize the data before conducting feature selection. This paper provides a way to select ....
H. Liu and R. Setiono. A probabilistic approach to feature selection - A filter solution. In Proceedings of the 13th International Conference on Machine Learning, pages 319--327, 1996.
....original data can be retained. This aspect of feature selection is related to the study of search strategies. Extensive research effort has been devoted to this study [19, 18, 10] Examples are Branch Bound [15, 18] Relief [6, 9] Wrapper methods [7] Approximate Markov Blanket [8] and LVF [13]. The search process starts with either an empty set or a full set. For the former, it expands the search space by adding one feature at a time (Step wise Forward Selection) an example is Focus [1] for the latter, it expands the search space by deleting one feature at a time (Stepwise Backward ....
....moderately large, Focus will take a long time. Similarly, if there are a moderate number of irrelevant features, Branch and Bound will take a long time. A third, more robust, option is to start with a random feature set and expand the search space by randomly generating new subsets (e.g. LVF [13]) In the rest of the paper P is the number of patterns, N is the number of features, and M is the minimal size of features. The above three representative methods are designed for different situations. When the minimal size (M ) is small, it is sensible to apply Focus since 2 M is not large; ....
[Article contains additional citation context not shown here]
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In L. Saitta, editor, Proceedings of International Conference on Machine Learning (ICML-96), July 3-6, 1996, pages 319--327, Bari, Italy, 1996. San Francisco: Morgan Kaufmann Publishers, CA.
....class. The inconsistency criterion aims to keep the discriminating power of the data for multiple classes after feature selection. In other words, the criterion is supported by information theoretic considerations. The affirmative experimental results for the effectiveness of LVF can be found in (Liu Setiono, 1996). It is shown that LVF can select relevant features where Focus or Branchand Bound are impractical. In other words, LVF complements many existing algorithms nicely (Dash Liu, 1997) A limitation for LVF is that it works only on discrete data. Other versions extended from LVF inherit this ....
....rule induction, as well as data collection in future. 4. LVS can easily scale up. Time complexity of a feature selection algorithm can be described along two dimensions: number of attributes (N) and number of instances (P ) By approximating MAX TRIES of LVF with c Theta N (reduced from 2 N ) (Liu Setiono, 1996), the time complexity of LVF is mainly determined by P since N is relatively small. The scalable version, LVS, makes it possible to start with a fixed small number of instances (e.g. a few thousand for D 0 ) no matter how large the original data set is. The experimental results show that the ....
Liu, H., & Setiono, R. (1996). A probabilistic approach to feature selection - a filter solution. In L. Saitta (Ed.), Proceedings of International Conference on Machine Learning (ICML-96), July 3-6, 1996 (pp. 319--327). Bari, Italy: San Francisco: Morgan Kaufmann Publishers, CA.
....future categories. Their main focus was the branch and bound methods [34] and its variants, 16] No experimental study was conducted in this paper. Their survey was published in the year 1987, and since then many new and efficient methods have been introduced (e.g. Focus [2] Relief [22] LVF [28]) Doak followed a similar approach to Siedlecki and Sklansky s survey and grouped the different search algorithms and evaluation functions used in feature selection methods independently, and ran experiments using some combinations of evaluation functions and search procedures. In this article, ....
....and evaluation functions used in feature selection methods independently, and ran experiments using some combinations of evaluation functions and search procedures. In this article, a survey is conducted for feature selection methods starting from the early 1970 s [33] to the most recent methods [28]. In the next section, the two major steps of feature selection (generation procedure and evaluation function) are divided into different groups, and 32 different feature selection methods are categorized based on the type of generation procedure and evaluation function that is used. 134 M. Dash, ....
[Article contains additional citation context not shown here]
Liu, H. and Setiono, R., A probabilistic approach to feature selection---a filter solution. In: Proceedings of International Conference on Machine Learning, 319--327, 1996.
....For simplicity purpose, we use K, the predefined number of runs, as our sentinel for the loop. As LVFS keeps S best , we may interrupt LVFS anytime and request for S best . Depending on the goodness measure and or stopping criterion we choose, many versions of LV algorithms can be designed. LVF [20] is an algorithm that uses the inconsistency measure to evaluate the goodness of a feature subset. A subset is good if its inconsistency rate is less than or equal to a pre specified value (by default, it is 0) LVF generates a random subset of features in each run and the subset is evaluated ....
....good choices due to the monotonicity of the inconsistency measure, LVF should find a good solution quickly. However, the optimal case is that LVF can efficiently return a good subset with the minimum number of features. As the calculation of the inconsistency rate can be done efficiently (O(P) [20]) if we insist that a new subset be smaller than a found one, we can expect that the longer LVF runs, the better the result it returns. Some analysis shows that LVF should get slower and slower in approaching the smallest subset a problem of degradation. Assuming that there are 10 features and ....
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In L. Saitta, editor, Proceedings of International Conference on Machine Learning (ICML-96), July 3-6, 1996, pages 319--327, Bari, Italy, 1996. San Francisco: Morgan Kaufmann Publishers, CA.
No context found.
Setiono, R., and Liu, H. A probabilistic approach to feature selection-a filter solution. In Proceedings of International Conference on Machine Learning, 319-327 (1996).
No context found.
Setiono, R., and Liu, H. A probabilistic approach to feature selection-a filter solution. In Proceed ings of International Conference on Machine Learning, 319327 (1996). This is a rough measure. Obtaining true cpu time from within a Java program is quited i#cult.
No context found.
H. Liu, R. Setiono, "A Probabilistic Approach to Feature Selection-A Filter Solution," Proceedings of the 13th International Conference on Machine Learning, pp319-327, 1996.
No context found.
H. Liu and R. Setiono. A Probabilistic Approach to Feature Selection -- a Filter Solution. In 13th Int'l Conf. on Machine Learning, pages 319--327. Morgan Kaufmann, 1996.
No context found.
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In Proc. of 13th International Conference on Machine Learning, pages 319--327, Bari, Italy, 1996. Morgan Kaufmann. 53
No context found.
H. Liu and R. Setiono, "A probabilistic approach to feature selection --- a filter approach," Proc. 13th Int. Conf. Machine Learning, Morgan Kaufmann, 1996, pp. 319--327.
No context found.
H. Liu and R. Setiono, "A Probabilistic Approach to Feature Selection -- A Filter Solution". Machine Learning, Proc. of the 13th International Conference, Bari, Italy, 319-327, 1996.
No context found.
Liu, H. and Setiono, R. (1996a). A Probabilistic Approach to Feature Selection - A Filter Solution. In Proceedings of the 13th International Conference on Machine Learning, pp. 319-327. San Francisco, CA:Morgan Kaufmann.
No context found.
, July 1996, pp. 319-327. Bari, Italy.
No context found.
Liu, H., Setiono, R.: A probabilistic approach to feature selection - A filter solution. In: Saitta, L. (Ed.): Proceedings of the Thirteenth International Conference on Machine Learning (ICML'96) Italy (1996) 319-327
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC