Results 1  10
of
26
Learning from imbalanced data
 IEEE Trans. on Knowledge and Data Engineering
, 2009
"... Abstract—With the continuous expansion of data availability in many largescale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decisionm ..."
Abstract

Cited by 260 (6 self)
 Add to MetaCart
(Show Context)
Abstract—With the continuous expansion of data availability in many largescale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decisionmaking processes. Although existing knowledge discovery and data engineering techniques have shown great success in many realworld applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the stateoftheart technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data. Index Terms—Imbalanced learning, classification, sampling methods, costsensitive learning, kernelbased learning, active learning, assessment metrics. Ç
Tackling the Poor Assumptions of Naive Bayes Text Classifiers
 In Proceedings of the Twentieth International Conference on Machine Learning
, 2003
"... Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive ..."
Abstract

Cited by 157 (5 self)
 Add to MetaCart
(Show Context)
Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive Bayes classifiers, addressing both systemic issues as well as problems that arise because text is not actually generated according to a multinomial model.
Not so naive Bayes: Aggregating onedependence estimators
 Machine Learning
, 2005
"... Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, both LBR and superparent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach ..."
Abstract

Cited by 94 (11 self)
 Add to MetaCart
Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, both LBR and superparent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach to weakening the attribute independence assumption by averaging all of a constrained class of classifiers. In extensive experiments this technique delivers comparable prediction accuracy to LBR and superparent TAN with substantially improved computational e#ciency at test time relative to the former and at training time relative to the latter. The new algorithm is shown to have low variance and is suited to incremental learning.
Lazy Learning of Bayesian Rules
 Machine Learning
, 2000
"... The naive Bayesian classifier provides a simple and e#ective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and ge ..."
Abstract

Cited by 55 (11 self)
 Add to MetaCart
The naive Bayesian classifier provides a simple and e#ective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and generates a local naive Bayesian classifier at each leaf. The tests leading to a leaf can alleviate attribute interdependencies for the local naive Bayesian classifier. However, Bayesian tree learning still su#ers from the small disjunct problem of tree learning. While inferred Bayesian trees demonstrate low average prediction error rates, there is reason to believe that error rates will be higher for those leaves with few training examples. This paper proposes the application of lazy learning techniques to Bayesian tree induction and presents the resulting lazy Bayesian rule learning algorithm, called Lbr. This algorithm can be justified by a variant of Bayes theorem which supports a weaker conditional attribute independence assumption than is required by naive Bayes. For each test example, it builds a most appropriate rule with a local naive Bayesian classifier as its consequent. It is demonstrated that the computational requirements of Lbr are reasonable in a wide crosssection of natural domains. Experiments with these domains show that, on average, this new algorithm obtains lower error rates significantly more often than the reverse in comparison to a naive Bayesian classifier, C4.5, a Bayesian tree learning algorithm, a constructive Bayesian classifier that eliminates attributes and constructs new attributes using Cartesian products of existing nominal attributes, and a lazy decision tree learning algorithm. It also outperforms, although the result is not statisticall...
Lazy Bayesian Rules: A Lazy SemiNaive Bayesian Learning Technique Competitive to Boosting Decision Trees
 IN PROC. 16TH INTERNATIONAL CONF. ON MACHINE LEARNING
, 1999
"... Lbr is a lazy seminaive Bayesian classifier learning technique, designed to alleviate the attribute interdependence problem of naive Bayesian classification. To classify a test example, it creates a conjunctive rule that selects a most appropriate subset of training examples and induces a local nai ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
Lbr is a lazy seminaive Bayesian classifier learning technique, designed to alleviate the attribute interdependence problem of naive Bayesian classification. To classify a test example, it creates a conjunctive rule that selects a most appropriate subset of training examples and induces a local naive Bayesian classifier using this subset. Lbr can significantly improve the performance of the naive Bayesian classifier. A bias and variance analysis of Lbr reveals that it significantly reduces the bias of naive Bayesian classification at a cost of a slight increase in variance. It is interesting to compare this lazy technique with boosting and bagging, two wellknown stateoftheart nonlazy learning techniques. Empirical comparison of Lbr with boosting decision trees on discrete valued data shows that Lbr has, on average, significantly lower variance and higher bias. As a result of the interaction of these effects, the average prediction error of Lbr over a range of learning tasks is at...
A comparative study of seminaive Bayes methods in classification learning
 Proc. 4th Australasian Data Mining conference (AusDM05
, 2005
"... Abstract. Numerous techniques have sought to improve the accuracy of Naive Bayes (NB) by alleviating the attribute interdependence problem. This paper summarizes these seminaive Bayesian methods into two groups: those that apply conventional NB with a new attribute set, and those that alter NB by a ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
Abstract. Numerous techniques have sought to improve the accuracy of Naive Bayes (NB) by alleviating the attribute interdependence problem. This paper summarizes these seminaive Bayesian methods into two groups: those that apply conventional NB with a new attribute set, and those that alter NB by allowing interdependencies between attributes. We review eight typical seminaive Bayesian learning algorithms and perform error analysis using the biasvariance decomposition on thirtysix natural domains from the UCI Machine Learning Repository. In analysing the results of these experiments we provide general recommendations for selection between methods.
To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParentOneDependence Estimators
"... We conduct a largescale comparative study on linearly combining superparentonedependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether 16 model selection and weighing schemes, 58 benchmark data sets, as well as various statistical tests are employed. This p ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
We conduct a largescale comparative study on linearly combining superparentonedependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether 16 model selection and weighing schemes, 58 benchmark data sets, as well as various statistical tests are employed. This paper’s main contributions are threefold. First, it formally presents each scheme’s definition, rationale and time complexity; and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers biasvariance analysis for each scheme’s classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms with immediate and significant impact on realworld applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.
Efficiently mining interesting emerging patterns
 in Proc. 4th Int’l. Conf. on WebAge Information Management (WAIM2003
, 2003
"... Knowledge Discovery in Databases (KDD), or Data Mining is used to discover interesting or useful patterns and relationships in data, with an emphasis on large volume of observational databases. Among many other types of information (knowledge) that can be discovered in data, patterns that are expres ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Knowledge Discovery in Databases (KDD), or Data Mining is used to discover interesting or useful patterns and relationships in data, with an emphasis on large volume of observational databases. Among many other types of information (knowledge) that can be discovered in data, patterns that are expressed in terms of features are popular because they can be understood and used directly by people. The recently proposed Emerging Pattern (EP) is one type of such knowledge patterns. Emerging Patterns are sets of items (conjunctions of attribute values) whose frequency changes significantly from one dataset to another. They are useful as a means of discovering distinctions inherently present amongst a collection of datasets and have been shown to be a powerful method for constructing accurate classifiers. In this doctoral dissertation, we study the following three major problems involved in the discovery of Emerging Patterns and the construction of classification systems based on Emerging Patterns: 1. How to efficiently discover the complete set of Emerging Patterns between two classes
Efficient lazy elimination for averaged onedependence estimators
 In: Proceedings of the 23rd ICML. (2006
"... Seminaive Bayesian classifiers seek to retain the numerous strengths of naive Bayes while reducing error by relaxing the attribute independence assumption. Backwards Sequential Elimination (BSE) is a wrapper technique for attribute elimination that has proved effective at this task. We explore a ne ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Seminaive Bayesian classifiers seek to retain the numerous strengths of naive Bayes while reducing error by relaxing the attribute independence assumption. Backwards Sequential Elimination (BSE) is a wrapper technique for attribute elimination that has proved effective at this task. We explore a new technique, Lazy Elimination (LE), which eliminates highly related attributevalues at classification time without the computational overheads inherent in wrapper techniques. We analyze the effect of LE and BSE on a stateoftheart seminaive Bayesian algorithm Averaged OneDependence Estimators (AODE). Our experiments show that LE significantly reduces bias and error without undue computation, while BSE significantly reduces bias but not error, with high training time complexity. In the context of AODE, LE has a significant advantage over BSE in both computational efficiency and error. 1.
Sawtooth: Learning from huge amounts of data
, 2004
"... Data scarcity has been a problem in data mining up until recent times. Now, in the era of the Internet and the tremendous advances in both, data storage devices and highspeed computing, databases are filling up at rates never imagined before. The machine learning problems of the past have been augm ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Data scarcity has been a problem in data mining up until recent times. Now, in the era of the Internet and the tremendous advances in both, data storage devices and highspeed computing, databases are filling up at rates never imagined before. The machine learning problems of the past have been augmented by an increasingly important one, scalability. Extracting useful information from arbitrarily large data collections or data streams is now of special interest within the data mining community. In this research we find that mining from such large datasets may actually be quite simple. We address the scalability issues of previous widelyused batch learning algorithms and discretization techniques used to handle continuous values within the data. Then, we describe an incremental algorithm that addresses the scalability problem of Bayesian classifiers, and propose a Bayesiancompatible online discretization technique that handles continuous values, both with a “simplicity first ” approach and very low memory (RAM) requirements. To my family. To Nana. iii iv