#### DMCA

## S.: Learning from imbalanced data in relational domains: A soft margin approach

Venue: | In: ICDM (2014 |

Citations: | 1 - 1 self |

### Citations

3461 |
The Elements of Statistical Learning
- Hastie, Tibshirani, et al.
- 2003
(Show Context)
Citation Context ...ifiers by reweighting the example after every iteration and has been shown to outperform a single complex model. But boosting often suffers from overfitting after a few iterations [23]. Hastie et al. =-=[12]-=- proposed -Boost where they regularize by shrinking the contribution of each weak classifier. Jin et al., [24] proposed Weight-Boost, that combines weak classifier with an instance-dependent weight f... |

2209 | Experiments with a New Boosting Algorithm
- Freund, Schapire
- 1996
(Show Context)
Citation Context ...s learned). But this approach uses a constant cost for type I (false positive) and type II (false negative) errors and can still overfit as more trees are included. C. Other ensemble methods AdaBoost =-=[22]-=-, the most popular ensemble method, learns a sequence of weak classifiers by reweighting the example after every iteration and has been shown to outperform a single complex model. But boosting often s... |

993 | Greedy Function Approximation: A Gradient Boosting Machine
- Friedman
- 2001
(Show Context)
Citation Context ...n the models and complex search procedures inside the methods. A more recent algorithm called Relational Functional Gradient Boosting (RFGB) [4], [5], based on Friedman’s functional gradient boosting =-=[6]-=- addressed this problem by learning structure and parameters simultaneously. This was achieved by learning a set of relational trees for modelling the distribution of each (first-order) variable given... |

609 | Learning probabilistic relational models
- Friedman, Getoor, et al.
- 1999
(Show Context)
Citation Context ...in the relations increases. This problem is a critical one in all SRL domains involve relations between people, such as co-worker, advised-by, co-authors, etc. The way Probabilistic Relational Models =-=[7]-=- bypasses this problem is by creating a binary existence variable for every possible relation, but this introduces a similar class imbalance problem while learning the existence variables because this... |

337 |
Introduction to Statistical Relational Learning
- Getoor, Taskar, et al.
- 2007
(Show Context)
Citation Context ...lly. I. INTRODUCTION Recently, a great deal of progress has been made in combining statistical methods with relational or logical models, in what is now known as Statistical relational Learning (SRL) =-=[1]-=-. SRL addresses the challenge of applying statistical learning and inference approaches to problems which involve rich collections of objects linked together in a complex, stochastic and relational wo... |

117 | Linkage and autocorrelation cause feature selection bias in relational learning
- Jensen, Neville
(Show Context)
Citation Context ...m is typically addressed by operating and sampling in the feature space [8]. However, in relational domains, the feature vector is not of a fixed length and the feature space can possibly be infinite =-=[9]-=-. Previous work has shown that random sampling of features will not suffice in relational domains [10]. One solution to address the issue of imbalance leading to overfitting in the propositional world... |

111 |
Top-down induction of first-order logical decision trees
- Blockeel, Raedt
- 1998
(Show Context)
Citation Context ...ted for each example. This gradient becomes a weight for that example (∆(xi)). In SRL case, this corresponds to computing the gradient (weight) for every grounding. Then, a relational regression tree =-=[21]-=- is learned to fit to these weighted groundings, which is then added to the model. Similar to parametric descent, the sum of these m gradients is the current value of the regression function ψ. Note t... |

83 | Large margin hidden Markov models for automatic speech recognition
- Sha, Saul
- 2007
(Show Context)
Citation Context ...log-likelihood by considering the underlying model to be a (structured) log-linear model, we introduce a soft-margin objective function that is inspired by earlier research in log-linear models [14], =-=[15]-=-. The objective is very simple: high-cost examples should be penalized differently from low-cost examples. In our setting, false negatives constitute high-cost examples. This is especially motivated b... |

57 | Data mining for imbalanced datasets: an overview.
- Chawla
- 2005
(Show Context)
Citation Context ...training. While effective, this method can lead to a large variance in the resulting probabilistic model. An alternative method typically is to oversample the minority class but as observed by Chawla =-=[8]-=-, this can lead to overfitting on the minority class. This is particularly true for incremental model-building algorithms such as RFGB, which iteratively attempt to fix the mistakes made in earlier le... |

47 | Learning Markov logic network structure via hypergraph lifting
- Kok, Domingos
- 2009
(Show Context)
Citation Context ...oblem in SRL is also significantly difficult. Consequently, structure learning has received increased attention lately, particularly in the case of one type of SRL models called Markov Logic Networks =-=[2]-=-, [3]. These approaches provide solutions that are theoretically interesting; however, applicability to real, large applications is nominal due to restricting assumptions in the models and complex sea... |

40 |
Boosting: Foundations and Algorithms
- Freund, Schapire
- 2012
(Show Context)
Citation Context ...itting in the propositional world is to perform some form of margin maximization. A common approach to margin maximization is via regularization, typically achieved via a regularization function [11]–=-=[13]-=-. In propositional and relational functional-gradient boosting methods, common regularization approaches restrict number of iterations, tree size or number of trees learned. While reasonably successfu... |

39 | Gradient-based boosting for statistical relational learning: The relational dependency network case
- Natarajan, Kersting, et al.
(Show Context)
Citation Context ...ge applications is nominal due to restricting assumptions in the models and complex search procedures inside the methods. A more recent algorithm called Relational Functional Gradient Boosting (RFGB) =-=[4]-=-, [5], based on Friedman’s functional gradient boosting [6] addressed this problem by learning structure and parameters simultaneously. This was achieved by learning a set of relational trees for mode... |

38 | Boosted classification trees and class probability/quantile estimation
- Mease, Wyner, et al.
- 2007
(Show Context)
Citation Context ...quence of weak classifiers by reweighting the example after every iteration and has been shown to outperform a single complex model. But boosting often suffers from overfitting after a few iterations =-=[23]-=-. Hastie et al. [12] proposed -Boost where they regularize by shrinking the contribution of each weak classifier. Jin et al., [24] proposed Weight-Boost, that combines weak classifier with an instanc... |

30 | Softmaxmargin CRFs: Training log-linear models with cost functions.
- Gimpel, Smith
- 2010
(Show Context)
Citation Context ...s the log-likelihood by considering the underlying model to be a (structured) log-linear model, we introduce a soft-margin objective function that is inspired by earlier research in log-linear models =-=[14]-=-, [15]. The objective is very simple: high-cost examples should be penalized differently from low-cost examples. In our setting, false negatives constitute high-cost examples. This is especially motiv... |

29 | Learning markov logic networks via functional gradient boosting - Khot, Natarajan, et al. |

22 | A new boosting algorithm using inputdependent regularizer
- Liu, Si, et al.
- 2003
(Show Context)
Citation Context .... But boosting often suffers from overfitting after a few iterations [23]. Hastie et al. [12] proposed -Boost where they regularize by shrinking the contribution of each weak classifier. Jin et al., =-=[24]-=- proposed Weight-Boost, that combines weak classifier with an instance-dependent weight factor. They trade-off between the weak classifier in the current iteration and the classifier based on the prev... |

16 | Imitation learning in relational domains: A functionalgradient boosting approach
- Natarajan, Joshi, et al.
- 2011
(Show Context)
Citation Context ...e also show the relation between the soft margin and the current RFGB algorithm. Since the original RFGB algorithm has been extended to learn several types of directed and undirected models [4], [5], =-=[16]-=-, our algorithm is broadly applicable and not restricted to a particular class of SRL model. Finally, we evaluate the algorithm on several standard data sets and demonstrate empirically that the propo... |

15 | An improvement of AdaBoost to avoid overfitting
- Rätsch, Onoda, et al.
- 1998
(Show Context)
Citation Context ...ifier in the current iteration and the classifier based on the previous iterations. Xi et al., [25] minimize a L1-regularized exponential loss for sparse solutions and early stopping. Rätsch et al., =-=[26]-=- proposed a weight-decay method, where they soften the margin by introducing a slack variable in the exponential loss function. In contrast to our approach, these approaches were applied to propositio... |

9 | A new evaluation measure for imbalanced datasets,” in
- Weng, Poon
- 2008
(Show Context)
Citation Context ... as the precision stays within a reasonable range. To better serve such a goal, we employ evaluation metrics that assign higher weights to high recall regions, that is, the top region in an ROC curve =-=[27]-=-. Specifically this paper uses three such metrics: 1) false negative rate, 2) F5 measure, and 3) weighted AUC-ROC. By comparing the metrics it becomes possible to better understand an algorithm’s perf... |

8 |
Speed and sparsity of regularized boosting.
- Xi, Xiang, et al.
- 2008
(Show Context)
Citation Context ... combines weak classifier with an instance-dependent weight factor. They trade-off between the weak classifier in the current iteration and the classifier based on the previous iterations. Xi et al., =-=[25]-=- minimize a L1-regularized exponential loss for sparse solutions and early stopping. Rätsch et al., [26] proposed a weight-decay method, where they soften the margin by introducing a slack variable i... |

5 | L.: Cost-sensitive learning with conditional markov networks
- Sen, Getoor
(Show Context)
Citation Context ...s false negatives). However, they have not yet been applied to SRL models, which require non-trivial modifications. Cost-sensitive learning for relational models has been considered by Sen and Getoor =-=[19]-=- where they learn the parameters of a conditional Markov network using two cost-sensitive approaches: one based on the expected cost of misclassification and the second based on introducing a cost in ... |

2 |
Large margin cost-sensitive learning of conditional random fields,”
- Kim
- 2010
(Show Context)
Citation Context ... addressing the problem of class imbalance. II. BACKGROUND AND RELATED WORK A. Soft Margin Learning for Unbalanced Datasets Soft margin approaches are a popular approach for unbalanced datasets [17], =-=[18]-=-. Conceptually, our approach is closest to the softmax-margin approach for log-linear models [14] where the objective function is a modified log-likelihood function with a cost function in the normali... |

1 |
Relational one-class classication: A non-parametric approach
- Khot, Natarajan, et al.
- 2014
(Show Context)
Citation Context ...omains, the feature vector is not of a fixed length and the feature space can possibly be infinite [9]. Previous work has shown that random sampling of features will not suffice in relational domains =-=[10]-=-. One solution to address the issue of imbalance leading to overfitting in the propositional world is to perform some form of margin maximization. A common approach to margin maximization is via regul... |

1 | Ar-boost: Reducing overfitting by a robust data-driven regularization strategy
- Saha, Kunapuli, et al.
(Show Context)
Citation Context ...overfitting in the propositional world is to perform some form of margin maximization. A common approach to margin maximization is via regularization, typically achieved via a regularization function =-=[11]-=-–[13]. In propositional and relational functional-gradient boosting methods, common regularization approaches restrict number of iterations, tree size or number of trees learned. While reasonably succ... |

1 |
Maximal-margin approach for cost-sensitive learning based on scaled convex hull
- Liu
- 2011
(Show Context)
Citation Context ...FGB in addressing the problem of class imbalance. II. BACKGROUND AND RELATED WORK A. Soft Margin Learning for Unbalanced Datasets Soft margin approaches are a popular approach for unbalanced datasets =-=[17]-=-, [18]. Conceptually, our approach is closest to the softmax-margin approach for log-linear models [14] where the objective function is a modified log-likelihood function with a cost function in the n... |