Results 1  10
of
17
An efficient approach for assessing hyperparameter importance
 In Proc. of ICML14
, 2014
"... The performance of many machine learning methods depends critically on hyperparameter settings. Sophisticated Bayesian optimization methods have recently achieved considerable successes in optimizing these hyperparameters, in several cases surpassing the performance of human experts. However, bl ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
The performance of many machine learning methods depends critically on hyperparameter settings. Sophisticated Bayesian optimization methods have recently achieved considerable successes in optimizing these hyperparameters, in several cases surpassing the performance of human experts. However, blind reliance on such methods can leave end users without insight into the relative importance of different hyperparameters and their interactions. This paper describes efficient methods that can be used to gain such insight, leveraging random forest models fit on the data already gathered by Bayesian optimization. We first introduce a novel, lineartime algorithm for computing marginals of random forest predictions and then show how to leverage these predictions within a functional ANOVA framework, to quantify the importance of both single hyperparameters and of interactions between hyperparameters. We conducted experiments with prominent machine learning frameworks and stateoftheart solvers for combinatorial problems. We show that our methods provide insight into the relationship between hyperparameter settings and performance, and demonstrate that—even in very highdimensional cases—most performance variation is attributable to just a few hyperparameters. 1.
Predicting the Hardness of Learning Bayesian Networks
, 2014
"... There are various algorithms for finding a Bayesian network structure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and, given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
There are various algorithms for finding a Bayesian network structure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and, given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve the problem. Estimating the runtimes directly is extremely difficult as they are complicated functions of the instance. The main contribution of this paper is characterization of the empirical hardness of an instance for a given algorithm based on a novel collection of nontrivial, yet efficiently computable features. Our empirical results, based on the largest evaluation of stateoftheart BNS learning algorithms to date, demonstrate that we can predict the runtimes to a reasonable degree of accuracy, and effectively select algorithms that perform well on a particular instance. Moreover, we also show how the results can be utilized in building a portfolio algorithm that combines several individual algorithms in an almost optimal manner.
Bayesian Optimization With Censored Response Data
"... Bayesian optimization (BO) aims to minimize a given blackbox function using a model that is updated whenever new evidence about the function becomes available. Here, we address the problem of BO under partially rightcensored response data, where in some evaluations we only obtain a lower bound on t ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Bayesian optimization (BO) aims to minimize a given blackbox function using a model that is updated whenever new evidence about the function becomes available. Here, we address the problem of BO under partially rightcensored response data, where in some evaluations we only obtain a lower bound on the function value. The ability to handle such response data allows us to adaptively censor costly function evaluations in minimization problems where the cost of a function evaluation corresponds to the function value. One important application giving rise to such censored data is the runtimeminimizing variant of the algorithm configuration problem: finding settings of a given parametric algorithm that minimize the runtime required for solving problem instances from a given distribution. We demonstrate that terminating slow algorithm runs prematurely and handling the resulting rightcensored observations can substantially improve the state of the art in modelbased algorithm configuration. 1
From sequential algorithm selection to parallel portfolio selection
 In Dhaenens
, 2015
"... Abstract. In view of the increasing importance of hardware parallelism, a natural extension of perinstance algorithm selection is to select a set of algorithms to be run in parallel on a given problem instance, based on features of that instance. Here, we explore how existing algorithm selection t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In view of the increasing importance of hardware parallelism, a natural extension of perinstance algorithm selection is to select a set of algorithms to be run in parallel on a given problem instance, based on features of that instance. Here, we explore how existing algorithm selection techniques can be effectively parallelized. To this end, we leverage the machine learning models used by existing sequential algorithm selectors, such as 3S , ISAC , SATzilla and MEASP, and modify their selection procedures to produce a ranking of the given candidate algorithms; we then select the top n algorithms under this ranking to be run in parallel on n processing units. Furthermore, we adapt the presolving schedules obtained by aspeed to be effective in a parallel setting with different time budgets for each processing unit. Our empirical results demonstrate that, using 4 processing units, the best of our methods achieves a 12fold average speedup over the best single solver on a broad set of challenging scenarios from the algorithm selection library.
Predicting Performance of OWL Reasoners: Locally or Globally?
"... We propose a novel approach to performance prediction of OWL reasoners. The existing strategies take a view of an entire ontology corpus: they extract multiple features from the ontologies in the corpus and use them for training machine learning models. We call these global approaches. In contrast, ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We propose a novel approach to performance prediction of OWL reasoners. The existing strategies take a view of an entire ontology corpus: they extract multiple features from the ontologies in the corpus and use them for training machine learning models. We call these global approaches. In contrast, our approach is a local one: it examines a single ontology (independent of any corpus), selects suitable, small ontology subsets, and extrapolates their performance measurements to the whole ontology. Our results show that this simple idea leads to accurate performance predictions, comparable or superior to global approaches. Our second contribution concerns ontology features: we are the first to investigate intercorrelation of ontology features using Principal Component Analysis (PCA). We report that extracting multiple features as global approaches do makes surprisingly little difference for performance prediction. In fact, it turns out that the ontologies in two major corpora basically only differ in one or two features.
Improved Features for Runtime Prediction of DomainIndependent Planners
"... Stateoftheart planners often exhibit substantial runtime variation, making it useful to be able to efficiently predict how long a given planner will take to run on a given instance. In other areas of AI, such needs are met by building socalled empirical performance models (EPMs), statistical m ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Stateoftheart planners often exhibit substantial runtime variation, making it useful to be able to efficiently predict how long a given planner will take to run on a given instance. In other areas of AI, such needs are met by building socalled empirical performance models (EPMs), statistical models derived from sets of problem instances and performance observations. Historically, such models have been less accurate for predicting the running times of planners. A key hurdle has been a relative weakness in instance features for characterizing instances: mappings from problem instances to real numbers that serve as the starting point for learning an EPM. We propose a new, extensive set of instance features for planning, and investigate its effectiveness across a range of model families. We built EPMs for various prominent planning systems on several thousand benchmark problems from the planning literature and from IPC benchmark sets, and conclude that our models predict runtime much more accurately than the previous state of the art. We also study the relative importance of these features.
Reinforcement Learning for Automatic Online Algorithm Selection an Empirical Study ❤❛♥s✳❞❡❣r♦♦t❡❅❦✉•❡✉✈❡♥✳❜❡
"... Abstract: In this paper a reinforcement learning methodology for automatic online algorithm selection is introduced and empirically tested. It is applicable to automatic algorithm selection methods that predict the performance of each available algorithm and then pick the best one. The experiments ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract: In this paper a reinforcement learning methodology for automatic online algorithm selection is introduced and empirically tested. It is applicable to automatic algorithm selection methods that predict the performance of each available algorithm and then pick the best one. The experiments confirm the usefulness of the methodology: using online data results in better performance. As in many online learning settings an exploration vs. exploitation tradeoff, synonymously learning vs. earning tradeoff, is incurred. Empirically investigating the quality of classic solution strategies for handling this tradeoff in the automatic online algorithm selection setting is the secondary goal of this paper. The automatic online algorithm selection problem can be modelled as a contextual multiarmed bandit problem. Two classic strategies for solving this problem are tested in the context of automatic online algorithm selection: εgreedy and lower confidence bound. The experiments show that a simple purely exploitative greedy strategy outperforms strategies explicitly performing exploration.
Digital Object Identifier 10.4230/DagRep.1.5.61 1 Executive Summary
"... This report documents the programme and the outcomes of Dagstuhl Seminar 11201 Constraint Programming meets Machine Learning and Data Mining. Our starting point in this seminar was that machine learning and data mining have developed largely independently from constraint programming till now, but th ..."
Abstract
 Add to MetaCart
This report documents the programme and the outcomes of Dagstuhl Seminar 11201 Constraint Programming meets Machine Learning and Data Mining. Our starting point in this seminar was that machine learning and data mining have developed largely independently from constraint programming till now, but that it is increasingly becoming clear that there are many opportunities for interactions between these areas: on the one hand, data mining and machine learning can be used to improve constraint solving; on the other hand, constraint solving can be used in data mining in machine learning. This seminar brought together prominent researchers from both communities to discuss these opportunities.
Embedding Decision Trees and Random Forests in Constraint Programming
"... Abstract. In past papers, we have introduced Empirical Model Learning (EML) as a method to enable Combinatorial Optimization on real world systems that are impervious to classical modeling approaches. The core idea in EML consists in embedding a Machine Learning model in a traditional combinatorial ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. In past papers, we have introduced Empirical Model Learning (EML) as a method to enable Combinatorial Optimization on real world systems that are impervious to classical modeling approaches. The core idea in EML consists in embedding a Machine Learning model in a traditional combinatorial model. So far, the method has been demonstrated by using Neural Networks and Constraint Programming (CP). In this paper we add one more technique to the EML arsenal, by devising methods to embed Decision Trees (DTs) in CP. In particular, we propose three approaches: 1) a simple encoding based on metaconstraints; 2) a method using attribute discretization and a global table constraint; 3) an approach based on converting a DT into a Multivalued Decision Diagram, which is then fed to an mdd constraint. We finally show how to embed in CP a Random Forest, a powerful type of ensemble classifier based on DTs. The proposed methods are compared in an experimental evaluation, highlighting their strengths and their weaknesses. 1