Results 1  10
of
106
Segmentation of the mean of heteroscedastic data via crossvalidation
, 2010
"... This paper tackles the problem of detecting abrupt changes in the mean of a heteroscedastic signal by model selection, without knowledge on the variations of the noise. A new family of changepoint detection procedures is proposed, showing that crossvalidation methods can be successful in the heter ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
This paper tackles the problem of detecting abrupt changes in the mean of a heteroscedastic signal by model selection, without knowledge on the variations of the noise. A new family of changepoint detection procedures is proposed, showing that crossvalidation methods can be successful in the heteroscedastic framework, whereas most existing procedures are not robust to heteroscedasticity. The robustness to heteroscedasticity of the proposed procedures is supported by an extensive simulation study, together with recent theoretical results. An application to Comparative Genomic Hybridization (CGH) data is provided, showing that robustness to heteroscedasticity can indeed be required for their analysis.
Lateen EM: Unsupervised training with multiple objectives, applied to dependency grammar induction
 IN PROCEEDINGS OF EMNLP
, 2011
"... We present new training methods that aim to mitigate local optima and slow convergence in unsupervised training by using additional imperfect objectives. In its simplest form, lateen EM alternates between the two objectives of ordinary “soft ” and “hard ” expectation maximization (EM) algorithms. Sw ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
We present new training methods that aim to mitigate local optima and slow convergence in unsupervised training by using additional imperfect objectives. In its simplest form, lateen EM alternates between the two objectives of ordinary “soft ” and “hard ” expectation maximization (EM) algorithms. Switching objectives when stuck can help escape local optima. We find that applying a single such alternation already yields stateoftheart results for English dependency grammar induction. More elaborate lateen strategies track both objectives, with each validating the moves proposed by the other. Disagreements can signal earlier opportunities to switch or terminate, saving iterations. Deemphasizing fixed points in these ways eliminates some guesswork from tuning EM. An evaluation against a suite of unsupervised dependency parsing tasks, for a variety of languages, showed that lateen strategies significantly speed up training of both EM algorithms, and improve accuracy for hard EM.
On correlation and budget constraints in modelbased bandit optimization
"... with application to automatic machine learning ..."
Stochastic Modeling of Soft Magnetic Properties of Electrical Steels: Application to Stators of Electrical Machines
 IEEE Trans.Mag
"... To take account of the uncertainties introduced on the soft magnetic materials properties (magnetic behavior law, iron losses) during the manufacturing process, the present work deals with the stochastic modeling of the magnetic behavior law BH and iron losses of claw pole stator generator. Twenty ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
To take account of the uncertainties introduced on the soft magnetic materials properties (magnetic behavior law, iron losses) during the manufacturing process, the present work deals with the stochastic modeling of the magnetic behavior law BH and iron losses of claw pole stator generator. Twenty eight (28) samples of slinky stator (SS) coming from the same production chain have been investigated. The used approaches are similar to those used in mechanics. The accuracy of existing anhysteretic models has been tested first using cross validation techniques. The well known iron loss separation model has been implemented to take into account the variability of the losses. Then, the Multivariate Gaussian distribution is chosen to model the variability and dependencies between identified parameters, for both behavior law and iron loss models. The developed stochastic models allow predicting a 98 % confidence interval for the considered samples. Index Terms — slinky stator, magnetic behavior law, iron losses, variability, stochastic model
An empirical evaluation of portfolios approaches for solving csps
 In Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
, 2013
"... Abstract. Recent research in areas such as SAT solving and Integer Linear Programming has shown that the performances of a single arbitrarily efficient solver can be significantly outperformed by a portfolio of possibly slower onaverage solvers. We report an empirical evaluation and comparison of ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Recent research in areas such as SAT solving and Integer Linear Programming has shown that the performances of a single arbitrarily efficient solver can be significantly outperformed by a portfolio of possibly slower onaverage solvers. We report an empirical evaluation and comparison of portfolio approaches applied to Constraint Satisfaction Problems (CSPs). We compared models developed on top of offtheshelf machine learning algorithms with respect to approaches used in the SAT field and adapted for CSPs, considering different portfolio sizes and using as evaluation metrics the number of solved problems and the time taken to solve them. Results indicate that the best SAT approaches have top performances also in the CSP field and are slightly more competitive than simple models built on top of classification algorithms. 1
Penalized Sieve Estimation and Inference of Seminonparametric Dynamic Models: A Selective Review
, 2011
"... In this selective review, we …rst provide some empirical examples that motivate the usefulness of seminonparametric techniques in modelling economic and …nancial time series. We describe popular classes of seminonparametric dynamic models and some temporal dependence properties. We then present pe ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
In this selective review, we …rst provide some empirical examples that motivate the usefulness of seminonparametric techniques in modelling economic and …nancial time series. We describe popular classes of seminonparametric dynamic models and some temporal dependence properties. We then present penalized sieve extremum (PSE) estimation as a general method for seminonparametric models with crosssectional, panel, time series, or spatial data. The method is especially powerful in estimating di ¢ cult illposed inverse problems such as seminonparametric mixtures or conditional moment restrictions. We review recent advances on inference and large sample properties of the PSE estimators, which include (1) consistency and convergence rates of the PSE estimator of the nonparametric part; (2) limiting distributions of plugin PSE estimators of functionals that are either smooth (i.e., rootn estimable) or nonsmooth (i.e., slower than rootn estimable); (3) simple criterionbased inference for plugin PSE estimation of smooth or nonsmooth functionals; and (4) rootn asymptotic normality of semiparametric twostep estimators and their consistent variance estimators. Examples from dynamic asset pricing, nonlinear spatial VAR, semiparametric GARCH,
The Impact of Unlabeled Patterns in Rademacher Complexity Theory for Kernel Classifiers
"... We derive here new generalization bounds, based on Rademacher Complexity theory, for model selection and error estimation of linear (kernel) classifiers, which exploit the availability of unlabeled samples. In particular, two results are obtained: the first one shows that, using the unlabeled sample ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We derive here new generalization bounds, based on Rademacher Complexity theory, for model selection and error estimation of linear (kernel) classifiers, which exploit the availability of unlabeled samples. In particular, two results are obtained: the first one shows that, using the unlabeled samples, the confidence term of the conventional bound can be reduced by a factor of three; the second one shows that the unlabeled samples can be used to obtain much tighter bounds, by building localized versions of the hypothesis class containing the optimal classifier. 1
Compressive sampling of polynomial chaos expansions: convergence analysis and sampling strategies
 Journal of Computational Physics In press, available online
"... ar ..."
(Show Context)
Generalised Density Forecast Combinations∗
, 2013
"... Density forecast combinations are becoming increasingly popular as a means of improving forecast ‘accuracy’, as measured by a scoring rule. In this paper we generalise this literature by letting the combination weights follow more general schemes. Sieve estimation is used to optimise the score of th ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Density forecast combinations are becoming increasingly popular as a means of improving forecast ‘accuracy’, as measured by a scoring rule. In this paper we generalise this literature by letting the combination weights follow more general schemes. Sieve estimation is used to optimise the score of the generalised density combination where the combination weights depend on the variable one is trying to forecast. Specific attention is paid to the use of piecewise linear weight functions that let the weights vary by region of the density. We analyse these schemes theoretically, in Monte Carlo experiments and in an empirical study. Our results show that the generalised combinations outperform their linear counterparts.
Optimal data split methodology for model validation
 Proceedings of the World Congress on Engineering and Computer Science 2011 Vol. II, WCECS 2011
, 2011
"... Abstract—The decision to incorporate crossvalidation into validation processes of mathematical models raises an immediate question – how should one partition the data into calibration and validation sets? We answer this question systematically: we present an algorithm to find the optimal partition ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract—The decision to incorporate crossvalidation into validation processes of mathematical models raises an immediate question – how should one partition the data into calibration and validation sets? We answer this question systematically: we present an algorithm to find the optimal partition of the data subject to certain constraints. While doing this, we address two critical issues: 1) that the model be evaluated with respect to predictions of a given quantity of interest and its ability to reproduce the data, and 2) that the model be highly challenged by the validation set, assuming it is properly informed by the calibration set. This framework also relies on the interaction between the experimentalist and/or modeler, who understand the physical system and the limitations of the model; the decisionmaker, who understands and can quantify the cost of model failure; and the computational scientists, who strive to determine if the model satisfies both the modeler’s and decisionmaker’s requirements. We also note that our framework is quite general, and may be applied to a wide range of problems. Here, we illustrate it through a specific example involving a data reduction model for an ICCD camera from a shocktube experiment located at the NASA Ames Research Center (ARC). Index Terms—Model validation, quantity of interest, Bayesian inference I.