Results 1  10
of
35
Selection of relevant features and examples in machine learning
 ARTIFICIAL INTELLIGENCE
, 1997
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract

Cited by 590 (2 self)
 Add to MetaCart
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area.
Exact Sampling with Coupled Markov Chains and Applications to Statistical Mechanics
, 1996
"... For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain has ..."
Abstract

Cited by 548 (13 self)
 Add to MetaCart
For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain has run for M steps, with M sufficiently large, the distribution governing the state of the chain approximates the desired distribution. Unfortunately it can be difficult to determine how large M needs to be. We describe a simple variant of this method that determines on its own when to stop, and that outputs samples in exact accordance with the desired distribution. The method uses couplings, which have also played a role in other sampling schemes; however, rather than running the coupled chains from the present into the future, one runs from a distant point in the past up until the present, where the distance into the past that one needs to go is determined during the running of the al...
Generating Random Spanning Trees More Quickly than the Cover Time
 PROCEEDINGS OF THE TWENTYEIGHTH ANNUAL ACM SYMPOSIUM ON THE THEORY OF COMPUTING
, 1996
"... ..."
Isoperimetric Problems for Convex Bodies and a Localization Lemma
, 1995
"... We study the smallest number /(K) such that a given convex body K in IR n can be cut into two parts K 1 and K 2 by a surface with an (n \Gamma 1)dimensional measure /(K)vol(K 1 ) \Delta vol(K 2 )=vol(K). Let M 1 (K) be the average distance of a point of K from its center of gravity. We prove for ..."
Abstract

Cited by 129 (7 self)
 Add to MetaCart
We study the smallest number /(K) such that a given convex body K in IR n can be cut into two parts K 1 and K 2 by a surface with an (n \Gamma 1)dimensional measure /(K)vol(K 1 ) \Delta vol(K 2 )=vol(K). Let M 1 (K) be the average distance of a point of K from its center of gravity. We prove for the "isoperimetric coefficient" that /(K) ln 2 M 1 (K) ; and give other upper and lower bounds. We conjecture that our upper bound is best possible up to a constant. Our main tool is a general "Localization Lemma" that reduces integral inequalities over the ndimensional space to integral inequalities in a single variable. This lemma was first proved by two of the authors in an earlier paper, but here we give various extensions and variants that make its application smoother. We illustrate the usefulness of the lemma by showing how a number of wellknown results can be proved using it.
HitandRun Mixes Fast
 Math. Prog
, 1998
"... It is shown that the "hitandrun" algorithm for sampling from a convex body K (introduced by R.L. Smith) mixes in time O # (n 2 R 2 /r 2 ), where R and r are the radii of the inscribed and circumscribed balls of K. Thus after appropriate preprocessing, hitandrun produces an approx ..."
Abstract

Cited by 66 (6 self)
 Add to MetaCart
(Show Context)
It is shown that the "hitandrun" algorithm for sampling from a convex body K (introduced by R.L. Smith) mixes in time O # (n 2 R 2 /r 2 ), where R and r are the radii of the inscribed and circumscribed balls of K. Thus after appropriate preprocessing, hitandrun produces an approximately uniformly distributed sample point in time O # (n 3 ), which matches the best known bound for other sampling algorithms. We show that the bound is best possible in terms of R, r and n. 1 Introduction There are many computational tasks that require sampling from a convex body K in a highdimensional space R n (i.e., generating an approximately uniformly distributed random point in K). The generic method to do so is to define an ergodic random walk on the points of K whose stationary distribution is uniform, and follow this random walk for an appropriately large number of steps; the point obtained this way will be approximately stationary, i.e., approximately uniform. The crucial issue is t...
Fast algorithms for logconcave functions: sampling, rounding, integration and optimization
 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
, 2006
"... We prove that the hitandrun random walk is rapidly mixing for an arbitrary logconcave distribution starting from any point in the support. This extends the work of [26], where this was shown for an important special case, and settles the main conjecture formulated there. From this result, we deriv ..."
Abstract

Cited by 44 (12 self)
 Add to MetaCart
We prove that the hitandrun random walk is rapidly mixing for an arbitrary logconcave distribution starting from any point in the support. This extends the work of [26], where this was shown for an important special case, and settles the main conjecture formulated there. From this result, we derive asymptotically faster algorithms in the general oracle model for sampling, rounding, integration and maximization of logconcave functions, improving or generalizing the main results of [24, 25, 1] and [16] respectively. The algorithms for integration and optimization both use sampling and are surprisingly similar.
Geometric random walks: a survey
 Combinatorial and Computational Geometry
, 2005
"... Abstract. The developing theory of geometric random walks is outlined here. Three aspects —general methods for estimating convergence (the “mixing ” rate), isoperimetric inequalities in R n and their intimate connection to random walks, and algorithms for fundamental problems (volume computation and ..."
Abstract

Cited by 43 (5 self)
 Add to MetaCart
(Show Context)
Abstract. The developing theory of geometric random walks is outlined here. Three aspects —general methods for estimating convergence (the “mixing ” rate), isoperimetric inequalities in R n and their intimate connection to random walks, and algorithms for fundamental problems (volume computation and convex optimization) that are based on sampling by random walks —are discussed. 1.
Efficient algorithms for universal portfolios
 Proceedings of the 41st Annual Symposium on the Foundations of Computer Science
, 2000
"... A constant rebalanced portfolio is an investment strategy that keeps the same distribution of wealth among a set of stocks from day to day. There has been much work on Cover's Universal algorithm, which is competitive with the best constant rebalanced portfolio determined in hindsight (3, 9, 2, ..."
Abstract

Cited by 41 (8 self)
 Add to MetaCart
(Show Context)
A constant rebalanced portfolio is an investment strategy that keeps the same distribution of wealth among a set of stocks from day to day. There has been much work on Cover's Universal algorithm, which is competitive with the best constant rebalanced portfolio determined in hindsight (3, 9, 2, 8, 16, 4, 5, 6). While this algorithm has good performance guarantees, all known implementations are exponential in the number of stocks, restricting the number of stocks used in experiments (9, 4, 2, 5, 6). We present an efficient implementation of the Universal algorithm that is based on nonuniform random walks that are rapidly mixing (1, 14, 7). This same implementation also works for nonfinancial applications of the Universal algorithm, such as data compression (6) and language modeling (11).
On Numerical Solution of the Maximum Volume Ellipsoid Problem
 SIAM JOURNAL ON OPTIMIZATION
, 2001
"... In this paper we study practical solution methods for finding the maximumvolume ellipsoid inscribing a given fulldimensional polytope in ! n defined by a finite set of linear inequalities. Our goal is to design a generalpurpose algorithmic framework that is reliable and efficient in practice. To ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
In this paper we study practical solution methods for finding the maximumvolume ellipsoid inscribing a given fulldimensional polytope in ! n defined by a finite set of linear inequalities. Our goal is to design a generalpurpose algorithmic framework that is reliable and efficient in practice. To evaluate the merit of a practical algorithm, we consider two key factors: the computational cost per iteration and the typical number of iterations required for convergence. In addition, numerical stability is also an important factor. We investigate some new formulations upon which we build primaldual type, interiorpoint algorithms, and we provide theoretical justifications for the proposed formulations and algorithmic framework. Extensive numerical experiments have shown that one of the new algorithms should be the method of choice among the tested algorithms.