Results

**11 - 17**of**17**### Simple Complexity Analysis of Simplified Direct Search

, 2014

"... We consider the problem of unconstrained minimization of a smooth function in the derivative-free setting using. In particular, we propose and study a simplified variant of the direct search method (of direction type), which we call simplified direct search (SDS). Unlike standard direct search metho ..."

Abstract
- Add to MetaCart

(Show Context)
We consider the problem of unconstrained minimization of a smooth function in the derivative-free setting using. In particular, we propose and study a simplified variant of the direct search method (of direction type), which we call simplified direct search (SDS). Unlike standard direct search methods, which depend on a large number of parameters that need to be tuned, SDS depends on a single scalar parameter only. Despite relevant research activity in direct search methods spanning several decades, com-plexity guarantees—bounds on the number of function evaluations needed to find an approximate solution—were not established until very recently. In this paper we give a surprisingly brief and unified analysis of SDS for nonconvex, convex and strongly convex functions. We match the existing complexity results for direct search in their dependence on the problem dimension (n) and error tolerance (), but the overall bounds are simpler, easier to interpret, and have better dependence on other problem parameters. In particular, we show that for the set of direc-tions formed by the standard coordinate vectors and their negatives, the number of function evaluations needed to find an -solution is O(n2/) (resp. O(n2 log(1/))) for the problem of minimizing a convex (resp. strongly convex) smooth function. In the nonconvex smooth case, the bound is O(n2/2), with the goal being the reduction of the norm of the gradient below .

### 1Collaborative 20 Questions for Target Localization

"... We consider the problem of 20 questions with noise for multiple players under the minimum entropy criterion [1] in the setting of stochastic search, with application to target localization. Each player yields a noisy response to a binary query governed by a certain error probability. First, we propo ..."

Abstract
- Add to MetaCart

(Show Context)
We consider the problem of 20 questions with noise for multiple players under the minimum entropy criterion [1] in the setting of stochastic search, with application to target localization. Each player yields a noisy response to a binary query governed by a certain error probability. First, we propose a sequential policy for constructing questions that queries each player in sequence and refines the posterior of the target location. Second, we consider a joint policy that asks all players questions in parallel at each time instant and characterize the structure of the optimal policy for constructing the sequence of questions. This generalizes the single player probabilistic bisection method [1], [2] for stochastic search problems. Third, we prove an equivalence between the two schemes showing that, despite the fact that the sequential scheme has access to a more refined filtration, the joint scheme performs just as well on average. Fourth, we establish convergence rates of the mean-square error (MSE) and derive error exponents. Lastly, we obtain an extension to the case of unknown error probabilities. This framework provides a mathematical model for incorporating a human in the loop for active machine learning systems. Index Terms Optimal query selection, machine-machine-interaction, target localization, convergence rate, minimum entropy, human-aided decision making. I.

### Parallel Distributed Block Coordinate Descent Methods based on Pairwise Comparison Oracle

"... This paper provides a block coordinate descent algorithm to solve unconstrained opti-mization problems. In our algorithm, computation of function values or gradients is not required. Instead, pairwise comparison of function values is used. Our algorithm consists of two steps; one is the direction es ..."

Abstract
- Add to MetaCart

This paper provides a block coordinate descent algorithm to solve unconstrained opti-mization problems. In our algorithm, computation of function values or gradients is not required. Instead, pairwise comparison of function values is used. Our algorithm consists of two steps; one is the direction estimate step and the other is the search step. Both steps require only pairwise comparison of function values, which tells us only the order of function values over two points. In the direction estimate step, a Newton type search direction is estimated. A computation method like block coordinate descent methods is used with the pairwise comparison. In the search step, a numerical solution is updated along the estimated direction. The computation in the direction estimate step can be easily parallelized, and thus, the algorithm works efficiently to find the minimizer of the objective function. Also, we show an upper bound of the convergence rate. In numerical experiments, we show that our method efficiently finds the optimal solution compared to some existing methods based on the pairwise comparison. 1

### Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem

"... This paper proposes a new method for the K-armed dueling bandit problem, a variation on the regularK-armed bandit problem that offers only relative feedback about pairs of arms. Our ap-proach extends the Upper Confidence Bound al-gorithm to the relative setting by using estimates of the pairwise pro ..."

Abstract
- Add to MetaCart

(Show Context)
This paper proposes a new method for the K-armed dueling bandit problem, a variation on the regularK-armed bandit problem that offers only relative feedback about pairs of arms. Our ap-proach extends the Upper Confidence Bound al-gorithm to the relative setting by using estimates of the pairwise probabilities to select a promising arm and applying Upper Confidence Bound with the winner as a benchmark. We prove a sharp finite-time regret bound of order O(K log T) on a very general class of dueling bandit problems that matches a lower bound proven in (Yue et al., 2012). In addition, our empirical results using real data from an information retrieval applica-tion show that it greatly outperforms the state of the art. 1.

### Randomized Derivative-Free Optimization of Noisy Convex Functions

, 2015

"... We propose STARS, a randomized derivative-free algorithm for unconstrained opti-mization when the function evaluations are contaminated with random noise. STARS takes dynamic, noise-adjusted smoothing stepsizes that minimize the least-squares er-ror between the true directional derivative of a noisy ..."

Abstract
- Add to MetaCart

(Show Context)
We propose STARS, a randomized derivative-free algorithm for unconstrained opti-mization when the function evaluations are contaminated with random noise. STARS takes dynamic, noise-adjusted smoothing stepsizes that minimize the least-squares er-ror between the true directional derivative of a noisy function and its finite difference approximation. We provide a convergence rate analysis of STARS for solving convex problems with additive or multiplicative noise. Experimental results show that (1) STARS exhibits noise-invariant behavior with respect to different levels of stochastic noise; (2) the practical performance of STARS in terms of solution accuracy and con-vergence rate is significantly better than that indicated by the theoretical result; and (3) STARS outperforms a selection of randomized zero-order methods on both additive-and multiplicative-noisy functions.

### Multiple Optimality Guarantees in Statistical Learning

, 2014

"... Classically, the performance of estimators in statistical learning problems is measured in terms of their predictive ability or estimation error as the sample size n grows. In modern statistical and machine learning applications, however, computer scientists, statisticians, and analysts have a varie ..."

Abstract
- Add to MetaCart

Classically, the performance of estimators in statistical learning problems is measured in terms of their predictive ability or estimation error as the sample size n grows. In modern statistical and machine learning applications, however, computer scientists, statisticians, and analysts have a variety of additional criteria they must balance: estimators must be efficiently computable, data providers may wish to maintain anonymity, large datasets must be stored and accessed. In this thesis, we consider the fundamental questions that arise when trading between multiple such criteria—computation, communication, privacy—while maintaining statistical performance. Can we develop lower bounds that show there must be tradeoffs? Can we develop new procedures that are both theoretically optimal and practically useful? To answer these questions, we explore examples from optimization, confidentiality pre-serving statistical inference, and distributed estimation under communication constraints. Viewing our examples through a general lens of constrained minimax theory, we prove fundamental lower bounds on the statistical performance of any algorithm subject to the constraints—computational, confidentiality, or communication—specified. These lower bounds allow us to guarantee the optimality of the new algorithms we develop addressing the addi-