Results

**1 - 4**of**4**### MDL, Penalized Likelihood, and Statistical Risk

"... Abstract-We determine, for both countable and uncountable collections of functions, information-theoretic conditions on a penalty pen(f ) such that the optimizerf of the penalized log likelihood criterion log 1/likelihood(f )+pen(f ) has risk not more than the index of resolvability corresponding t ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract-We determine, for both countable and uncountable collections of functions, information-theoretic conditions on a penalty pen(f ) such that the optimizerf of the penalized log likelihood criterion log 1/likelihood(f )+pen(f ) has risk not more than the index of resolvability corresponding to the accuracy of the optimizer of the expected value of the criterion. If F is the linear span of a dictionary of functions, traditional descriptionlength penalties are based on the number of non-zero terms (the 0 norm of the coefficients). We specialize our general conclusions to show the 1 norm of the coefficients times a suitable multiplier λ is also an information-theoretically valid penalty.

### MDL Procedures with ℓ1 Penalty and their Statistical Risk

"... Abstract — We review recently developed theory for the Minimum Description Length principle, penalized likelihood and its statistical risk. An information theoretic condition on a penalty pen(f) yields the conclusion that the optimizer of the penalized log likelihood criterion log 1/likelihood(f) + ..."

Abstract
- Add to MetaCart

Abstract — We review recently developed theory for the Minimum Description Length principle, penalized likelihood and its statistical risk. An information theoretic condition on a penalty pen(f) yields the conclusion that the optimizer of the penalized log likelihood criterion log 1/likelihood(f) + pen(f) has risk not more than the index of resolvability, corresponding to the accuracy of the optimizer of the expected value of the criterion. For the linear span of a dictionary of candidate terms, we develop the validity of description-length penalties based on the ℓ1 norm of the coefficients. New results are presented for the regression case. Other examples involve log-density estimation and Gaussian graphical statistical models. I.

### 2.2 General Minimax Lower Bound..................... 6

, 2011

"... This thesis deals with lower bounds for the minimax risk in general decision-theoretic problems. Such bounds are useful for assessing the quality of decision rules. After providing a unified treatment of existing techniques, we prove new lower bounds which involve f-divergences, a general class of d ..."

Abstract
- Add to MetaCart

This thesis deals with lower bounds for the minimax risk in general decision-theoretic problems. Such bounds are useful for assessing the quality of decision rules. After providing a unified treatment of existing techniques, we prove new lower bounds which involve f-divergences, a general class of dissimilarity measures between proba-bility measures. The proofs of our bounds rely on elementary convexity facts and are extremely simple. Special cases and straightforward corollaries of our results include many well-known lower bounds. As applications, we study a covariance matrix esti-mation problem and the problem of estimation of convex bodies from noisy support function measurements.

### Information Theoretic validity of Penalized Likelihood

"... Abstract—Building upon past work, which de-veloped information theoretic notions of when a penalized likelihood procedure can be interpreted as codelengths arising from a two stage code and when the statistical risk of the procedure has a redundancy risk bound, we present new results and risk bounds ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract—Building upon past work, which de-veloped information theoretic notions of when a penalized likelihood procedure can be interpreted as codelengths arising from a two stage code and when the statistical risk of the procedure has a redundancy risk bound, we present new results and risk bounds showing that the l1 penalty in Gaussian Graphical Models fits the above story. We also show how twice the traditional l0 penalty times plus lower order terms which stay bounded on the whole parameter space has a conditional two stage description length interpretation and present risk bounds for this penalized likelihood procedure. I.