Results 11  20
of
118
The Information Bottleneck Revisited or How to Choose a Good Distortion Measure
"... Abstract — It is wellknown that the information bottleneck method and rate distortion theory are related. Here it is described how the information bottleneck can be considered as rate distortion theory for a family of probability measures where information divergence is used as distortion measure. ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
Abstract — It is wellknown that the information bottleneck method and rate distortion theory are related. Here it is described how the information bottleneck can be considered as rate distortion theory for a family of probability measures where information divergence is used as distortion measure. It is shown that the information bottleneck method has some properties that are not shared with rate distortion theory based on any other divergence measure. In this sense the information bottleneck method is unique. I.
A contrast between two decision rules for use with (convex) sets of probabilities: ΓMaximin versus Eadmissibilty.
, 2002
"... ..."
Generalised exponential families and associated entropy functions
, 2008
"... entropy ..."
(Show Context)
On Bayesian bounds
 In Proceedings of the 23rd International Conference on Machine Learning
, 2006
"... We show that several important Bayesian bounds studied in machine learning, both in the batch as well as the online setting, arise by an application of a simple compression lemma. In particular, we derive (i) PACBayesian bounds in the batch setting, (ii) Bayesian logloss bounds and (iii) Bayesian ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
(Show Context)
We show that several important Bayesian bounds studied in machine learning, both in the batch as well as the online setting, arise by an application of a simple compression lemma. In particular, we derive (i) PACBayesian bounds in the batch setting, (ii) Bayesian logloss bounds and (iii) Bayesian boundedloss bounds in the online setting using the compression lemma. Although every setting has different semantics for prior, posterior and loss, we show that the core bound argument is the same. The paper simplifies our understanding of several important and apparently disparate results, as well as brings to light a powerful tool for developing similar arguments for other methods. 1.
Composite multiclass losses.
 In Neural Information Processing Systems,
, 2011
"... Abstract We consider loss functions for multiclass prediction problems. We show when a multiclass loss can be expressed as a "proper composite loss", which is the composition of a proper loss and a link function. We extend existing results for binary losses to multiclass losses. We subsum ..."
Abstract

Cited by 21 (8 self)
 Add to MetaCart
Abstract We consider loss functions for multiclass prediction problems. We show when a multiclass loss can be expressed as a "proper composite loss", which is the composition of a proper loss and a link function. We extend existing results for binary losses to multiclass losses. We subsume results on "classification calibration" by relating it to properness. We determine the stationarity condition, Bregman representation, ordersensitivity, and quasiconvexity of multiclass proper losses. We then characterise the existence and uniqueness of the composite representation for multiclass losses. We show how the composite representation is related to other core properties of a loss: mixability, admissibility and (strong) convexity of multiclass losses which we characterise in terms of the Hessian of the Bayes risk. We show that the simple integral representation for binary proper losses can not be extended to multiclass losses but offer concrete guidance regarding how to design different loss functions. The conclusion drawn from these results is that the proper composite representation is a natural and convenient tool for the design of multiclass loss functions.
Extensions of Expected Utility Theory and Some Limitations of Pairwise Comparisons
 In Proceedings of the Third ISIPTA (JM
, 2003
"... We contrast three decision rules that extend Expected Utility to contexts where a convex set of probabilities is used to depict uncertainty: #Maximin, Maximality, and Eadmissibility. The rules extend Expected Utility theory as they require that an option is inadmissible if there is another that ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
We contrast three decision rules that extend Expected Utility to contexts where a convex set of probabilities is used to depict uncertainty: #Maximin, Maximality, and Eadmissibility. The rules extend Expected Utility theory as they require that an option is inadmissible if there is another that carries greater expected utility for each probability in a (closed) convex set. If the convex set is a singleton, then each rule agrees with maximizing expected utility. We show that, even when the option set is convex, this pairwise comparison between acts may fail to identify those acts which are Bayes for some probability in a convex set that is not closed. This limitation affects two of the decision rules but not Eadmissibility, which is not a pairwise decision rule. Eadmissibility can be used to distinguish between two convex sets of probabilities that intersect all the same supporting hyperplanes.
Nonmyopic strategies in prediction markets
 In Second Workshop on Prediction Markets (held at EC ’07
, 2007
"... One attractive feature of market scoring rules [Hanson, Information Systems Frontiers, 2003] is that they are myopically strategyproof: It is optimal for a trader to report her true belief about the likelihood of an event provided that we ignore the impact of her report on the profit she might garne ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
(Show Context)
One attractive feature of market scoring rules [Hanson, Information Systems Frontiers, 2003] is that they are myopically strategyproof: It is optimal for a trader to report her true belief about the likelihood of an event provided that we ignore the impact of her report on the profit she might garner from future trades. This does not rule out the possibility that traders may profit by first misleading other traders through dishonest trades and then correcting the errors made by other traders. In this paper, we describe a new approach to analyzing nonmyopic strategies and the existence of myopic equilibria. We first use a simple model with two partially informed traders in a single information market to gain insight into the conditions under which different equilibrium behavior emerges. We prove that, under generic conditions, the myopically optimal strategy profile is not a weak Perfect Bayesian Equilibrium (PBE) strategy for the logarithmic market scoring rule. We show that our results extend to multiple traders and signals. We propose a simple discounted market scoring rule that reduces the opportunity for bluffing strategies. We show that in any weak PBE, myopic or otherwise, the market price converges to the optimal price, and the rate of convergence can be bounded in terms of the discounting parameter.
Information Geometry of qGaussian Densities and Behaviors of Solutions to Related Diffusion Equations
, 2008
"... This paper presents new geometric aspects of the behaviors of solutions to the porous medium equation (PME) and its associated equation. First we discuss thermostatistical structure with information geometry on a manifold of generalized exponential densities. A dualistic relation between the two exi ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
(Show Context)
This paper presents new geometric aspects of the behaviors of solutions to the porous medium equation (PME) and its associated equation. First we discuss thermostatistical structure with information geometry on a manifold of generalized exponential densities. A dualistic relation between the two existing formalisms by Naudts and Eguchi is elucidated. Next by equipping the manifold of what is called qGaussian densities with such a structure, we derive several physically and geometrically interesting properties of the solutions. Since the manifold of the qGaussian densities is proved invariant for the equations, it plays a central role in our analysis. We characterize the momentconserving projection of a solution to the manifold as a geodesic curve. Further, the evolutional velocities of the second moments and the convergence rate to the manifold are evaluated in terms of the Bregman divergence. Finally we show the selfsimilar solution is geometrically special in the sense that it is simultaneously a geodesic curve for the dually flat connections. 1