22 citations found. Retrieving documents...
Shun-Ichi Amari. Differential geometry of curved exponential families-curvatures and information loss. Annals of Statistics, 10(2):357--385, 1982.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Tree-Based Reparameterization Framework for Analysis.. - Wainwright, Jaakkola, .. (2003)   (4 citations)  (Correct)

....they do not change the distribution on the graph with cycles, but simply yield an alternative factorization. Geometrically, this invariance means that successive iterates are confined to an affine subspace of exponential parameters (i.e. an flat manifold in terms of information geometry (e.g. [30], 31] We then show how each TRP update can be viewed as a projection onto an flat manifold formed by the constraints associated with each tree. We prove that a Pythagorean type result holds for successive TRP iterates under a cost function that is an approximation to the Kullback Leibler (KL) ....

....illustrating the potential advantages of tree based updates over synchronous BP. A. Exponential Families of Distributions Central to our work are exponential representations of distributions, which have been studied extensively in statistics and applied probability theory (e.g. 31] 40] [30], 41] 42] Given an index set , we consider a collection of potential functions associated with the graph . We let denote a vector of parameters, and then consider following distribution: 6a) Here we have described an unrelaxed form of the updates; in the sequel, we present and analyze a ....

[Article contains additional citation context not shown here]

S. Amari, "Differential geometry of curved exponential families---Curvatures and information loss," Ann. Statist., vol. 10, no. 2, pp. 357--385, 1982.


System Identification with Information Theoretic Criteria - Stoorvogel, van Schuppen (1996)   (1 citation)  (Correct)

....in general, although this relation does hold for a larger class of distributions than the Gaussian one. S. I. Amari has investigated the geometric structure of exponential families of probability distributions and explored the use of the likelihood function and of divergence for this family, see [4, 6, 5]. The minimum divergence method differs from the maximum likelihood method in that for the first method a measure must be chosen that represents the observations. In this point the maximum likelihood approach requires less. 4.4 Approximation with divergence For system identification an ....

S.I. Amari, "Differential geometry of curved exponential families - Curvatures and information loss", Ann. Statist., 10 (1982), pp. 357--385.


How representative are the known structures of the proteins in a .. - Gerstein (1998)   (2 citations)  (Correct)

....The distance between two genomes A and B is defined in terms of amino acid composition through the following formula for Euclidean distance: 4) where C(i,g) is the composition of the ith amino acid in genome g. Other measures of distance were also tried, in particular the Hellinger distance [97], which is the same as D abs (AB) except for the replacement: 5) This treats small differences differently. However, it is found that the resulting tree topology is insensitive to the choice of distance metric providing a test of the robustness of the results. Ci Ci , ....

Amari, S. (1982). Differential geometry of curved exponential families --- curvatures and information loss. Ann. Stat. 10, 357-385.


Nonlinear Regression: Third Order Significance - Abebe, Fraser, Wong (1996)   (Correct)

....the model. For normally distributed errors, Bates Watts (1980) proposed two measures of nonlinearity directly related to the maximum relative intrinsic curvature and maximum relative parameter effects curvature, respectively. The 3 measures were extended to a general nonlinear model context in Amari (1982) and Kass (1984) see also Cook and Witmer, 1985) Second order inference for nonlinear regression models was treated in Hamilton, Watts Bates (1982) Hamilton (1986) and Fraser Massam (1987) The analyses by Hamilton, Watts Bates (1982) and Hamilton (1986) are based on approximating the ....

Amari, S. I. (1982). Differential geometry of curved exponential families, curvature and information loss. Ann. Statist. 10, 357--385.


Regression Analysis, Nonlinear or Nonnormal: Simple and.. - Fraser, Wong, Wu   (Correct)

....nonlinearity, the maximum relative intrinsic curvature and the maximum relative parameter effects curvature; these focus respectively on the actual curvature of the surface and on curvature anomalies arising from the choice of parameterization. The measures were extended to the nonnormal case by Amari (1982) and Kass (1984) By approximating the location surface by a quadratic expression, Hamilton, Watts Bates (1982) and Hamilton (1986) developed second order inference methods. Two levels of conditioning were proposed in Fraser Massam (1985) one for the data direction on the tangent plane and ....

Amari, S. I. (1982), Differential geometry of curved exponential families, curvature and information loss, Annals Statistics 10, 357--385.


Bayesian Geometric Theory of Statistical Inference - Huaiyu Zhu, Richard Rohwer (1996)   (Correct)

....used is the family of ffi deviations (Amari s ff divergences) extended to finite measures, as studied in x3. The term deviation is preferred to divergence as explained in ( Cencov, 1982, x8, Note (1) Our parameter ffi 2 [0; 1] follows (Kass, 1984) and corresponds to (1 Gamma ff) 2 in (Amari, 1982; Amari, 1985) 1=u in ( Cencov, 1982) ff in (R enyi, 1961; Hartigan, 1964; Ferguson, 1973) t in (Chernoff, 1952) and k in (Hartigan, 1967) Most concepts in information geometry (Amari, 1985) are indexed by ffi but this detail may be ignored in this introduction. The main result (x4) is ....

....intuitive exposition. Its extension to finite dimension was studied in (Koshevnik and Levit, 1976) The deficiency (information loss) of an estimator (Fisher, 1925) is in addition closely related to the 0 curvature of the model and 1 curvature of the method (Efron, 1975; Dawid, 1975; Reed, 1975; Amari, 1982; Amari, 1985; Amari, 1987) Independent studies on uniform parameterization reveal five different concepts of uniformity (Hougaard, 1982) corresponding to uniformity in the ffi affine structure with ffi 2 f0; 1=3; 1=2; 2=3; 1g (Kass, 1984) It appears that the result of (Hartigan, 1964; Hartigan, ....

[Article contains additional citation context not shown here]

Amari, S. (1982). Differential geometry of curved exponential families---curvature and information loss. Ann. Statist., 10(2):357--385.


On the Mathematical Foundation of Learning Algorithms - Huaiyu Zhu (1996)   (Correct)

....1985) These geometrical concepts exist independent of the statistical interpretation. It is quite remarkable that for statistical models the Fisher metric and the ffi conneciton are the only possible intrinsic versions of these concepts. A remarkable synthesis of the above works was achieved in (Amari, 1982). 1 He defined the ffi curvatures using these ffi connections in conjunction with the metric, and showed that Efron s statistical curvature is exactly the 0 curvature, also called exponential curvature, that the curvature associated with p is the 1 curvature, also called the mixture ....

....as that would make many calculations much simpler. These culminated in (Hougaard, 1982) in which five distictive senses of uniformity were identified and expressed in a family of complicated differential equation parameterised by a scalar ffi 2 f0; 1=3; 1=2; 2=3; 1g. The relation between (Amari, 1982) and (Hougaard, 1982) was soon recognised in (Kass, 1984) in which this differential equation was generalised to multidimensional cases and shown to be exactly ffi Gamma ijk = 0. Therefore a model permits a ffi uniform parameterisation if and only if its ffi curvature vanishes. 1 For ....

[Article contains additional citation context not shown here]

Amari, S. (1982). Differential geometry of curved exponential families---curvature and information loss. Ann. Statist., 10(2):357--385.


Mutual Information, Metric Entropy, and Risk in Estimation of .. - Haussler, Opper (1996)   (Correct)

.... are given by Yamanishi [56, 58, 57] Amari has developed an extensive theory that relates the risk when is the true state of Nature to certain differential geometric properties of the parameter space Theta in the neighborhood of involving Fisher information and related quantities [2, 3] (see also [60, 40] Some authors have also looked at the value of the relative entropy risk in nonparametric cases as well, e.g. 6, 10, 52, 59, 55] Also, the issue of consistent estimation of a general probability distribution with respect to relative entropy is addressed in [1, 41] However, ....

S. Amari. Differential geometry of curved exponential families-curvatures and information loss. Annals of Statistics, 10:357--385, 1982.


Using Ancillary Statistics in On-Line Learning Algorithms - Huaiyu Zhu   (Correct)

....P on X . If X is of infinite size P forms an infinite dimensional manifold. A statistical model is a finite dimensional submanifold Q P . We shall also consider the space e P of finite measures on X . It is also an infinite dimensional manifold, containing P as a smooth submanifold. See [4, 1, 2, 6, 14, 12]. It has been shown [14, 12] that in general a statistical inference problem can be specified by a prior P (p) on P , the information divergence D ffi (p; q) ffi 2 [0; 1] and the model Q. For a given sample x, there exists a unique ideal estimate, called the ffi estimate, b p 2 e P , given ....

....with tangent l 1 and normal l 2 . The auxiliary point r is unchanged and is represented in the new auxiliary coordinates as 1 and 2 , where 1 = 0. Figure 2: Change of coordinates caused by curvature training is the exponential geometry, a special case of information geometry [4, 1, 2, 6]. Roughly speaking, it is defined by the Fisher information metric, which specifies an inner product on the tangent space, and the exponential affine connection, which specifies that exponential families are to be considered flat submanifolds. Note that as we are considering geometry for the ....

S. Amari. Differential geometry of curved exponential families---curvature and information loss. Ann. Statist., 10(2):357--385, 1982.


Information Geometry, Bayesian Inference, Ideal Estimates and.. - Zhu, Rohwer (1998)   (Correct)

.... stabilizing asymptotic variances; The normal likelihood parameterization is 2=3 uniform; The asymptotically non skewed parameterization is 1=3 uniform (Hougaard, 1982; Kass, 1984) Note that our ffi 2 [0; 1] follows (Kass, 1984; Hougaard, 1982) and corresponds to (1 Gamma ff) 2 in (Amari, 1982, 1985) 1=u in ( Cencov, 1982) ff in (LeCam, 1970; Renyi, 1961; Hartigan, 1965; Ferguson, 1973) t in (Chernoff, 1952) k in (Hartigan, 1967) Statistics has always been a science of methodology as well as mathematics. Information geometry helps to reduce some choices of principles into choices ....

.... where A is the 1 curvature of the parameterization (parametereffect curvature, naming curvature, Bhattacharyya curvature) B is the 0 curvature of the model (it is zero for exponential families) and C is the 1 curvature of the estimator (it is zero for MLE) Efron, 1975; Dawid, 1975; Reed, 1975; Amari, 1982, 1985, 1987) This provides a quantitative expression of information loss, and explains the sufficiency of exponential families and MLEs. See also (Kass, 1987) for an introduction. The ffi and (1 Gamma ffi) connections are dually affine to each other, with respect to the metric. This uniquely ....

[Article contains additional citation context not shown here]

Amari, S. (1982). Differential geometry of curved exponential families---curvature and information loss. Ann. Statist., 10(2):357--385.


The Geometry and Dynamics of Data-Driven Modeling - Vixie   (Correct)

.... [11] Efron [18] 19] Dawid [14] and others, but Amari s introduction of a parameterized class of affine connections permitted the generation of statistical divergences in the differential geometric setting and paved the way for the geometric approach to inference in statistical modeling [1] [2]. See the introduction of [3] for more details of the history. Moving on to the theory, let M represent the infinite dimensional manifold M of probability density functions f on R n . Let us assume that these are in fact Radon Nicodym derivatives with respect to the Lebesgue measure so that ....

....make it into a Riemannian manifold. And indeed, when one does so, one obtains that exponential families are flat, i.e. straight lines are defined by affine functions of the parameters. Developments that needed the idea of parallel transport were stimulated by Amari s work of the early 1980 s ( 1] [2]) What Amari did was discover a whole family of affine connections ( ways to define parallel transport of tangent vectors) They are parameterized by ff 2 [ 1, 1] At ff = 0 we have the connection which preserves the metric. It is the Levi Civita or Riemannian connection. Preservation of the ....

S. Amari. Differential geometry of curved exponential families - curvature and information loss. Annals of Statistics, 10:357--387, 1982.


Forecasting Financial Time Series with Correlation Matrix.. - Kustrin (1998)   (Correct)

....of probability distributions. The method is based on construction of manifolds of probability distributions, establishing their differential geometric properties and applying them to the analysis of their properties. The method has been applied to various areas including statistical inference (Amari 1982; Amari 1985) control systems theory (Amari 1987) and multi terminal information theory (Amari and Han 1989; Amari 1989) A.1 Forecasting and Information Geometry As mentioned in x1.2, the ultimate goal of this DPhil research project was formulation of a better forecasting system. Developing a ....

....the degree of undistinguishability of two probability distributions and the distance of an estimate from the maximum likelihood estimate of a distribution, in the case of flat manifolds. The application of differential geometry in statistical inference has been extensively studied in recent years (Amari 1982; Amari 1985; Barndorff Nielsen, Cox, and Reid 1986; Amari, Barndorff Nielsen, Kass, Lauritzen, and Rao 1987) Differential geometry, on the other hand, is a mathematical framework for dealing with geometries that vary from point to a point. A special kind of differential geometry called ....

Amari, S. (1982). Differential geometry of curved exponential families -- curvatures and information loss. Annals of Statistics 10 (2), 357--385.


Bayesian Invariant Measurements of Generalisation for.. - Zhu, Rohwer (1995)   (2 citations)  (Correct)

....since we are only concerned with the amount of information captured by the learning algorithm which, unlike the content of information, should not depend on a renaming of the samples. Such information divergences have been studied extensively in the theory of information geometry [Che72, Ama82, Ama85, Ama87] See also [Egu83, BN86, Lau87, CMS93] For more background see [Hou82, Kas84, BNCR86, Kas87, Kas89] The main result of information geometry particularly relevant to our current enquiry is that there is a unique one parameter family of information divergences satisfying the above ....

....of a conditional distribution P ( Deltajp) This enables us to write the definition of Kullback Leibler divergence between two distributions p and q as 5 K(p; q) Z p log p q = Z y2Y P (yjp) log P (yjp) P (yjq) 2. 2) As in [ZR95a] we use ffi = 1 Gamma ff) 2 instead of ff as in [Ama82, Ama85] This usage was adopted from [Hou82, Kas84] Definition 2.1 (ffi divergence) Let p; q 2 P. The ffi divergence is defined as, D ffi (p; q) 1 ffi(1 Gamma ffi) 1 Gamma Z p ffi q 1 Gammaffi : 2.3) D 0 (p; q) lim ffi 0 D ffi (p; q) D 1 (p; q) lim ffi 1 D ffi ....

S. Amari. Differential geometry of curved exponential families---curvature and information loss. Ann. Stat., 10(2):357--385, 1982.


Information Geometric Measurements of Generalisation - Zhu, Rohwer (1995)   (3 citations)  (Correct)

....in Figure 4. What we need is a measure of divergence D(p; q) 1.4) between two distributions, p and q, which should be defined on the space of distributions P, and should be invariant under reparameterisation. This is what exactly has been offered by the theory of information geometry (See [Ama82, Ama85, BNCR86, ABNK 87, Kas89] for introductions, backgrounds and references) in which a parameterised family of distributions, P, is regarded as a differentiable manifold, where the parameters w act as 1 Even those Bayesian methods which do not make a point estimate use an implicit ....

S. Amari. Differential geometry of curved exponential families---curvature and information loss. Ann. Stat., 10(2):357--385, 1982.


Mutual Information, Metric Entropy, and Cumulative Relative.. - Haussler, Opper (1996)   (3 citations)  (Correct)

....certain asymptotic normality assumptions. Amari has developed an extensive theory that relates the risk when is the true state of Nature to certain differential geometric properties of the parameter space Theta in the neighborhood of involving Fisher information and related quantities [2, 3] (see also [57, 39, 55] Some authors have also looked at the value of the relative entropy risk in nonparametric cases as well, e.g. 6, 10, 51, 56, 54] Also, the issue of consistent estimation of a general probability distribution with respect to relative entropy is addressed in [1, 40] ....

S. Amari. Differential geometry of curved exponential families-curvatures and information loss. Annals of Statistics, 10:357--385, 1982.


Fourth And Fifth Order Efficiency: Fisher Information - Kano   (Correct)

....successfully proved the third order efficiency of the MLE in a class of the BAN (best asymptotic normal) estimators in a curved one parameter multinomial distribution, that is, the MLE is the best of the class of first order efficient estimators. This result is often called the Fisher Rao theorem. Amari (1982) and Eguchi (1983) took a geometric approach to extend the Fisher Rao theorem to the general curved exponential family with a structural parameter vector. The third order efficiency has been shown to hold under other criteria, quadratic risk and concentration probability as well (e.g. Ghosh and ....

Amari, S. (1982). Differential geometry of curved exponential families---Curvatures and information loss. Ann. Statist., 10, 357-385.


Information Geometry on Hierarchical Decomposition of Stochastic.. - Amari (1999)   (2 citations)  Self-citation (Amari)   (Correct)

....in the set of all the probability distributions. Moreover, it gives a new invariant decomposition of entropy into a sum of the degrees of the k th order correlations. The present paper studies such a hierarchical structure and the related invariant decomposition by using information geometry ([2], 7] 22] 24] Information geometry studies the intrinsic geometrical structure to be introduced in the manifold of a family of probability distributions. Its Riemannian structure was introduced by Rao [31] Csisz ar ( 16] 17] studied the geometry of f divergence in detail and applied it to ....

....introduced by Rao [31] Csisz ar ( 16] 17] studied the geometry of f divergence in detail and applied it to information theory. It was Chentsov [14] who developed Rao s idea further and introduced new invariant affine connections in the manifolds of probability distributions. Nagaoka and Amari [2], 25] developed a theory of dual structures and unified all of these theories in the dual differential geometrical framework. Information geometry has been used so far not only for mathematical foundations of statistical inferences ( 2] 9] 22] and many others) but also applied to information ....

[Article contains additional citation context not shown here]

S. Amari, "Differential geometry of curved exponential families--- curvature and information loss", Ann. Statist., 10, pp. 357--385, 1982.


Dualistic Dynamical Systems in the Framework of Information.. - Akio Fujiwara And (1993)   (1 citation)  Self-citation (Amari)   (Correct)

.... geometry played an essential role (Bayer and Lagarias [5] Brockett [6] which is the same as in classical mechanics [7] or in nonequilibrium statistical physics [8] On the other hand, infomation geometry, originated from the geometric study of the manifold of probability distributions (Amari [9][10] Nagaoka and Amari [11] Amari et al. 12] Chentsov [13] etc) has been successfully applied to many fields, such as statistical inference (Amari [10] Kumon and Amari [14] Amari and Kumon [15] Okamoto et al. 16] control systems theory (Amari [17] Ohara and Amari [18] multiterminal ....

S. Amari, "Differential geometry of curved exponential families --- Curvatures and information loss," Ann. Statist. 10 (1982) 357.


Information Geometry of the EM and em Algorithms for Neural Networks - Amari (1995)   (56 citations)  Self-citation (Amari)   (Correct)

....sources of ideas. It originated from the information structure of a manifold of probability distributions and has been developed to be a new mathematical subject with new differential geometrical notions. It has been successfully applied to various information sciences such as statistical sciences (Amari [1982], 1985] BarndorffNielsen [1988] Barndorff Nielsen, Cox and Reid [1986] Murray and Rice [1993] Kass [1989] Amari et al. 1987] etc. information theory (Amari and Han [1989] Amari 2 [1989] systems theory (Amari [1987a] Ohara and Amari [1992] and many others. Applications to neural ....

Amari, S. (1982). Differential geometry of curved exponential families --- curvatures and information loss, Annals of Statistics, 10, pp.357--385.


Information Geometry on Hierarchical Decomposition of Stochastic.. - Amari (1999)   (2 citations)  Self-citation (Amari)   (Correct)

....in the set of all the probability distributions. Moreover, it gives a new invariant decomposition of entropy into a sum of the degrees of the k th order correlations. The present paper studies such a hierarchical structure and the related invariant decomposition by using information geometry ([2], 7] 22] 24] 2 Information geometry studies the intrinsic geometrical structure to be introduced in the manifold of a family of probability distributions. Its Riemannian structure was introduced by Rao [31] Csisz ar ( 16] 17] studied the geometry of f divergence in detail and applied ....

....introduced by Rao [31] Csisz ar ( 16] 17] studied the geometry of f divergence in detail and applied it to information theory. It was Chentsov [14] who developed Rao s idea further and introduced new invariant affine connections in the manifolds of probability distributions. Nagaoka and Amari [2], 25] developed a theory of dual structures and unified all of these theories in the dual differential geometrical framework. Information geometry has been used so far not only for mathematical foundations of statistical inferences ( 2] 9] 22] and many others) but also applied to information ....

[Article contains additional citation context not shown here]

S. Amari, "Differential geometry of curved exponential families--- curvature and information loss", Ann. Statist., 10, pp. 357--385, 1982.


Margin-constrained Random Projections And Very Sparse Random - Projections Ping Li   (Correct)

No context found.

Shun-Ichi Amari. Differential geometry of curved exponential families-curvatures and information loss. Annals of Statistics, 10(2):357--385, 1982.


Third-Order Efficiency Implies Fourth-Order Efficiency - Kano (1996)   (Correct)

No context found.

Amari, S. (1982). Differential geometry of curved exponential families: Curvatures and information loss. Ann. Statist., 10, 357-387.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC