Results 1  10
of
81
Markov Logic Networks
 MACHINE LEARNING
, 2006
"... We propose a simple approach to combining firstorder logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a firstorder knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the ..."
Abstract

Cited by 816 (39 self)
 Add to MetaCart
We propose a simple approach to combining firstorder logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a firstorder knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a firstorder formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudolikelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a realworld database and knowledge base in a university domain illustrate the promise of this approach.
Positive Semidefinite Metric Learning with Boosting
"... The learning of appropriate distance metrics is a critical problem in image classification and retrieval. In this work, we propose a boostingbased technique, termed BOOSTMETRIC, for learning a Mahalanobis distance metric. One of the primary difficulties in learning such a metric is to ensure that t ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
(Show Context)
The learning of appropriate distance metrics is a critical problem in image classification and retrieval. In this work, we propose a boostingbased technique, termed BOOSTMETRIC, for learning a Mahalanobis distance metric. One of the primary difficulties in learning such a metric is to ensure that the Mahalanobis matrix remains positive semidefinite. Semidefinite programming is sometimes used to enforce this constraint, but does not scale well. BOOSTMETRIC is instead based on a key observation that any positive semidefinite matrix can be decomposed into a linear positive combination of traceone rankone matrices. BOOSTMETRIC thus uses rankone positive semidefinite matrices as weak learners within an efficient and scalable boostingbased learning process. The resulting method is easy to implement, does not require tuning, and can accommodate various types of constraints. Experiments on various datasets show that the proposed algorithm compares favorably to those stateoftheart methods in terms of classification accuracy and running time. 1
Accounting for Burstiness in Topic Models
"... Many different topic models have been used successfully for a variety of applications. However, even stateoftheart topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once i ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Many different topic models have been used successfully for a variety of applications. However, even stateoftheart topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once in a document, it is more likely to be used again. We introduce a topic model that uses Dirichlet compound multinomial (DCM) distributions to model this burstiness phenomenon. On both text and nontext datasets, the new model achieves better heldout likelihood than standard latent Dirichlet allocation (LDA). It is straightforward to incorporate the DCM extension into topic models that are more complex than LDA. 1.
On the Behavior of the Gradient Norm in the Steepest Descent Method
, 2000
"... It is well known that the norm of the gradient may be unreliable as a stopping test in unconstrained optimization, and that it often exhibits oscillations in the course of the optimization. In this paper we present results describing the properties of the gradient norm for the steepest descent me ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
It is well known that the norm of the gradient may be unreliable as a stopping test in unconstrained optimization, and that it often exhibits oscillations in the course of the optimization. In this paper we present results describing the properties of the gradient norm for the steepest descent method applied to quadratic objective functions. We also make some general observations that apply to nonlinear problems, relating the gradient norm, the objective function value, and the path generated by the iterates.
Nonsmooth optimization via BFGS
, 2008
"... We investigate the BFGS algorithm with an inexact line search when applied to nonsmooth functions, not necessarily convex. We define a suitable line search and show that it generates a sequence of nested intervals containing points satisfying the Armijo and weak Wolfe conditions, assuming only abs ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
We investigate the BFGS algorithm with an inexact line search when applied to nonsmooth functions, not necessarily convex. We define a suitable line search and show that it generates a sequence of nested intervals containing points satisfying the Armijo and weak Wolfe conditions, assuming only absolute continuity. We also prove that the line search terminates for all semialgebraic functions. The analysis of the convergence of BFGS using this line search seems very challenging; our theoretical results are limited to the univariate case. However, we systematically investigate the numerical behavior of BFGS with the inexact line search on various classes of examples. The method consistently converges to local minimizers on all but the most difficult class of examples, and even in that case, the method converges to points that are apparently Clarke stationary. Furthermore, the convergence rate is observed to be linear with respect to the number of function evaluations, with a rate of convergence that varies in an unexpectedly consistent way with the problem parameters. When the problem is sufficiently difficult, convergence may not be observed, but this seems to be due to rounding error caused by illconditioning. We try to give insight into why BFGS works as well as it does, and we conclude with a bold conjecture.
GloptLab, a configurable framework for the rigorous global solution of quadratic constraint satisfaction problems
"... solution of quadratic constraint satisfaction problems ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
(Show Context)
solution of quadratic constraint satisfaction problems
A dualweighted trustregion adaptive POD 4DVar applied to a finiteelement shallowwater equations model
 International Journal for Numerical Methods in Fluids 2009; DOI: 10.1002/fld.2198
"... In this paper we study solutions of an inverse problem for a global shallow water model controlling its initial conditions specified from the 40yr ECMWF Reanalysis (ERA40) data sets, in the presence of full or incomplete observations being assimilated in a time interval (window of assimilation) w ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
In this paper we study solutions of an inverse problem for a global shallow water model controlling its initial conditions specified from the 40yr ECMWF Reanalysis (ERA40) data sets, in the presence of full or incomplete observations being assimilated in a time interval (window of assimilation) with or without background error covariance terms. As an extension of the work by Chen et al. (Int. J. Numer. Meth. Fluids 2009), we attempt to obtain a reduced order model of the above inverse problem, based on proper orthogonal decomposition (POD), referred to as POD 4DVar for a finite volume global shallow water equation model based on the Lin–Rood fluxform semiLagrangian semiimplicit time integration scheme. Different approaches of POD implementation for the reduced inverse problem are compared, including a dualweighted method for snapshot selection coupled with a trustregion POD adaptivity approach. Numerical results with various observational densities and background error covariance operator are also presented. The POD 4D Var model results combined with the trustregion adaptivity exhibit similarity in terms of various error metrics to the full 4D Var results, but are obtained using a significantly lesser number of minimization iterations and require lesser CPU time. Based on our previous and current work, we conclude that POD 4D Var certainly warrants further studies, with promising potential of its extension
On affinescaling interiorpoint Newton methods for nonlinear minimization with bound constraints
 Computational Optimization and Applications
"... Abstract. A class of new affinescaling interiorpoint Newtontype methods are considered for the solution of optimization problems with bound constraints. The methods are shown to be locally quadratically convergent under the strong second order sufficiency condition without assuming strict compl ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
(Show Context)
Abstract. A class of new affinescaling interiorpoint Newtontype methods are considered for the solution of optimization problems with bound constraints. The methods are shown to be locally quadratically convergent under the strong second order sufficiency condition without assuming strict complementarity of the solution. The new methods differ from previous ones by Coleman and Li [Mathematical Programming, 67 (1994), pp. 189– 224] and Heinkenschloss, Ulbrich, and Ulbrich [Mathematical Programming, 86 (1999), pp. 615–635] mainly in the choice of the scaling matrix. The scaling matrices used here have stronger smoothness properties and allow the application of standard results from nonsmooth analysis in order to obtain a relatively short and elegant local convergence result. An important tool for the definition of the new scaling matrices is the correct identification of the degenerate indices. Some illustrative numerical results with a comparison of the different scaling techniques are also included. Key Words. Newton’s method, affine scaling, interiorpoint method, quadratic convergence, identification of active constraints. 1
Construction of nondiagonal background error covariance matrices for global chemical data assimilation,
 Geosci. Model Dev. Discuss.,
, 2010
"... Abstract. Chemical data assimilation attempts to optimally use noisy observations along with imperfect model predictions to produce a better estimate of the chemical state of the atmosphere. It is widely accepted that a key ingredient for successful data assimilation is a realistic estimation of th ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Chemical data assimilation attempts to optimally use noisy observations along with imperfect model predictions to produce a better estimate of the chemical state of the atmosphere. It is widely accepted that a key ingredient for successful data assimilation is a realistic estimation of the background error distribution. Particularly important is the specification of the background error covariance matrix, which contains information about the magnitude of the background errors and about 5 their correlations. As models evolve toward finer resolutions, the use of diagonal background covariance matrices is increasingly inaccurate, as they captures less of the spatial error correlations. This paper discusses an efficient computational procedure for constructing nondiagonal background error covariance matrices which account for the spatial correlations of errors. The correlation length scales are specified by the user; a correct choice of correlation lengths is important for a good performance of 10 the data assimilation system. The benefits of using the nondiagonal covariance matrices for variational data assimilation with chemical transport models are illustrated.