Results 1 - 10
of
60
Joint inference in information extraction
- In Proceedings of the 22nd National Conference on Artificial Intelligence (2007
"... The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each candidate record separately, and then merging records that refer to the same entities. While computationally efficient, this ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each candidate record separately, and then merging records that refer to the same entities. While computationally efficient, this approach is suboptimal, because it ignores the fact that segmenting one candidate record can help to segment similar ones. For example, resolving a well-segmented field with a lessclear one can disambiguate the latter’s boundaries. In this paper we propose a joint approach to information extraction, where segmentation of all records and entity resolution are performed together in a single integrated inference process. While a number of previous authors have taken steps in this direction (e.g., Pasula et al. (2003), Wellner et al. (2004)), to our knowledge this is the first fully joint approach. In experiments on the CiteSeer and Cora citation matching datasets, joint inference improved accuracy, and our approach outperformed previous ones. Further, by using Markov logic and the existing algorithms for it, our solution consisted mainly of writing the appropriate logical formulas, and required much less engineering than previous ones.
Automatically Refining the Wikipedia Infobox Ontology
, 2008
"... The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more se ..."
Abstract
-
Cited by 43 (7 self)
- Add to MetaCart
The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanly-structured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infobox-class ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful joint-inference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the joint-inference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integrating Wikipedia’s infobox-class schemata with WordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features.
Clp(bn): Constraint logic programming for probabilistic knowledge
- In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI03
, 2003
"... Abstract. In Datalog, missing values are represented by Skolem constants. More generally, in logic programming missing values, or existentially quantified variables, are represented by terms built from Skolem functors. The CLP(BN) language represents the joint probability distribution over missing v ..."
Abstract
-
Cited by 37 (6 self)
- Add to MetaCart
Abstract. In Datalog, missing values are represented by Skolem constants. More generally, in logic programming missing values, or existentially quantified variables, are represented by terms built from Skolem functors. The CLP(BN) language represents the joint probability distribution over missing values in a database or logic program by using constraints to represent Skolem functions. Algorithms from inductive logic programming (ILP) can be used with only minor modification to learn CLP(BN) programs. An implementation of CLP(BN) is publicly available as part of YAP Prolog at
Bottom-up learning of Markov logic network structure
- In Proceedings of the Twenty-Fourth International Conference on Machine Learning
, 2007
"... Markov logic networks (MLNs) are a statistical relational model that consists of weighted firstorder clauses and generalizes first-order logic and Markov networks. The current state-of-theart algorithm for learning MLN structure follows a top-down paradigm where many potential candidate structures a ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Markov logic networks (MLNs) are a statistical relational model that consists of weighted firstorder clauses and generalizes first-order logic and Markov networks. The current state-of-theart algorithm for learning MLN structure follows a top-down paradigm where many potential candidate structures are systematically generated without considering the data and then evaluated using a statistical measure of their fit to the data. Even though this existing algorithm outperforms an impressive array of benchmarks, its greedy search is susceptible to local maxima or plateaus. We present a novel algorithm for learning MLN structure that follows a more bottom-up approach to address this problem. Our algorithm uses a “propositional ” Markov network learning method to construct “template” networks that guide the construction of candidate clauses. Our algorithm significantly improves accuracy and learning time over the existing topdown approach in three real-world domains. 1.
Efficient weight learning for Markov logic networks
- In Proceedings of the Eleventh European Conference on Principles and Practice of Knowledge Discovery in Databases
, 2007
"... Abstract. Markov logic networks (MLNs) combine Markov networks and first-order logic, and are a powerful and increasingly popular representation for statistical relational learning. The state-of-the-art method for discriminative learning of MLN weights is the voted perceptron algorithm, which is ess ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
Abstract. Markov logic networks (MLNs) combine Markov networks and first-order logic, and are a powerful and increasingly popular representation for statistical relational learning. The state-of-the-art method for discriminative learning of MLN weights is the voted perceptron algorithm, which is essentially gradient descent with an MPE approximation to the expected sufficient statistics (true clause counts). Unfortunately, these can vary widely between clauses, causing the learning problem to be highly ill-conditioned, and making gradient descent very slow. In this paper, we explore several alternatives, from per-weight learning rates to second-order methods. In particular, we focus on two approaches that avoid computing the partition function: diagonal Newton and scaled conjugate gradient. In experiments on standard SRL datasets, we obtain order-of-magnitude speedups, or more accurate models given comparable learning times. 1
Mapping and revising markov logic networks for transfer learning
- In Proceedings of the 22 nd National Conference on Artificial Intelligence (AAAI
, 2007
"... Transfer learning addresses the problem of how to leverage knowledge acquired in a source domain to improve the accuracy and speed of learning in a related target domain. This paper considers transfer learning with Markov logic networks (MLNs), a powerful formalism for learning in relational domains ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
Transfer learning addresses the problem of how to leverage knowledge acquired in a source domain to improve the accuracy and speed of learning in a related target domain. This paper considers transfer learning with Markov logic networks (MLNs), a powerful formalism for learning in relational domains. We present a complete MLN transfer system that first autonomously maps the predicates in the source MLN to the target domain and then revises the mapped structure to further improve its accuracy. Our results in several real-world domains demonstrate that our approach successfully reduces the amount of time and training data needed to learn an accurate model of a target domain over learning from scratch.
Joint Unsupervised Coreference Resolution with Markov Logic
"... Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised approaches have been proposed (e.g., Haghighi and Klein (2007)), but they are less accurate. In this paper, we present the first unsupervised approach that is competi ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised approaches have been proposed (e.g., Haghighi and Klein (2007)), but they are less accurate. In this paper, we present the first unsupervised approach that is competitive with supervised ones. This is made possible by performing joint inference across mentions, in contrast to the pairwise classification typically used in supervised methods, and by using Markov logic as a representation language, which enables us to easily express relations like apposition and predicate nominals. On MUC and ACE datasets, our model outperforms Haghigi and Klein’s one using only a fraction of the training data, and often matches or exceeds the accuracy of state-of-the-art supervised models. 1
Discriminative Structure and Parameter Learning for Markov Logic Networks
"... Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both first-order logic and graphical models. Existing methods for learning the logical structure of an MLN are not discriminative; however, many relational learning problems involve spe ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both first-order logic and graphical models. Existing methods for learning the logical structure of an MLN are not discriminative; however, many relational learning problems involve specific target predicates that must be inferred from given background information. We found that existing MLN methods perform very poorly on several such ILP benchmark problems, and we present improved discriminative methods for learning MLN clauses and weights that outperform existing MLN and traditional ILP methods. 1.
Statistical predicate invention
- In Z. Ghahramani (Ed.), Proceedings of the 24’th annual international conference on machine learning (ICML-2007
, 2007
"... We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an initial model for SPI based on second-order Markov logic, in which predicates as well as arguments can be variables, and the domain of discourse is not fully known in advance. Our approach iteratively refines clusters of symbols based on the clusters of symbols they appear in atoms with (e.g., it clusters relations by the clusters of the objects they relate). Since different clusterings are better for predicting different subsets of the atoms, we allow multiple cross-cutting clusterings. We show that this approach outperforms Markov logic structure learning and the recently introduced infinite relational model on a number of relational datasets. 1.
A General Method for Reducing the Complexity of Relational Inference And its Application to MCMC
"... Many real-world problems are characterized by complex relational structure, which can be succinctly represented in firstorder logic. However, many relational inference algorithms proceed by first fully instantiating the first-order theory and then working at the propositional level. The applicabilit ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Many real-world problems are characterized by complex relational structure, which can be succinctly represented in firstorder logic. However, many relational inference algorithms proceed by first fully instantiating the first-order theory and then working at the propositional level. The applicability of such approaches is severely limited by the exponential time and memory cost of propositionalization. Singla and Domingos (2006) addressed this by developing a “lazy ” version of the WalkSAT algorithm, which grounds atoms and clauses only as needed. In this paper we generalize their ideas to a much broader class of algorithms, including other types of SAT solvers and probabilistic inference methods like MCMC. Lazy inference is potentially applicable whenever variables and functions have default values (i.e., a value that is much more frequent than the others). In relational domains, the default is false for atoms and true for clauses. We illustrate our framework by applying it to MC-SAT, a state-of-the-art MCMC algorithm. Experiments on a number of real-world domains show that lazy inference reduces both space and time by several orders of magnitude, making probabilistic relational inference applicable in previously infeasible domains.

