Results 1 -
6 of
6
Learning Graphical Models for Relational Data via Lattice Search
- MACHINE LEARNING
"... Many machine learning applications that involve relational databases incorporate first-order logic and probability. Relational extensions of generative graphical models include Parametrized Bayes Net [44] and Markov Logic Networks (MLNs). Many of the current state-of-the-art algorithms for learning ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Many machine learning applications that involve relational databases incorporate first-order logic and probability. Relational extensions of generative graphical models include Parametrized Bayes Net [44] and Markov Logic Networks (MLNs). Many of the current state-of-the-art algorithms for learning MLNs have focused on relatively small datasets with few descriptive attributes, where predicates are mostly binary and the main task is usually prediction of links between entities. This paper addresses what is in a sense a complementary problem: learning the structure of a graphical model that models the distribution of discrete descriptive attributes given the links between entities in a relational database. Descriptive attributes are usually nonbinary and can be very informative, but they increase the search space of possible candidate clauses. We present an efficient new algorithm for learning a Parametrized Bayes Net that performs a level-wise search through the table join lattice for relational dependencies. From the Bayes net we obtain an MLN structure via a standard moralization procedure for converting directed models to undirected models. Learning MLN structure by moralization is 200-1000 times faster and scores substantially higher in predictive accuracy than benchmark MLN algorithms on three relational databases.
Virtual joins with nonexistent links
, 2009
"... Abstract. Many approaches to multi-relational learning require the computation of database frequencies in the presence of nonexistent links. The corresponding ILP problem is to compute the number of groundings that satisfy a given conjunction of literals in a relational database, where one or more o ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Many approaches to multi-relational learning require the computation of database frequencies in the presence of nonexistent links. The corresponding ILP problem is to compute the number of groundings that satisfy a given conjunction of literals in a relational database, where one or more of the literals is negated. We present a fast new dynamic programming algorithm for this problem. The database table joins performed by our algorithm are restricted to joins of tables already existing in the database. Evaluation on three data sets confirms the efficiency of our algorithm; computing frequencies for negated literals added about 15 % to the cost of computing frequencies for positive literals only. 1
Simple Decision Forests for Multi-Relational Classification
"... An important task in multi-relational data mining is link-based classification which takes advantage of attributes of links and linked entities, to predict the class label. The relational naive Bayes classifier exploits independence assumptions to achieve scalability. We introduce a weaker independe ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
An important task in multi-relational data mining is link-based classification which takes advantage of attributes of links and linked entities, to predict the class label. The relational naive Bayes classifier exploits independence assumptions to achieve scalability. We introduce a weaker independence assumption to the e↵ect that information from di↵erent data tables is independent given the class label. The independence assumption entails a closed-form formula for combining probabilistic predictions based on decision trees learned on di↵erent database tables. Logistic regression learns di↵erent weights for information from di↵erent tables and prunes irrelevant tables. In experiments, learning was very fast with competitive accuracy.
Class-Level Bayes Nets for Relational Data
, 2009
"... Many databases store data in relational format, with different types of entities and information about links between the entities. The field of statistical-relational learning has developed a number of new statistical models for such data. Most of these models aim to support instance-level predictio ..."
Abstract
- Add to MetaCart
(Show Context)
Many databases store data in relational format, with different types of entities and information about links between the entities. The field of statistical-relational learning has developed a number of new statistical models for such data. Most of these models aim to support instance-level predictions about the attributes or links of specific entities. In this paper we focus on learning class-level dependencies, which model the database statistics over attributes of links and linked objects. Class-level statistical relationships are of interest in themselves, and they support applications like policy making, strategic planning, and query optimization. While a class-level model does not support instance-level predictions, learning and inference are simpler at the class-level. We describe efficient and scalable algorithms for structure learning and parameter estimation in class-level Bayes nets that directly leverage the efficiency of single-table non-relational Bayes net learners. An evaluation of our methods on three data sets shows that our algorithms are computationally feasible for realistic table sizes, and that the learned structures represented the statistical information in the databases well. After learning has compiled the database statistics into a Join Bayes net, querying these statistics via the net is faster than directly with SQL queries, and does not depend on the size of the database. 1
Learning Class-Level Bayes Nets for Relational Data
"... Many databases store data in relational format, with differ-ent types of entities and information about links between the entities. The field of statistical-relational learning (SRL) has developed a number of new statistical models for such data. In this paper we focus on learning class-level or fir ..."
Abstract
- Add to MetaCart
(Show Context)
Many databases store data in relational format, with differ-ent types of entities and information about links between the entities. The field of statistical-relational learning (SRL) has developed a number of new statistical models for such data. In this paper we focus on learning class-level or first-order dependencies, which model the general database statistics over attributes of linked objects and links (e.g., the percent-age of A grades given in computer science classes). Class-level statistical relationships are important in themselves, and they support applications like policy making, strate-gic planning, and query optimization. Most current SRL methods find class-level dependencies, but their main task is to support instance-level predictions about the attributes or links of specific entities. We focus only on class-level predic-tion, and describe algorithms for learning class-level models that are orders of magnitude faster for this task. Our algo-rithms learn Bayes nets with relational structure, leveraging the efficiency of single-table nonrelational Bayes net learn-ers. An evaluation of our methods on three data sets shows that they are computationally feasible for realistic table sizes, and that the learned structures represent the statistical infor-mation in the databases well. After learning compiles the database statistics into a Bayes net, querying these statistics via Bayes net inference is faster than with SQL queries, and does not depend on the size of the database. 1
INDEPENDENCE ASSUMPTIONS FOR MULTI-RELATIONAL CLASSIFICATION
"... The author, whose copyright is declared on the title page of this work, has granted to Simon Fraser University the right to lend this thesis, project or extended essay to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a reques ..."
Abstract
- Add to MetaCart
(Show Context)
The author, whose copyright is declared on the title page of this work, has granted to Simon Fraser University the right to lend this thesis, project or extended essay to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. The author has further granted permission to Simon Fraser University to keep or make a digital copy for use in its circulating collection (currently available to the public at the “Institutional Repository ” link of the SFU Library website <www.lib.sfu.ca> at: