Results 1 -
2 of
2
A Hierarchy of Independence Assumptions for Multi-relational Bayes Net Classifiers
"... Abstract—Many databases store data in relational format, with different types of entities and information about their attributes and links between the entities. Link-based classification (LBC) is the problem of predicting the class attribute of a target entity given the attributes of entities linked ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Many databases store data in relational format, with different types of entities and information about their attributes and links between the entities. Link-based classification (LBC) is the problem of predicting the class attribute of a target entity given the attributes of entities linked to it. In this paper we propose a new relational Bayes net classifier method for LBC, which assumes that different links of an object are independently drawn from the same distribution, given attribute information from the linked tables. We show that this assumption allows very fast multi-relational Bayes net learning. We define three more independence assumptions for LBC to unify proposals from different researchers in a single novel hierarchy. Our proposed model is at the top and the wellknown multi-relational Naive Bayes classifier is at the bottom of this hierarchy. The model in each level of the hierarchy uses a new independence assumption in addition to the assumptions used in the higher levels. In experiments on four benchmark datasets, our proposed link independence model has the best predictive accuracy compared to the hierarchy models and a variety of relational classifiers.
Aggregating Predictions vs. Aggregating Features for Relational Classification
"... Abstract—Relational data classification is the problem of predicting a class label of a target entity given information about features of the entity, of the related entities, or neighbors, and of the links. This paper compares two fundamental approaches to relational classification: aggregating the ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Relational data classification is the problem of predicting a class label of a target entity given information about features of the entity, of the related entities, or neighbors, and of the links. This paper compares two fundamental approaches to relational classification: aggregating the features of entities related to a target instance, or aggregating the probabilistic predictions based on the features of each entity related to the target instance. Our experiments compare different relational classifiers on sports, financial, and movie data. We examine the strengths and weaknesses of both score and feature aggregation, both conceptually and empirically. The performance of a single aggregate operator (e.g., average) can vary widely across datasets, for both feature and score aggregation. Aggregate features can be adapted to a dataset by learning with a set of aggregate features. Used adaptively, aggregate features outperformed learning with