#### DMCA

## Factorizing YAGO: scalable machine learning for linked data (2012)

### Cached

### Download Links

Venue: | In WWW |

Citations: | 44 - 14 self |

### Citations

799 | Markov logic networks
- Richardson, Domingos
- 2006
(Show Context)
Citation Context ...g approach is mandatory. But relational learning algorithms often require a considerable amount of prior knowledge about the domain of discourse, e.g. a knowledge base for Markov Logic Networks (MLN) =-=[24]-=- or the structure of a Bayesian Network. This can become a obstacle when applying Machine Learning to Linked Open Data, since it is difficult and expensive to gather this kind of knowledge manually or... |

698 | Tensor decompositions and applications
- Kolda, Bader
- 2009
(Show Context)
Citation Context ... the Bayesian Clustered Tensor Factorization (BCTF) and applied it to various data sets of smaller and medium size. An extensive review of tensor decompositions and their applications can be found in =-=[20]-=-. In the context of the Semantic Web, Inductive Logic Programming (ILP) and kernel learning have been the dominant Machine Learning approaches so far [7, 9, 11]. Furthermore, [17] uses regularized mat... |

507 | OPTICS: Ordering Points To Identify the Clustering Structure
- Ankerst, Breunig, et al.
- 1999
(Show Context)
Citation Context ...ter, Anime and Action Movie or Country and City. In total there exist 80 concepts and the maximum subclass-level is 3. A tensor representation of this data is of size 1519×1519×35. We selected OPTICS =-=[1]-=- as the hierarchical clustering algorithm to work in the latent-component space A. OPTICS is a density-based hierarchical clustering algorithm, which also provides an interpretable visual representati... |

467 | YAGO: A core of semantic knowledge
- Suchanek, Kasneci, et al.
- 2007
(Show Context)
Citation Context ...reating and Using Links between Data Objects April 16–20, 2012, Lyon, France this new reasoning paradigm for the Semantic Web. Here, in our approach to this challenge, we focus on the YAGO 2 ontology =-=[27]-=-, a large knowledge base that lies, along with other databases such as DBpedia [2], at the core of the LOD cloud. Applying Machine Learning to Linked Data at this scale however, is not trivial. For in... |

401 | The relationship between precision-recall and roc curves
- Davis, Goadrich
- 2006
(Show Context)
Citation Context ...there is a large skew in the distribution of existing and non-existing triples, we report the area under the precision-recall curve (AUC-PR) to evaluate the results, what is suitable for this setting =-=[10]-=-. Figure 4 shows the results of these experiments. It can be seen that RESCAL learns a reasonable model in both settings. The results for setting a) indicate that given enough support in the data, RES... |

173 | Collective classification in network data
- Sen, Deshpande, et al.
(Show Context)
Citation Context ...ationship correlations across multiple interconnections of entities and relations. It is known that applying a collective learning method to relational data can improve learning results significantly =-=[26]-=-. For instance, consider the task of predicting the party membership of a president of the United States of America. Naturally, the party membership of the president and his vice president are highly ... |

136 |
Linked data—the story so far, in
- Bizer, Heath, et al.
- 2009
(Show Context)
Citation Context ...these papers is limited to classroom use, and personal use by others. WWW 2012, April 16–20, 2012, Lyon, France. ACM 978-1-4503-1229-5/12/04. 1. INTRODUCTION The Semantic Web’s Linked Open Data (LOD) =-=[6]-=- cloud is growing rapidly. At the time of this writing, it consists of around 300 interlinked databases, where some of these databases store billions of facts in form of RDF triples. 1 Thus, for the f... |

132 | Feature hashing for large scale multitask learning
- Weinberger, Dasgupta, et al.
(Show Context)
Citation Context ...anding in terms of memory, especially A, a dense n×r matrix, such that if the domain contains a very large number of entities, additional dimensionality reduction techniques such as the“hashing trick”=-=[32]-=- might be required. 3.5 Adding Attributes to the Factorization In its original form, RESCAL doesn’t recognize attributes of entities explicitly. However, much information in the LOD cloud is in the fo... |

117 |
PARAFAC: Tutorial and applications
- Bro
- 1997
(Show Context)
Citation Context ...ries to support data mining constructs. [5] employs a coevolution-based genetic algorithm to learn kernels for RDF data. Probably most similar to our approach is TripleRank [12], which applies the CP =-=[8]-=- tensor decomposition to RDF graphs for faceted browsing. However, in contrast to the tensor factorization employed in this paper, CP isn’t capable of collective learning, which is an important featur... |

111 | C.: Beyond streams and graphs: dynamic tensor analysis
- Sun, Tao, et al.
(Show Context)
Citation Context ...raditioninfieldslikepsycho-andchemometrics, tensors and tensor factorizations have only recently been applied to Machine Learning, e.g. to incorporate dynamic aspects in network models. For instance, =-=[28]-=- presents methods for dynamic and streaming tensor analysis and applies them to network traffic and bibliographic data. In [23], a specialized tensor factorization for personalized item recommendation... |

107 | P.: When owl:sameAs isn’t the same: An analysis of identity links on the Semantic Web
- Halpin, Hayes
- 2010
(Show Context)
Citation Context ...he LOD cloud, due to its size, inherent noisiness andinconsistencies. Consider, forexample, thatowl:sameAs is often misused in the LOD cloud, leading to inconsistencies between different data sources =-=[13]-=-. Further examples include malformed datatype literals, undefined classes and properties, misuses of ontological terms [16] or the modeling of a simple fact such as Nancy Pelosi voted in favor of the ... |

81 |
Z.: DBpedia: A nucleus for a web of open data. The Semantic Web
- Auer, Bizer, et al.
- 2008
(Show Context)
Citation Context ...ew reasoning paradigm for the Semantic Web. Here, in our approach to this challenge, we focus on the YAGO 2 ontology [27], a large knowledge base that lies, along with other databases such as DBpedia =-=[2]-=-, at the core of the LOD cloud. Applying Machine Learning to Linked Data at this scale however, is not trivial. For instance, due to the linked nature of the data, using a relational learning approach... |

62 | A three-way model for collective learning on multi-relational data
- Nickel, Tresp, et al.
- 2011
(Show Context)
Citation Context ... Web is based on RESCAL, a tensor factorization that has shown very good results in various canonical relational learning tasks such as link prediction, entity resolution or collective classification =-=[22]-=-. The main advantage of RESCAL, if compared to other tensor factorizations, is that it can exploit a collective learning effect when applied to relational data. Collective learning refers to the autom... |

52 | TripleRank: Ranking Semantic Web data by tensor decomposition
- Franz, Schultz, et al.
- 2009
(Show Context)
Citation Context ...-ML [18] extends SPARQL queries to support data mining constructs. [5] employs a coevolution-based genetic algorithm to learn kernels for RDF data. Probably most similar to our approach is TripleRank =-=[12]-=-, which applies the CP [8] tensor decomposition to RDF graphs for faceted browsing. However, in contrast to the tensor factorization employed in this paper, CP isn’t capable of collective learning, wh... |

51 | A reasonable Semantic Web
- Hitzler, Harmelen
- 2010
(Show Context)
Citation Context ...type literals, undefined classes and properties, misuses of ontological terms [16] or the modeling of a simple fact such as Nancy Pelosi voted in favor of the Health Care Bill using eight RDF triples =-=[15]-=-. Partial inconsistencies in the data or noise such as duplicate entities or predicates are direct consequences of the open nature of Linked Open Data. For this reason, it has been recently proposed t... |

51 |
Weaving the pedantic web
- Hogan, Harth, et al.
- 2010
(Show Context)
Citation Context ... in the LOD cloud, leading to inconsistencies between different data sources [13]. Further examples include malformed datatype literals, undefined classes and properties, misuses of ontological terms =-=[16]-=- or the modeling of a simple fact such as Nancy Pelosi voted in favor of the Health Care Bill using eight RDF triples [15]. Partial inconsistencies in the data or noise such as duplicate entities or p... |

50 | Factorizing Personalized Markov Chains for Next-basket Recommendation
- Rendle, Freudenthaler, et al.
- 2010
(Show Context)
Citation Context ...ng, e.g. to incorporate dynamic aspects in network models. For instance, [28] presents methods for dynamic and streaming tensor analysis and applies them to network traffic and bibliographic data. In =-=[23]-=-, a specialized tensor factorization for personalized item recommendations is used to include information of the preceding transaction. For relational learning, [29] introduced the Bayesian Clustered ... |

47 | Statistical predicate invention
- Kok, Domingos
- 2007
(Show Context)
Citation Context ...ents in the k-th predicate. Expressing data in terms of newly invented latent components is often referred to as predicate invention in statistical relational learning and considered a powerful asset =-=[19]-=-. To solve (1), [22] presents an efficient alternating least squares algorithm, which updates A and Rk iteratively until a convergence criterion is met. In the following, we will refer to this algorit... |

41 | Teh, Bayesian learning via stochastic gradient Langevin dynamics
- Welling, Yee
(Show Context)
Citation Context ...mining good choices of regularization parameters via cross-validation can be tedious on large-scale data. Efficient methods to finding good parameter values, for instance scalableBayesianmethodssuchas=-=[33]-=-, couldprovideasolution. Acknowledgements We acknowledge funding by the German Federal Ministry of Economy and Technology (BMWi) under the THESEUS project and by the EU FP 7 Large-Scale Integrating Pr... |

39 | Modelling relational data using bayesian clustered tensor factorization
- Sutskever, Salakhutdinov, et al.
(Show Context)
Citation Context ...traffic and bibliographic data. In [23], a specialized tensor factorization for personalized item recommendations is used to include information of the preceding transaction. For relational learning, =-=[29]-=- introduced the Bayesian Clustered Tensor Factorization (BCTF) and applied it to various data sets of smaller and medium size. An extensive review of tensor decompositions and their applications can b... |

36 | F.: DL-FOIL: Concept learning in description logics
- Fanizzi, d’Amato, et al.
(Show Context)
Citation Context ...itions and their applications can be found in [20]. In the context of the Semantic Web, Inductive Logic Programming (ILP) and kernel learning have been the dominant Machine Learning approaches so far =-=[7, 9, 11]-=-. Furthermore, [17] uses regularized matrix factorization to predict unknown triples in Semantic Web data. Recently, [21] proposed to learn Relational Bayesian Classifiers for RDF data via queries to ... |

34 | Learning annotated hierarchies from relational data
- Roy, Kemp, et al.
- 2006
(Show Context)
Citation Context ...his domain and to interpret the resulting clusters according to their members. However, there are only very few approaches that are able to compute a hierarchical clustering for multi-relational data =-=[25]-=-. To compute such a clustering with RESCAL, we use again the property of A to reflect the similarity of entities in the relational domain, and simply compute a clustering in the latent-component space... |

34 |
et al. Introduction to data mining
- Tan, Steinbach, et al.
- 2006
(Show Context)
Citation Context ...vides an interpretable visual representation of its results. An example of this representation is shown in Figure 5c. To evaluate the quality of our clustering, we followed the procedure suggested in =-=[30]-=- and assign that F-measure score to a particular concept that is the highest for this concept out of all clusters. The idea behind this approach is that there should exist one cluster for each concept... |

29 | S.: Learning of OWL class descriptions on very large knowledge bases
- Hellmann, Lehmann, et al.
- 2009
(Show Context)
Citation Context ...ethods such as association rule mining and knowledge base fragment extraction have been applied to large Semantic Web databases for tasks like schema induction and learning complex class descriptions =-=[31, 14]-=-. To the best of our knowledge, there have yet not been any attempts to apply a general relational learning approach to knowledge bases of the size considered in this paper. 3. THE MODEL Our approach ... |

22 | Temporal analysis of semantic graphs using asalsan
- Bader, Harshman, et al.
- 2007
(Show Context)
Citation Context ...necker product. However, computing the update steps of Rk in this form would be intractable for large-scale data, since it involves the r 2 ×n 2 matrix Z. Fortunately, similar to the ASALSAN algorithm=-=[4]-=-, itispossibletousetheQRdecompositionofA to simplify the update steps for Rk significantly. The basic idea is to minimize for each Rk a function that is equivalent to (2), namely min‖ Rk ˆ Xk −ÂRkÂT ‖... |

22 |
Kernel methods for mining instance data in ontologies
- Bloehdorn, Sure
- 2007
(Show Context)
Citation Context ...itions and their applications can be found in [20]. In the context of the Semantic Web, Inductive Logic Programming (ILP) and kernel learning have been the dominant Machine Learning approaches so far =-=[7, 9, 11]-=-. Furthermore, [17] uses regularized matrix factorization to predict unknown triples in Semantic Web data. Recently, [21] proposed to learn Relational Bayesian Classifiers for RDF data via queries to ... |

22 | A.: Adding data mining support to SPARQL via statistical relational learning methods
- Kiefer, Bernstein, et al.
(Show Context)
Citation Context ...torization to predict unknown triples in Semantic Web data. Recently, [21] proposed to learn Relational Bayesian Classifiers for RDF data via queries to a SPARQL endpoint. Also, the work on SPARQL-ML =-=[18]-=- extends SPARQL queries to support data mining constructs. [5] employs a coevolution-based genetic algorithm to learn kernels for RDF data. Probably most similar to our approach is TripleRank [12], wh... |

10 |
Multivariate structured prediction for learning on semantic web
- Huang, Tresp, et al.
- 2010
(Show Context)
Citation Context ...ions can be found in [20]. In the context of the Semantic Web, Inductive Logic Programming (ILP) and kernel learning have been the dominant Machine Learning approaches so far [7, 9, 11]. Furthermore, =-=[17]-=- uses regularized matrix factorization to predict unknown triples in Semantic Web data. Recently, [21] proposed to learn Relational Bayesian Classifiers for RDF data via queries to a SPARQL endpoint. ... |

9 |
Creating knowledge out of interlinked data
- Auer, Lehmann
- 2010
(Show Context)
Citation Context ...arning methods should assist knowledge engineers in the creation of ontologies, such that an automated system suggests new axioms for an ontology, which are added under the supervision of an engineer =-=[3]-=-. Here, we focus on the simpler task of learning a taxonomy from instance data. A taxonomy can be interpretedasahierarchicalgroupingofinstances. Consequently, a natural approach to learning a taxonomy... |

8 | Learning relational bayesian classifiers from RDF data
- Lin, Koul, et al.
- 2011
(Show Context)
Citation Context ...ernel learning have been the dominant Machine Learning approaches so far [7, 9, 11]. Furthermore, [17] uses regularized matrix factorization to predict unknown triples in Semantic Web data. Recently, =-=[21]-=- proposed to learn Relational Bayesian Classifiers for RDF data via queries to a SPARQL endpoint. Also, the work on SPARQL-ML [18] extends SPARQL queries to support data mining constructs. [5] employs... |

5 | Relational kernel machines for learning from graph-structured RDF data
- Bicer, Tran, et al.
- 2011
(Show Context)
Citation Context ...ently, [21] proposed to learn Relational Bayesian Classifiers for RDF data via queries to a SPARQL endpoint. Also, the work on SPARQL-ML [18] extends SPARQL queries to support data mining constructs. =-=[5]-=- employs a coevolution-based genetic algorithm to learn kernels for RDF data. Probably most similar to our approach is TripleRank [12], which applies the CP [8] tensor decomposition to RDF graphs for ... |

5 | Non-parametric statistical learning methods for inductive classifiers in semantic knowledge bases
- D’Amato, Fanizzi, et al.
- 2008
(Show Context)
Citation Context ...itions and their applications can be found in [20]. In the context of the Semantic Web, Inductive Logic Programming (ILP) and kernel learning have been the dominant Machine Learning approaches so far =-=[7, 9, 11]-=-. Furthermore, [17] uses regularized matrix factorization to predict unknown triples in Semantic Web data. Recently, [21] proposed to learn Relational Bayesian Classifiers for RDF data via queries to ... |

2 |
Statistical Schema Induction”. In: The Semantic Web: Research and
- Völker, Niepert
(Show Context)
Citation Context ...ethods such as association rule mining and knowledge base fragment extraction have been applied to large Semantic Web databases for tasks like schema induction and learning complex class descriptions =-=[31, 14]-=-. To the best of our knowledge, there have yet not been any attempts to apply a general relational learning approach to knowledge bases of the size considered in this paper. 3. THE MODEL Our approach ... |