• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia. (2012)

by J Hoffart, F M Suchanek, K Berberich, G Weikum
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 158
Next 10 →

YAGO2: Exploring and Querying World Knowledge in Time, Space, Context, and Many Languages

by Johannes Hoffart, Fabian M. Suchanek, Klaus Berberich, Edwin Lewis-kelham, Gerard Melo, Gerhard Weikum
"... We present YAGO2, an extension of the YAGO knowledge base with focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, and Word-Net, and contains nearly 10 million entities and events, as well as 80 million facts representing general world knowledge. An enhanced ..."
Abstract - Cited by 61 (5 self) - Add to MetaCart
We present YAGO2, an extension of the YAGO knowledge base with focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, and Word-Net, and contains nearly 10 million entities and events, as well as 80 million facts representing general world knowledge. An enhanced data representation introduces time and location as first-class citizens. The wealth of spatio-temporal information in YAGO can be explored either graphically or through a special time- and space-aware query language.
(Show Context)

Citation Context

...han 80 million facts for 9.8 million entities, plus 76 million keywords. Sampling-based manual assessment of over 7000 facts shows that YAGO2 has a precision of 95%. Detailed results are available in =-=[6]-=-. This demo allows querying and visualizing the wealth of data in YAGO2. For this purpose, we have extended the classical subject-predicate-object knowledge representation by three more dimensions, a ...

Knowledge Vault: A Web-scale approach to probabilistic knowledge fusion

by Xin Luna Dong, Kevin Murphy, Thomas Strohmann, Shaohua Sun, Wei Zhang - In submission , 2014
"... Recent years have witnessed a proliferation of large-scale knowledge bases, including Wikipedia, Freebase, YAGO, Mi-crosoft’s Satori, and Google’s Knowledge Graph. To in-crease the scale even further, we need to explore automatic methods for constructing knowledge bases. Previous ap-proaches have pr ..."
Abstract - Cited by 49 (6 self) - Add to MetaCart
Recent years have witnessed a proliferation of large-scale knowledge bases, including Wikipedia, Freebase, YAGO, Mi-crosoft’s Satori, and Google’s Knowledge Graph. To in-crease the scale even further, we need to explore automatic methods for constructing knowledge bases. Previous ap-proaches have primarily focused on text-based extraction, which can be very noisy. Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that com-bines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repos-itories. We employ supervised machine learning methods for fusing these distinct information sources. The Knowledge Vault is substantially bigger than any previously published structured knowledge repository, and features a probabilis-tic inference system that computes calibrated probabilities of fact correctness. We report the results of multiple studies that explore the relative utility of the different information sources and extraction methods. Keywords Knowledge bases; information extraction; probabilistic mod-els; machine learning 1.
(Show Context)

Citation Context

...instances # Relation types # Confident facts (relation instances) Knowledge Vault (KV) 1100 45M 4469 271M DeepDive [32] 4 2.7M 34 7Ma NELL [8] 271 5.19M 306 0.435Mb PROSPERA [30] 11 N/A 14 0.1M YAGO2 =-=[19]-=- 350,000 9.8M 100 4Mc Freebase [4] 1,500 40M 35,000 637Md Knowledge Graph (KG) 1,500 570M 35,000 18,000Me Table 1: Comparison of knowledge bases. KV, DeepDive, NELL, and PROSPERA rely solely on extrac...

Entity Linking meets Word Sense Disambiguation: A Unified Approach

by Andrea Moro, Ro Raganato, Roberto Navigli - Transactions of the Association for Computational Linguistics , 2014
"... Entity Linking (EL) and Word Sense Disam-biguation (WSD) both address the lexical am-biguity of language. But while the two tasks are pretty similar, they differ in a fundamen-tal respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while ..."
Abstract - Cited by 46 (17 self) - Add to MetaCart
Entity Linking (EL) and Word Sense Disam-biguation (WSD) both address the lexical am-biguity of language. But while the two tasks are pretty similar, they differ in a fundamen-tal respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while in WSD there is a perfect match between the word form (bet-ter, its lemma) and a suitable word sense. In this paper we present Babelfy, a unified graph-based approach to EL and WSD based on a loose identification of candidate mean-ings coupled with a densest subgraph heuris-tic which selects high-coherence semantic in-terpretations. Our experiments show state-of-the-art performances on both tasks on 6 differ-ent datasets, including a multilingual setting. Babelfy is online at
(Show Context)

Citation Context

... large semistructured resources, such as Wikipedia, and knowledge resources built from them (Hovy et al., 2013), such as BabelNet (Navigli and Ponzetto, 2012a), DBpedia (Auer et al., 2007) and YAGO2 (=-=Hoffart et al., 2013-=-), has favoured the emergence of new tasks, such as Entity Linking (EL) (Rao et al., 2013), and opened up new possibilities for tasks such as Named Entity Disambiguation (NED) and Wikification. The ai...

Large-scale Semantic Parsing via Schema Matching and Lexicon Extension

by Qingqing Cai, Alexander Yates - In Proceedings of the Annual Meeting of the Association for Computational Linguistics , 2013
"... Supervised training procedures for semantic parsers produce high-quality semantic parsers, but they have difficulty scaling to large databases because of the sheer number of logical constants for which they must see labeled training data. We present a technique for developing semantic parsers for la ..."
Abstract - Cited by 30 (0 self) - Add to MetaCart
Supervised training procedures for semantic parsers produce high-quality semantic parsers, but they have difficulty scaling to large databases because of the sheer number of logical constants for which they must see labeled training data. We present a technique for developing semantic parsers for large databases based on a reduction to standard supervised training algorithms, schema matching, and pattern learning. Leveraging techniques from each of these areas, we develop a semantic parser for Freebase that is capable of parsing questions with an F1 that improves by 0.42 over a purely-supervised learning algorithm. 1
(Show Context)

Citation Context

...e between natural language questions and database queries over large-scale databases. Yahya et al. (2012) report on a system for translating natural language queries to SPARQL queries over the Yago2 (=-=Hoffart et al., 2013-=-) database. Yago2 consists of information extracted from Wikipedia, WordNet, and other resources using manually-defined extraction patterns. The manual extraction patterns pre-define a link between na...

AMIE: association rule mining under incomplete evidence in ontological knowledge bases

by Luis Galárraga , Christina Teflioudi , Katja Hose , Fabian M Suchanek - In WWW , 2013
"... ABSTRACT Recent advances in information extraction have led to huge knowledge bases (KBs), which capture knowledge in a machine-readable format. Inductive Logic Programming (ILP) can be used to mine logical rules from the KB. These rules can help deduce and add missing knowledge to the KB. While IL ..."
Abstract - Cited by 23 (5 self) - Add to MetaCart
ABSTRACT Recent advances in information extraction have led to huge knowledge bases (KBs), which capture knowledge in a machine-readable format. Inductive Logic Programming (ILP) can be used to mine logical rules from the KB. These rules can help deduce and add missing knowledge to the KB. While ILP is a mature field, mining logical rules from KBs is different in two aspects: First, current rule mining systems are easily overwhelmed by the amount of data (state-of-the art systems cannot even run on today's KBs). Second, ILP usually requires counterexamples. KBs, however, implement the open world assumption (OWA), meaning that absent data cannot be used as counterexamples. In this paper, we develop a rule mining model that is explicitly tailored to support the OWA scenario. It is inspired by association rule mining and introduces a novel measure for confidence. Our extensive experiments show that our approach outperforms state-of-the-art approaches in terms of precision and coverage. Furthermore, our system, AMIE, mines rules orders of magnitude faster than state-of-the-art approaches.
(Show Context)

Citation Context

...tions that go beyond the current KB. We are not interested in describing the existing data, but in generating new data. Therefore, we proceed as follows: We run the systems on an older dataset (YAGO2 =-=[16]-=-). We generate all predictions, i.e., the head atoms of the instantiated rules (see Section 3). We remove all predictions that are in the old KB. Then we compare the remaining predicted facts to the s...

Unsupervised graph-based topic labelling using DBpedia

by Ioana Hulpus, Conor Hayes, Conor Hayes, Marcel Karnstedt, Derek Greene - IN PROCEEDINGS OF THE 6TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM ’13 , 2013
"... ..."
Abstract - Cited by 18 (2 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...os:broader and skos:- broaderOf. This structure is not a proper hierarchy as it contains cycles [1]. YAGO The YAGO vocabulary represents an ontology automatically extracted from Wikipedia and WordNet =-=[6]-=-. It is linked to DBpedia and contains 365,372 classes. Classes are organised hierarchically and can be navigated using the rdfs:type property and rdfs:subClassOf property. For example, the DBpedia en...

Two is bigger (and better) than one: the wikipedia bitaxonomy project.

by Tiziano Flati , Daniele Vannella , Tommaso Pasini , Roberto Navigli - In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, , 2014
"... Abstract We present WiBi, an approach to the automatic creation of a bitaxonomy for Wikipedia, that is, an integrated taxonomy of Wikipage pages and categories. We leverage the information available in either one of the taxonomies to reinforce the creation of the other taxonomy. Our experiments sho ..."
Abstract - Cited by 18 (10 self) - Add to MetaCart
Abstract We present WiBi, an approach to the automatic creation of a bitaxonomy for Wikipedia, that is, an integrated taxonomy of Wikipage pages and categories. We leverage the information available in either one of the taxonomies to reinforce the creation of the other taxonomy. Our experiments show higher quality and coverage than state-of-the-art resources like DBpedia, YAGO, MENTA, WikiNet and WikiTaxonomy. WiBi is available at http://wibitaxonomy.org.
(Show Context)

Citation Context

...can be harvested and transformed into structured form (Medelyan et al., 2009; Hovy et al., 2013). Prominent examples include DBpedia (Bizer et al., 2009), BabelNet (Navigli and Ponzetto, 2012), YAGO (=-=Hoffart et al., 2013-=-) and WikiNet (Nastase and Strube, 2013). The types of semantic relation in these resources range from domain-specific, as in Freebase (Bollacker et al., 2008), to unspecified relations, as in BabelNe...

Kore: keyphrase overlap relatedness for entity disambiguation

by Johannes Hoffart, Martin Theobald, Stephan Seufert, Gerhard Weikum, Dat Ba Nguyen - In Proceedings of the 21st ACM CIKM , 2012
"... Measuring the semantic relatedness between two entities is the basis for numerous tasks in IR, NLP, and Web-based knowledge extraction. This paper focuses on disambiguating names in a Web or text document by jointly mapping all names onto semantically related entities registered in a knowledge base. ..."
Abstract - Cited by 14 (2 self) - Add to MetaCart
Measuring the semantic relatedness between two entities is the basis for numerous tasks in IR, NLP, and Web-based knowledge extraction. This paper focuses on disambiguating names in a Web or text document by jointly mapping all names onto semantically related entities registered in a knowledge base. To this end, we have developed a novel notion of semantic relatedness between two entities represented as sets of weighted (multi-word) keyphrases, with consideration of partially overlapping phrases. This measure improves the quality of prior link-based models, and also eliminates the need for (usually Wikipedia-centric) explicit interlinkage between entities. Thus, our method is more versatile and can cope with long-tail and newly emerging entities that have few or no links associated with them. For efficiency, we have developed approximation techniques based on min-hash sketches and locality-sensitive hashing. Our experiments on semantic relatedness and on named entity disambiguation demonstrate the superiority of our method compared to state-of-the-art baselines.
(Show Context)

Citation Context

... these reasons, we settled for relative ranking judgments and use a crowdsourcing platform (Crowdflower) to average out the subjectivity of such judgments. We selected a set of 20 entities from YAGO2 =-=[15]-=- from 4 different domains: IT companies, Hollywood celebrities, video games, and television series. For each of the 20 seed entities we selected 20 candidates from the set of entities linked to by the...

Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0

by Maud Ehrmann, Francesco Cecconi, Daniele Vannella, John Mccrae, Roberto Navigli - In Proc. of LREC , 2014
"... Recent years have witnessed a surge in the amount of semantic information published on the Web. Indeed, the Web of Data, a subset of the Semantic Web, has been increasing steadily in both volume and variety, transforming the Web into a ‘global database ’ in which resources are linked across sites. L ..."
Abstract - Cited by 14 (5 self) - Add to MetaCart
Recent years have witnessed a surge in the amount of semantic information published on the Web. Indeed, the Web of Data, a subset of the Semantic Web, has been increasing steadily in both volume and variety, transforming the Web into a ‘global database ’ in which resources are linked across sites. Linguistic fields – in a broad sense – have not been left behind, and we observe a similar trend with the growth of linguistic data collections on the so-called ‘Linguistic Linked Open Data (LLOD) cloud’. While both Semantic Web and Natural Language Processing communities can obviously take advantage of this growing and distributed linguistic knowledge base, they are today faced with a new challenge, i.e., that of facilitating multilingual access to the Web of data. In this paper we present the publication of BabelNet 2.0, a wide-coverage multilingual encyclopedic dictionary and ontology, as Linked Data. The conversion made use of lemon, a lexicon model for ontologies particularly well-suited for this enterprise. The result is an interlinked multilingual (lexical) resource which can not only be accessed on the LOD, but also be used to enrich existing datasets with linguistic information, or to support the process of mapping datasets across languages. Keywords:Linguistic Linked Data, Multilingual Semantic Web, lexical-semantic resource, semantic network 1.
(Show Context)

Citation Context

...y Linking can be performed jointly and with state-of-the-art performance in virtually any language of interest (Moro et al., 2014). With a similar focus on encyclopedic knowledge, the YAGO2 ontology (=-=Hoffart et al., 2013-=-) provides millions of facts and entities, some of which are spatially and temporally anchored. As for YAGO (Suchanek et al., 2008), it is based on the integration of Wikipedia and WordNet, whose mapp...

SPred: Large-scale Harvesting of Semantic Predicates

by Tiziano Flati, Roberto Navigli, Sapienza Università Di Roma - In Proceedings of 51st Annual Meeting of the Association for Computational Linguistics , 2013
"... We present SPred, a novel method for the creation of large repositories of semantic predicates. We start from existing collocations to form lexical predicates (e.g., break ∗) and learn the semantic classes that best fit the ∗ argument. To do this, we extract all the occurrences in Wikipedia which ma ..."
Abstract - Cited by 13 (4 self) - Add to MetaCart
We present SPred, a novel method for the creation of large repositories of semantic predicates. We start from existing collocations to form lexical predicates (e.g., break ∗) and learn the semantic classes that best fit the ∗ argument. To do this, we extract all the occurrences in Wikipedia which match the predicate and abstract its arguments to general semantic classes (e.g., break BODY PART, break AGREEMENT, etc.). Our experiments show that we are able to create a large collection of semantic predicates from the Oxford Advanced Learner’s Dictionary with high precision and recall, and perform well against the most similar approach. 1
(Show Context)

Citation Context

...ts of knowledge acquisition (Hovy et al., 2013), leading to the creation of several large-scale knowledge resources, such as DBPedia (Bizer et al., 2009), BabelNet (Navigli and Ponzetto, 2012), YAGO (=-=Hoffart et al., 2013-=-), MENTA (de Melo and Weikum, 2010), to name but a few. This wealth of acquired knowledge is known to have a positive impact on important fields such as Information Retrieval (Chu-Carroll and Prager, ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University