Results 1 - 10
of
76
BabelNet: The automatic construction, evaluation and application of a . . .
- ARTIFICIAL INTELLIGENCE
, 2012
"... ..."
BabelNet: Building a very large multilingual semantic network
- In Proc. of ACL-10
, 2010
"... In this paper we present BabelNet – a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to e ..."
Abstract
-
Cited by 73 (12 self)
- Add to MetaCart
(Show Context)
In this paper we present BabelNet – a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and existing gold-standard datasets to show the high quality and coverage of the resource. 1
Knowledge Base Population: Successful Approaches and Challenges
"... In this paper we give an overview of the Knowledge Base Population (KBP) track at the 2010 Text Analysis Conference. The main goal of KBP is to promote research in discovering facts about entities and augmenting a knowledge base (KB) with these facts. This is done through two tasks, Entity Linking – ..."
Abstract
-
Cited by 51 (9 self)
- Add to MetaCart
(Show Context)
In this paper we give an overview of the Knowledge Base Population (KBP) track at the 2010 Text Analysis Conference. The main goal of KBP is to promote research in discovering facts about entities and augmenting a knowledge base (KB) with these facts. This is done through two tasks, Entity Linking – linking names in context to entities in the KB – and Slot Filling – adding information about an entity to the KB. A large source collection of newswire and web documents is provided from which systems are to discover information. Attributes (“slots”) derived from Wikipedia infoboxes are used to create the reference KB. In this paper we provide an overview of the techniques which can serve as a basis for a good KBP system, lay out the remaining challenges by comparison with traditional Information Extraction (IE) and Question Answering (QA) tasks, and provide some suggestions to address these challenges. 1
An open-source toolkit for mining wikipedia
- In Proc. New Zealand Computer Science Research Student Conf
"... The online encyclopedia Wikipedia is a vast repository of information. For developers and researchers it represents a giant multilingual database of concepts and semantic relations; a promising resource for natural language processing and many other research areas. In this paper we introduce the Wik ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
The online encyclopedia Wikipedia is a vast repository of information. For developers and researchers it represents a giant multilingual database of concepts and semantic relations; a promising resource for natural language processing and many other research areas. In this paper we introduce the Wikipedia Miner toolkit: an open-source collection of code that allows researchers and developers to easily integrate Wikipedia's rich semantics into their own applications. The Wikipedia Miner toolkit is already a mature product. In this paper we describe how it provides simplified, object-oriented access to Wikipedia’s structure and content, how it allows terms and concepts to be compared semantically, and how it can detect Wikipedia topics when they are mentioned in documents. We also describe how it has already been applied to several different research problems. However, the toolkit is not intended to be a complete, polished product; it is instead an entirely open-source project that we hope will continue to evolve.
Overview of the TAC 2010 knowledge base population track
- In Third Text Analysis Conference (TAC
, 2010
"... In this paper we give an overview of the Knowledge Base Population (KBP) track at TAC 2010. The main goal of KBP is to promote research in discovering facts about entities and expanding a structured knowledge base with this information. A large source collection of newswire and web documents is prov ..."
Abstract
-
Cited by 44 (13 self)
- Add to MetaCart
(Show Context)
In this paper we give an overview of the Knowledge Base Population (KBP) track at TAC 2010. The main goal of KBP is to promote research in discovering facts about entities and expanding a structured knowledge base with this information. A large source collection of newswire and web documents is provided for systems to discover information. Attributes (a.k.a. “slots”) derived from Wikipedia infoboxes are used to create the reference knowledge base (KB). KBP2010 includes the following four tasks: (1) Regular Entity Linking, where names must be aligned to entities in the KB; (2) Optional Entity linking, without using Wikipedia texts; (3) Regular Slot Filling, which requires a system to automatically discover the attributes of specified entities from the source document collection and use them to expand the KB; (4) Surprise Slot Filling, which requires a system to return answers regarding new slot types within a short time period. KBP2010 has attracted many participants (over 45 teams registered for KBP 2010 (not including the RTE-KBP Validation Pilot task), among which 23 teams submitted results). In this paper we provide an overview of the task definition and annotation challenges associated with KBP2010. Then we summarize the evaluation results and discuss the lessons that we have learned based on detailed analysis. 1
A Framework for Benchmarking Entity-Annotation Systems
"... In this paper we design and implement a benchmarking framework for fair and exhaustive comparison of entity-annotation systems. The framework is based upon the definition of a set of problems related to the entity-annotation task, a set of measures to evaluate systems performance, and a systematic c ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
(Show Context)
In this paper we design and implement a benchmarking framework for fair and exhaustive comparison of entity-annotation systems. The framework is based upon the definition of a set of problems related to the entity-annotation task, a set of measures to evaluate systems performance, and a systematic comparative evaluation involving all publicly available datasets, containing texts of various types such as news, tweets and Web pages. Our framework is easily-extensible with novel entity annotators, datasets and evaluation measures for comparing systems, and it has been released to the public as open source 1. We use this framework to perform the first extensive comparison among all available entity annotators over all available datasets, and draw many interesting conclusions upon their efficiency and effectiveness. We also draw conclusions between academic versus commercial annotators.
Fast and accurate annotation of short texts with wikipedia pages, arXiv preprint arXiv:1006.3498
, 2010
"... We address the problem of cross-referencing text fragments with Wikipedia pages, in a way that synonymy and poly-semy issues are resolved accurately and efficiently. We take inspiration from a recent flow of work [3, 10, 12, 14], and ex-tend their scenario from the annotation of long documents to th ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
We address the problem of cross-referencing text fragments with Wikipedia pages, in a way that synonymy and poly-semy issues are resolved accurately and efficiently. We take inspiration from a recent flow of work [3, 10, 12, 14], and ex-tend their scenario from the annotation of long documents to the annotation of short texts, such as snippets of search-engine results, tweets, news, blogs, etc.. These short and poorly composed texts pose new challenges in terms of effi-ciency and effectiveness of the annotation process, that we address by designing and engineering Tagme, the first sys-tem that performs an accurate and on-the-fly annotation of these short textual fragments. A large set of experiments shows that Tagme outperforms state-of-the-art algorithms when they are adapted to work on short texts and it results fast and competitive on long texts. 1.
WikiNet: A Very Large Scale Multi-Lingual Concept Network
"... This paper describes a multi-lingual concept network obtained automatically by mining for concepts and relations and exploiting a variety of sources of knowledge from Wikipedia. Concepts and their lexicalizations are extracted from Wikipedia pages. Relations are extracted from the category and page ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
(Show Context)
This paper describes a multi-lingual concept network obtained automatically by mining for concepts and relations and exploiting a variety of sources of knowledge from Wikipedia. Concepts and their lexicalizations are extracted from Wikipedia pages. Relations are extracted from the category and page network, infoboxes and the body of the articles. The network consists of a central, language independent list of concepts (keeping track of their lexicalizations in various languages), interconnected with a variety of relations to form a very large scale multi-lingual concept network. 1.
Two is bigger (and better) than one: the wikipedia bitaxonomy project.
- In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,
, 2014
"... Abstract We present WiBi, an approach to the automatic creation of a bitaxonomy for Wikipedia, that is, an integrated taxonomy of Wikipage pages and categories. We leverage the information available in either one of the taxonomies to reinforce the creation of the other taxonomy. Our experiments sho ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
(Show Context)
Abstract We present WiBi, an approach to the automatic creation of a bitaxonomy for Wikipedia, that is, an integrated taxonomy of Wikipage pages and categories. We leverage the information available in either one of the taxonomies to reinforce the creation of the other taxonomy. Our experiments show higher quality and coverage than state-of-the-art resources like DBpedia, YAGO, MENTA, WikiNet and WikiTaxonomy. WiBi is available at http://wibitaxonomy.org.
WikiPop- Personalized Event Detection System Based on Wikipedia Page View Statistics
"... In this paper, we describe WikiPop service, a system designed to detect significant increase of popularity of topics related to users’ interests. We exploit Wikipedia page view statistics to identify concepts with significant increase of the interest from the public. Daily, there are thousands of ar ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we describe WikiPop service, a system designed to detect significant increase of popularity of topics related to users’ interests. We exploit Wikipedia page view statistics to identify concepts with significant increase of the interest from the public. Daily, there are thousands of articles with increased popularity; thus, a personalization is in order to provide the user only with results related to his/her interest. The WikiPop system allows a user to define a context by stating a set of Wikipedia articles describing topics of interest. The system is then able to search, for the given date, for popular topics related to the user defined context.