• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Automatising the learning of lexical patterns: An application to the enrichement of WordNet by extracting semantic relationships from Wikipedia (2007)

by M Ruiz-Casado, E Alfonseca, P Castells
Venue:Data and Knowledge Engineering
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

Combining Statistical Techniques and Lexico-syntactic Patterns for Semantic Relations Extraction from Text

by Emiliano Giovannetti, Simone Marchi, Simonetta Montemagni
"... Abstract. We describe here a methodology to combine two different techniques for Semantic Relation Extraction from texts. On the one hand, generic lexicosyntactic patterns are applied to the linguistically analyzed corpus to detect a first set of pairs of co-occurring words, possibly involved in “sy ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. We describe here a methodology to combine two different techniques for Semantic Relation Extraction from texts. On the one hand, generic lexicosyntactic patterns are applied to the linguistically analyzed corpus to detect a first set of pairs of co-occurring words, possibly involved in “syntagmatic” relations. On the other hand, a statistical unsupervised association system is used to obtain a second set of pairs of “distributionally similar ” terms, that appear to occur in similar contexts, thus possibly involved in “paradigmatic” relations. The approach aims at learning ontological information by filtering the candidate relations obtained through generic lexico-syntactic patterns and by labelling the anonymous relations obtained through the statistical system. The resulting set of relations can be used to enrich existing ontologies and for semantic annotation of documents or web pages.

Refining Non-Taxonomic Relation Labels with External Structured Data to Support Ontology Learning

by unknown authors
"... This paper presents a method to integrate external knowledge sources such as DBpedia and OpenCyc into an ontology learning system that automatically suggests labels for unknown relations in domain ontologies based on large corpora of unstructured text. The method extracts and aggregates verb vectors ..."
Abstract - Add to MetaCart
This paper presents a method to integrate external knowledge sources such as DBpedia and OpenCyc into an ontology learning system that automatically suggests labels for unknown relations in domain ontologies based on large corpora of unstructured text. The method extracts and aggregates verb vectors from semantic relations identified in the corpus. It composes a knowledge base which consists of (i) verb centroids for known relations between domain concepts, (ii) mappings between concept pairs and the types of known relations, and (iii) ontological knowledge retrieved from external sources. Applying semantic inference and validation to this knowledge base yields a refined relation label suggestion. A formal evaluation compares the accuracy and average ranking precision of this hybrid method with the performance of methods that solely rely on corpus data and those that are only based on reasoning and external data sources.

Semantic Relationship Extraction and Ontology Building using Wikipedia: A Comprehensive Survey

by Nora I. Al- Rajebah
"... Semantic web as a vision of Tim Berners-Lee is highly dependable upon the availability of machine readable information. Ontologies are one of the different machine readable formats that have been widely investigated. Several studies focus on how to extract concepts and semantic relations in order to ..."
Abstract - Add to MetaCart
Semantic web as a vision of Tim Berners-Lee is highly dependable upon the availability of machine readable information. Ontologies are one of the different machine readable formats that have been widely investigated. Several studies focus on how to extract concepts and semantic relations in order to build ontologies. Wikipedia is considered as one of the important knowledge sources that have been used to extract semantic relations due to its characteristics as a semi-structured knowledge source that would facilitate such a challenge. In this paper we will focus on the current state of this challenging field by discussing some of the recent studies about Wikipedia and semantic extraction and highlighting their main contributions and results.

Conceptual image retrieval over the Wikipedia corpus

by Adrian Popescu, Hervé Le Borgne, Pierre-alain Moëllic
"... Abstract. Image retrieval in large-scale databases is currently based on a textual chains matching procedure, a technique that produces good results as long as the annotations associated to pictures are accurate and detailed enough. These conditions are not met for a large majority of image corpuses ..."
Abstract - Add to MetaCart
Abstract. Image retrieval in large-scale databases is currently based on a textual chains matching procedure, a technique that produces good results as long as the annotations associated to pictures are accurate and detailed enough. These conditions are not met for a large majority of image corpuses, such as the Wikipedia collection, and it is interesting to explore methods that go beyond chain matching. In this paper, we present our approach to image retrieval, tested in the ImageCLEF 2008 WikipediaMM. The approach is based on a query reformulation using concepts that are semantically related to those in the initial query. For each interesting entity in the query, we used Wikipedia and WordNet to extract and list of related concepts, which were further ranked in order to propose the most salient in priority. We also made a list of visual concepts which were used in order to re-rank the answers to queries that included, implicitly or explicitly, these visual concepts. The CEA submitted two automatic runs, one based on query reformulation only and one combining query reformulation and visual concepts, which were ranked 4th and 2nd using the MAP measure..

CEA

by Adrian Popescu, Gregory Grefenstette, Pierre-alain Moëllic
"... Geolocalized databases are becoming necessary in a wide variety of application domains. Thus far, the creation of such databases has been a costly, manual process. This drawback has stimulated interest in automating their construction, for example, by mining geographical information from the Web. He ..."
Abstract - Add to MetaCart
Geolocalized databases are becoming necessary in a wide variety of application domains. Thus far, the creation of such databases has been a costly, manual process. This drawback has stimulated interest in automating their construction, for example, by mining geographical information from the Web. Here we present and evaluate a new automated technique for creating and enriching a geographical gazetteer, called Gazetiki. Our technique merges disparate information from Wikipedia, Panoramio, and web search engines in order to identify geographical names, categorize these names, find their geographical coordinates and rank them. The information produced in Gazetiki enhances and complements the Geonames database, using a similar domain model. We show that our method provides a richer structure and an improved coverage compared to another known attempt at automatically building a geographic database and, where possible, we compare our Gazetiki to Geonames.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University