Results 1 -
6 of
6
G.: Constructing Virtual Documents for Ontology Matching
- In: 15th International World Wide Web Conference
, 2006
"... Abstract. Ontology matching is a crucial task for data integration and management on the Semantic Web. The ontology matching techniques today can solve many problems from heterogeneity of ontologies to some extent. However, for matching large ontologies, most ontology match-ers take too long run tim ..."
Abstract
-
Cited by 79 (9 self)
- Add to MetaCart
(Show Context)
Abstract. Ontology matching is a crucial task for data integration and management on the Semantic Web. The ontology matching techniques today can solve many problems from heterogeneity of ontologies to some extent. However, for matching large ontologies, most ontology match-ers take too long run time and have strong requirements on running environment. Based on the MapReduce framework and the virtual doc-ument technique, in this paper, we propose a 3-stage MapReduce-based approach called V-Doc+ for matching large ontologies, which signifi-cantly reduces the run time while keeping good precision and recall. Firstly, we establish four MapReduce processes to construct virtual doc-ument for each entity (class, property or instance), which consist of a simple process for the descriptions of entities, an iterative process for the descriptions of blank nodes and two processes for exchanging the descriptions with neighbors. Then, we use a word-weight-based partition method to calculate similarities between entities in the corresponding re-ducers. We report our results from two experiments on an OAEI dataset and a dataset from the biology domain. Its performance is assessed by comparing with existing ontology matchers. Additionally, we show how run time is reduced with increasing the size of cluster. 1
Solicited review(s): Name Surname, University, Country
"... Abstract. Evidence-based policy is policy informed by rigorously established objective evidence. An important aspect of evidence-based policy is the use of scientifically rigorous studies to identify programs and practices capable of improving policy relevant outcomes. Statistics represent a crucial ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Evidence-based policy is policy informed by rigorously established objective evidence. An important aspect of evidence-based policy is the use of scientifically rigorous studies to identify programs and practices capable of improving policy relevant outcomes. Statistics represent a crucial means to determine whether progress is made towards policy targets. In May 2010, the European Commission adopted the Digital Agenda for Europe, a strategy to take advantage of the potential offered by the rapid progress of digital technologies. The Digital Agenda contains commitments to undertake a number of specific policy actions intended to stimulate a circle of investment in and usage of digital technologies. It identifies 13 key performance targets. In order to chart the progress of both the announced policy actions and the key performance targets a scoreboard is published, thus allowing the monitoring and benchmarking of the main developments of information society in European countries. In addition to these human-readable browsing, visualization and exploration methods, machine-readable access facilitating re-usage and interlinking of the underlying data is provided by means of RDF and Linked Open Data. We sketch the transformation process from raw data up to rich, interlinked RDF, describe its publishing and the lessons learned.
Complex Matching of RDF Datatype Properties
"... Property mapping is a fundamental component of ontology matching, and yet there is little support that goes beyond the identification of single prop-erty matches. Real data often requires some degree of composition, trivially exemplified by the map-ping of “first name”, “last name ” to “full name ” ..."
Abstract
- Add to MetaCart
Property mapping is a fundamental component of ontology matching, and yet there is little support that goes beyond the identification of single prop-erty matches. Real data often requires some degree of composition, trivially exemplified by the map-ping of “first name”, “last name ” to “full name ” on one end, to complex matchings, such as parsing and pairing symbol/digit strings to SSN numbers, at the other end of the spectrum. In this paper, we propose a two-phase instance-based technique for complex datatype property matching. Phase 1 computes the estimate mutual information matrix of the property values to (1) find simple, 1:1 matches, and (2) com-pute a list of possible complex matches. Phase 2 applies genetic programming to the much reduced search space of candidate matches to find complex matches. We conclude with experimental results that illustrate how the technique works. Further-more, we show that the proposed technique greatly improves results over those obtained if the estimate mutual information matrix or the genetic program-ming techniques were to be used independently. 1
Ontology matching benchmarks:
, 2013
"... The OAEI Benchmark test set has been used for many years as a main reference to evaluate and compare ontology matching systems. However, this test set has barely varied since 2004 and has become a relatively easy task for matchers. In this paper, we present the design of a flexible test generator ba ..."
Abstract
- Add to MetaCart
(Show Context)
The OAEI Benchmark test set has been used for many years as a main reference to evaluate and compare ontology matching systems. However, this test set has barely varied since 2004 and has become a relatively easy task for matchers. In this paper, we present the design of a flexible test generator based on an extensible set of alterators which may be used programmatically for generating different test sets from different seed ontologies and different alteration modalities. It has been used for reproducing Benchmark both with the original seed ontology and with other ontologies. This highlights the remarkable stability of results over different generations and the preservation of difficulty across seed ontologies, as well as a systematic bias towards the initial Benchmark test set and the inability of such tests to identify an overall winning matcher. These were exactly the properties for which Benchmark had been designed. Furthermore, the generator has been used for providing new test sets aiming at increasing the difficulty and discriminability of Benchmark. Although difficulty may be easily increased with the generator, attempts to increase discriminability proved unfruitful. However, efforts towards this goal raise questions about the very nature of discriminability.