Results 1 - 10
of
20
An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario
- In Proceedings of the 7th International Semantic Web Conference (ISWC
, 2008
"... Abstract. Efficient RDF data management is one of the cornerstones in realizing the Semantic Web vision. In the past, different RDF storage strategies have been proposed, ranging from simple triple stores to more advanced techniques like clustering or vertical partitioning on the predicates. We pres ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Abstract. Efficient RDF data management is one of the cornerstones in realizing the Semantic Web vision. In the past, different RDF storage strategies have been proposed, ranging from simple triple stores to more advanced techniques like clustering or vertical partitioning on the predicates. We present an experimental comparison of existing storage strategies on top of the SP 2 Bench SPARQL performance benchmark suite and put the results into context by comparing them to a purely relational model of the benchmark scenario. We observe that (1) in terms of performance and scalability, a simple triple store built on top of a column-store DBMS is competitive to the vertically partitioned approach when choosing a physical (predicate, subject, object) sort order, (2) in our scenario with real-world queries, none of the approaches scales to documents containing tens of millions of RDF triples, and (3) none of the approaches can compete with a purely relational model. We conclude that future research is necessary to further bring forward RDF data management. 1
Storing and Querying Scientific Workflow Provenance Metadata Using an RDBMS
- THIRD IEEE INTERNATIONAL CONFERENCE ON E-SCIENCE AND GRID COMPUTING
, 2007
"... Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensi ..."
Abstract
-
Cited by 14 (10 self)
- Add to MetaCart
(Show Context)
Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power of an RDBMS. Specifically, we propose: i) two schema mapping algorithms to map an arbitrary OWL provenance ontology to a relational database schema that is optimized for common provenance queries; ii) two efficient data mapping algorithms to map provenance RDF metadata to relational data according to the generated relational database schema, and iii) a schema-independent SPARQL-to-SQL translation algorithm that is optimized on-the-fly by using the type information of an instance available from the input provenance ontology and the statistics of the sizes of the tables in the database. Experimental results are presented to show that our algorithms are efficient and scalable.
SPARQL Query Rewriting for Implementing Data Integration over Linked Data
"... There has been lately an increased activity of publishing structured data in RDF due to the activity of the Linked Data community 1. The presence on the Web of such a huge information cloud, ranging from academic to geographic to gene related information, poses a great challenge when it comes to rec ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
(Show Context)
There has been lately an increased activity of publishing structured data in RDF due to the activity of the Linked Data community 1. The presence on the Web of such a huge information cloud, ranging from academic to geographic to gene related information, poses a great challenge when it comes to reconcile heterogeneous schemas adopted by data publishers. For several years, the Semantic Web community has been developing algorithms for aligning data models (ontologies). Nevertheless, exploiting such ontology alignments for achieving data integration is still an under supported research topic. The semantics of ontology alignments, often defined over a logical frameworks, implies a reasoning step over huge amounts of data, that is often hard to implement and rarely scales on Web dimensions. This paper presents an algorithm for achieving RDF data mediation based on SPARQL query rewriting. The approach is based on the encoding of rewriting rules for RDF patterns that constitute part of the structure of a SPARQL query.
Rdfmatview: Indexing rdf data for sparql queries
, 2010
"... Abstract. The Semantic Web as an evolution of the World Wide Web aims to create a universal medium for the exchange of semantically described data. The idea of representing this information by means of directed labelled graphs, RDF, has been widely accepted by the scientific community. However query ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Abstract. The Semantic Web as an evolution of the World Wide Web aims to create a universal medium for the exchange of semantically described data. The idea of representing this information by means of directed labelled graphs, RDF, has been widely accepted by the scientific community. However querying RDF data sets to find the desired information often is highly time consuming due to the number of comparisons that are needed. In this article we propose indexes on RDF to reduce the search space and the SPARQL query processing time. Our approach is based on materialized queries, i.e., precomputed query patterns and their occurrences in the data sets. We provide a formal definition of RDFMatView indexes for SPARQL queries, a cost model to evaluate their potential impact on query performance, and a rewriting algorithm to use indexes in SPARQL queries. We also develop and compare different approaches to integrate such indexes into an existing SPARQL query engine. Our preliminary results show that our approach can drastically decrease the query processing time in comparison to conventional query processing.
OWSCIS: Ontology and Web Service Based Cooperation of Information Sources
- In Proceedings of the Third International IEEE Conference on Signal-Image Technologies and Internet-Based Systems (SITIS 2007
"... Abstract ..."
(Show Context)
Scientific workflow provenance metadata management using an RDBMS
, 2007
"... Abstract. Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperabilit ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power of an RDBMS. Specifically, we propose: i) two schema mapping algorithms to map an arbitrary OWL provenance ontology to a relational database schema that is optimized for common provenance queries; ii) three efficient data mapping algorithms to map provenance RDF metadata to relational data according to the generated relational database schema, and iii) a schema-independent SPARQL-to-SQL translation algorithm that is optimized on-the-fly by using the type information of an instance available from the input provenance ontology and the statistics of the sizes of the tables in the database. While the schema mapping and query translation and optimization algorithms are applicable to general RDF storage and query systems, the data mapping algorithms are optimized for and applicable only to scientific workflow provenance metadata. Moreover, we extend SPARQL with negation, aggregation, and set operations to support additional important provenance queries. Experimental results are presented to show that our algorithms are efficient and scalable. The comparison with existing RDF stores, Jena and Sesame, showed that our optimizations result in improved performance and scalability for provenance metadata management.
SPARQL Query Containment under SHI Axioms
"... SPARQL query containment under schema axioms is the problem of determining whether, for any RDF graph satisfying a given set of schema axioms, the answers to a query are contained in the answers of another query. This problem has major applications for verification and optimization of queries. In or ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
SPARQL query containment under schema axioms is the problem of determining whether, for any RDF graph satisfying a given set of schema axioms, the answers to a query are contained in the answers of another query. This problem has major applications for verification and optimization of queries. In order to solve it, we rely on the µ-calculus. Firstly, we provide a mapping from RDF graphs into transition systems. Secondly, SPARQL queries and RDFS and SHI axioms are encoded into µ-calculus formulas. This allows us to reduce query containment and equivalence to satisfiability in the µ-calculus. Finally, we prove a double exponential upper bound for containment under SHI schema axioms.
RDFMatView: Indexing RDF data using Materialized SPARQL Queries
- In International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS
, 2010
"... Abstract. The Semantic Web aims to create a universal medium for the exchange of semantically tagged data. The idea of representing and querying this information by means of directed labelled graphs, i.e., RDF and SPARQL, has been widely accepted by the scientific community. However, even when most ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. The Semantic Web aims to create a universal medium for the exchange of semantically tagged data. The idea of representing and querying this information by means of directed labelled graphs, i.e., RDF and SPARQL, has been widely accepted by the scientific community. However, even when most current implementations of RDF/SPARQL are based on ad-hoc storage systems, processing complex queries on large data sets incurs a high number of joins, which may slow down performance. In this article we propose materialized SPARQL queries as indexes on RDF data sets to reduce the number of necessary joins and thus query processing time. We provide a formal definition of materialized SPARQL queries, a cost model to evaluate their impact on query performance, a storage scheme for the materialization, and an algorithm to find the optimal set of indexes given a query. We also present and evaluate different approaches to integrate materialized queries into an existing SPARQL query engine. An evaluation shows that our approach can drastically decrease the query processing time compared to a direct evaluation.
1 Using Description Logics for the Provision of Context-Driven Content Adaptation Services
"... This paper presents our design and development of a description logics-based planner for providing context-driven content adaptation services. This approach dynamically transforms requested Web content into a proper format conforming to receiving contexts (e.g., access condition, network connection, ..."
Abstract
- Add to MetaCart
This paper presents our design and development of a description logics-based planner for providing context-driven content adaptation services. This approach dynamically transforms requested Web content into a proper format conforming to receiving contexts (e.g., access condition, network connection, and receiving device). Aiming to establish a semantic foundation for content adaptation, we apply description logics to formally define context profiles and requirements. We also propose a formal Object Structure Model as the basis of content adaptation management for higher reusability and adaptability. To automate content adaptation decision, our content adaptation planner is driven by a stepwise procedure equipped with algorithms and techniques to enable rule-based context-driven content adaptation over the mobile Internet. Experimental results prove the effectiveness and efficiency of our content adaptation planner on saving transmission bandwidth, when users are using handheld devices. By reducing the size of adapted content, we moderately decrease the computational overhead caused by content adaptation.
1 Using Description Logics for the Provision of Context-Driven Content Adaptation Services
, 2010
"... This paper presents our design and development of a description logics-based planner for providing context-driven content adaptation services. This approach dynamically transforms requested Web content into a proper format conforming to receiving contexts (e.g., access condition, network connection, ..."
Abstract
- Add to MetaCart
This paper presents our design and development of a description logics-based planner for providing context-driven content adaptation services. This approach dynamically transforms requested Web content into a proper format conforming to receiving contexts (e.g., access condition, network connection, and receiving device). Aiming to establish a semantic foundation for content adaptation, we apply description logics to formally define context profiles and requirements. We also propose a formal Object Structure Model as the basis of content adaptation management for higher reusability and adaptability. To automate content adaptation decision, our content adaptation planner is driven by a stepwise procedure equipped with algorithms and techniques to enable rule-based context-driven content adaptation over the mobile Internet. Experimental results prove the effectiveness and efficiency of our content adaptation planner on saving transmission bandwidth, when users are using handheld devices. By reducing the size of adapted content, we moderately decrease the computational overhead caused by content adaptation.