• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Gutierrez C: Survey of graph database models (0)

by R Angles
Venue:ACM Comput Surv
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 19
Next 10 →

The Open Provenance Model

by Luc Moreau, Natalia Kwasnikowska, Jan Van Den Bussche , 2008
"... The Open Provenance Model (OPM) is a community-driven data model for Provenance that is designed to support inter-operability of provenance technology. Underpinning OPM, is a notion of directed acyclic graph, used to represent data products and processes involved in past computations, and causal dep ..."
Abstract - Cited by 28 (4 self) - Add to MetaCart
The Open Provenance Model (OPM) is a community-driven data model for Provenance that is designed to support inter-operability of provenance technology. Underpinning OPM, is a notion of directed acyclic graph, used to represent data products and processes involved in past computations, and causal dependencies between these. The Open Provenance Model was derived following two “Provenance Challenges”, international, multidisciplinary activities trying to investigate how to exchange information between multiple systems supporting provenance and how to query it. The OPM design was mostly driven by practical and pragmatic considerations, and is being tested in a third Provenance Challenge, which has just started. The purpose of this paper is to investigate the theoretical foundations of this data model. The formalisation consists of a set-theoretic definition of the data model, a definition of the inferences by transitive closure that are permitted, a formal description of how the model can be used to express dependencies in past computations, and finally, a description of the kind of time-based inferences that are supported. A novel element that OPM introduces is the concept of an account, by which multiple descriptions of a same execution are allowed to co-exist in a same graph. Our formalisation gives a precise meaning to such accounts and associated notions of alternate and refinement. Warning It was decided that this paper should be released as early as possible since it brings useful clarifications on the Open Provenance Model, and therefore can benefit the Provenance Challenge 3 community. The reader should recognise that this paper is however an early draft, and several sections are incomplete. Additionally, figures rely on colours but these may be difficult to read when printed in a black and white. It is advisable to print the paper in colour. 1 1

nSPARQL: A Navigational Language for RDF

by Jorge Pérez, Marcelo Arenas, Claudio Gutierrez
"... Navigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In this paper, we propose the query language nSPARQL that uses nested regular expressions to na ..."
Abstract - Cited by 26 (4 self) - Add to MetaCart
Navigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In this paper, we propose the query language nSPARQL that uses nested regular expressions to navigate RDF data. We study some of the fundamental properties of nSPARQL and nested regular expressions concerning expressiveness and complexity of evaluation. Regarding expressiveness, we show that nSPARQL is expressive enough to answer queries considering the semantics of the RDFS vocabulary by directly traversing the input graph. We also show that nesting is necessary in nSPARQL to obtain this last result, and we study the expressiveness of the combination of nested regular expressions and SPARQL operators. Regarding complexity of evaluation, we prove that given an RDF graph G and a nested regular expression E, this problem can be solved in time O(|G|·|E|).

An algebra for basic graph patterns

by George H. L. Fletcher - In Proc. of the Workshop on Logic in Databases (LID , 2008
"... Abstract. Motivated by recent developments in the dataspaces, web, and personal information management communities, we outline research directions on query processing for SPARQL, the W3C recommendation language for querying RDF triple stores. The core of each SPARQL query is a basic graph pattern (B ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Abstract. Motivated by recent developments in the dataspaces, web, and personal information management communities, we outline research directions on query processing for SPARQL, the W3C recommendation language for querying RDF triple stores. The core of each SPARQL query is a basic graph pattern (BGP). BGP is a little logic for extracting subsets of related nodes in an RDF graph. In this paper we undertake a formal study of BGP with an eye towards efficient SPARQL query evaluation. Our main contributions are (1) an algebraization of BGP, and (2) first steps towards a framework for the design of structural indexes to accelerate processing of queries in this algebra. 1

An Extension of SPARQL for RDFS

by Marcelo Arenas, Claudio Gutierrez, Jorge Pérez, Católica Chile - In SWDB-ODBIS 2007
"... Abstract. RDF Schema (RDFS) extends RDF with a schema vocabulary with a predefined semantics. Evaluating queries which involve this vocabulary is challenging, and there is not yet consensus in the Semantic Web community on how to define a query language for RDFS. In this paper, we introduce a langua ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
Abstract. RDF Schema (RDFS) extends RDF with a schema vocabulary with a predefined semantics. Evaluating queries which involve this vocabulary is challenging, and there is not yet consensus in the Semantic Web community on how to define a query language for RDFS. In this paper, we introduce a language for querying RDFS data. This language is obtained by extending SPARQL with nested regular expressions that allow to navigate through an RDF graph with RDFS vocabulary. This language is expressive enough to answer SPARQL queries involving RDFS vocabulary, by directly traversing the input graph. 1

A Query Language for Analyzing Networks

by Anton Dries, Siegfried Nijssen, Luc De Raedt
"... With more and more large networks becoming available, mining and querying such networks are increasingly important tasks which are not being supported by database models and querying languages. This paper wants to alleviate this situation by proposing a data model and a query language for facilitati ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
With more and more large networks becoming available, mining and querying such networks are increasingly important tasks which are not being supported by database models and querying languages. This paper wants to alleviate this situation by proposing a data model and a query language for facilitating the analysis of networks. Key features include support for executing external tools on the networks, flexible contexts on the network each resulting in a different graph, primitives for querying subgraphs (including paths) and transforming graphs. The data model provides for a closure property, in which the output of every query can be stored in the database and used for further querying.

Scalable Indexing of RDF Graphs for Efficient Join Processing ABSTRACT

by George H. L. Fletcher
"... Current approaches to RDF graph indexing suffer from weak data locality, i.e., information regarding a piece of data appears in multiple locations, spanning multiple data structures. Weak data locality negatively impacts storage and query processing costs. Towards stronger data locality, we propose ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Current approaches to RDF graph indexing suffer from weak data locality, i.e., information regarding a piece of data appears in multiple locations, spanning multiple data structures. Weak data locality negatively impacts storage and query processing costs. Towards stronger data locality, we propose a Three-way Triple Tree (TripleT) secondary memory indexing technique to facilitate flexible and efficient join evaluation on RDF data. The novelty of TripleT is that the index is built over the atoms occurring in the data set, rather than at a coarser granularity, such as whole triples occurring in the data set; and, the atoms are indexed regardless of the roles (i.e., subjects, predicates, or objects) they play in the triples of the data set. We show through extensive empirical evaluation that TripleT exhibits multiple orders of magnitude improvement over the state-of-the-art, in terms of both storage and query processing costs.

Representing, querying and transforming social networks with rdf/sparql

by Mauro San Martín, Claudio Gutierrez - Semantic Web: Research and Applications , 2009
"... Abstract. As social networks are becoming ubiquitous on the Web, the Semantic Web goals indicate that it is critical to have a standard model allowing exchange, interoperability, transformation, and querying of social network data. In this paper we show that RDF/SPARQL meet this desiderata. Building ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Abstract. As social networks are becoming ubiquitous on the Web, the Semantic Web goals indicate that it is critical to have a standard model allowing exchange, interoperability, transformation, and querying of social network data. In this paper we show that RDF/SPARQL meet this desiderata. Building on developments of social network analysis, graph databases and Semantic Web, we present a social networks data model based on RDF, and a query and transformation language based on SPARQL meeting the above requirements. We study its expressive power and complexity showing that it behaves well, and present an illustrative prototype. 1

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

by Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, Joseph M. Hellerstein
"... While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving a high degree of parallel performance in the shared-memory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains over Hadoop-based implementations. 1.

Finding Maximum Degrees in Hidden Bipartite Graphs

by Yufei Tao, Cheng Sheng, Jianzhong Li , 2010
"... An (edge) hidden graph is a graph whose edges are not explicitly given. Detecting the presence of an edge requires expensive edgeprobing queries. We consider thek most connected vertex problem on hidden bipartite graphs. Specifically, given a bipartite graph G with independent vertex sets B and W, t ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
An (edge) hidden graph is a graph whose edges are not explicitly given. Detecting the presence of an edge requires expensive edgeprobing queries. We consider thek most connected vertex problem on hidden bipartite graphs. Specifically, given a bipartite graph G with independent vertex sets B and W, the goal is to find the k vertices in B with the largest degrees using the minimum number of queries. This problem can be regarded as a top-k extension of a semi-join, and is encountered in many applications in practice (e.g., top-k spatial join with arbitrarily complex join predicates). If B and W have n and m vertices respectively, the number of queries needed to solve the problem isnm in the worst case. This, however, is a pessimistic estimate on how many queries are necessary on practical data. In fact, on some easy inputs, the problem can be efficiently settled with onlykm+n edges, which is significantly

Foundations of RDF Databases

by Marcelo Arenas, Claudio Gutierrez, Jorge Pérez
"... Abstract The goal of this paper is to give an overview of the basics of the theory of RDF databases. We provide a formal definition of RDF that includes the features that distinguish this model from other graph data models. We then move into the fundamental issue of querying RDF data. We start by co ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract The goal of this paper is to give an overview of the basics of the theory of RDF databases. We provide a formal definition of RDF that includes the features that distinguish this model from other graph data models. We then move into the fundamental issue of querying RDF data. We start by considering the RDF query language SPARQL, which is a W3C Recommendation since January 2008. We provide an algebraic syntax and a compositional semantics for this language, study the complexity of the evaluation problem for different fragments of SPARQL, and consider the problem of optimizing the evaluation of SPARQL queries, showing that a natural fragment of this language has some good properties in this respect. We furthermore study the expressive power of SPARQL, by comparing it with some well-known query languages such as relational algebra. We conclude by considering the issue of querying RDF data in the presence of RDFS vocabulary. In particular, we present a recently proposed extension of SPARQL with navigational capabilities. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University