Results 1 - 10
of
57
Linked Data -- The story so far
"... The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertion ..."
Abstract
-
Cited by 739 (15 self)
- Add to MetaCart
The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions- the Web of Data. In this article we present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. We describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Publishing and Consuming Provenance Metadata on the Web of Linked Data
- In: Proc. of 3rd Int. Provenance and Annotation Workshop
, 2010
"... Abstract. The World Wide Web evolves into a Web of Data, a huge, globally distributed dataspace that contains a rich body of machineprocessable information from a virtually unbound set of providers covering a wide range of topics. However, due to the openness of the Web little is known about who cre ..."
Abstract
-
Cited by 44 (4 self)
- Add to MetaCart
(Show Context)
Abstract. The World Wide Web evolves into a Web of Data, a huge, globally distributed dataspace that contains a rich body of machineprocessable information from a virtually unbound set of providers covering a wide range of topics. However, due to the openness of the Web little is known about who created the data and how. The fact that a large amount of the data on the Web is derived by replication, query processing, modification, or merging raises concerns of information quality. Poor quality data may propagate quickly and contaminate the Web of Data. Provenance information about who created and published the data and how, provides the means for quality assessment. This paper takes a first step towards creating a quality-aware Web of Data: we present approaches to integrate provenance information into the Web of Data and we illustrate how this information can be consumed. In particular, we introduce a vocabulary to describe provenance of Web data as metadata and we discuss possibilities to make such provenance metadata accessible as part of the Web of Data. Furthermore, we describe how this metadata can be queried and consumed to identify outdated information. 1
Using web data provenance for quality assessment
- In: Proc. of the Workshop on Semantic Web and Provenance Management at ISWC
, 2009
"... Abstract—The Web of Data cannot be a trustworthy data source unless an approach for evaluating the quality of data on the Web is established and integrated as part of the data publication and access process. In this paper, we propose an approach of using provenance information about the data on the ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
(Show Context)
Abstract—The Web of Data cannot be a trustworthy data source unless an approach for evaluating the quality of data on the Web is established and integrated as part of the data publication and access process. In this paper, we propose an approach of using provenance information about the data on the Web to assess their quality and trustworthiness. Our contributions include a model for Web data provenance and an assessment method that can be adapted for specific quality criteria. We demonstrate how this method can be used to evaluate the timeliness of data on the Web, to reflect how up-to-date the data is. We also propose a possible solution to deal with missing provenance information by associating certainty values with calculated quality values. I.
An empirical survey of Linked Data conformance
, 2009
"... There has been a recent, tangible growth in RDF published on the Web in accordance with the Linked Data principles and best practices, the result of which has been dubbed the “Web of Data”. Linked Data guidelines are designed to facilitate ad hoc re-use and integration of conformant structured data— ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
There has been a recent, tangible growth in RDF published on the Web in accordance with the Linked Data principles and best practices, the result of which has been dubbed the “Web of Data”. Linked Data guidelines are designed to facilitate ad hoc re-use and integration of conformant structured data—across the Web—by consumer applications; however, thus far, systems have yet to emerge that convincingly demonstrate the potential applications for consuming currently available Linked Data. Herein, we compile a list of fourteen concrete guidelines as given in the “How to Publish Linked Data on the Web ” tutorial. Thereafter, we evaluate conformance of current RDF data providers with respect to these guidelines. Our evaluation is based on quantitative empirical analyses of a crawl of ∼4 million RDF/XML documents constituting over 1 billion quadruples, where we also look at the stability of hosted documents for a corpus consisting of nine monthly snapshots from a sample of 151 thousand documents. Backed by our empirical survey, we provide insights into the current level of conformance with respect to various Linked Data guidelines, enumerating lists of the most (non-)conformant data providers. We show that certain guidelines are broadly adhered to (esp. use HTTP URIs, keep URIs stable), whilst others are commonly overlooked (esp. provide licencing and human-readable meta-data). We also compare PageRank scores for the data-providers and their conformance to Linked Data guidelines, showing that both factors negatively correlate for guidelines restricting use of RDF features, while positively correlating for guidelines encouraging external linkage and vocabulary re-use. Finally, we present a summary of conformance for the different guidelines, and present the top-ranked data providers in terms of a combined PageRank and Linked Data conformance score.
Querying Trust in RDF Data with tSPARQL
, 2009
"... Today a large amount of RDF data is published on the Web. However, the openness of the Web and the ease to combine RDF data from different sources creates new challenges. The Web of data is missing a uniform way to assess and to query the trustworthiness of information. In this paper we present tSP ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
Today a large amount of RDF data is published on the Web. However, the openness of the Web and the ease to combine RDF data from different sources creates new challenges. The Web of data is missing a uniform way to assess and to query the trustworthiness of information. In this paper we present tSPARQL, a trust-aware extension to SPARQL. Two additional keywords enable users to describe trust requirements and to query the trustworthiness of RDF data. Hence, tSPARQL allows adding trust to RDF-based applications in an easy manner. As the foundation we propose a trust model that associates RDF statements with trust values and we extend the SPARQL semantics to access these trust values in tSPARQL. Furthermore, we discuss opportunities to optimize the execution of tSPARQL queries.
A GENERAL FRAMEWORK FOR REPRESENTING, REASONING AND QUERYING WITH ANNOTATED SEMANTIC WEB DATA
, 2011
"... published as conference papers in AAAI 2010 [40] and ISWC 2010 [28]. ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
(Show Context)
published as conference papers in AAAI 2010 [40] and ISWC 2010 [28].
Data-Gov Wiki: Towards Linked Government Data
"... Abstract. The Data-gov Wiki is the delivery site for a project where we investigate the role of linked data in producing, processing and utilizing the government datasets found in data.gov. Towards facilitating the Web developers and users access the public government data transparently, the Data-go ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
Abstract. The Data-gov Wiki is the delivery site for a project where we investigate the role of linked data in producing, processing and utilizing the government datasets found in data.gov. Towards facilitating the Web developers and users access the public government data transparently, the Data-gov Wiki currently features the following: (i) RDF dump of interlinked US government data (over 2 billion triples covering hundreds of data.gov datasets) with dereferenceable URIs; (ii) a Semantic Wiki based user interface mashing up the catalog data published at data.gov, machine generated statistics at TWC and user contributed data that connects the RDF dump to the open linked data; (iii) a number of visual demos illustrating the practical value the linked data.gov datasets as well as the corresponding technical details, and (iv) web services that publish changes in data.gov datasets (e.g. recently added/updated datasets) via RSS and Twitter. Extensions underway include developing interesting applications and demonstrations that show how semantically linked government data can be used to combine information from the different datasets, how it can be used to combine these datasets with information found elsewhere on the Web, and how we can link US data.gov efforts with the UK linked-data release currently under development. 1
An HTTP-Based Versioning Mechanism for Linked Data
"... Dereferencing a URI returns a representation of the current state of the resource identified by that URI. But, on the Web representations of prior states of a resource are also available, for example, as resource versions in Content Management Systems or archival resources in Web Archives such as th ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
(Show Context)
Dereferencing a URI returns a representation of the current state of the resource identified by that URI. But, on the Web representations of prior states of a resource are also available, for example, as resource versions in Content Management Systems or archival resources in Web Archives such as the Internet Archive. This paper introduces a resource versioning mechanism that is fully based on HTTP and uses datetime as a global version indicator. The approach allows “follow your nose ” style navigation both from the current time-generic resource to associated time-specific version resources as well as among version resources. The proposed versioning mechanism is congruent with the Architecture of the World Wide Web, and is based on the Memento framework that extends HTTP with transparent content negotiation in the datetime dimension. The paper shows how the versioning approach applies to Linked Data, and by means of a demonstrator built for DBpedia, it also illustrates how it can be used to conduct a time-series analysis across versions of Linked Data descriptions.
Using SPARQL and SPIN for Data Quality Management on the Semantic Web
"... Abstract. The quality of data is a key factor that determines the performance of information systems, in particular with regard (1) to the amount of exceptions in the execution of business processes and (2) to the quality of decisions based on the output of the respective information system. Recentl ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
(Show Context)
Abstract. The quality of data is a key factor that determines the performance of information systems, in particular with regard (1) to the amount of exceptions in the execution of business processes and (2) to the quality of decisions based on the output of the respective information system. Recently, the Semantic Web and Linked Data activities have started to provide substantial data resources that may be used for real business operations. Hence, it will soon be critical to manage the quality of such data. Unfortunately, we can observe a wide range of data quality problems in Semantic Web data. In this paper, we (1) evaluate how the state of the art in data quality research fits the characteristics of the Web of Data, (2) describe how the SPARQL query language and the SPARQL Inferencing Notation (SPIN) can be utilized to identify data quality problems in
Weaving a Social Data Web with Semantic Pingback
"... Abstract. In this paper we tackle some of the most pressing obstacles of the emerging Linked Data Web, namely the quality, timeliness and coherence as well as direct end user benefits. We present an approach for complementing the Linked Data Web with a social dimension by extending the well-known Pi ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper we tackle some of the most pressing obstacles of the emerging Linked Data Web, namely the quality, timeliness and coherence as well as direct end user benefits. We present an approach for complementing the Linked Data Web with a social dimension by extending the well-known Pingback mechanism, which is a technological cornerstone of the blogosphere, towards a Semantic Pingback. It is based on the advertising of an RPC service for propagating typed RDF links between Data Web resources. Semantic Pingback is downwards compatible with conventional Pingback implementations, thus allowing to connect and interlink resources on the Social Web with resources on the Data Web. We demonstrate its usefulness by showcasing use cases of the Semantic Pingback implementations in the semantic wiki OntoWiki and the Linked Data interface for database-backed Web applications Triplify.