• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Robust and scalable Linked Data reasoning incorporating provenance and trust annotations (0)

by P A Bonatti, A Hogan, A Polleres, L Sauro
Venue:J. Web Sem
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 24
Next 10 →

Scalable and Distributed Methods for Entity Matching, Consolidation and Disambiguation over Linked Data Corpora

by Aidan Hogan , Antoine Zimmermann , Jürgen Umbrich , Axel Polleres , Stefan Decker , 2011
"... With respect to large-scale, static, Linked Data corpora, in this paper we discuss scalable and distributed methods for entity consolidation (aka. smushing, entity resolution, object consolidation, etc.) to locate and process names that signify the same entity. We investigate (i) a baseline approach ..."
Abstract - Cited by 21 (6 self) - Add to MetaCart
With respect to large-scale, static, Linked Data corpora, in this paper we discuss scalable and distributed methods for entity consolidation (aka. smushing, entity resolution, object consolidation, etc.) to locate and process names that signify the same entity. We investigate (i) a baseline approach, which uses explicit owl:sameAs relations to perform consolidation; (ii) extended entity consolidation which additionally uses a subset of OWL 2 RL/RDF rules to derive novel owl:sameAs relations through the semantics of inverse-functional properties, functional-properties and (max-)cardinality restrictions with value one; (iii) deriving weighted concurrence measures between entities in the corpus based on shared inlinks/outlinks and attribute values using statistical analyses; (iv) disambiguating (initially) consolidated entities based on inconsistency detection using OWL 2 RL/RDF rules. Our methods are based upon distributed sorts and scans of the corpus, where we deliberately avoid the requirement for indexing all data. Throughout, we offer evaluation over a diverse Linked Data corpus consisting of 1.118 billion quadruples derived from a domain-agnostic, open crawl of 3.985 million RDF/XML Web documents, demonstrating the feasibility of our methods at that scale, and giving insights into the quality of the results for real-world data.

Exploiting RDFS and OWL for Integrating Heterogeneous, Large-Scale, Linked Data Corpora

by Aidan Hogan , 2011
"... The Web contains a vast amount of information on an abundance of topics, much of which is encoded as structured data indexed by local databases. However, these databases are rarely interconnected and information reuse across sites is limited. Semantic Web standards offer a possible solution in the ..."
Abstract - Cited by 17 (11 self) - Add to MetaCart
The Web contains a vast amount of information on an abundance of topics, much of which is encoded as structured data indexed by local databases. However, these databases are rarely interconnected and information reuse across sites is limited. Semantic Web standards offer a possible solution in the form of an agreed-upon data model and set of syntaxes, as well as metalanguages for publishing schema-level information, offering a highly-interoperable means of publishing and interlinking structured data on the Web. Thanks to the Linked Data community, an unprecedented lode of such data has now been published on the Web—by individuals, academia, communities, corporations and governmental organisations alike—on a medley of often overlapping topics. This new publishing paradigm has opened up a range of new and interesting research topics with respect to how this emergent “Web of Data” can be harnessed and exploited by consumers. Indeed, although Semantic

Assessing linked data mappings using network measures

by Paul Groth, Claus Stadler, Jens Lehmann - In ESWC , 2012
"... Abstract. Linked Data is at its core about the setting of links between resources. Links provide enriched semantics, pointers to extra informa-tion and enable the merging of data sets. However, as the amount of Linked Data has grown, there has been the need to automate the cre-ation of links and suc ..."
Abstract - Cited by 15 (4 self) - Add to MetaCart
Abstract. Linked Data is at its core about the setting of links between resources. Links provide enriched semantics, pointers to extra informa-tion and enable the merging of data sets. However, as the amount of Linked Data has grown, there has been the need to automate the cre-ation of links and such automated approaches can create low-quality links or unsuitable network structures. In particular, it is difficult to know whether the links introduced improve or diminish the quality of Linked Data. In this paper, we present LINK-QA, an extensible framework that allows for the assessment of Linked Data mappings using network met-rics. We test five metrics using this framework on a set of known good and bad links generated by a common mapping system, and show the behaviour of those metrics.
(Show Context)

Citation Context

... takes a wider view of quality beyond just robustness. The closest work is most likely the work by Bonatti et al., which uses a variety of techniques for determining trust to perform robust reasoning =-=[16]-=-. In particular, they use a PageRank style algorithm to rank the quality of various sources while performing reasoning. Their work focuses on using these inputs for reasoning whereas LINK-QA specifica...

RDFS & OWL Reasoning for Linked Data

by Axel Polleres, Aidan Hogan, Renaud Delbru, Jürgen Umbrich
"... Abstract. Linked Data promises that a large portion of Web Data will be usable as one big interlinked RDF database against which structured queries can be answered. In this lecture we will show how reasoning – using RDF Schema (RDFS) and the Web Ontology Language (OWL) – can help to obtain more comp ..."
Abstract - Cited by 9 (6 self) - Add to MetaCart
Abstract. Linked Data promises that a large portion of Web Data will be usable as one big interlinked RDF database against which structured queries can be answered. In this lecture we will show how reasoning – using RDF Schema (RDFS) and the Web Ontology Language (OWL) – can help to obtain more complete answers for such queries over Linked Data. We first look at the extent to which RDFS and OWL features are being adopted on the Web. We then introduce two high-level architectures for query answering over Linked Data and outline how these can be enriched by (lightweight) RDFS and OWL reasoning, enumerating the main challenges faced and discussing reasoning methods that make practical and theoretical trade-offs to address these challenges. In the end, we also ask whether or not RDFS and OWL are enough and discuss numeric reasoning methods that are beyond the scope of these standards but that are often important when integrating Linked Data from several, heterogeneous sources. 1
(Show Context)

Citation Context

...ies. pD* does not support disjoint (15) or union classes (19). Regarding the standard OWL 2 profiles, OWL 2 EL and OWL 2 QL both omit support for important top-20 features. Neither include functional =-=(12)-=- or inverse-functional properties (18), or union classes (19). OWL 2 EL further omits support for inverse (14) and symmetric properties (20). OWL 2 QL does not support the prevalent same-as (16) featu...

Improving the recall of live Linked Data querying through reasoning

by Jürgen Umbrich, Aidan Hogan, Axel Polleres, Stefan Decker - In RR , 2012
"... Abstract. Linked Data principles allow for processing SPARQL queries on-the-fly by dereferencing URIs. Link-traversal query approaches for Linked Data have the benefit of up-to-date results and decentralised execution, but operate only on explicit data from dereferenced documents, affecting recall. ..."
Abstract - Cited by 7 (4 self) - Add to MetaCart
Abstract. Linked Data principles allow for processing SPARQL queries on-the-fly by dereferencing URIs. Link-traversal query approaches for Linked Data have the benefit of up-to-date results and decentralised execution, but operate only on explicit data from dereferenced documents, affecting recall. In this paper, we show how inferable knowledge— specifically that found through owl:sameAs and RDFS reasoning—can improve recall in this setting. We first analyse a corpus featuring 7 million Linked Data sources and 2.1 billion quadruples: we (1) measure expected recall by only considering dereferenceable information, (2) measure the improvement in recall given by considering rdfs:seeAlso links as previous proposals did. We further propose and measure the impact of additionally considering (3) owl:sameAs links, and (4) applying lightweight RDFS reasoning for finding more results, relying on static schema information. We evaluate different configurations for live queries covering different shapes and domains, generated from random walks over our corpus. 1
(Show Context)

Citation Context

...∼. Further note that, wrt. the Web of Data, our sample recall measures specify an upper bound. RDFS Schema From our corpus, we extract a static set of schema data for the RDFS reasoning. As argued in =-=[6]-=-, schema data on the Web is often noisy, where third-party publishers “redefine” popular terms outside of their namespace; for example, one document defines nine properties as the domain of rdf:type, ...

Generating and summarizing explanations for linked data

by Rakebul Hasan - In: Proc. of the 11th Extended Semantic Web Conference , 2014
"... Abstract. Linked Data consumers may need explanations for debug-ging or understanding the reasoning behind producing the data. They may need the possibility to transform long explanations into more un-derstandable short explanations. In this paper, we discuss an approach to explain reasoning over Li ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
Abstract. Linked Data consumers may need explanations for debug-ging or understanding the reasoning behind producing the data. They may need the possibility to transform long explanations into more un-derstandable short explanations. In this paper, we discuss an approach to explain reasoning over Linked Data. We introduce a vocabulary to de-scribe explanation related metadata and we discuss how publishing these metadata as Linked Data enables explaining reasoning over Linked Data. Finally, we present an approach to summarize these explanations taking into account user specified explanation filtering criteria.
(Show Context)

Citation Context

...oduction In the recent years, we have seen a growth of publishing Linked Data from community driven efforts, governmental bodies, social networking sites, scientific communities, and corporate bodies =-=[7]-=-. These data publishers from various domains publish their data in an interlinked fashion1 using vocabularies defined in RDFS/OWL. This presents opportunities for large-scale data integration and reas...

Practical rdf schema reasoning with annotated semantic web data

by C. V. Damásio, F. Ferreira - In The Semantic Web–ISWC 2011
"... Abstract. Semantic Web data with annotations is becoming available, being YAGO knowledge base a prominent example. In this paper we present an approach to perform the closure of large RDF Schema anno-tated semantic web data using standard database technology. In partic-ular, we exploit several alter ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Abstract. Semantic Web data with annotations is becoming available, being YAGO knowledge base a prominent example. In this paper we present an approach to perform the closure of large RDF Schema anno-tated semantic web data using standard database technology. In partic-ular, we exploit several alternatives to address the problem of computing transitive closure with real fuzzy semantic data extracted from YAGO in the PostgreSQL database management system. We benchmark the sev-eral alternatives and compare to classical RDF Schema reasoning, pro-viding the first implementation of annotated RDF schema in persistent storage.
(Show Context)

Citation Context

...QL, Rules 1 Introduction The Semantic Web rests on large amounts of data expressed in the form of RDF triples. The need to extend this data with meta-information like trust, provenance and confidence =-=[26, 22, 3]-=- imposed new requirements and extensions to the Resource Description Framework (Schema) [19] to handle annotations appropriately. Briefly, an annotation v from a suitable mathematical structure is add...

Evolution of Workflow Provenance Information in the Presence of Custom Inference Rules

by Christos Strubulis, Yannis Tzitzikas, Martin Doerr, Giorgos Flouris
"... Abstract. Workflow systems can produce very large amounts of provenance information. In this paper we introduce provenance-based inference rules as a means to reduce the amount of provenance information that has to be stored, and to ease quality control (e.g., corrections). We motivate this kind of ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Abstract. Workflow systems can produce very large amounts of provenance information. In this paper we introduce provenance-based inference rules as a means to reduce the amount of provenance information that has to be stored, and to ease quality control (e.g., corrections). We motivate this kind of (provenance) inference and identify a number of basic inference rules over a conceptual model appropriate for representing provenance. The proposed inference rules concern the interplay between (i) actors and carried out activities, (ii) activities and devices that were used for such activities, and, (iii) the presence of information objects and physical things at events. However, since a knowledge base is not static but it changes over time for various reasons, we also study how we can satisfy change requests while supporting and respecting the aforementioned inference rules. Towards this end, we elaborate on the specification of the required change operations. 1
(Show Context)

Citation Context

...”, which traces causal dependencies of individual data elements between input and output. In the latter category we could mention [17], [18]. Lastly, inference rules with annotations are exploited in =-=[4]-=- for scalable reasoning on web data. Although these annotations are indicators of data provenance, they do not directly model it. 5.2 Knowledge Evolution in RDF/S The research field of ontology evolut...

Link Traversal Querying for a Diverse Web of Data

by Jürgen Umbrich , Aidan Hogan , Axel Polleres , Stefan Decker , 2014
"... Traditional approaches for querying the Web of Data often involve centralised warehouses that replicate remote data. Conversely, Linked Data principles allow for answering queries live over the Web by dereferencing URIs to traverse remote data sources at runtime. A number of authors have looked at ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Traditional approaches for querying the Web of Data often involve centralised warehouses that replicate remote data. Conversely, Linked Data principles allow for answering queries live over the Web by dereferencing URIs to traverse remote data sources at runtime. A number of authors have looked at answering SPARQL queries in such a manner; these link-traversal based query execution (LTBQE) approaches for Linked Data offer up-to-date results and decentralised (i.e., client-side) execution, but must operate over incomplete dereferenceable knowledge available in remote documents, thus affecting response times and “recall” for query answers. In this paper, we study the recall and effectiveness of LTBQE, in practice, for the Web of Data. Furthermore, to integrate data from diverse sources, we propose lightweight reasoning extensions to help find additional answers. From the state-of-the-art which (1) considers only dereferenceable information and (2) follows rdfs:seeAlso links, we propose extensions to consider (3) owl:sameAs links and reasoning, and (4) lightweight RDFS reasoning. We then estimate the recall of link-traversal query techniques in practice: we analyse a large crawl of the Web of Data (the BTC’11 dataset), looking at the ratio of raw data contained in dereferenceable documents vs. the corpus as a whole and determining how much more raw data our extensions make available for query answering. We then stress-test LTBQE (and our extensions) in real-world settings using the FedBench and DBpedia SPARQL Benchmark frameworks, and propose a novel benchmark called QWalk based on random

Towards An Approximative Ontology-Agnostic Approach for Logic Programs

by João C. P. Da Silva, Andre ́ Freitas - In Proc. of the 8th Intl. Symposium on Foundations of Information and Knowledge Systems , 2014
"... Abstract. Distributional semantics focuses on the automatic construc-tion of a semantic model based on the statistical distribution of co-located words in large-scale texts. Deductive reasoning is a fundamental component for semantic understanding. Despite the generality and ex-pressivity of logical ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Abstract. Distributional semantics focuses on the automatic construc-tion of a semantic model based on the statistical distribution of co-located words in large-scale texts. Deductive reasoning is a fundamental component for semantic understanding. Despite the generality and ex-pressivity of logical models, from an applied perspective, deductive rea-soners are dependent on highly consistent conceptual models, which lim-its the application of reasoners to highly heterogeneous and open domain knowledge sources. Additionally, logical reasoners may present scalabil-ity issues. This work focuses on advancing the conceptual and formal work on the interaction between distributional semantics and logic, fo-cusing on the introduction of a distributional deductive inference model for large-scale and heterogeneous knowledge bases. The proposed reason-ing model targets the following features: (i) an approximative ontology-agnostic reasoning approach for logical knowledge bases, (ii) the inclu-sion of large volumes of distributional semantics commonsense knowledge into the inference process and (iii) the provision of a principled geometric representation of the inference process.
(Show Context)

Citation Context

...ar = battle involked(jackson,war) 16 João C. P. da Silva1 and André Freitas2 in the context of the Semantic Web. The composition with scalable and selective reasoning models (e.g. in Bonatti et al. =-=[4]-=-) should be investigated in order to minimize the impact of the additional inference process. The average predicate distributional matching time is 1,523 ms in a core i5 8GB RAM machine. The τ -Space ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University