Results 1 - 10
of
24
OptimAX: Optimizing Distributed ActiveXML Applications
- In ICWE
, 2008
"... The Web has become a platform of choice for the deployment of complex applications involving several business partners. Typically, such applications interoperate by means of Web services, exchanging XML information. We present OptimAX, an optimization Web service that applies at the static level (pr ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
The Web has become a platform of choice for the deployment of complex applications involving several business partners. Typically, such applications interoperate by means of Web services, exchanging XML information. We present OptimAX, an optimization Web service that applies at the static level (prior to enacting an application) in order to rewrite it into one whose execution will be more performant. OptimAX builds on the ActiveXML (AXML) data-centric Web service composition language, and demonstrates how database-style techniques can be efficiently integrated in a loosely-coupled, distributed application based on Web services. OptimAX has been fully implemented and we describe its experimental performance. Figure 1. WebContent architecture outline. 1
WebContent: Efficient P2P Warehousing of Web Data
, 2008
"... We present the WebContent platform for managing distributed repositories of XML and semantic Web data. The platform allows integrating various data processing building blocks (crawling, translation, semantic annotation, full-text search, structured XML querying, and semantic querying), presented as ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We present the WebContent platform for managing distributed repositories of XML and semantic Web data. The platform allows integrating various data processing building blocks (crawling, translation, semantic annotation, full-text search, structured XML querying, and semantic querying), presented as Web services, into a large-scale efficient platform. Calls to various services are combined inside ActiveXML [8] documents, which are XML documents including service calls. An ActiveXML optimizer is used to: (i) efficiently distribute computations among sites; (ii) perform XQuery-specific optimizations by leveraging an algebraic XQuery optimizer; and (iii) given an XML query, chose among several distributed indices the most appropriate in order to answer the query.
Materialized views for P2P XML warehousing
"... We consider the efficient, scalable management of XML documents in structured peer-to-peer networks based on distributed hash table (DHT) indices. We present an approach for exploiting materialized views deployed in the DHT network independently by the peers, to answer an interesting dialect of tree ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
We consider the efficient, scalable management of XML documents in structured peer-to-peer networks based on distributed hash table (DHT) indices. We present an approach for exploiting materialized views deployed in the DHT network independently by the peers, to answer an interesting dialect of tree pattern queries. We provide algorithms to index and materialize views in the DHT, show that the rewriting problem is polynomial in the number of views, and describe rewriting algorithms. Our approach is validated by experiments on the complete platform deployed on 1000 peers in a wide area network.
RDF in the clouds: A survey
- VLDB J
, 2014
"... The Resource Description Framework (RDF) pioneered by ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
The Resource Description Framework (RDF) pioneered by
Lca-based selection for xml document collections
- in WWW, 2010
"... ABSTRACT In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based on their goodness to the query. Goodness is determined by the relevance of the documents in the collection to th ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based on their goodness to the query. Goodness is determined by the relevance of the documents in the collection to the query. We consider keyword queries and support Lowest Common Ancestor (LCA) semantics for defining query results, where the relevance of each document to a query is determined by properties of the LCA of those nodes in the XML document that contain the query keywords. To avoid evaluating queries against each document in a collection, we propose maintaining in a preprocessing phase, information about the LCAs of all pairs of keywords in a document and use it to approximate the properties of the LCA-based results of a query. To improve storage and processing efficiency, we use appropriate summaries of the LCA information based on Bloom filters. We address both a boolean and a weighted version of the database selection problem. Our experimental results show that our approach incurs low errors in the estimation of the goodness of a collection and provides rankings that are very close to the actual ones.
ViP2P: Efficient XML management in DHT networks
- In ICWE, 2012. 24 François Goasdoué et al
"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
Affinity-based XML Fragmentation
"... In this paper we tackle the fragmentation problem for highly distributed databases. In such an environment, a suitable fragmentation strategy may provide scalability and availability by minimizing distributed transactions. We propose an approach for XML fragmentation that takes as input both the app ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
In this paper we tackle the fragmentation problem for highly distributed databases. In such an environment, a suitable fragmentation strategy may provide scalability and availability by minimizing distributed transactions. We propose an approach for XML fragmentation that takes as input both the application’s expected workload and a storage threshold, and produces as output an XML fragmentation schema. Our workload-aware method aims to minimize the execution of distributed transactions by packing up related data in a small set of fragments. We present experiments that compare alternative fragmentation schemas, showing that the one produced by our technique provides a finer-grained result and better system throughput. 1.
Routing of Structured Queries in Large-Scale Distributed Systems
"... In order to search XML-document collections, structural information – given by a user in the form of a structured query or provided by the self-describing structure of XML-documents – have been used in the past years to improve Information Retrieval (IR) quality in terms of recall and precision. How ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
In order to search XML-document collections, structural information – given by a user in the form of a structured query or provided by the self-describing structure of XML-documents – have been used in the past years to improve Information Retrieval (IR) quality in terms of recall and precision. However, all known approaches have only been used in classical client-/server (C/S) architectures. None have ever been applied to improve retrieval in large-scale distributed systems such as Peer-to-Peer (P2P) networks, where efficiency issues have to be dealt with carefully, e.g. in order to reduce communication overhead between distributed nodes. As P2P networks can be considered promising alternatives to C/S-systems for storing large amounts of information including XML-documents, possibilities for improving the retrieval in such networks should be investigated. In this paper, we concentrate on query routing in such a scenario and raise the question, how structured queries can be routed in a highly distributed environment so as to increase both efficiency and effectiveness. We provide an infrastructure for investigating this question and propose techniques for performing routing based on a mixture of document-, element-, collection- and peerevidence. We also report on preliminary evaluation results with the INEX collection.
Optimized union of non-disjoint distributed data sets
- In EDBT
, 2009
"... ABSTRACT In a variety of applications, ranging from data integration to distributed query evaluation, there is a need to obtain sets of data items from several sources (peers) and compute their union. As these sets often contain common data items, avoiding the transmission of redundant information ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT In a variety of applications, ranging from data integration to distributed query evaluation, there is a need to obtain sets of data items from several sources (peers) and compute their union. As these sets often contain common data items, avoiding the transmission of redundant information is essential for effective union computation. In this paper we define the notion of optimal union plans for nondisjoint data sets residing on distinct peers, and present efficient algorithms for computing and executing such optimal plans. Our algorithms avoid redundant data transmission and optimally exploit the network bandwidth capabilities. A challenge in the design of optimal plans is the lack of a complete map of the distribution of the data items among peers. We analyze the information required for optimal planning and propose novel techniques to obtain compact, cheap to communicate, description of the data sources. We then exploit it for efficient union computation with reasonable accuracy. We demonstrate experimentally the superiority of our approach over the common naive union computation, showing it improves the performance by an order of magnitude.