Results 1 
8 of
8
Data Driven Approximation with Bounded Resources
"... ABSTRACT This paper proposes BEAS, a resourcebounded scheme for querying relations. It is parameterized with a resource ratio α ∈ (0, 1], indicating that given a big dataset D, we can only afford to access an αfraction of D with limited resources. For a query Q posed on D, BEAS computes exact ans ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT This paper proposes BEAS, a resourcebounded scheme for querying relations. It is parameterized with a resource ratio α ∈ (0, 1], indicating that given a big dataset D, we can only afford to access an αfraction of D with limited resources. For a query Q posed on D, BEAS computes exact answers Q(D) if doable and otherwise approximate answers, by accessing at most αD amount of data in the entire process. Underlying BEAS are (1) an access schema, which helps us identify and fetch the part of data needed to answer Q, (2) an accuracy measure to assess approximate answers in terms of their relevance and coverage w.r.t. exact answers, (3) an Approximability Theorem for the feasibility of resourcebounded approximation, and (4) algorithms for query evaluation with bounded resources. A unique feature of BEAS is its ability to answer unpredictable queries, aggregate or not, using bounded resources and assuring a deterministic accuracy lower bound. Using reallife and synthetic data, we empirically verify the effectiveness and efficiency of BEAS.
Querying big data: Bridging theory and practice
 J. Comput. Sci. Technol
"... Abstract Big data introduces challenges to query answering, from theory to practice. A number of questions arise. What queries are "tractable" on big data? How can we make big data "small" so that it is feasible to find exact query answers? When exact answers are beyond reach in ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract Big data introduces challenges to query answering, from theory to practice. A number of questions arise. What queries are "tractable" on big data? How can we make big data "small" so that it is feasible to find exact query answers? When exact answers are beyond reach in practice, what approximation theory can help us strike a balance between the quality of approximate query answers and the costs of computing such answers? To get sensible query answers in big data, what else do we necessarily do in addition to coping with the size of the data? This position paper aims to provide an overview of recent advances in the study of querying big data. We propose approaches to tackling these challenging issues, and identify open problems for future research.
Datadriven Visual Graph Query Interface Construction and Maintenance: Challenges and Opportunities
"... ABSTRACT Visual query interfaces make it easy for scientists and other nonexpert users to query a data collection. Heretofore, visual query interfaces have been staticallyconstructed, independent of the data. In this paper we outline a vision of a different kind of interface, one that is built (in ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT Visual query interfaces make it easy for scientists and other nonexpert users to query a data collection. Heretofore, visual query interfaces have been staticallyconstructed, independent of the data. In this paper we outline a vision of a different kind of interface, one that is built (in part) from the data. In our datadriven approach, the visual interface is dynamically constructed and maintained. A datadriven approach has many benefits such as reducing the cost in constructing and maintaining an interface, superior support for query formulation, and increased portability of the interface. We focus on graph databases, but our approach is applicable to several other kinds of databases such as JSON and XML.
10.1109/TKDE.2015.2429138, IEEE Transactions on Knowledge and Data Engineering 1 Answering Pattern Queries Using Views
"... Abstract—Answering queries using views has proven effective for querying relational and semistructured data. This paper investigates this issue for graph pattern queries based on graph simulation. We propose a notion of pattern containment to characterize graph pattern matching using graph pattern v ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Answering queries using views has proven effective for querying relational and semistructured data. This paper investigates this issue for graph pattern queries based on graph simulation. We propose a notion of pattern containment to characterize graph pattern matching using graph pattern views. We show that a pattern query can be answered using a set of views if and only if it is contained in the views. Based on this characterization, we develop efficient algorithms to answer graph pattern queries. We also study problems for determining (minimal, minimum) containment of pattern queries. We establish their complexity (from cubictime to NPcomplete) and provide efficient checking algorithms (approximation when the problem is intractable). In addition, when a pattern query is not contained in the views, we study maximally contained rewriting to find approximate answers; we show that it is in cubictime to compute such rewriting, and present a rewriting algorithm. We experimentally verify that these methods are able to efficiently answer pattern queries on large realworld graphs. 1
Association Rules with Graph Patterns
"... We propose graphpattern association rules (GPARs) for social media marketing. Extending association rules for itemsets, GPARs help us discover regularities between entities in social graphs, and identify potential customers by exploring social influence. We study the problem of discovering topk ..."
Abstract
 Add to MetaCart
(Show Context)
We propose graphpattern association rules (GPARs) for social media marketing. Extending association rules for itemsets, GPARs help us discover regularities between entities in social graphs, and identify potential customers by exploring social influence. We study the problem of discovering topk diversified GPARs. While this problem is NPhard, we develop a parallel algorithm with accuracy bound. We also study the problem of identifying potential customers with GPARs. While it is also NPhard, we provide a parallel scalable algorithm that guarantees a polynomial speedup over sequential algorithms with the increase of processors. Using reallife and synthetic graphs, we experimentally verify the scalability and effectiveness of the algorithms. 1.
Processing SPARQL Queries Over Linked Data  A Distributed Graphbased Approach
, 2014
"... We propose techniques for processing SPARQL queries over linked data. We follow a graphbased approach where answering a query Q is equivalent to finding its matches over a distributed RDF data graph G. We adopt a “partial evaluation and assembly” framework. Partial evaluation results of query Q ove ..."
Abstract
 Add to MetaCart
(Show Context)
We propose techniques for processing SPARQL queries over linked data. We follow a graphbased approach where answering a query Q is equivalent to finding its matches over a distributed RDF data graph G. We adopt a “partial evaluation and assembly” framework. Partial evaluation results of query Q over each repository—called local partial match—are found. In the assembly stage, we propose a centralized and a distributed assembly strategy. We analyze our algorithms both theoretically and the experimentally. Extensive experiments over both real and benchmark RDF repositories with billion triples demonstrate the high performance and scalability of our methods compared with that of the existing solutions.