Results 1  10
of
26
A Survey on Representation, Composition and Application of Preferences in Database Systems
 ACM TODS
, 2011
"... Preferences have been traditionally studied in philosophy, psychology, and economics and applied to decision making problems. Recently, they have attracted the attention of researchers in other fields, such as databases where they capture soft criteria for queries. Databases bring a whole fresh pers ..."
Abstract

Cited by 30 (6 self)
 Add to MetaCart
(Show Context)
Preferences have been traditionally studied in philosophy, psychology, and economics and applied to decision making problems. Recently, they have attracted the attention of researchers in other fields, such as databases where they capture soft criteria for queries. Databases bring a whole fresh perspective to the study of preferences, both computational and representational. From a representational perspective, the central question is how we can effectively represent preferences and incorporate them in database querying. From a computational perspective, we can look at how we can efficiently process preferences in the context of database queries. Several approaches have been proposed but a systematic study of these works is missing. The purpose of this survey is to provide a framework for placing existing works in perspective and highlight critical open challenges to serve as a springboard for researchers in database systems. We organize our study around three axes: preference representation, preference composition, and preference query processing.
Stochastic Skyline Operator
"... Abstract — In many applications involving the multiple criteria optimal decision making, users may often want to make a personal tradeoff among all optimal solutions. As a key feature, the skyline in a multidimensional space provides the minimum set of candidates for such purposes by removing all ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
Abstract — In many applications involving the multiple criteria optimal decision making, users may often want to make a personal tradeoff among all optimal solutions. As a key feature, the skyline in a multidimensional space provides the minimum set of candidates for such purposes by removing all points not preferred by any (monotonic) utility/scoring functions; that is, the skyline removes all objects not preferred by any user no mater how their preferences vary. Driven by many applications with uncertain data, the probabilistic skyline model is proposed to retrieve uncertain objects based on skyline probabilities. Nevertheless, skyline probabilities cannot capture the preferences of monotonic utility functions. Motivated by this, in this paper we propose a novel skyline operator, namely stochastic skyline. In the light of the expected utility principle, stochastic skyline guarantees to provide the minimum set of candidates for the optimal solutions over all possible monotonic multiplicative utility functions. In contrast to the conventional skyline or the probabilistic skyline computation, we show that the problem of stochastic skyline is NPcomplete with respect to the dimensionality. Novel and efficient algorithms are developed to efficiently compute stochastic skyline over multidimensional uncertain data, which run in polynomial time if the dimensionality is fixed. We also show, by theoretical analysis and experiments, that the size of stochastic skyline is quite similar to that of conventional skyline over certain data. Comprehensive experiments demonstrate that our techniques are efficient and scalable regarding both CPU and IO costs. I.
Efficient all topk computation  a unified solution for all topk, reverse topk and topm influential queries
 IEEE Trans. Knowl. Data Eng
"... Abstract—Given a set of objects P and a set of ranking functions F over P, an interesting problem is to compute the top ranked objects for all functions. Evaluation of multiple topk queries finds application in systems, where there is a heavy workload of ranking queries (e.g., online search engines ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Given a set of objects P and a set of ranking functions F over P, an interesting problem is to compute the top ranked objects for all functions. Evaluation of multiple topk queries finds application in systems, where there is a heavy workload of ranking queries (e.g., online search engines and product recommendation systems). The simple solution of evaluating the topk queries onebyone does not scale well; instead, the system can make use of the fact that similar queries share common results to accelerate search. This paper is the first, to our knowledge, thorough study of this problem. We propose methods that compute all topk queries in batch. Our first solution applies the block indexed nested loops paradigm, while our second technique is a viewbased algorithm. We propose appropriate optimization techniques for the two approaches and demonstrate experimentally that the second approach is consistently the best. Our approach facilitates evaluation of other complex queries that depend on the computation of multiple topk queries, such as reverse topk and topm influential queries. We show that our batch processing technique for these complex queries outperform the stateoftheart by orders of magnitude. Index Terms—all topk queries, viewbased index F 1
Flexible and Efficient Resolution of Skyline Query Size Constraints
, 2010
"... Given a set of multidimensional points, a skyline query returns the interesting points that are not dominated by other points. It has been observed that the actual cardinality (s) of a skyline query result may differ substantially from the desired result cardinality (k), which has prompted studies ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Given a set of multidimensional points, a skyline query returns the interesting points that are not dominated by other points. It has been observed that the actual cardinality (s) of a skyline query result may differ substantially from the desired result cardinality (k), which has prompted studies on how to reduce s for the case where k < s. This paper goes further by addressing the general case where the relationship between k and s is not known beforehand. Due to their complexity, the existing pointwise ranking and setwide maximization techniques are not well suited for this problem. Moreover, the former often incurs too many ties in its ranking, and the latter is inapplicable for k> s. Based on these observations, the paper proposes a new approach, called skyline ordering, that forms a skylinebased partitioning of a given data set, such that an order exists among the partitions. Then setwide maximization techniques may be applied within each partition. Efficient algorithms are developed for skyline ordering and for resolving size constraints using the skyline order. The results of extensive experiments show that skyline ordering yields a flexible framework for the efficient and scalable resolution of arbitrary size constraints on skyline queries.
Efficient Skyline Evaluation over Partially Ordered Domains
, 2010
"... Although there has been a considerable body of work on skyline evaluation in multidimensional data with totally ordered attribute domains, there are only a few methods that consider attributes with partially ordered domains. Existing work maps each partially ordered domain to a total order and then ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Although there has been a considerable body of work on skyline evaluation in multidimensional data with totally ordered attribute domains, there are only a few methods that consider attributes with partially ordered domains. Existing work maps each partially ordered domain to a total order and then adapts algorithms for totallyordered domains to solve the problem. Nevertheless these methods either use stronger notions of dominance, which generate false positives, or require expensive dominance checks. In this paper, we propose two new methods, which do not have these drawbacks. The first method uses an appropriate mapping of a partial order to a total order, inspired by the lattice theorem and an offtheshelf skyline algorithm. The second technique uses an appropriate storage and indexing approach, inspired by column stores, which enables efficient verification of whether a pair of objects are incompatible. We demonstrate that both our methods are up to an order of magnitude more efficient than previous work and scale well with different problem parameters, such as complexity of partial orders.
BSkyTree: Scalable Skyline Computation Using A Balanced Pivot Selection ∗
"... Skyline queries have gained a lot of attention for multicriteria analysis in largescale datasets. While existing skyline algorithms have focused mostly on exploiting data dominance to achieve efficiency, we propose that data incomparability should be treated as another key factor in optimizing ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Skyline queries have gained a lot of attention for multicriteria analysis in largescale datasets. While existing skyline algorithms have focused mostly on exploiting data dominance to achieve efficiency, we propose that data incomparability should be treated as another key factor in optimizing skyline computation. Specifically, to optimize both factors, we first identify common modules shared by existing nonindex skyline algorithms, and then analyze them to develop a cost model to guide a balanced pivot point selection. Based on the cost model, we lastly implement our balanced pivot selection in two algorithms, BSkyTreeS and BSkyTreeP, treating both dominance and incomparability as key factors. Our experimental results demonstrate that proposed algorithms outperform stateoftheart skyline algorithms up to two orders of magnitude.
Stochastic Skylines
, 2012
"... In many applications involving multiple criteria optimal decision making, users may often want to make a personal tradeoff among all optimal solutions for selecting one object that fits best their personal needs. As a key feature, the skyline in a multidimensional space provides the minimum set of ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In many applications involving multiple criteria optimal decision making, users may often want to make a personal tradeoff among all optimal solutions for selecting one object that fits best their personal needs. As a key feature, the skyline in a multidimensional space provides the minimum set of candidates for such purposes by removing all points not preferred by any (monotonic) utility/scoring functions; that is, the skyline removes all objects not preferred by any user no matter how their preferences vary. Driven by many recent applications with uncertain data, the probabilistic skyline model is proposed to retrieve uncertain objects based on skyline probabilities. Nevertheless, skyline probabilities cannot capture the preferences of monotonic utility functions. Motivated by this, in this article we propose a novel skyline operator, namely stochastic skylines. In the light of the expected utility principle, stochastic skylines guarantee to provide the minimum set of candidates to optimal solutions over a family of utility functions. We first propose the lskyline operator based on the lower orthant orders. lskyline guarantees to provide the minimum set of candidates to the optimal solutions for the family of monotonic multiplicative utility functions. While lskyline works very effectively for the family of multiplicative functions, it may miss optimal solutions for other utility /scoring functions (e.g., linear functions). To resolve this, we also propose a general stochastic skyline operator, gskyline, based on the usual orders. gskyline provides the minimum candidate set to the optimal
Skyline Operator on Anticorrelated Distributions
"... Finding the skyline in a multidimensional space is relevant to a wide range of applications. The skyline operator over a set of ddimensional points selects the points that are not dominated by any other point on all dimensions. Therefore, it provides a minimal set of candidates for the users to ma ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Finding the skyline in a multidimensional space is relevant to a wide range of applications. The skyline operator over a set of ddimensional points selects the points that are not dominated by any other point on all dimensions. Therefore, it provides a minimal set of candidates for the users to make their personal tradeoff among all optimal solutions. Theexistingalgorithmsestablishboththeworstcasecomplexitybydiscardingdistributionsandtheaveragecasecomplexity by assuming dimensional independence. However, thedataintherealworldismorelikelytobeanticorrelated. The cardinality and complexity analysis on dimensionally independent data is meaningless when dealing with anticorrelated data. Furthermore, the performance of the existing algorithms becomes impractical on anticorrelated data. In this paper, we establish a cardinality model for anticorrelated distributions. We propose an accurate polynomial estimation for the expected value of the skyline cardinality. Because the high skyline cardinality downgrades the performance of most existing algorithms on anticorrelated data, we further develop a determination and elimination frameworkwhichextendsthewelladoptedeliminationstrategy. It achieves remarkable effectiveness and efficiency. The comprehensiveexperimentsonbothrealdatasetsandbenchmark synthetic datasets demonstrate that our approach significantly outperforms the stateoftheart algorithms under a wide range of settings. 1.
QSkycube: Efficient Skycube Computation Using PointBased Space Partitioning
"... Skyline queries have gained considerable attention for multicriteria analysis of largescale datasets. However, the skyline queries are known to return too many results for highdimensional data. To address this problem, a skycube is introduced to efficiently provide users with multiple skylines with ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Skyline queries have gained considerable attention for multicriteria analysis of largescale datasets. However, the skyline queries are known to return too many results for highdimensional data. To address this problem, a skycube is introduced to efficiently provide users with multiple skylines with different strengths. For efficient skycube construction, stateoftheart algorithms amortized redundant computation among subspace skylines, or cuboids, either (1) in a bottomup fashion with the principle of sharing result or (2) in atopdownfashion with theprinciple ofsharing structure. However, we observed further room for optimization in both principles. This paper thus aims to design a more efficient skycube algorithm that shares multiple cuboids using more effective structures. Specifically, we first develop each principle by leveraging multiple parents and a skytree, representing recursive pointbased space partitioning. We then design an efficient algorithm exploiting these principles. Experimental results demonstrate that our proposed algorithm is significantly faster than stateoftheart skycube algorithms in extensive datasets. 1.