Results 1 
4 of
4
Dissociation and Propagation for Efficient Query Evaluation over Probabilistic Databases
, 2010
"... Queries over probabilistic databases are either safe, in which case they can be evaluated entirely in a relational database engine, or unsafe, in which case they need to be evaluated with a generalpurpose inference engine at a high cost. This paper proposes a new approach by which every query is e ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
Queries over probabilistic databases are either safe, in which case they can be evaluated entirely in a relational database engine, or unsafe, in which case they need to be evaluated with a generalpurpose inference engine at a high cost. This paper proposes a new approach by which every query is evaluated like a safe query inside the database engine, by using a new method called dissociation. A dissociated query is obtained by adding extraneous variables to some atoms until the query becomes safe. We show that the probability of the original query and that of the dissociated query correspond to two wellknown scoring functions on graphs, namely graph reliability (which is #Phard), and the propagation score (which is related to PageRank and is in PTIME): When restricted to graphs, standard query probability is graph reliability, while the dissociated probability is the propagation score. We define a propagation score for conjunctive queries without selfjoins and prove (i) that it is is always an upper bound for query reliability, and (ii) that both scores coincide for all safe queries. Given the widespread and successful use of graph propagation methods in practice, we argue for the dissociation method as a good and efficient way to rank probabilistic query results, especially for those queries which are highly intractable for exact probabilistic inference.
Approximate Lifted Inference with Probabilistic Databases
"... This paper proposes a new approach for approximate evaluation of #Phard queries with probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking th ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
This paper proposes a new approach for approximate evaluation of #Phard queries with probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking their minimum. We provide an algorithm that takes into account important schema information to enumerate only the minimal necessary plans among all possible plans. Importantly, this algorithm is a strict generalization of all known results of PTIME selfjoinfree conjunctive queries: A query is safe if and only if our algorithm returns one single plan. We also apply three relational query optimization techniques to evaluate all minimal safe plans very fast. We give a detailed experimental evaluation of our approach and, in the process, provide a new way of thinking about the value of probabilistic methods over nonprobabilistic methods for ranking query answers. 1.
OpenWorld Probabilistic Databaseṡ
"... Abstract Largescale probabilistic knowledge bases are becoming increasingly important in academia and industry alike. They are constantly extended with new data, powered by modern information extraction tools that associate probabilities with database tuples. In this paper, we revisit the semantic ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Largescale probabilistic knowledge bases are becoming increasingly important in academia and industry alike. They are constantly extended with new data, powered by modern information extraction tools that associate probabilities with database tuples. In this paper, we revisit the semantics underlying such systems. In particular, the closedworld assumption of probabilistic databases, that facts not in the database have probability zero, clearly conflicts with their everyday use. To address this discrepancy, we propose an openworld probabilistic database semantics, which relaxes the probabilities of open facts to intervals. While still assuming a finite domain, this semantics can provide meaningful answers when some probabilities are not precisely known. For this openworld setting, we propose an efficient evaluation algorithm for unions of conjunctive queries. Our openworld algorithm incurs no overhead compared to closedworld reasoning and runs in time linear in the size of the database for tractable queries. All other queries are #Phard, implying a data complexity dichotomy between linear time and #P. For queries involving negation, however, openworld reasoning can become NP, or even NP PP hard. Finally, we discuss additional knowledgerepresentation layers that can further strengthen openworld reasoning about big uncertain data.
Approximate Lifted Inference in Probabilistic Databases
"... This paper proposes a new approach for approximate evaluation of #Phard queries over probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking ..."
Abstract
 Add to MetaCart
(Show Context)
This paper proposes a new approach for approximate evaluation of #Phard queries over probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking their minimum. We provide an algorithm that takes into account important schema information to enumerate only the minimal necessary plans among all possible plans. Importantly, this algorithm is a strict generalization of all known results of PTIME selfjoin free conjunctive queries: A query is safe if and only if our algorithm returns one single plan. We also apply three relational query optimization techniques to evaluate all minimal safe plans very fast. We give a detailed experimental evaluation of our approach and, in the process, provide a new way of thinking about the value of probabilistic methods over nonprobabilistic methods for ranking query answers.