#### DMCA

## Bridging the gap between intensional and extensional query evaluation in probabilistic databases (2010)

### Cached

### Download Links

Citations: | 13 - 6 self |

### Citations

8896 | Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference - Pearl - 1988 |

1128 | An introduction to variational methods for graphical models
- Jordan, Ghahramani, et al.
- 1999
(Show Context)
Citation Context ...for unsafe queries, and hence many approximation strategies have also been proposed based on sampling [21, 13] or compilation [19]. There are other approximate inference approaches in graphical models=-=[15, 6, 26]-=- that can also be leveraged. Note that these approximation strategies can be used on the And-Or Networks as well. Our method basically reduces the original problem into an inference problem of smaller... |

773 | Probabilistic Networks and Expert Systems - Cowell, Dawid, et al. - 1999 |

473 | Generalized belief propagation
- Yedidia, Freeman, et al.
- 2000
(Show Context)
Citation Context ...for unsafe queries, and hence many approximation strategies have also been proposed based on sampling [21, 13] or compilation [19]. There are other approximate inference approaches in graphical models=-=[15, 6, 26]-=- that can also be leveraged. Note that these approximation strategies can be used on the And-Or Networks as well. Our method basically reduces the original problem into an inference problem of smaller... |

450 | Query evaluation on probabilistic databases
- Re, Dalvi, et al.
(Show Context)
Citation Context ...ctive queries without self-joins on so-called tuple-independent probabilistic databases: A query is either tractable, that is, it can be evaluated in polynomial time data complexity, or it is #P-hard =-=[8]-=-. The former queries are called safe (or hierarchical), whereas the latter are unsafe. A safe query can be evaluated by using an extensional query plan. This is a regular relational plan, where every ... |

310 | ULDBs: Databases with uncertainty and lineage
- Benjelloun, Sarma, et al.
- 2006
(Show Context)
Citation Context ...s with any join ordering, and not only those orderings required by safe plans. Intensional approaches can evaluate any conjunctive query by using general-purpose inference techniques on query lineage =-=[2]-=-, or specialized compilation techniques [16, 17]. [25] construct a Bayesian Network, instead of lineage, and evaluate the answer probability of tuples by doing inference over these networks. Exact eva... |

236 | A Linear Time Algorithm for Finding Tree-Decompositions of Small Treewidth
- Bodlaender
- 1996
(Show Context)
Citation Context ...ed by ignoring the direction of edges in G. Then given a tree decomposition of G and any W ⊆ V (G), x : W → {0, 1}, the marginal probability N 0 (x) can be computed in time O(|G|16 tw(G) ). Note that =-=[3]-=- has already shown that tree-decomposition for graphs of bounded treewidth can be found in linear time. Proof: First we state a known fact about treewidth that will enable us to present the algorithm ... |

177 | Efficient top-k query evaluation on probabilistic data (extended version
- Ré, Dalvi, et al.
- 2006
(Show Context)
Citation Context ...nts associated with the tuples in the input database. Second, a general purpose probabilistic inference algorithm is run on every lineage expression to compute that tuple’s probability, e.g. sampling =-=[21]-=-, or inference on graphical models [25], or variable elimination (DPLL) [16]. Thus, there is a significant efficiency gap between safe queries and unsafe queries. The query processor has to make a dec... |

140 | Representing and querying correlated tuples in probabilistic databases
- Sen, Deshpande
- 2007
(Show Context)
Citation Context ...nput database. Second, a general purpose probabilistic inference algorithm is run on every lineage expression to compute that tuple’s probability, e.g. sampling [21], or inference on graphical models =-=[25]-=-, or variable elimination (DPLL) [16]. Thus, there is a significant efficiency gap between safe queries and unsafe queries. The query processor has to make a decision whether the query is safe, and th... |

108 | MCDB: a Monte Carlo approach to managing uncertain data
- Jampani, Xu, et al.
- 2008
(Show Context)
Citation Context ...ability of tuples by doing inference over these networks. Exact evaluation is not always feasible for unsafe queries, and hence many approximation strategies have also been proposed based on sampling =-=[21, 13]-=- or compilation [19]. There are other approximate inference approaches in graphical models[15, 6, 26] that can also be leveraged. Note that these approximation strategies can be used on the And-Or Net... |

106 | Management of probabilistic data: Foundations and challenges - Dalvi, Suciu - 2007 |

95 |
Mystiq: a system for finding more answers by using probabilities
- Boulos, Dalvi, et al.
- 2005
(Show Context)
Citation Context ...r11s11 ∨ r11s12 ∨ r12s21 ∨ r12s22 ∨ r21s11 ∨ r21s12 ∨ r22s21 ∨ r22s22, where rij and sij are random variables corresponding to tuples (i, j) in R and S respectively. In probabilistic database systems =-=[2, 1, 4]-=-, if the query is unsafe then the system first computes its lineage, then uses some general-purpose probabilistic inference method for computing the probability of the lineage; it is known that the pr... |

88 | D.: Fast and simple relational processing of uncertain data
- Antova, Jansen, et al.
- 2008
(Show Context)
Citation Context ...ersion of a paper that appeared in EDBT’10 [14]. 2 Background A probabilistic relation R = (R, ρ) represents a probability distribution over all subsets of R, also called instances, given by ρ : 2R → =-=[0, 1]-=- s.t. ∑ ω⊆R ρ(ω) = 1. Given k probabilistic relations (R1, ρ1), . . . , (Rk, ρk), a probabilistic database is the product space D = (R, ρ), where R = (R1, . . . , Rk) and ∀1 ≤ i ≤ k, ωi ⊆ Ri : ρ(ω1, .... |

63 | Conditioning probabilistic databases
- Koch, Olteanu
(Show Context)
Citation Context ...ose probabilistic inference algorithm is run on every lineage expression to compute that tuple’s probability, e.g. sampling [21], or inference on graphical models [25], or variable elimination (DPLL) =-=[16]-=-. Thus, there is a significant efficiency gap between safe queries and unsafe queries. The query processor has to make a decision whether the query is safe, and then it can be evaluated using an exten... |

54 | An optimal approximation algorithm for Bayesian inference,
- Dagum, Luby
- 1997
(Show Context)
Citation Context ...for unsafe queries, and hence many approximation strategies have also been proposed based on sampling [21, 13] or compilation [19]. There are other approximate inference approaches in graphical models=-=[15, 6, 26]-=- that can also be leveraged. Note that these approximation strategies can be used on the And-Or Networks as well. Our method basically reduces the original problem into an inference problem of smaller... |

54 | Sprout: Lazy vs. eager query plans for tuple-independent probabilistic databases
- Olteanu, Huang, et al.
- 2009
(Show Context)
Citation Context ... evaluated by using an extensional query plan. This is a regular relational plan, where every operator performs some simple manipulation of the probabilities, e.g. using multiplication or aggregation =-=[8, 18]-=-. However, such plans can only exist for safe queries: every unsafe query can be shown to be #P hard, and therefore is unlikely to admit any efficient evaluation algorithm. The common practice to eval... |

42 | MayBMS: a probabilistic database management system. - Huang, Antova, et al. - 2009 |

40 | Using OBDDs for efficient query evaluation on probabilistic databases
- Olteanu, Huang
- 2008
(Show Context)
Citation Context ...t order is itself an intractable problem, and there are no guarantees that one can find such a good order based on the data. The approaches that provide some theoretical guarantees on the performance =-=[12, 10, 17]-=- have their running time exponential in the treewidth of the lineage. We next define treewidth. A hypergraph is a pair H = (V, E), where V is the set of vertices and E ⊆ 2V is a set of subsets of V . ... |

39 | Counting truth assignments of formulas of bounded tree-width or clique-width. - Fischer, Makowsky, et al. - 2008 |

39 | Exploiting lineage for confidence computation in uncertain and probabilistic databases - Sarma, Theobald, et al. - 2008 |

28 | Approximate confidence computation in probabilistic databases
- Olteanu, Huang, et al.
- 2010
(Show Context)
Citation Context ...ng inference over these networks. Exact evaluation is not always feasible for unsafe queries, and hence many approximation strategies have also been proposed based on sampling [21, 13] or compilation =-=[19]-=-. There are other approximate inference approaches in graphical models[15, 6, 26] that can also be leveraged. Note that these approximation strategies can be used on the And-Or Networks as well. Our m... |

19 | Using DPLL for efficient OBDD construction.
- Huang, Darwiche
- 2004
(Show Context)
Citation Context ...t order is itself an intractable problem, and there are no guarantees that one can find such a good order based on the data. The approaches that provide some theoretical guarantees on the performance =-=[12, 10, 17]-=- have their running time exponential in the treewidth of the lineage. We next define treewidth. A hypergraph is a pair H = (V, E), where V is the set of vertices and E ⊆ 2V is a set of subsets of V . ... |

11 |
Efficient Reasoning in Graphical Models
- Rish
- 1999
(Show Context)
Citation Context ...ried out by moralizing the graph(connect all the parents of every node)G to convert it into a markov network M(G). This would lead to networks of very high treewidth, but [25] exploit decomposability =-=[22]-=- to reduce the factors in G to size 3 factors: call this graph D(G). Figure 2 illustrates the construction of M(G) and D(G). Hence the complexity of their approach depends on the treewidth of the grap... |

2 | Optimal ordered Binary Decision Diagrams for read-once formulas
- Sauerhoff, Wegener, et al.
(Show Context)
Citation Context ...for SAT and #SAT problems and generally involve using some heuristic to eliminate the variables and simplify the formulae. There are some classes of formulae such as symmetric and read-once functions =-=[24]-=- that can be solved efficiently. The most effective methods rely on finding a good variable order; however, finding the best order is itself an intractable problem, and there are no guarantees that on... |