#### DMCA

## K.: Parameter learning in probabilistic databases: A least squares approach (2008)

### Cached

### Download Links

Citations: | 21 - 6 self |

### Citations

3517 | Graph-based algorithms for boolean function manipulation
- Bryant
- 1986
(Show Context)
Citation Context ...(a,c) is ac ∨ (ab ∧ bc), where we use xy as Boolean variable representing edge(x,y). To effectively calculate the probability of such a monotone DNF formula, we employ Binary Decision Diagrams (BDDs) =-=[19]-=-, an efficient graphical representation of a Boolean function over a set of variables, see Section 6 for more details. As the size of the DNF formula grows with the number of proofs, its evaluation ca... |

2082 |
W.: Foundations of Logic Programming
- Lloyd
- 1984
(Show Context)
Citation Context ...the explanation probability can easily be realized using a bestfirst search – guided by the probability of the current derivation – through standard logic programming techniques based on the SLD-tree =-=[18]-=-. On the other hand, evaluating the success probability of ProbLog queries is computationally hard, as different proofs of a query are not independent in general. As shown in [4], the problem can be t... |

613 | Learning probabilistic relational models
- Friedman, Getoor, et al.
- 1999
(Show Context)
Citation Context ...pproach in that there is no underlying generative model. Indeed, consider for instance the learning of stochastic logic programs (SLPs) [5], PRISM programs [6], probabilistic relational models (PRMs) =-=[7]-=- or Bayesian logic programs (BLPs) [8]. In all these approaches, a generative model is assumed. For SLPs (and stochastic context-free grammars) as well as for PRISM, the learning procedure assumes tha... |

450 | Query evaluation on probabilistic databases
- Re, Dalvi, et al.
(Show Context)
Citation Context ... both the structure and parameters of probabilistic logics, cf. [1, 2], but so far seems to have devoted little attention to the learning of probabilistic database formalisms. Probabilistic databases =-=[3, 4]-=- associate probabilities to facts, indicating the probabilities with which these facts hold. This information is then used to define and compute the success probability of queries or derived facts or ... |

339 |
Introduction to Statistical Relational Learning
- Getoor, Taskar
- 2007
(Show Context)
Citation Context ...can deal with uncertainty. Over the last years, the statistical relational learning community has devoted a lot of attention to learning both the structure and parameters of probabilistic logics, cf. =-=[1, 2]-=-, but so far seems to have devoted little attention to the learning of probabilistic database formalisms. Probabilistic databases [3, 4] associate probabilities to facts, indicating the probabilities ... |

252 | Tree-Bank Grammars.
- Charniak
- 1996
(Show Context)
Citation Context ...ing from sentences belonging to the grammar (learning from entailment / from queries), or alternatively, one could learn it from parse-trees (learning from proofs), cf. the work on tree-bank grammars =-=[11, 12]-=-. The former setting is typically a lot harder than the later one because one query may have multiple proofs, which introduces hidden parameters into the learning setting, which are not present when l... |

154 | Mean field theory for sigmoid belief networks.
- Saul, Jaakkola, et al.
- 1996
(Show Context)
Citation Context ...ss each pj ∈]0, 1[ in terms of the sigmoid function5 pj = σ(aj) := 1/(1 + exp(−aj)) applied to aj ∈ R. This technique has been used for Bayesian networks and in particular for sigmoid belief networks =-=[23]-=-. We derive the partial derivative ∂Ps(qi|T )/∂aj in the same way as (7) but we have to apply the chain rule one more time due to the σ function σ(aj) · (1 − σ(aj)) · ∑ S⊆LT L|=qi δjS ∏ cx∈S x̸=j σ(ax... |

144 | ProbLog: A probabilistic Prolog and its application in link discovery.
- Raedt, Kimmig, et al.
- 2007
(Show Context)
Citation Context ... both the structure and parameters of probabilistic logics, cf. [1, 2], but so far seems to have devoted little attention to the learning of probabilistic database formalisms. Probabilistic databases =-=[3, 4]-=- associate probabilities to facts, indicating the probabilities with which these facts hold. This information is then used to define and compute the success probability of queries or derived facts or ... |

122 | Parameter learning of logic programs for symbolicstatistical modeling.
- Sato, Kameya
- 2001
(Show Context)
Citation Context ... the usual statistical relational learning approach in that there is no underlying generative model. Indeed, consider for instance the learning of stochastic logic programs (SLPs) [5], PRISM programs =-=[6]-=-, probabilistic relational models (PRMs) [7] or Bayesian logic programs (BLPs) [8]. In all these approaches, a generative model is assumed. For SLPs (and stochastic context-free grammars) as well as f... |

82 | Parameter estimation in stochastic logic programs
- Cussens
- 2001
(Show Context)
Citation Context ...coring function and learning setting, it is important to realize that there is also a major difference between probabilistic databases and alternative probabilistic logics, such as PRISM [6] and SLPs =-=[5]-=-, even though the probabilistic database semantics seems closely related at first sight. To see this, assume that we now want to estimate the parameters of a ProbLog program starting from example quer... |

72 | Probabilistic inductive logic programming.
- Raedt, Kersting
- 2008
(Show Context)
Citation Context ... weighted examples such as 0.6:locatedIn(a,b) and 0.7:interacting(a,c) as already argued e.g. by Gupta and Sarawagi [14] and Chen et al. [9]. The situation fits the general learning setting stated in =-=[21]-=-: Given is a set of examples E, a probabilistic coverage relation P (e|D) that denotes the probability that the database D covers the example e ∈ E, a theory T in a probabilistic logic, and a scoring ... |

63 | Creating probabilistic databases from information extraction models
- Gupta, Sarawagi
- 2006
(Show Context)
Citation Context ...st estimating parameters as empirical frequencies among matching rules and then selecting the subset of rules with the lowest expected quadratic loss on an hold-out validation set. Gupta and Sarawagi =-=[14]-=- also consider a closely related learning setting but only extract probabilistic facts from data. Finally, the new setting and algorithm compromise a natural and interesting addition to the existing l... |

46 | Logic Programming, Abduction and Probability: a top-down anytime algorithm for estimating prior and posterior probabilities,
- Poole
- 1993
(Show Context)
Citation Context ...anation probability respectively. Using k = 1 in parameter learning has also been called Viterbi learning. Finding the k best proofs can be realized using a simple branch-and-bound approach (cf. also =-=[20]-=-). To illustrate k-probability, we consider again our example graph, but this time with query path(a,d). This query has four proofs, represented by the conjunctions ac∧cd, ab∧bc∧cd, ac∧ce∧ed and ab∧bc... |

43 | Probabilistic logic learning.
- Raedt, Kersting
- 2003
(Show Context)
Citation Context ... – in part – why so far only few learning techniques for probabilistic databases have been developed. The learning setting, however, is in line with the general theory of probabilistic logic learning =-=[10]-=- and inductive logic programming. From an inductive logic programming perspective, a query corresponds to a formula that is entailed by the database, and hence, queries correspond to well-known learni... |

37 | Link discovery in graphs derived from biological databases
- SEVON, ERONEN, et al.
- 2006
(Show Context)
Citation Context ...pes for the diseases are from OMIM. Most of the other information comes from EntrezGene, String, UniProt, HomoloGene, Gene Ontology, and OMIM databases. Weights were assigned to edges as described in =-=[24]-=-. In the experiments below, we used a fixed number of randomly chosen (Alzheimer disease or asthma) genes for graph extraction. Subgraphs were extracted by taking all acyclic paths of no more than len... |

28 | Extensibility in Data Mining Systems.
- Wrobel, Wettschereck, et al.
- 1996
(Show Context)
Citation Context ...form of theory revision. The present task extends the compression setting in that parameters of all facts can now be tuned starting from evidence. This realizes a more general form of theory revision =-=[16]-=-, albeit that only the parameters are changed and not the structure.d 0.5 e 0 0.9 c 0.8 ac ab bc 1 a 0.8 0.6 0.7 b (a) (b) Fig. 1. (a) Example of a probabilistic graph, where edge labels indicate the... |

21 |
Raedt. Basic principles of learning bayesian logic programs
- Kersting, De
- 2002
(Show Context)
Citation Context ... generative model. Indeed, consider for instance the learning of stochastic logic programs (SLPs) [5], PRISM programs [6], probabilistic relational models (PRMs) [7] or Bayesian logic programs (BLPs) =-=[8]-=-. In all these approaches, a generative model is assumed. For SLPs (and stochastic context-free grammars) as well as for PRISM, the learning procedure assumes that ground atoms for a single predicate ... |

17 | Learning probabilistic logic models from probabilistic examples (extended abstract
- Chen, Muggleton, et al.
(Show Context)
Citation Context ...icate (or in the grammar case, sentences belonging to the language) are sampled and that the sum of the probabilities of all different atoms obtainable in this way is at most 1. Recently, Chen et al. =-=[9]-=- also proposed a learning setting similar to ours. The probabilities associated with examples, however, are viewed as specifying the degree of being sampled from some distribution specified by a gener... |

12 | Compressing probabilistic Prolog programs
- Raedt, Kersting, et al.
- 2008
(Show Context)
Citation Context .... Finally, the new setting and algorithm compromise a natural and interesting addition to the existing learning algorithms for ProbLog. It is most closely related to the theory compression setting of =-=[15]-=-. There the task was to remove all but the k best facts from the database (that is to set the probability of such facts to 0), which realizes an elementary form of theory revision. The present task ex... |

9 | Learning probabilistic Datalog rules for information classification and transformation
- Nottelmann, Fuhr
(Show Context)
Citation Context ...g that learning from proofs is integrated with learning from entailment. Within the probabilistic database community, parameter estimation has received surprisingly few attention. Nottelmann and Fuhr =-=[13]-=- consider learning probabilistic Datalog rules in a similar setting where the underlying distribution semantics is similar to ProbLog. However, their setting and approach also significantly differ fro... |

8 | Towards learning stochastic logic programs from proof-banks
- Raedt, Kersting, et al.
- 2005
(Show Context)
Citation Context ...ing from sentences belonging to the grammar (learning from entailment / from queries), or alternatively, one could learn it from parse-trees (learning from proofs), cf. the work on tree-bank grammars =-=[11, 12]-=-. The former setting is typically a lot harder than the later one because one query may have multiple proofs, which introduces hidden parameters into the learning setting, which are not present when l... |

7 | H.: Probabilistic explanation based learning
- Kimmig, Raedt, et al.
(Show Context)
Citation Context ...then defined as the probability of the most likely explanation or proof of the query q: ∏ Px(q|T ) = max e∈E(q) P (e|T ) = max e∈E(q) ci∈e pi (2) where E(q) is the set of all explanations for query q =-=[17]-=-. For our example graph and query path(a,c), the set of all explanations contains the edge from a to c (with probability 0.8) as well as the path consisting of the edges from a to b and from b to c (w... |

7 | Probabilistic inductive logic programming: theory and applications - Raedt, Frasconi - 2008 |

1 |
relational learning. The MIT press. Statistical
- Saul, Jaakkola, et al.
- 1996
(Show Context)
Citation Context ... pj = σ(aj) := 1/(1 + exp(−aj)) applied to aj ∈ R.Parameter Learning in Probabilistic Databases This technique has also been used for Bayesian networks and in particular for sigmoid belief networks (=-=Saul et al., 1996-=-). We can derive the partial derivative ∂Ps(qi|T )/∂aj in the same way as (4) but we have to apply the chain rule one more time due to the σ function σ(aj) · (1 − σ(aj)) · ∑ ∏ σ(ax) ∏ (1 − σ(ax)). S⊆L... |