Results 1 
8 of
8
On the Conditional Independence Implication Problem: A LatticeTheoretic Approach
"... A latticetheoretic framework is introduced that permits the study of the conditional independence (CI) implication problem relative to the class of discrete probability measures. Semilattices are associated with CI statements and a finite, sound and complete inference system relative to semilatti ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
(Show Context)
A latticetheoretic framework is introduced that permits the study of the conditional independence (CI) implication problem relative to the class of discrete probability measures. Semilattices are associated with CI statements and a finite, sound and complete inference system relative to semilattice inclusions is presented. This system is shown to be (1) sound and complete for saturated CI statements, (2) complete for general CI statements, and (3) sound and complete for stable CI statements. These results yield a criterion that can be used to falsify instances of the implication problem and several heuristics are derived that approximate this “latticeexclusion” criterion in polynomial time. Finally, we provide experimental results that relate our work to results obtained from other existing inference algorithms. 1
Itemset Frequency Satisfiability: Complexity and Axiomatization
, 2007
"... Computing frequent itemsets is one of the most prominent problems in data mining. We study the following related problem, called FREQSAT, in depth: given some itemsetinterval pairs, does there exist a database such that for every pair the frequency of the itemset falls into the interval? This probl ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Computing frequent itemsets is one of the most prominent problems in data mining. We study the following related problem, called FREQSAT, in depth: given some itemsetinterval pairs, does there exist a database such that for every pair the frequency of the itemset falls into the interval? This problem is shown to be NPcomplete. The problem is then further extended to include arbitrary Boolean expressions over items and conditional frequency expressions in the form of association rules. We also show that, unless P equals NP, the related function problem—find the best interval for an itemset under some frequency constraints—cannot be approximated efficiently. Furthermore, it is shown that FREQSAT is recursively axiomatizable, but that there cannot exist an axiomatization of finite arity.
Logical inference algorithms and matrix representations for probabilistic conditional independence
 In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence
, 2009
"... Logical inference algorithms for conditional independence (CI) statements have important applications from testing consistency during knowledge elicitation to constraintbased structure learning of graphical models. We prove that the implication problem for CI statements is decidable, given that the ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Logical inference algorithms for conditional independence (CI) statements have important applications from testing consistency during knowledge elicitation to constraintbased structure learning of graphical models. We prove that the implication problem for CI statements is decidable, given that the size of the domains of the random variables is known and fixed. We will present an approximate logical inference algorithm which combines a falsification and a novel validation algorithm. The validation algorithm represents each set of CI statements as a sparse 01 matrix A and validates instances of the implication problem by solving specific linear programs with constraint matrix A. We will show experimentally that the algorithm is both effective and efficient in validating and falsifying instances of the probabilistic CI implication problem. 1
On the Effectiveness and Efficiency of Computing Bounds on the Support of ItemSets in the Frequent ItemSets Mining Problem
"... We study the relative effectiveness and the efficiency of computing supportbounding rules that can be used to prune the search space in algorithms to solve the frequent itemsets mining problem (FIM). We develop a formalism wherein these rules can be stated and analyzed using the concept of differe ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We study the relative effectiveness and the efficiency of computing supportbounding rules that can be used to prune the search space in algorithms to solve the frequent itemsets mining problem (FIM). We develop a formalism wherein these rules can be stated and analyzed using the concept of differentials and density functions of the support function. We derive a general bounding theorem, which provides lower and upper bounds on the supports of itemsets in terms of the supports of their subsets. Since, in general, many lower and upper bounds exists for the support of an itemset, we show how to the best bounds. The result of this optimization shows that the best bounds are among those that involve the supports of all the strict subsets of an itemset of a particular size q. These bounds are determined on the basis of so called qrules. In this way, we derive the bounding theorem established by Calders
On when and how to use SAT to mine frequent itemsets
 CoRR
"... Abstract. A new stream of research was born in the last decade with the goal of mining itemsets of interest using Constraint Programming (CP). This has promoted a natural way to combine complex constraints in a highly flexible manner. Although CP stateoftheart solutions formulate the task using ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. A new stream of research was born in the last decade with the goal of mining itemsets of interest using Constraint Programming (CP). This has promoted a natural way to combine complex constraints in a highly flexible manner. Although CP stateoftheart solutions formulate the task using Boolean variables, the few attempts to adopt propositional Satisfiability (SAT) provided an unsatisfactory performance. This work deepens the study on when and how to use SAT for the frequent itemset mining (FIM) problem by defining different encodings with multiple taskdriven enumeration options and search strategies. Although for the majority of the scenarios SATbased solutions appear to be noncompetitive with CP peers, results show a variety of interesting cases where SAT encodings are the best option. 1
On Implication Problems for Disjunctive Statements (Extended Abstract)
, 2009
"... Implication problems occur in many areas of computer science. Examples include, of course, logic, but also database systems (constraints), data mining (association rules), and reasoning under uncertainty (conditional independence). We provide a general framework for implication problems based on the ..."
Abstract
 Add to MetaCart
(Show Context)
Implication problems occur in many areas of computer science. Examples include, of course, logic, but also database systems (constraints), data mining (association rules), and reasoning under uncertainty (conditional independence). We provide a general framework for implication problems based on the observation that many can be reduced to an implication problem for additive constraints on specific classes of realvalued functions. Furthermore, we provide inference systems and properties of classes of realvalued functions which imply the soundness and completeness of these systems. We present computational complexity results for an important class of implication problems for which a finite axiomatization exists. We also derive properties of classes of realvalued functions that imply the nonexistence of finite, complete axiomatizations.
Itemset Frequency Satisfiability: Complexity and Axiomatization 1
"... Computing frequent itemsets is one of the most prominent problems in data mining. We study the following related problem, called FREQSAT, in depth: given some itemsetinterval pairs, does there exist a database such that for every pair the frequency of the itemset falls into the interval? This probl ..."
Abstract
 Add to MetaCart
Computing frequent itemsets is one of the most prominent problems in data mining. We study the following related problem, called FREQSAT, in depth: given some itemsetinterval pairs, does there exist a database such that for every pair the frequency of the itemset falls into the interval? This problem is shown to be NPcomplete. The problem is then further extended to include arbitrary Boolean expressions over items and conditional frequency expressions in the form of association rules. We also show that, unless P equals NP, the related function problem—find the best interval for an itemset under some frequency constraints—cannot be approximated efficiently. Furthermore, it is shown that FREQSAT is recursively axiomatizable, but that there cannot exist an axiomatization of finite arity.