| Mannila H. and R aih a K. Dependency inference. In Proceedings of the 13th International Conference on Very Large Databases (VLDB) (1987), pp. 155--158. |
....all, of the data and not that any probabilistic techniques are used. that FDs represent interesting patterns existent in the data. In this setting, FDs are not regarded as declared constraints. Researchers have investigated the problem of efficiently discovering FDs that hold in a given instance [11, 13, 16, 18, 20, 21, 23, 38]. Researchers have also considered the concept of an FD approximately holding in an instance and have developed measures to characterize the degree of approximation . Piatetsky Shapiro [26] describe a measure derived from probabilistic considerations (this measure corresponds to the measure of ....
Mannila H. and R aih a K. Dependency inference. In Proceedings of the 13th International Conference on Very Large Databases (VLDB) (1987), pp. 155--158.
....and business management. Therefore, mining association rules from large data sets has been a focused topic in recent research into knowledge discov ery in databases [1, 2, 3, 9, 12, 14] Studies on mining association rules have evolved from techniques for discovery of functional dependen cies [10], strong rules [14] classification rules [7, 15] causal rules [11] clustering [6] etc. to disk based, ef ficient methods for mining association rules in large sets of transaction data [1, 2, 3, 12] However, previ ous work has been focused on mining association rules at a single concept ....
H. Mannila and K-J. Raiha. Dependency inference. In Proc. 1987.
....a second on a PC. 1.1. Related work Several algorithms for the discovery of functional dependencies have been presented [1, 3, 5, 6, 12, 13, 14] We review these algorithms and compare them with our method in Section 5.3. The complexity of discovering functional dependencies has been studied in [2, 12, 15]. Approximate functional dependencies have been considered in [7, 8, 16, 17] Kivinen and Mannila [16] define several measures for the error of a dependency and derive bounds for discovering dependencies with errors; they denote the measure e by g 3 . The use of partitions to describe and define ....
Mannila, H. and Raiha, K.-J. (1987) Dependency inference. In Proc. 13th Int. Conf. on Very Large Data Bases (VLDB'87), Los Altos, CA, pp. 155--158. Morgan Kaufmann.
....research since the 1970 s. Several algorithms now exist for using functional dependencies to create normalized databases that minimize redundancies and facilitate update [138] An asymptotically optimally algorithm also exists for finding the minimal set of functional dependencies in a database [94]. Wong et al. 148] suggest a bottom up procedure for discovering multivalued dependencies (MVDs) in observed data without knowing a priori the relationships among the attributes. A prototype system for automated database schema design has been implemented. In recent years, research by Glymour et ....
Mannila H. and Raiha K.J., "Dependency Inference", in Proceedings 13 th International Conference Very Large Data Bases, Brighton, England, pp.155158, 1987. 191
.... to which an FD is approximate 1 Introduction Over approximately the last ten years, a new research direction has emerged involving functional dependencies (FDs) Researchers have been addressing the problem of finding all of the FDs which hold in a given relation instance ( 4] 5] 6] 7] [9], 10] 12] 15] We call this FD discovery research. The primary motivation for FD discovery research is different than that for the original FD research in the 70s. The research in the 70s was primarily motivated by database design (e.g. schema normal forms) The primary motivation for FD ....
Mannila H. and Raiha K. Dependency inference. In Proceedings of the 13th International Conference on Very Large Databases (VLDB), pages 155--158, 1987.
....If used for finding all association rules, this algorithm will make as many passes over the data as the number of combinations of items in the antecedent, which is exponentially large. Related work in the database literature is the work on inferring functional dependencies from data [Bit92] MR87] Functional dependencies are rules requiring strict satisfaction. Consequently,having determined a dependency X A, the algorithms in [Bit92] MR87] 2 consider any other dependency of the form X Y A redundant and do not generate it. The association rules we consider are probabilistic in ....
....which is exponentially large. Related work in the database literature is the work on inferring functional dependencies from data [Bit92] MR87] Functional dependencies are rules requiring strict satisfaction. Consequently,having determined a dependency X A, the algorithms in [Bit92] MR87] 2 consider any other dependency of the form X Y A redundant and do not generate it. The association rules we consider are probabilistic in nature. The presence of a rule X A does not necessarily mean that X Y A also holds because the latter may not have minimum support. Similarly, the ....
Heikki Mannila and Kari-Jouku Raiha. Dependency inference. In Proc. of the VLDB Conference, pages 155--158, Brighton, England, 1987.
....of a second on a PC. Related work Several algorithms for the discovery of functional dependencies have been presented [10, 3, 12, 22, 21, 14, 2] We review these algorithms and compare them with our method in Section 6. The complexity of discovering functional dependencies has been studied in [11, 13, 12]. Approximate functional dependencies have been considered in [7, 19, 8, 4] Kivinen and Mannila [7] define several measures for the error of a dependency, and derive bounds for discovering dependencies with errors. The measure g 3 is one of their measures. The use of partitions to describe and ....
H. Mannila and K.-J. Raiha. Dependency inference. In Proceedings of the Thirteenth International Conference on Very Large Data Bases (VLDB'87), pages 155--158, Los Altos, CA, 1987. Morgan Kaufmann.
....of a second on a PC. Related work Several algorithms for the discovery of functional dependencies have been presented [7, 2, 9, 18, 17, 11, 1] We review these algorithms and compare them with our method in Section 6. The complexity of discovering functional dependencies has been studied in [8, 10, 9]. Approximate functional dependencies have been considered in [5, 15, 6, 3] Kivinen and Mannila [5] define several measures for the error of a dependency, and derive bounds for discovering dependencies with errors. The measure g 3 is one of their measures. The use of partitions to describe and ....
H. Mannila and K.-J. Raiha. Dependency inference. In Proceedings of the Thirteenth International Conference on Very Large Data Bases (VLDB'87), pages 155--158, Los Altos, CA, 1987. Morgan Kaufmann.
....of TRANSVERSAL HYPERGRAPH is currently known. Similarly, it is an open problem, whether T r(H) can be computed in output polynomial total time (i.e. in time polynomial in the combined sizes of the input and the output) This complexity problem was posed independently by several researchers [MR87, DT87, JYP88] Note that the existence of an output polynomial algorithm for computing T r(H) would imply the polynomial solvability of TRANSVERSAL HYPERGRAPH; vice versa, if TRANSVERSAL HYPERGRAPH is co NP complete, then no output polynomial algorithm for the computation of T r(H) is likely to ....
....these problems in input polynomial time. In face of the inherent complexity, it is of interest to have algorithms which solve these problems in output polynomial total time. Unfortunately, even under this relaxed complexity requirement no efficient algorithms are known for both problems [MR86, MR87, BDFS84] Note that problems AP1 and AP2, which are search problems in terms of complexity theory, are solvable by algorithms in output polynomial time only if the following decision problem, which we call FD RELATION EQUIVALENCE, is in P: Problem: FD RELATION EQUIVALENCE Instance: A relation R ....
[Article contains additional citation context not shown here]
Heikki Mannila and Kari-Jouko Raiha. Dependency Inference. In Proceedings of the 13 th VLDB, pages 155--158, 1987.
....it is also possible to discover formats (e.g. mmddyy or dd mm yyyy for a date) Dependency Analysis. Finding functional, multivalued and inclusion dependencies in large databases is an expensive operation. But at least for functional dependencies it is feasible for narrow data samples [9] [30], 31] The results gained from data samples will probably not be exact, but can nevertheless be helpful as candidates which can then be confirmed or rejected by the user. The general problem of finding inclusion dependencies is NP complete [31] It can be reduced to O(n 2 p log p) with p = ....
Mannila, H., Raiha, K.J.: Dependency Inference. Proceedings of the 13th VLDB Conference, Brighton 1987.
....since the 1970 s. Several algorithms now exist for using functional dependencies to create normalized databases that minimize redundancies and facilitate updates [Ullman, 1982] An asymptotically optimal algorithm also exists for finding the minimal set of functional dependencies in a database [Mannila and Raiha, 1987]. In recent years, research by Pearl [1988, 1991] Glymour et al. 1987] and others has resulted in major advances in the area of discovering dependency or causal graphs. Because standard statistical techniques cannot distinguish causation from covariation, data precedence information or ....
H. Mannila and K.-J. Raiha. Dependency inference. In Proceedings of the Thirteenth International Conference on Very Large Data Bases (VLDB'87), pages 155--158, 1987.
.... filtering conditions [12, 17] The notion of a frontier border set, crucial for efficient finding of all large frequent itemsets [35, 54] is closely related to the GUHA concept of prime sentences [16] The gap between association rules and functional dependencies known from databases [2, 34] can be partially bridged in GUHA by means of improving literals [12, 16] 4 Impact of Soft Computing Soft computing is a relatively new name for a branch of research including fuzzy logic, neural networks, genetic and probabilistic computing. 5 Here we contemplate on soft exploratory data ....
Mannila, H., and R¨ aih¨ a, K. Dependency inference. In Proceedings of the 13th International Conference on Very Large Data Bases (1987), pp. 155--158.
....and business management. Therefore, mining association rules from large data sets has been a focused topic in recent research into knowledge discovery in databases [2, 4, 5, 65, 82, 85] Studies on mining association rules have evolved from techniques for discovery of functional dependencies [71], strong rules [85] classification rules [46, 91] causal rules [77] clustering [34] etc. to disk based, efficient methods for mining association rules in large sets of transaction data [2, 4, 5, 82] However, previous work has been focused on mining association rules at a single conceptual ....
H. Mannila and K-J. Raiha. Dependency inference. In Proc. 1987 Int. Conf. Very Large Data Bases, pages 155--158, Brighton, England, Sept. 1987.
.... Therefore, mining association rules or sequential patterns from large data sets has been a focused topic in recent research into knowledge discovery in databases [17, 2, 1, 3, 4, 13] Studies on mining association rules have evolved from techniques for discovery of functional dependencies [12], strong rules [17] classification rules [18, 8] causal rules [14] clustering [7] inductive logic programming [15] etc. to disk based, efficient methods for mining association rules in large sets of transaction data [2, 1, 3, 4] However, previous work has been focused on mining association ....
H. Mannila and K-J. Raiha. Dependency inference. In Proc. 1987 Int. Conf. Very Large Data Bases, pages 155--158, Brighton, England, Sept. 1987.
....and business management. Therefore, mining association rules from large data sets has been a focused topic in recent research into knowledge discovery in databases [1, 2, 3, 9, 12, 14] Studies on mining association rules have evolved from techniques for discovery of functional dependencies [10], strong rules [14] classification rules [7, 15] causal rules [11] clustering [6] etc. to disk based, efficient methods for mining association rules in large sets of transaction data [1, 2, 3, 12] However, previous work has been focused on mining association rules at a single concept level. ....
H. Mannila and K-J. Raiha. Dependency inference. In Proc. 1987 Int. Conf. Very Large Data Bases, pp. 155--158, Brighton, England, Sept. 1987.
....If used for finding all association rules, this algorithm will make as many passes over the data as the number of combinations of items in the antecedent, which is exponentially large. Related work in the database literature is the work on inferring functional dependencies from data [Bit92] MR87] Functional dependencies are rules requiring strict satisfaction. Consequently, having determined a dependency X A, the algorithms in [Bit92] MR87] consider any other dependency of the form X Y A redundant and do not generate it. The association rules we consider are probabilistic in ....
....which is exponentially large. Related work in the database literature is the work on inferring functional dependencies from data [Bit92] MR87] Functional dependencies are rules requiring strict satisfaction. Consequently, having determined a dependency X A, the algorithms in [Bit92] MR87] consider any other dependency of the form X Y A redundant and do not generate it. The association rules we consider are probabilistic in nature. The presence of a rule X A does not necessarily mean that X Y A also holds because the latter may not have minimum support. Similarly, the ....
Heikki Mannila and Kari-Jouku Raiha. Dependency inference. In Proc. of the VLDB Conference, pages 155--158, Brighton, England, 1987.
....idea and expands it by identifying pruning rules and techniques for efficiently implementing those pruning rules. The application of the idea is also slightly more general in this paper; instead of focusing on a fixed classification, the search identifies minimal domains for any range of interest. Mannila and R ih (1987) present an algorithm for inducing functional dependencies. Like Almuallim and Dietterich, their approach also uses disagreements between pairs of data. Basically, their algorithm computes all pairwise disagreements between tuples in a database. This takes O( time. Then, it collects up all these ....
Mannila, H., & Räihä, K. (1987). Dependency inference (extended abstract). Proceedings of the Thirteenth Very Large Database Conference (pp. 155--158). Brighton.
....in databases has become an important issue in recent research [1, 3, 5, 16] Most previous studies on data mining have been focused at mining rules at single concept levels, i.e. either at a primitive concept level or at a rather high concept level. For example, mining functional dependency rules [12], strong rules [15] causal rules [13] and association rules [1, 2] in relational or transaction databases is related to finding rules usually at primitive concept levels but sometimes at a given high concept level, depending on the available data. The generalization based data mining The ....
H. Mannila and K-J. Raiha. Dependency inference. In Proc. 1987 Int. Conf. Very Large Data Bases, pp. 155-- 158, Brighton, England, Sept. 1987.
....in [20] If used for finding all association rules, this algorithm will make as many passes over the data as the number of combinations of items in the antecedent, which is exponentially large. Related work in the database literature is the work on inferring functional dependencies from data [16]. Functional dependencies are rules requiring strict satisfaction. Consequently, having determined a dependency X A, the algorithms in [16] consider any other dependency of the form X Y A redundant and do not generate it. The association rules we consider are probabilistic in nature. The ....
....of items in the antecedent, which is exponentially large. Related work in the database literature is the work on inferring functional dependencies from data [16] Functional dependencies are rules requiring strict satisfaction. Consequently, having determined a dependency X A, the algorithms in [16] consider any other dependency of the form X Y A redundant and do not generate it. The association rules we consider are probabilistic in nature. The presence of a rule X A does not necessarily mean that X Y A also holds because the latter may not have minimumsupport. Similarly, the ....
H. Mannila and K.-J. Raiha. Dependency inference. In Proc. of the VLDB Conference, pages 155--158, Brighton, England, 1987.
....an efficient solution and actual performance results for a problem that clearly has the exponential worst case behavior in number of itemsets. There has been work in the database community on inferring functional dependencies from data, and efficient inference algorithms have been presented in [3] [8]. Functional dependencies are very specific predicate rules while our rules are propositional in nature. Contrary to our framework, the algorithms in [3] 8] consider strict satisfaction of rules. Due to the strict satisfaction, these algorithms take advantage of the implications between rules and ....
....work in the database community on inferring functional dependencies from data, and efficient inference algorithms have been presented in [3] 8] Functional dependencies are very specific predicate rules while our rules are propositional in nature. Contrary to our framework, the algorithms in [3] [8] consider strict satisfaction of rules. Due to the strict satisfaction, these algorithms take advantage of the implications between rules and do not consider rules that are logically implied by the rules already discovered. That is, having inferred a dependency X A, any other dependency of the ....
Heikki Mannila and Kari-Jouku Raiha, "Dependency Inference", VLDB-87, Brighton, England, 1987, 155-158.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC