| S. Bell and P. Brockhausen. Discovery of data dependencies in relational databases. In Y. Kodratoff, G. Nakhaeizadeh, and C. Taylor, editors, Statistics, Machine Learning and Knowledge Discovery in Databases, ML--Net Familiarization Workshop, pages 53--58, 1995. |
....dependency states that the value of an attribute is uniquely determined by the values of some other attributes. For example, in an address database, zip code is determined by city and street address. The discovery of functional dependencies from relations has received considerable interest (e.g. [1, 2, 3, 4, 5, 6, 7, 8]) Automated database analysis is, of course, interesting for knowledge discovery and data mining (KDD) purposes. For instance, consider a database of chemical compounds and their outcomes on various bioassays. Discovering that an essential quality, such as carcinogenicity, of a compound depends ....
....hundreds of thousands of tuples. Dependency discovery tasks that have been reported to take minutes or even hours are solved with the new algorithm in seconds or fractions of a second on a PC. 1.1. Related work Several algorithms for the discovery of functional dependencies have been presented [1, 3, 5, 6, 12, 13, 14]. We review these algorithms and compare them with our method in Section 5.3. The complexity of discovering functional dependencies has been studied in [2, 12, 15] Approximate functional dependencies have been considered in [7, 8, 16, 17] Kivinen and Mannila [16] define several measures for the ....
[Article contains additional citation context not shown here]
Bell, S. and Brockhausen, P. (1995) Discovery of Data Dependencies in Relational Databases. Technical Report LS8 Report-14, University of Dortmund.
.... concepts can be formulated as logic programs (e.g. database states, views, integrity constrains, implicit data, schema de nitions) 8, 10, 24] there has been interest in using ILP techniques to (semi ) automate design tasks such as the formulation of views, integrity constraints, and others [5, 6], using stored data as, and perhaps additionally complementing it with, examples. However, just as the database research community has overlooked the potential for ILP, the ILP research community has also not attempted to exploit fully many 1 results from DDB research. The main aim of this paper ....
S. Bell and P. Brockhausen. Discovery of Data Dependencies in Relational Databases. Technical Report LS-8 Report 14, Computer Science Department, University of Dortmund, 1995.
.... concepts can be formulated as logic programs (e.g. database states, views, integrity constrains, implicit data, schema definitions) 6, 7, 15] there has been interest in using ILP techniques to (semi ) automate design tasks such as the formulation of views, integrity constraints, and others [3, 4], using stored data as, and perhaps additionally complementing it with, examples. This paper indicates how different logic programming technologies can underpin an architecture for KIDM in which the problem of knowledge creation is addressed in a well founded and systematic manner. The paper does ....
S. Bell and P. Brockhausen. Discovery of Data Dependencies in Relational Databases. Technical Report LS-8 Report 14, Computer Science Department, University of Dortmund, 1995.
....Functional dependencies are relationships between attributes of a relation: a functional dependency states that the value of an attribute is uniquely determined by the values of some other attributes. The discovery of functional dependencies from relations has received considerable interest (e.g. [3, 13, 21, 23, 14, 2, 8, 4]) Automated database analysis is, of course, interesting for knowledge discovery and data mining (KDD) purposes, and functional dependencies have applications in the areas of database management, reverse engineering [18, 24] and query optimization [25] Formally, a functional dependency over a ....
....dependencies with a small lefthand side, i.e. from the ones that are not very likely to hold. The algorithm then works towards larger and larger dependencies, until the minimal dependencies that hold are found. The levelwise approach has been used for discovering functional dependencies before [22, 2], but the partition based method for validity testing combined with better pruning rules and more efficient implementation makes the new algorithm much faster. The worst case time complexity of the algorithm with respect to the number of 2 attributes is exponential, but this is inevitable since ....
[Article contains additional citation context not shown here]
S. Bell and P. Brockhausen. Discovery of data dependencies in relational databases. Tech. Rep. LS-8 Report-14, University of Dortmund, Apr. 1995.
....Functional dependencies are relationships between attributes of a relation: a functional dependency states that the value of an attribute is uniquely determined by the values of some other attributes. The discovery of functional dependencies from relations has received considerable interest (e.g. [2, 10, 17, 19, 11, 1, 6, 3]) Automated database analysis is, of course, interesting for knowledge discovery and data mining (KDD) purposes, and functional dependencies have applications in the areas of database management, reverse engineering [14, 20] and query optimization [21] Formally, a functional dependency over a ....
....with even hundreds of thousands of rows. Dependency discovery tasks that have been reported to take minutes or even hours are solved with the new algorithm in seconds or fractions of a second on a PC. Related work Several algorithms for the discovery of functional dependencies have been presented [7, 2, 9, 18, 17, 11, 1]. We review these algorithms and compare them with our method in Section 6. The complexity of discovering functional dependencies has been studied in [8, 10, 9] Approximate functional dependencies have been considered in [5, 15, 6, 3] Kivinen and Mannila [5] define several measures for the error ....
[Article contains additional citation context not shown here]
S. Bell and P. Brockhausen. Discovery of data dependencies in relational databases. Tech. Rep. LS-8 Report-14, University of Dortmund, Apr. 1995.
....a user defined threshold. The reliability measure is supposed to measure the functional degree of the map given subsequent data. As we have argued in [Pfahringer Kramer, 1995] this measure does not avoid overfitting the data, since it does not have a penalty for overly complex dependencies. [Bell Brockhausen, 1995] hint at a simple modification of their functional dependency search algorithm to cope with noise. But their modification can only take into account the number of errors in the projection, whereas a reliable estimate would need to assess the global number of errors . Kivinen Mannila, 1995] ....
Bell S., Brockhausen P.: Discovery of Data Dependencies in Relational Databases, FB Informatik, Universitaet Dortmund, LS-8 Report 14, 1995.
....dependences, in a relational database. The computation of maximal frequent sets is a fundamental data mining problem which is required in discovering association rules [1, 2] Minimal keys can be used for semantic query optimization, which leads to fast query processing in database systems [22, 18, 5, 26]. Here we refer to possible keys that exist in a specific instance of a relational database and are not designed as such. The computation of sentences of a theory is an enumeration problem. The computation involves listing combinatorial substructures related with the input. For enumeration ....
S. Bell and P. Brockhausen. Discovery of data dependencies in relational databases. Technical Report LS-8 14, Universitat Dortmund, Fachbereich Informatik, Lehrstuhl VIII, Kunstliche Intelligenz, 1995.
....files, databases, or spread sheets. Studies have shown that data is an invaluable resource for extracting business knowledge [23] as well as systems design information [26] Research in constraints discovery have also confirmed that data is a reliable source for recovering database constraints [3, 5, 9, 12, 13, 20, 21]. There might be situation where data is the only resource available for design recovery. However, there is no research dedicated to that scenario. Our focus in this part of the information systems reverse engineering research is on extraction of design constraints from data for design recovery ....
....is on extraction of design constraints from data for design recovery purpose. The discovery of functional dependencies (FDs) is a fundamental activity in the database design recovery process. Many techniques and algorithms for discovering functional dependencies from the data have been proposed [3, 5, 9, 12, 13, 20, 21]. These techniques and algorithms share a common goal: generate minimum number of functional dependency (FD) hypotheses and verify them against the database. These algorithms suffer from a common braw back: the performance of the algorithm deteriorated when the number of attributes or and the size ....
[Article contains additional citation context not shown here]
Siegfried Bell, Peter Brockhausen, Discovery of Data Dependencies in Relational Databases, University of Dortmund, Computer Science Department, LS-8 Report 14, Apr. 1995.
No context found.
S. Bell and P. Brockhausen. Discovery of data dependencies in relational databases. In Y. Kodratoff, G. Nakhaeizadeh, and C. Taylor, editors, Statistics, Machine Learning and Knowledge Discovery in Databases, ML--Net Familiarization Workshop, pages 53--58, 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC