Results 1 -
4 of
4
Efficient substructure discovery from large semi-structured data
, 2002
"... By rapid progress of network and storage technologies, a huge amount of electronic data such as Web pages and XML data [23] has been available on intra and internet. These electronic data are heterogeneous collection of ill-structured data that have no rigid structures, and often called semi-structu ..."
Abstract
-
Cited by 87 (9 self)
- Add to MetaCart
By rapid progress of network and storage technologies, a huge amount of electronic data such as Web pages and XML data [23] has been available on intra and internet. These electronic data are heterogeneous collection of ill-structured data that have no rigid structures, and often called semi-structured data [1]. Hence, there have been
Discovering All Most Specific Sentences
- ACM Transactions on Database Systems
, 2003
"... this article, we show how the problems of finding frequent sets in relations and of finding minimal keys in databases can be reduced to this formulation. Using this theory extraction formulation [Mannila 1995, 1996; Mannila and Toivonen 1997], one can formulate general results about the complexity o ..."
Abstract
-
Cited by 39 (2 self)
- Add to MetaCart
this article, we show how the problems of finding frequent sets in relations and of finding minimal keys in databases can be reduced to this formulation. Using this theory extraction formulation [Mannila 1995, 1996; Mannila and Toivonen 1997], one can formulate general results about the complexity of algorithms for these data mining tasks
Mining Association Rules in Entity-Relationship Modeled Databases
, 2001
"... . Current data mining algorithms handle databases consisting of a single table. This paper addresses the problem of mining association rules in databases consisting of multiple tables and designed using the entity-relationship model. We discuss previous approaches to this problem and point out s ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
. Current data mining algorithms handle databases consisting of a single table. This paper addresses the problem of mining association rules in databases consisting of multiple tables and designed using the entity-relationship model. We discuss previous approaches to this problem and point out some unaddressed issues, and we present a couple of algorithms to address these issues and experimental results showing the scalability of these algorithms with respect to the increase in size of the database. The paper concludes with a discussion of the possibility of extending our algorithms to database schemas more complex than a star schema. Keywords: entity-relationship model, itemset, association rule, entity support, join support 1
A Tight Upper Bound on the Number of Candidate Patterns
, 2001
"... In the context of mining for frequent patterns using the standard levelwise algorithm, the following question arises: given the current level and the current set of frequent patterns, what is the maximal number of candidate patterns that can be generated on the next level? We answer this question by ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In the context of mining for frequent patterns using the standard levelwise algorithm, the following question arises: given the current level and the current set of frequent patterns, what is the maximal number of candidate patterns that can be generated on the next level? We answer this question by providing a tight upper bound, derived from a combinatorial result from the sixties by Kruskal and Katona. Our result is useful to reduce the number of database scans.

