| N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34--37, 1999. |
....not based on sampling and access every item in the full data set. However, this does not imply that sampling is unimportant in data mining, in fact, it is often the only way to deal with very large data sets. The di#erent disciplines are also reflected in di#erent goals of data mining techniques [65]: 1. Induction. Find general patterns and equations which characterise the data. 2. Compression. Reduce the complexity of the data, replace by simpler concepts 3. Querying. Find better ways to query data, i.e. extract information and properties. 4. Approximation. Find models which ....
....concepts 3. Querying. Find better ways to query data, i.e. extract information and properties. 4. Approximation. Find models which approximate the observations well. 5. Search. Look for recurring patters. Furthermore, one may classify data mining algorithms based on three key elements [65]: Model representation. Decision trees, regression functions and associations. Data. Continuous, time series, discrete, labeled, multimedia or nominal data. Applications domains. Finance and economy, biology, web logs, web text mining. The discovered patterns can be characterised ....
[Article contains additional citation context not shown here]
N. Ramakrishnan and A.Y. Grama. Data mining: From serendipity to science. Computer, 32(8):34--37, August 1999.
.... emergence as a large, distributed data repository and the realization that on line transaction databases can be analyzed for commercial gains, data mining in large databases has attracted wide interest from both academia and the industry, and in the meanwhile, has also uncovered new challenges [22]. Data mining has its distinctive goal from related fields such as machine learning, databases and statistics, and accordingly requires distinctive tools. Which types of tools do we need for practical data mining development Yu. For a practical data mining tool, just having advanced algorithms ....
Ramakrishnan, N. and Grama, A.Y., Data Mining: From Serendipity to Science. IEEE Com- puter, 32, 1999, 8: 34-37.
....analysis, where transactions are records representing point of sale data, while items represent products on sale. The importance for marketing decisions of association rules like the 80 of customers which buy products x 1 and x 2 also buy y is intuitive, and explains the strong interest for ARM [10, 11, 19]. Given a database of transactions D, an association rule has the form X ) Y , where X and Y are sets of items (itemsets) and X Y = A rule X ) Y holds in D with a minimum con dence c and a minimum support s, if at least c of all the transactions containing X also contain Y , and X[Y is ....
N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34-37, 1999.
....outperforms the others previously proposed. Our test bed was a Pentium based Linux workstation, while the datasets used for tests were synthetically generated. 1 Introduction The Frequent Set Counting (FSC) 1] problem has been extensively studied as a method of unsupervised Data Mining [6, 7, 12] for discovering all the subsets of items (or attributes) that frequently occurs in the transactions of a given database. Knowledge on the frequent sets is generally used to extract Association Rules stating how a subset of items in uences the presence of another itemset in the transaction ....
N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34-37, 1999.
....support, and the exploitation of e#ective pruning techniques which significantly reduce the size of the dataset as execution progresses. 1 Introduction The Frequent Set Counting (FSC) 1,2,3,4,6,11,12,13,15,17,19] problem has been extensively studied as a method of unsupervised Data Mining [8,9,16] for discovering all the subsets of items (or attributes) that frequently occur in the transactions of a given database. Knowledge of the frequent sets is generally used to extract Association Rules stating how a set of items (itemset) influences the presence of another itemset in the transaction ....
N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34--37, 1999.
....the entire process is usually repeated for different data sets. Discovered patterns are usually represented using a certain well known knowledge representation technique, including inference rules (If Then rules) decision trees, tables, diagrams, images, analytical expressions, and so on [1] [24]. If Then rules are the most frequently used technique [12] 21] The following example is a pattern from financial domain, discovered by applying the MKS system described in [3] IF Home Loan = Yes THEN Post Code = POST RURAL and Gender = MALE and Marital Status = MARRIED and Access Card = ....
....Data Mining Tasks DM tasks depend on the kind of knowledge that the KDD DM system looks for. Each DM task has its specific features and follows specific steps in the discovery process. The following DM tasks are among the most frequently used ones in nowadays KDD DM application [1] 11] 21] [24], 26] Classification. The task is to discover whether an item from the database belongs to one of some previously defined classes. The main problem, however, is how to define classes. In practice, classes are often defined using specific values of certain fields in the data records or some ....
[Article contains additional citation context not shown here]
Ramakrishnan, N., Grama, A.Y., Data mining: from serendipity to science. IEEE Computer 32, August 1999., pp. 34-37.
.... emergence as a large, distributed data repository and the realization that on line transaction databases can be analyzed for commercial gains, data mining in large databases has attracted wide interest from both academia and the industry, and in the meanwhile, has also uncovered new challenges [20]. Data mining has its distinctive goal from related fields such as machine learning, databases and statistics, and accordingly requires distinctive tools. An intelligent learning database system is one of such tools to implement automatic knowledge acquisition from databases. Acknowledgements ....
N. Ramakrishnan and A.Y. Grama, Data Mining: From Serendipity to Science, IEEE Computer, 32(1999), 8: 34--37. 9
.... lack of a sophisticated conceptual model for personalization constitutes a serious bottleneck from a designer s perspective (and indirectly, to users) Compare and contrast this with the availability of powerful models for other aspects of online information access, such as querying [25] mining [1, 61], and navigation [67, 69] Example 1: We motivate our ideas with a simple example involving personalization using link labels. In this paper, we specifically concentrate on web site personalization. Extension of our proposed methodologies to other domains of personalization are addressed toward ....
....individual pages and documents. Such techniques can be included in our methodology if the problem domain requires the modeling of information at this granularity. Clustering and Topical Compression: A final variety of schema compression involves using clustering techniques and approximations [53, 61] to collapse together types and labels that satisfy, say, a certain distance metric. Various suggestions based on geometrical considerations are provided in [53] The BEV case study, discussed in [60] shows that for some domains, it is acceptable (and even desirable) to be less strict in ....
N. Ramakrishnan and A.Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, Vol. 32(8):pp. 34--37, August 1999.
.... spaces and good paradigms of human computer interaction [Valdes Perez, 1999b] It thus cannot be overemphasized that data mining processes are fundamentally iterative and interactive [Hellerstein et al. 1999] Systems that enable such interaction and perform integrated exploration and mining [Ramakrishnan and Grama, 1999] can thus improve performance in the long run, while sacrificing some exploration time in the short term. 3.2 Best Practices One of the key issues to be addressed prior to data mining is the decision on the right mechanism for representation. For example, while a simple structure might exist, ....
Ramakrishnan, N. and Grama, A. (1999). Data Mining: From Serendipity to Science (Guest Editors' Introduction to the Special Issue on Data Mining). IEEE Computer, Vol. 32(8):pp. 34--37.
....equipment used for this work was supported by National Science Foundation MRI grant EIA 9871053 and by the Intel Corp. 1 1 Introduction and Motivation With the availability of large online databases, there has been significant interest in mining this data for underlying patterns of interest [6, 4]. Despite the strong information theoretic underpinings of such tasks, relatively little work has been done on adapting information theoretic results and techniques to data mining and analysis with some noted exceptions. In this paper, we address the problem of summarizing transactions with a view ....
Naren Ramakrishnan and Ananth Grama. Data mining: From serendipity to science. IEEE Computer Special Issue on Data Mining, August 1999.
No context found.
N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34--37, 1999.
No context found.
N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34--37, 1999.
No context found.
Ramakrishnan, N. and Grama, A.Y., "Data Mining: From Serendipity to Science", IEEE Computer, Vol. 32, No.8, August 1999, pp. 34-37.
No context found.
N. Ramakrishnan and A.Y. Grama. Data mining: From serendipity to science. Computer, 32(8):34--37, August 1999.
No context found.
N. Ramakrishnan and A. Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34--37, 1999.
No context found.
N. Ramakrishnan and A. Y. Grama, \Data Mining: From Serendipity to Science," IEEE Computer Special Issue on Data Mining 32(8), pp. 34-37, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC