(Enter summary)
Abstract: In this paper we propose a scaling-up method that is applicable
to essentially any induction algorithm based on discrete
search. The result of applying the method to an algorithm
is that its running time becomes independent of the size of
the database, while the decisions made are essentially identical
to those that would be made given infinite data. The
method works within pre-specified memory limits and, as
long as the data is iid, only requires accessing it sequentially.
It gives anytime... (Update)
Context of citations to this paper: More
.... This is based on our previous work in applying subsampling techniques to propositional learners [Domingos and Hulten, 2000; Hulten and Domingos, 2002] Beyond this, we envisage that intelligent control of which tuples a learner looks at, and which join paths it pursues,...
...continuously generate data, and the distribution of this data often changes drastically as time goes by. In previous work [Hulten and Domingos, 2002] we developed a framework capable of semi automatically scaling up a wide class of propositional learning algorithms to address...
Cited by: More
Tractable Learning of Large Bayes Net Structures from Sparse.. - Anna Goldenberg Anya
(Correct)
Learning in First-Order Probabilistic Representations - Matthew Richardson Ph (2003)
(Correct)
A General Framework for Mining Massive Data Stream - Domingos, Hulten (2003)
(Correct)
Active bibliography (related documents): More All
0.5: A Scalable Constant-Memory Sampling Algorithm for Pattern.. - Scheffer, Wrobel (2002)
(Correct)
0.5: Speeding up k-means Clustering by Bootstrap Averaging - Ian Davidson And
(Correct)
0.2: Sequential Sampling Algorithms: Unified Analysis and Lower.. - Gavalda, Watanabe (2001)
(Correct)
Similar documents based on text: More All
0.5: Mining High-Speed Data Streams - Domingos, Hulten (2000)
(Correct)
0.3: Mining Time-Changing Data Streams - Hulten, Spencer, Domingos (2001)
(Correct)
0.2: A General Method for Scaling Up Machine Learning Algorithms.. - Domingos, Hulten (2001)
(Correct)
Related documents from co-citation: More All
4: Mining High-Speed Data Streams
- Domingos, Hulten - 2000
4: Learning probabilistic relational models
- Getoor, Friedman et al. - 1999
3: sparse candidate (context) - Friedman, Nachman et al. - 1999
BibTeX entry: (Update)
G. Hulten and P. Domingos. Mining complex models from arbitrarily large databases in constant time. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 525--531, Edmonton, Alberta, Canada, 2002. ACM Press. http://citeseer.ist.psu.edu/hulten02mining.html More
@misc{ hulten02mining,
author = "G. Hulten and P. Domingos",
title = "Mining complex models from arbitrarily large databases in constant time",
text = "G. Hulten and P. Domingos. Mining complex models from arbitrarily large
databases in constant time. In Proceedings of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, pages 525--531, Edmonton,
Alberta, Canada, 2002. ACM Press.",
year = "2002",
url = "citeseer.ist.psu.edu/hulten02mining.html" }
Citations (may not include all citations):
351
Learning Bayesian networks: The combination of knowledge and..
- Heckerman, Geiger et al. - 1995
157
Probability inequalities for sums of bounded random variable.. (context) - Hoe - 1963
92
Mining high-speed data streams
- Domingos, Hulten - 2000
91
Improving generalization with active learning
- Cohn, Atlas et al. - 1994
81
Sequential analysis (context) - Wald - 1947
53
Mining time-changing data streams
- Hulten, Spencer et al. - 2001
27
the sample complexity of learning Bayesian networks
- Friedman, Yakhini - 1996
23
PALO: A probabilistic hill-climbing algorithm
- Greiner - 1996
21
Adaptive sampling methods for scaling up knowledge discovery..
- Domingo, Gavalda et al. - 2002
17
sparse candidate (context) - Friedman, Nachman et al. - 1999
12
ding races: Accelerating model selection search for classifi.. (context) - Maron, Moore - 1994
11
KDD-Cup (context) - Kohavi, Brodley et al. - 2000
8
cient progressive sampling (context) - Provost, Jensen et al. - 1999
7
Learning from infinite data in finite time
- Domingos, Hulten - 2002
3
Incremental maximization of non-instance-averaging utility f.. (context) - Sche, Wrobel - 2001
2
The learning-curve method applied to model-based clustering (context) - Meek, Thiesson et al. - 2002
1
A general method for scaling up learning algorithms and its .. (context) - Hulten, Domingos - 2002
Documents on the same site (http://www.cse.unsw.edu.au/~qzhang/papers/): More
Single-shot Detection of Multiple Categories of Text using.. - Ueda, Saito (2002)
(Correct)
Relational Markov Models and their Application to Adaptive.. - Anderson, Domingos (2002)
(Correct)
A Model for Discovering Customer Value for E-Content - Jagannathan, Nayak.. (2002)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC