• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Incrementally Maintaining Classification using an RDBMS

Cached

  • Download as a PDF

Download Links

  • [pages.cs.wisc.edu]
  • [pages.cs.wisc.edu]
  • [pages.cs.wisc.edu]
  • [hazy.cs.wisc.edu]
  • [pages.cs.wisc.edu]
  • [www.cs.stanford.edu]
  • [i.stanford.edu]
  • [infolab.stanford.edu]
  • [www.cs.stanford.edu]
  • [www.vldb.org]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by M. Levent Koc , Christopher Ré
Citations:4 - 0 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

Citations

734 Support vector machine active learning with applications to text classification - Tong, Koller (Show Context)

Citation Context

...histicated statistical frameworks have been developed such as Factor Graphs [33], and the Monte Carlo Database [15]. An interesting problem that is related, but orthogonal, problem is active learning =-=[32,35]-=-, where the goal is to leverage the user feedback to interactively build a model. Technically, our goal is to solicit feedback (which can dramatically help improve the model). In fact one of our initi...

312 Sprint: A scalable parallel classifier for data mining - Shafer, Agrawal, et al. - 1996 (Show Context)

Citation Context

...ntary problem of incrementally maintaining the output of the models as the underlying data change. There has been work on maintaining data mining models incrementally, notably association rule mining =-=[29, 34, 36, 37]-=-, but not for the general class of linear classifiers. Researchers have considered scaling machine learning tool kits that contain many more algorithms than we discuss here, e.g., WekaDB [38]. These a...

258 Random features for large-scale kernel machines. - Rahimi, Recht - 2007 (Show Context)

Citation Context

...r techniques from the body of the paper by transforming these kernels to (low) dimensional linear spaces. The idea is based on a technique of Rahimi and Recht called random non-linear feature vectors =-=[30]-=-. Suppose that all vectors are in S d , the unit ball in d dimensions (any compact set will do). The idea is to find a (random) map z : S d → R D for some D such that for x, y ∈ S d we have z(x) T z(y...

242 Interactive Deduplication Using Active Learning,” - Sarawagi, Bhamidipaty - 2002 (Show Context)

Citation Context

...histicated statistical frameworks have been developed such as Factor Graphs [33], and the Monte Carlo Database [15]. An interesting problem that is related, but orthogonal, problem is active learning =-=[32,35]-=-, where the goal is to leverage the user feedback to interactively build a model. Technically, our goal is to solicit feedback (which can dramatically help improve the model). In fact one of our initi...

225 Maintenance of discovered association rules in large databases: an incremental updating approach.”, - Cheung, Han, et al. - 1996 (Show Context)

Citation Context

...ntary problem of incrementally maintaining the output of the models as the underlying data change. There has been work on maintaining data mining models incrementally, notably association rule mining =-=[29, 34, 36, 37]-=-, but not for the general class of linear classifiers. Researchers have considered scaling machine learning tool kits that contain many more algorithms than we discuss here, e.g., WekaDB [38]. These a...

37 Mining frequent itemsets in evolving databases, in: - Veloso, Carvalho, et al. - 2002 (Show Context)

Citation Context

...ntary problem of incrementally maintaining the output of the models as the underlying data change. There has been work on maintaining data mining models incrementally, notably association rule mining =-=[29, 34, 36, 37]-=-, but not for the general class of linear classifiers. Researchers have considered scaling machine learning tool kits that contain many more algorithms than we discuss here, e.g., WekaDB [38]. These a...

30 PrDB: managing and exploiting rich correlations in probabilistic databases - Sen, Deshpande, et al. (Show Context)

Citation Context

... imprecise; this observation has motivated research to treat this data as probabilistic databases [1, 2, 10] and several sophisticated statistical frameworks have been developed such as Factor Graphs =-=[33]-=-, and the Monte Carlo Database [15]. An interesting problem that is related, but orthogonal, problem is active learning [32,35], where the goal is to leverage the user feedback to interactively build ...

28 ICF: A new term weighting scheme for clustering dynamic data streams, in: - Reed, Jiao, et al. - 1236 (Show Context)

Citation Context

...needed from the corpus. A final example is TFICF (term frequency inverse corpus frequency) in which the term frequencies are obtained from a corpus, but explicitly not updated after each new document =-=[31]-=-. In general, an application will know what feature functions are appropriate, and so we design Hazy to be extensible in this regard, but we expect that the administrator (or a similar expert) writes ...

14 Efficient evaluation of queries with mining predicates. In: - Chaudhuri, Narasayya, et al. - 2002 (Show Context)

Citation Context

...act one of our initial motivations behind the hybrid approach is to allow active learning over large data sets. There has been research on how to optimize the queries on top of data mining predicates =-=[28]-=-; we solve the complementary problem of incrementally maintaining the output of the models as the underlying data change. There has been work on maintaining data mining models incrementally, notably a...

8 Efciently mining approximate models of associations in evolving databases - Veloso, Gusmao, et al. - 2002 (Show Context)

Citation Context

...ntary problem of incrementally maintaining the output of the models as the underlying data change. There has been work on maintaining data mining models incrementally, notably association rule mining =-=[29, 34, 36, 37]-=-, but not for the general class of linear classifiers. Researchers have considered scaling machine learning tool kits that contain many more algorithms than we discuss here, e.g., WekaDB [38]. These a...

5 D.Precup. Data mining using relational database management systems - Zou, Kemme - 2006 (Show Context)

Citation Context

..., 34, 36, 37], but not for the general class of linear classifiers. Researchers have considered scaling machine learning tool kits that contain many more algorithms than we discuss here, e.g., WekaDB =-=[38]-=-. These approaches may result in lower performance than state-ofthe-art approaches. In contrast, our goal is to take advantage of incremental facilities to achieve higher levels of performance. E. REF...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University