| Dmitry Pavlov and Padhraic Smyth. Probabilistic query models for transaction data. In KDD '01, 2001. |
....anymore. The probe algorithm is quite resistant to the extension of varying topic probabilities: the Sammon maps are remarkably similar to those obtained for the nonvarying probability topic models. Maximum entropy model. We also considered whether the maximum entropy method described in e.g. [16, 15] might be useful in finding topics. The method is used to answer queries about the data as follows: first, one mines frequent sets with some threshold [1, 2] and then finds the maximum entropy distribution [3, 9] consistent with the frequent sets. We performed experiments using simulated data to ....
D. Pavlov and P. Smyth. Probabilistic query models for transaction data. In KDD 2001.
No context found.
D. Pavlov and P. Smyth. Probabilistic query models for transaction data. In Proc. of Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 164--173. ACM Press, 2001.
No context found.
D. Pavlov and P. Smyth. Probabilistic query models for transaction data. In Proceedings of Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 164--173. New York, NY: ACM Press, 2001.
.... Boolean queries [21] The restriction to sparse binary data sets is motivated by the availability of such data sets in real world applications (e.g. retail transaction data sets, Web logs) again, however, the methodologies in this paper can be generalized for categorical non binary data sets [23]. We further assume that the database consists of one table and our task is to estimate selectivity only, i.e. we do not need, for example, to perform JOINs or return the actual records satisfying the query. In this paper we are only interested in estimating the count of the query. historical ....
....12 attributes) demonstrate success of the methodology. The use of interaction models in dependency based histograms for query selectivity estimation was described by [11] These ideas are similar to ours, in particular, the maximum entropy Markov random eld [16, 21] and Bayesian network models [22, 23]. Nonetheless, this work still uses relatively low dimensional data cubes (12 dimensions) and does not contain a complete and a systematic investigation of di erent probabilistic modeling techniques for query approximation. Finally, a nice motivation for using Bayesian networks for the query ....
[Article contains additional citation context not shown here]
D. Y. Pavlov. Probabilistic query models for transaction data. Unpublished PhD Dissertation, University California, Irvine, http://www.ics.uci.edu/~pavlovd/research.html, 2002.
....12 attributes) demonstrate success of the methodology. The use of interaction models in dependency based histograms for query selectivity estimation was described by [11] These ideas are similar to ours, in particular, the maximum entropy Markov random eld [16, 21] and Bayesian network models [22, 23]. Nonetheless, this work still uses relatively low dimensional data cubes (12 dimensions) and does not contain a complete and a systematic investigation of di erent probabilistic modeling techniques for query approximation. Finally, a nice motivation for using Bayesian networks for the query ....
....k = 1000 attributes and a query on 4 attributes. To calculate the marginal probability of interest on the 4 query variables requires summing out up to 960 other variables. The structure of the graph can of course be leveraged to make this sum tractable, but for large values of k we have found [22, 23] that the resulting graph can end up with quite large cliques making exact inference intractable. Approximate inference can be done using Gibbs sampling but is not particularly well suited to generating approximate query answers in real time given its relatively slow convergence (especially for ....
[Article contains additional citation context not shown here]
D. Pavlov and P. Smyth. Probabilistic query models for transaction data. In Proc. of Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 164-173. ACM Press, 2001.
....rather than as simple conjunctions. The methods can also be extended to arbitrary categorical data (or discretizations of real valued data) rather than being restricted to binary data alone. Anytime algorithms for approximate query answering are also of significant practical interest. In [28] we present a more detailed description of the framework and its extensions. 31 Acknowledgements The research described in this paper was supported in part by NSF CAREER award IRI 9703120. ....
D. Y. Pavlov. Probabilistic Query Models for Transaction Data. PhD thesis, Department of Information and Computer Science, University of California Irvine., 2002.
.... In addition, in all of this prior work the techniques used were only tested on relatively low dimensional data cubes (10 or fewer dimensions) In our earlier work on this topic we have reported initial results on the use of maximum entropy methods [17] 24] mixture models, and Bayesian networks [26] for online query an swering in sparse binary data sets of high dimension (50 dimensions and higher) This present paper studies a new query answering method based on the inclusion exclusion principle and provides a systematic characterization and empirical evaluation of the performance of a ....
D. Pavlov and P. Smyth. Probabilistic query models for transaction data. In Proceedings of Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'01), pages 164-173. ACM Press, 2001.
No context found.
Dmitry Pavlov and Padhraic Smyth. Probabilistic query models for transaction data. In KDD '01, 2001.
No context found.
D. Yu. Pavlov. Probabilistic query models for transaction data. Unpublished PhD Dissertation, University California, Irvine, http://www.ics.uci.edu/pavlovd/research.html, 2002.
No context found.
D. Pavlov and P. Smyth. Probabilistic query models for transaction data. In Knowledge Discovery and Data Mining, pages 164--173, 2001.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC