MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Estimating frequency of change (2000) [68 citations — 7 self]

Download:
pdf | ps
by Junghoo Cho, Junghoo Cho, Junghoo Cho, Hector Garcia-molina, Hector Garcia-molina
ACM Transactions on Internet Technology
http://www-db.stanford.edu/pub/papers/cho-freq.ps
Add To MetaCart

Abstract:

Many online data sources are updated autonomously and independently. In this paper, we make the case for estimating the change frequency of the data, to improve web crawlers, web caches and to help data mining. We first identify various scenarios, where different applications have different requirements on the accuracy of the estimated frequency. Then we develop several "frequency estimators " for the identified scenarios. In developing the estimators, we analytically show how precise/effective the estimators are, and we show that the estimators that we propose can improve precision significantly. 1

Citations

574 Bayesian Theory – Bernardo, Smith - 1994
377 Implementing Data Cubes Efficiently – Harinarayan, Rajaraman, et al. - 1996
230 View maintenance in a warehousing environment – Zhuge, Garcia-Molina, et al. - 1995
221 On the Scale and Performance of Cooperative Web Proxy Caching – Wolman, Voelker, et al. - 1999
189 Rate of Change and other Metrics: a Live Study of the World Wide Web – Douglis, Feldmann, et al. - 1997
142 World-Wide Web Cache Consistency – Gwertzman, Seltzer - 1996
130 The Evolution of the Web and Implications for an Incremental Crawler – Cho, Garica-Molina - 2000
123 Synchronizing a Database to Improve Freshness – Cho, García-Molina - 2000
91 How Dynamic is the Web – Brewington, Cybenko - 2000
89 An introduction to stochastic modeling – Taylor, Karlin - 1998
86 A Scalable Web Cache Consistency Architecture – Yu, Breslau, et al. - 1999
65 The Stanford Data Warehousing Project – Hammer, Garcia-Molina, et al. - 1995
54 Optimal Robot Scheduling for Web Search Engines – Coffman, Liu, et al. - 1998
54 Towards a better understanding of web resources and server responses for improved caching – Wills, C, et al. - 1999
53 An adaptive model for optimizing performance of an incremental Web crawler – Edwards, McCurley, et al. - 2002
40 Bayesian Statistics: An Introduction – Lee - 1992
36 Keeping up with the changing Web – Brewington, Cybenko - 2000
32 Random Point Processes – Snyder - 1975
31 Queueing Systems: Theory – Kleinrock - 1975
31 Calculus and Analytic Geometry – Thomas - 1980
20 World Wide Web caching: The application-level view of the Internet – Baentsch, Baum, et al. - 1997
13 Introduction to Bayesian Inference and – Winkler - 1972
5 An Introduction to Stochastic Modeling, 3rd ed – Taylor, Karlin - 1998
4 Parameter estimation in Poisson processes – Misra, Sorenson - 1975
2 Estimation of internet file-access/modification rates from incomplete data – Matloff - 2005
2 Calculus and analytic geometry, 4th ed – Thomas - 1969
2 Using control charts for parameter estimation of a homogeneous poisson process – Yacout, Chang - 1996
1 A bayesian approach to parameter and reliability estimation in the Poisson distribution – Canavos - 1972
1 Estimating Frequency of Change · 31 – Baentsch, Baum, et al. - 1997
1 Methods of mathematical physics, 1st ed – Courant, David - 1989