| GRAY, J., AND GRAEFE, G. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26, 4 (1997). |
....which has poor cache performance. Another problem is that in order to insert an entry into a sorted array, half of the page (on average) must be copied to make room for the new entry. To make matters worse, the optimal disk page size for B Trees is increasing with disk technology trends [12, 15], making the above problems even more serious in the future. Techniques for improving B Tree cache performance. One approach that was brie y mentioned by Lomet [16] is micro indexing, which is illustrated in Figure 4. The idea behind micro indexing is that the rst key of every cache line in ....
....cache performance only, there is an optimal in page node size, determined by memory system parameters and key and pointer sizes [6] Ideally, in page trees based on this optimal size t tightly within a page. However, the optimal page size is determined by I O parameters and disk and memory prices [12, 15]. Thus there is likely a mismatch between the two sizes, as depicted in Figure 6. Figure 6(a) shows an over ow scenario in which a two level tree with cacheoptimal node sizes fails to t within the page. Figure 6(b) shows an under ow scenario in which a two level tree with cache optimal node sizes ....
J. Gray and G. Graefe. The Five-Minute Rule Ten Years Later. ACM SIGMOD Record, 26(4):63-68, Dec. 1997.
....up by the server. This optimization helps clients improve their response times. 2. 2 Client and Server Cache Model Conventional caching and prefetching strategies are typically page based since the optimal unit of transfer between systems resources are pages with sizes ranging from 8 KB to 32 KB [GG97] In mobile data delivery networks caching and prefetching data on a coarse granularity such as pages is inefficient due to the physical constraints and characteristics of the mobile environment. As mentioned before, the communication in client server direction is handicapped by low bandwidth ....
....data objects. To limit the memory size allocated for historical reference information, OOW93] suggests storing that information only for a limited period of time after the reference had been recorded. As reasonable rule of thumb for the length of this period they use the Five Minute Rule [GG97] However, applying it in a mobile environment may be inappropriate for the following reason: a timebased approach for keeping reference information ignores the available cache size and reference behavior of the client. For example, if a client operates in disconnected mode due to a lack of ....
J. Gray, G. Graefe. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26(4), pages 63-68, 1997.
....a search, which has poor cache performance. Another problem is that in order to insert an entry into a sorted array, half of the page (on average) must be copied to make room for the new entry. To make matters worse, the optimal disk page size for B Trees is increasing with disk technology trends [12, 15], making the above problems even more serious in the future. Techniques for improving B Tree cache perfor mance. One approach that was briefly mentioned by Lomet [16] is micro indexing, which is illustrated in Figure 4. The idea behind micro indexing is that the first key of every cache line in ....
....performance only, there is an optimal in page node size, determined by memory system parameters and key and pointer sizes [6] Ideally, in page trees based on this optimal size fit tightly within a page. However, the optimal page size is determined by I O parameters and disk and memory prices [12, 15]. Thus there is likely a mismatch between the two sizes, as depicted in Figure 6. Figure 6(a) shows an overflow scenario in which a two level tree with cache optimal node sizes fails to fit within the page. Figure 6(b) shows an underflow scenario in which a two level tree with cache optimal node ....
J. Gray and G. Graefe. The Five-Minute Rule Ten Years Later. ACM SIGMOD Record, 26(4):63-68, Dec. 1997.
....up by the server. This optimization helps clients improve there response times. 2. 2 Client and Server Cache Model Conventional caching and prefetching strategies are typically page based since the optimal unit of transfer between systems resources are pages with sizes ranging from 8 KB to 32 KB [GG97] In mobile data delivery networks caching and prefetching data on a coarse granularity such as pages is inefficient due to the physical constraints and characteristics of the mobile environment. As mentioned before, the communication in client server direction is handicapped by low bandwidth ....
....storing data objects. To limit the memory size allocated for historical reference information, OOW93] suggests storing that information only for a limited period of time after the reference has been recorded. As reasonable rule of thumb for the length of this period they use the Five Minute Rule [GG97] However, applying it in a mobile environment may be inappropriate for the following reason: a time based approach for keeping reference information disobeys the available cache size and reference behavior of the client. For example, if a client operates in disconnected mode due to a lack of ....
J. Gray, G. Graefe. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26(4), 1997, pages 63-68.
....be identified using the Five Minutes Rule which state that only pages which will be accessed again in 5 minutes are worth keeping in the buffer [16] Furthermore, we are conservative in using 8Kbyte pages in our experiments. This is expected to be too small for index page sizes by the year 2005 [17], when 64 Kbytes pages may be required for optimum performance. A larger page size means that more rules can be stored inside one buffer page, so BROOM will not need as many extra buffer frames, and therefore perform better than indicated in our experiments. As for the extra time needed by BROOM ....
....therefore perform better than indicated in our experiments. As for the extra time needed by BROOM to monitor and activate rules during buffer management, we measured this in our experiments and found it to be (on average) 1.5 times that of the other policies. Assuming a 10.8ms disk access time [17], our calculations show that this increase in processing time is still worth the increase in hit rate, especially where observation (B2) is concerned. 59 6.2 When to use BROOM 6.1.2 Frequency of Mining Another pertinent issue is how often mining should be done. Naturally, this depends on the ....
J. Gray and G. Graefe. The Five-Minute Rule ten years later, and other computer storages rules of thumb. SIGMOD Record, Vol. 26, No. 4, Dec 1997, pp. 63-68.
....Page Size When Caching is Considered David Lomet Microsoft Research Redmond, WA 98052 lomet microsoft.com 1. Introduction The recent article by Gray and Graefe in the Sigmod Record [1] included a study of B tree page size and the trade off between page size and the cost of accessing B tree data. Its goal was to find the page size that would result in the lowest access cost per record. This insightful analysis showed how increasing page size permitted the B tree to be traversed ....
....page size permitted the B tree to be traversed faster while increasing the amount of data that needed to be read to perform the traversal. Their analysis captures the trade off between the cost (in time) of each access and how many accesses are needed to traverse the tree. What the analysis in [1] does not capture is the impact on B tree cost performance that results from caching parts of the index tree in main memory. Substantial parts of a B tree index (above the leaves) will often be memory resident. It does not matter what the size of those pages is. What does matter is how much memory ....
[Article contains additional citation context not shown here]
Gray, J. and Graefe, G. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. ACM Sigmod Record 26,4 (Dec. 1997) 63-68.
....factors. First, we wanted to support more generic workloads such as 3D virtual world navigation, instead of being restricted to video or audio only. Second, we note that current trends in disk technology show that the cost of storage space is decreasing faster than the cost of disk bandwidth[16] [15]. Thus, we expect that multimedia servers will be increasingly limited by bandwidth as opposed to being limited by storage space. In this scenario, disks will typically be bought for their bandwidth and extra storage space will be available at no cost, favoring a simpler design based on data ....
....since the load is already optimally balanced 52 across all disks, by keeping all streams synchronized and equally distributed among the disks. Moreover, we note that current trends in disk technology show that the cost of storage space is decreasing faster than the cost of disk bandwidth [16] [15]. In [15] the authors observe that disk access speeds have increased 10 fold in the last twenty years, but during the same period, disk unit capacity have increased hundred fold and storage cost decreased ten thousand fold. This trend is likely to continue in the future, and multimedia ....
[Article contains additional citation context not shown here]
J. Gray and G. Graefe. The five-minute rule ten years later, and other computer storage rules of thumb. SIGMOD Record, 26(4):63--68, December 1997.
....justifiable. Some readers may object that technology costs fluctuate too rapidly to guide design decisions. While it is true that memory and bandwidth prices change rapidly, engineering principles and rules of thumb based on technology price ratios have remained remarkably robust for long periods [18,20], and the main results of this section are stated in terms of ratios. In order to apply the methods of Section 2.1 or Section 2.2 in an optimal cache size computation, we require both detailed workload data (R and p ) and technology costs ( M and B ) for the same site at which the workload is ....
J. Gray, G. Graefe, The five-minute rule ten years later, and other computer storage rules of thumb. Technical Report MSR-TR-97-33, Microsoft Research, September 1997, http://www.research.microsoft. com/scripts/pubs/trpub.asp.
....the same initial dataset. This will allow us to compare each structure with respect to the dimensionality of the dataset. The tree node in most access structures is directly linked to the disk page size and most published research has used node sizes of 4 Kb. Not too long ago, Gray and Graefe [6] presented arguments indicating that current index pages should probably be 16 Kb large. As a matter of fact in the near future, pages of 8 Kb may be considered too small given the predicted throughput of future I O systems. The effect of page size is seldomly investigated in the indexing ....
....Figures 5 and 6 confirm the fact that the SS tree is indeed very compact, and reveal that the M tree is not as space efficient as the other ones. Particularly it cannot take advantage of larger nodes, whereas all other can, especially the SR tree. If disk pages continue to grow as predicted in [6] the SR tree is also bound to become very space efficient. 5 0 200 400 600 800 1000 1200 1400 1600 4000 6000 8000 10000 12000 14000 16000 18000 Page size [bytes] M tree R tree SR tree SS tree Figure 2: Index construction time versus disk page size 0 0.2 0.4 0.6 0.8 1 1.2 ....
J. Gray and G. Graefe. The five-minute rule ten years later, and other computer storage rules of thumb. ACM SIGMOD Record, 26(4):63--68, 1997.
....justifiable. Some readers may object that technology costs fluctuate too rapidly to guide design decisions. While it is true that memory and bandwidth prices change rapidly, engineering principles and rules of thumb based on technology price ratios have remained remarkably robust for long periods [18, 20], and the main results of this section are stated in terms of ratios. In order to apply the methods of Section 2.1 or Section 2.2 in an optimal cache size computation, we require both detailed workload data (R and p i ) and technology costs ( M and B ) for the same site at which the workload is ....
Jim Gray and Goetz Graefe. The five-minute rule ten years later, and other computer storage rules of thumb. Technical Report MSR-TR-97-33, Microsoft Research, September 1997. http://www.research.microsoft.com/scripts/ pubs/trpub.asp.
....are continuously growing. For example, the BaBar experiment is storing about 200 Terabytes of physics data per year [6] and the CMS experiment plans to store 1000 Terabytes (1 Petabyte) per year starting in 2005 [3] Storing a large dataset on tape is some 10 times cheaper than storing it on disk [4]. For massive datasets, the use of tertiary storage remains the only cost effective option in the foreseeable future. The work reported on here is part of a larger research project aimed at exploring database technology options for the storage and analysis of massive high energy physics datasets ....
J. Gray, G. Graefe, The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26(4): 63--68 (1997).
....However very little effort is made to cluster data across page boundaries. For instance in a B Tree index, the leaf pages are not stored consecutively on disk. This ad hoc, intra page clustering has very bad I O performance since reading random pages is about 10 times slower than sequential reads [GG97]. Increasing the page To appear in ACM SIGACT SIGMOD SIGART Symposium on Principles of Database Systems, Philadelphia, June 1999. size is not a solution since that will increase the cost of each update and hence slow transaction processing. In this paper we present locality preserving ....
....depend to a great deal on the bus characteristics, on whether file system buffering is on, on whether write caching is present, and on the request size, on the file system used, and on whether the operation is a read or a write. These values are roughly indicative of the actual values. See [GG97, RvIG98, GK97] for recent detailed measurements on different disks and file systems. 0 500 1000 1500 2000 1000 2000 3000 4000 5000 6000 number of records in range to be read time (msecs) clustered B Tree LPD Figure 4: Time (milliseconds) take for range searches for LPDs and clustered B Trees 0 5 ....
J. Gray and G. Graefe. The five minute rule ten years later. SIGMOD Record, 26(4), 1997.
....all pages that are currently in the cache plus some more pages that are potential caching candidates. Determining the latter kind of relevant pages may be seen as a tricky fine tuning problem again, but one can easily make a pragmatic and practically robust choice based on the five minute rule [10]: the history of pages that have not been referenced within the last five minutes can be safely discarded from the method s bookkeeping. ffl Prediction: The time interval between the k th last reference of a page p and the current time gives us a statistical estimator for the reference ....
Gray, J., Graefe, G., The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb, ACM SIGMOD Record 26(4), 1997.
No context found.
GRAY, J., AND GRAEFE, G. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26, 4 (1997).
No context found.
Gray J., Graefe G. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb // SIGMOD Record. 1997. Vol. 26. No. 4. P. 63-68.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC