Histogram-based Temporal Joins
Abstract:
Histograms are used in most commercial database systems to estimate query result sizes and evaluation plan costs. They can also be used to optimize join algorithms. In this paper we consider how to use histograms to improve the join processing in temporal databases. We define histograms for temporal data and present two temporal join algorithms that make use of this histogram information. The first is a temporal partition-join with dynamic buffer allocation. Histogram information is used to determine partitions boundaries that maximize overall buffer usage. The second is a temporal merge-join designed for the special case of append-only relations. The order of read operations is determined by histogram information to minimize buffer overflow and cost of read operations. We compare the performance of these join algorithms to temporal join evaluation strategies that do not use histograms. The results demonstrate that the temporal partition-join is substantially improved through the incorporation of histogram information. In contrast, while adding histogram information to the merge-join does not generally lead to improvements, it does improve the robustness of the algorithm in the presence of many long-lived tuples. 1

