and Binomial Trees into Parallel Memory Modules for Fast and Conflict-Free Access to Path and Subtree Templates
Abstract:
The main memory access latency can significantly slowdown the overall performance of a computer system due to the fact that average cycle time of the main memory is typically a factor of 5-10 higher than that of a processor. To cope up with this problem, in addition to the use of caches, the main memory of a multiprocessor architecture is usually organized in multiple modules or banks. Although such organization enhances memory bandwidth, the amount of data that the multiprocessor can retrieve in the same memory cycle, conflicts due to simultaneous attempt to access the same memory module, may reduce the effective bandwidth. Therefore, efficient mapping schemes are required to distribute data in such a way that regular patterns, called templates, of various structures can be retrieved in parallel without memory conflicts. Prior work on data mappings mostly deal with conflict-free access to templates like rows, columns, or diagonals of (multi-dimensional) arrays, and only limited attention has been paid to access templates of non-numeric structures such as trees. In this paper, we study optimal and balanced mappings for accessing path and subtree templates of trees, where a mapping will be called optimal if it allows conflict-free access to templates with as few memory banks as possible. An optimal mapping will be called also balanced if it distributes as

