Accession Number : ADA293978
Title : Reducing Cache Misses in Numerical Applications Using Data Relocation and Prefetching.
Descriptive Note : Technical rept.,
Corporate Author : ILLINOIS UNIV AT URBANA COORDINATED SCIENCE LAB
Personal Author(s) : Yamada, Yoji ; Johnson, Teresa L. ; Haab, Grant E. ; Gyllenhaal, John C. ; Hwu, Wen-mei W.
PDF Url : ADA293978
Report Date : APR 1995
Pagination or Media Count : 31
Abstract : Numerical applications frequently contain nested loops that process large arrays of data. The execution of these loop structures often produces memory reference patterns that utilize data caches poorly. Indeed, poor reuse of the data, large working set sizes, and frequent non-unit stride accesses all combine to cause many cache misses. To improve cache performance, data copying has been proposed. However, this technique has high overhead. In this paper, instead, we propose a combined hardware and software technique called data relocation and prefetching which eliminates much of the overhead of data copying through the use of special hardware. Furthermore, by relocating the data while performing software prefetch- ing, the overhead of copying the data can be reduced further. This technique performs better than prefetching alone because it reduces cache misses through relocation, and it reduces overhead by prefetching multiple elements at once. The hardware is designed to overlap relocation and prefetching with normal execution, and to highly utilize the available bus bandwidth. Simulation results show that this technique greatly reduces data cache miss rates. As a result, large applications including PERFECT and SPEC benchmarks achieve up to 2.5 times speedup. The hardware support required by this technique has been greatly refined over that presented in an earlier paper. (AN)
Descriptors : *SOFTWARE ENGINEERING, *OPTIMIZATION, *DATA MANAGEMENT, COMPUTERIZED SIMULATION, COMPUTER ARCHITECTURE, RELOCATION, OVERLAP, DATA COMPRESSION, NUMERICAL METHODS AND PROCEDURES, COMPUTER PROGRAM VERIFICATION, LOOPS, COMPILERS, BUFFER STORAGE, EXECUTIVE ROUTINES, CONTROL SEQUENCES, BLOCK ORIENTED RANDOM ACCESS MEMORIES, REPRODUCTION(COPYING), COMPUTER BENCHMARKING.
Subject Categories : Computer Programming and Software
Distribution Statement : APPROVED FOR PUBLIC RELEASE