Accession Number : ADA329899

Title :   Using Peer Support to Reduce Fault-Tolerant Overhead in Distributed Shared Memories

Descriptive Note : Technical rept

Corporate Author : ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE

Personal Author(s) : Hunt, G. C. ; Scott, M. L.

PDF Url : ADA329899

Report Date : JUN 1996

Pagination or Media Count : 16

Abstract : We present a peer logging system for reducing peformance overhead in fault tolerant distributed shared memory systems. Our system provides fault tolerant shared memory using individual checkpointing and rollback, Peer logging logs DSM modification messages to remote nodes instead of to local disks. We present results for implementations of our fault tolerant technique using simulations of both TreadMarks, a software only DSM, and Cashmere, a DSM using memory mapped hardware. We compare simulations with no fault tolerance to simulations with local disk logging and peer logging. We present results showing that fault tolerant Treadmarks can be achieved with an average of 17 percent overhead for peer logging. We also present results showing that while almost any DSM protocol can be made fault tolerant, systems with localized DSM page meta-data have much lower overheads.

Descriptors :   *MEMORY DEVICES, *FAULT TOLERANT COMPUTING, COMPUTER PROGRAMS, DISTRIBUTED DATA PROCESSING, DISKS, FAULT TOLERANCE.

Subject Categories : Computer Programming and Software

Distribution Statement : APPROVED FOR PUBLIC RELEASE