Accession Number : ADA296392

Title :   An Efficient Technique for Tracking Nondeterministic Execution and its Applications.

Corporate Author : CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE

Personal Author(s) : Elnozahy, E. N.

PDF Url : ADA296392

Report Date : MAY 1995

Pagination or Media Count : 20

Abstract : This report describes a technique for using instruction counters to track non determinism in the execution of operating system kernels and user programs. The operating system records the number of instructions between consecutive nondeterministic events and information about their nature during normal operation. During an analysis phase, the execution is repeated under the control of a monitor, and the nondeterministic events are applied at the same instructions as during the monitored execution. We describe the application of this technique to four areas: Performance monitoring: The technique can be used to instrument an operating system to capture long traces of memory references. Unlike current techniques, it performs the gathering in a postmortem phase and therefore has negligible effect on the computation itself during the monitoring phase. We expect trace periods that are longer than what existing techniques can capture by orders of magnitude with little or no noticeable perturbation to the monitored system itself. Kernel Debugging: This technique can be used to repeat the execution of an operating system that precedes a crash due to a Heizenbug. This allows developers a systematic approach for getting rid of these bugs during testing. Support for Rollback-Recovery: Systems that use checkpointing and execution replay can adopt this technique to ensure that execution replay during recovery is identical to the one before failure, despite the occurrence of nondeterministic events that cannot be captured efficiently otherwise. Software-based TMR systems: Using this technique, a TMR system based on active replication can be built out of off-the-shelf workstations connected by a general purpose network.

Descriptors :   *OPERATING SYSTEMS(COMPUTERS), *DEBUGGING(COMPUTERS), *COUNTERS, MONITORING, OFF THE SHELF EQUIPMENT, TRACKING, EFFICIENCY, USER NEEDS, WORK STATIONS, COMPUTER NETWORKS, INSTRUCTIONS, RECORDS.

Subject Categories : Computer Programming and Software

Distribution Statement : APPROVED FOR PUBLIC RELEASE