Accession Number : ADA191688
Title : Effectiveness Evaluation of Fault-Tolerant Multiprocessor Systems.
Descriptive Note : Final rept. 1 Oct 83-30 Aug 87,
Corporate Author : DUKE UNIV DURHAM NC DEPT OF COMPUTER SCIENCE
Personal Author(s) : Trivedi, Kishor S
PDF Url : ADA191688
Report Date : 27 Jan 1988
Pagination or Media Count : 5
Abstract : An important area of research is in the analysis of the coverage of a fault tolerant system, that is, the probability that the system can recover from a fault. The author has studied a variety of models, from simple phase-type models to very complex stochastic Petri net models, and has investigated solution techniques for each model type. His methodology allows consideration of external events that can interfere with recovery, such as a hard limit on recovery time, or the occurrence of a second near-coincident fault. It was discovered that a policy of attempting transient recovery upon detection of an error (as opposed to automatically reconfiguring the affected component out of the system) may actually increase the unreliability of the system. This result holds if the error detectability is not nearly perfect, so that the risk of producing an undetectable error (if the transient error is present) is greater than the benefit gained by not discarding the component. Keywords: Bibliographies.
Descriptors : *FAULT TOLERANT COMPUTING, *MULTIPROCESSORS, DETECTION, ERRORS, EXTERNAL, FAULTS, LIMITATIONS, MODELS, RECOVERY, RISK, SOLUTIONS(GENERAL), TIME, TOLERANCE, TRANSIENTS, PERFORMANCE(ENGINEERING), COMPUTER PROGRAMS
Subject Categories : Computer Programming and Software
Distribution Statement : APPROVED FOR PUBLIC RELEASE