Date of Award

Spring 1993

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical/Computer Engineering

Committee Director

John W. Stoughton

Committee Member

D. L. Livingston

Committee Member

L. Wilson

Committee Member

Roland Mielke

Abstract

The modeling and design of a fault-tolerant multiprocessor system is addressed in this dissertation. In particular, the behavior of the system during recovery and restoration after a fault has occurred is investigated. Given that a multicomputer system is designed using the Algorithm to Architecture To Mapping Model (ATAMM) model, and that a fault (death of a computing resource) occurs during its normal steady-state operation, a model is presented as a viable research tool for predicting the performance bounds of the system during its recovery and restoration phases. Furthermore, the bounds of the performance behavior of the system during this transient mode can be assessed. These bounds include: time recover from the fault (trec), time to restore the system (tres} and whether there is a permanent delay in the system's Time Between Input and Output (TBIO) after the system has reached a steady state. An implementation of an ATAMM based computer was developed with the Generic VHSIC Spaceborne Computer (GVSC) as the target system. A simulation of the GVSC was also written based on the code used in ATAMM Multicomputer Operating System (AMOS). The simulation is in turn used to validate the new model in the usefulness and accuracy in tracking the propagation of the delay through the system and predicting the behavior in the transient state of recovery and restoration. The model is validated as an accurate method to predict the transient behavior of an ATAMM based multicomputer during recovery and restoration.

DOI

10.25777/bcpk-dm48

Share

COinS