An integrated methodology for the performance and reliability evaluation of fault-tolerant systems
Author(s)Domínguez-García, Alejandro D. (Alejandro Dan)
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
John G. Kassakian and Joel E. Schindall.
MetadataShow full item record
This thesis proposes a new methodology for the integrated performance and reliability evaluation of embedded fault-tolerant systems used in aircraft, space, tactical, and automotive applications. This methodology uses a behavioral model of the system dynamics, similar to the ones used by control engineers when designing the control system, but incorporates additional artifacts to model the failure behavior of the system components. These artifacts include component failure modes (and associated failure rates) and how those failure modes affect the dynamic behavior of the component. The methodology bases the system evaluation on the analysis of the dynamics of the different configurations the system can reach after component failures occur. For each of the possible system configurations, a performance evaluation of its dynamic behavior is carried out to check whether its properties, e.g., accuracy, overshoot, or settling time, which are called performance metrics, meet system requirements. Markov chains are used to model the stochastic process associated with the different configurations that a system can adopt when failures occur.(cont.) Reliability and unreliability measures can be quantified, as well as probabilistic measures of performance, by merging the values of the performance metrics for each configuration and the system configuration probabilities yielded by the corresponding Markov model. This methodology is not only used for system evaluation, but also for guiding the design process, and further optimization. Thus, within the context of the new methodology, we define new importance measures to rank the contributions of model parameters to system reliability and performance. In order to support this methodology, we developed a MATLAB/SIMULINK® tool, which also provides a common environment with a common language for control engineers and reliability engineers to develop fault-tolerant systems. We illustrate the use of the methodology and the capabilities of the tool with two case-studies. The first one corresponds to the lateral-directional control system of an advanced fighter aircraft. This case-study shows how the methodology can identify weak points in the system design; and point out possible solutions to eliminate them; compare different architecture alternatives from different perspectives; and test different failure detection, isolation, and reconfiguration (FDIR) techniques.(cont.) This case-study also shows the effectiveness of the MATLAB/SIMULINK® tool to analyze large and complex systems. The second case-study compares two very different solutions to achieve fault-tolerance in a steer-by-wire (SbW) system. The first solution is based on the replication of components; and the introduction of failure detection, isolation, and reconfiguration mechanisms. In the second solution, a dissimilar backup mechanism called brake-actuated steering (BAS), is used to achieve fault-tolerance rather than replicating each component within the system. This case-study complements the flight control system one by showing how the performance and MATLAB/SIMULINK® tool can be used to compare very different architectural approaches to achieve fault-tolerance; and therefore, how the methodology can be used to choose the best design in terms of performance and reliability.
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (leaves 220-224).
DepartmentMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.