Due to the increasing integration of safety-critical functionalities into electronic devices, safety-related system design and certification have become a major challenge. Amongst others a suitable reaction of components in case of internal errors must be ensured in order to prevent a function from failing and to guarantee a certain degree of reliability. In this context a wide variety of different fault tolerance mechanisms have been developed in the past, including analytical considerations of error coverage and resulting reliability. However, most of these mechanisms induce a certain timing overhead, which in turn might affect the real-time capabilities of the system in a negative way. More concretely, even if each error is treated adequately such that no logical failure occurs, a timing failure due to missing a deadline cannot be ruled out definitely. Thus, there is a growing need for appropriate methods to calculate the probability of timing failures and to prove that potential reliability and safety constraints are not violated.
In this paper we present an analysis approach for networked systems as well as highly integrated multi-core architectures to calculate reliability with respect to timing failures. For that purpose simulation techniques are less appropriate and expensive due to the rare fault events, leading to exhaustive simulation times until results are statistically relevant. Therefore, formal methods have been developed to prove that the considered embedded real-time system is working correctly and that failure rates are bounded according to the required safety level. Further on we present an extension of the basic analysis ideas to include the influence of different error models into reliability analysis. Special emphasis is put on mixed-criticality systems, i.e. systems with applications of different safety requirements. We propose an approach to decouple the reliability analyses for these applications and to determine an individual safety integrity level for each application. Based on this approach it is possible to refine the conservative concept of IEC 61508 to take the most critical application as a basis for the whole system, enabling cost reduction and automated qualification. Based on a prototype implementation for Symtavision's SymTA/S tool suite we will show how the presented methodologies can be integrated into a safety related design flow. Based on that kind of tooling support the presented approaches can be applied for different stages of the design process, such as design space exploration and optimization as well as for verification and certification purposes.