Fail-Awareness: An Approach to Construct Fail-Safe Applications
Christof Fetzer and Flaviu Cristian
Appeared in: Journal of Real-Time Systems
Date: March 2003

Download: BIBTEX (JRTS2003)
Pages: 203-238
We present a framework for building fail-safe hard real-time applications on top of an asynchronous distributed system subject to communication partitions, i.e. using processors and communication facilities whose real-time delays cannot be guaranteed. The basic assumption behind our approach is that each processor has a local hardware clock that proceeds within a linear envelope of real-time. This allows to compute an upper bound on the actual delays incurred by a particular processing sequence or message transmission. Services and applications can use these computed bounds to detect when they cannot guarantee all their standard properties because of excessive delays. This allows an application to detect when to switch to an exception fail-safe mode.