SAFEXPLAIN shares strategies for diverse redundancy in ML/AI Critical Systems session at ERTS ’24

Date: June 13, 2024
Martí Caro from the Barcelona Supercomputing Center presents at the 2024 Embedded Real Time System Congress

Barcelona Supercomputing Center researcher Martí Caro presented “Software-Only Semantic Diverse Redundancy for High-Integrity AI-Based Functionalities” at the 2024 Embedded Real Time System Congress, in Toulouse, France on 12June 2024.

His presentation was delivered during the session on “ML/AI for Critical Systems” chaired by Eric Jenn (IRT Saint-Exupery). The presentation delved into the use of Dual (DMR) and Triple Modular Redundancy (TMR) in safety-critical systems for ensuring that functionalities at the highest integrity level provide fault detection and/or tolerance capabilities.

The presentation, delivered to the conference audience of universities, research centers and industries, observed that many emerging AI-based functionalities are intrinsically stochastic (like in the case of camera-based object detection), and hence, their correctness must be judged semantically, with room for variations across correct outcomes (e.g., confidence must be above a given threshold).

Based on this observation, SAFEXPLAIN proposes strategies to create DMR and TMR implementations of AI-based functionalities that not only bring fault tolerance against random hardware faults, but also against AI model inaccuracies. These strategies, which can be realized with software-only means and ported to virtually any computing platform, build on input data modifications affecting the inference computations, but not the expected semantic output (e.g., introducing some limited random noise in the input data).