Executive summary
NASA's Graceful Architecture for Mitigation of System Failures project addresses long-duration mission risk by combining machine learning, digital twin simulation, and autonomous decision support. Phase I validated the feasibility of detecting and responding to spacecraft failures in real time.
Challenge
Deep-space missions cannot rely on immediate Earth-based intervention. Communications delays, hardware failures, radiation exposure, micrometeoroid impacts, and cascading faults can make real-time autonomous response essential.
- Communication delays can make Earth-based troubleshooting impractical.
- Traditional failure trees do not cover novel or unexpected anomalies.
- Redundant hardware increases mission weight and cost.
Problem requirements
Autonomous spacecraft systems need to detect anomalies, model likely failure paths, recommend corrective action, and adapt across mission profiles without depending on constant ground control.
Solution
GRAMS uses a modular cognitive architecture that combines anomaly detection, alert routing, failure simulation, digital twin modeling, and recommended corrective action.
- Risk Identification Algorithm to detect and classify anomalies.
- Alert Generator to prioritize alerts and reduce cognitive burden.
- Failure Simulator to create synthetic training and evaluation scenarios.
- Digital Twin to model spacecraft behavior under failure conditions.
- Action Recommender to suggest corrective mitigation steps.
Implementation
The system was tested in simulated ISS-relevant environments with scenarios such as pressure leaks, sensor malfunctions, and pump failures. VISIMO deployed GRAMS on Hewlett Packard Enterprise's Spaceborne Computer-2 Test and Development System to validate compatibility with space-grade computing constraints.
Results
The Phase I effort exceeded anomaly-detection benchmarks, validated digital twin modeling against complex failure scenarios, and positioned GRAMS for further validation in an ISS testing pathway.