AI assurance and model validation

MVPT: Validating Mission AI Against the Unknown

MVPT is a NASA SBIR Phase I effort that helps autonomous mission systems estimate whether AI models remain valid when they encounter novel, shifted, or degraded data during high-consequence missions.

Case study

The relevant public case-study content is organized here for readability and technical review.

Executive summary

Autonomous space systems increasingly rely on AI and machine learning to interpret data, support decisions, and respond to conditions that cannot be managed in real time from Earth. VISIMO developed Model Validation via Precomputed Transformations (MVPT) through a NASA SBIR Phase I effort to explore whether model-validation evidence could be computed before a mission and then retrieved quickly when autonomous systems encounter new or shifted data. The Phase I effort completed all three technical objectives, reached 0.95 +/- 0.05 Fraction of Correctly Computed Validations in proof-of-concept testing, and established a Phase II pathway for deep-space rotorcraft autonomy.

Challenge

AI can help autonomous spacecraft, rovers, rotorcraft, and other mission systems make decisions faster than human operators can support from Earth. But high-consequence autonomy depends on confidence. Mission teams need to know when a model can be trusted, when it is operating outside expected conditions, and when fallback procedures may be necessary.

AI models may encounter data that differs from pre-mission training or validation data.
Traditional validation can be computationally expensive and difficult to repeat onboard.
Mission systems need fast, reviewable estimates of whether models remain valid.

Problem requirements

The Phase I effort needed to determine whether MVPT could provide a technically credible path for AI and machine-learning validation under novel mission conditions. The work had to show compatibility across NASA-relevant models and data, demonstrate a working proof-of-concept, and create a specific Phase II scenario for more mission-relevant simulation and testing.

Assess compatibility across model families, validation methods, and data modalities.
Build and test a proof-of-concept with real NASA-relevant data.
Define a Phase II test path for autonomous mission software.

Solution

MVPT is built around a simple but powerful idea: perform expensive validation work before the mission, then use fast lookup and similarity methods during the mission. Before deployment, the system evaluates how a model behaves across many transformed versions of known data. When new data arrives, the system can compare it to that precomputed validation space and estimate whether the model remains reliable without recomputing full validation onboard.

Precompute validation behavior across plausible data transformations.
Use similarity search to retrieve relevant validation evidence when new data arrives.
Support fast runtime estimates of whether a model remains valid under shifted conditions.

Implementation

VISIMO pursued two research tracks. First, the team conducted a structured compatibility analysis across NASA-relevant machine-learning models, validation methods, and data modalities. Second, the team built a proof-of-concept around NASA solar wind data from the Deep Space Climate Observatory. Historical solar wind data was separated into quiet and volatile regimes so the demonstration could evaluate how model-validity estimates changed when a probabilistic forecasting model encountered data outside its nominal training regime.

Results

MVPT successfully proved Phase I feasibility. VISIMO completed all three Phase I technical objectives and achieved a final 0.95 +/- 0.05 Fraction of Correctly Computed Validations in the proof-of-concept. The compatibility analysis showed a 0.979 +/- 0.015 compatibility fraction across reviewed machine-learning model families, while validation-method and data-modality compatibility were statistically consistent with the project's 0.9 benchmark. The proof-of-concept used real NASA solar wind data and tested model behavior under quiet and volatile regimes, where the volatile regime created a meaningful out-of-sample validation case. The project also produced a concrete Phase II path focused on a deep-space rotorcraft using camera-based navigation and hazard-detection models.

Completed all Phase I objectives and produced a mission-relevant Phase II simulation plan.
Reached 0.95 +/- 0.05 FCCV in proof-of-concept validation-state testing.
Showed broad compatibility with reviewed NASA-relevant AI model families.

Applications

Where the work can be applied or adapted.

NASA and space exploration

Support deep-space missions where autonomous systems must make decisions faster than ground teams can intervene, including navigation, landing-site assessment, failure detection, and autonomous science operations.

Commercial space

Apply precomputed validation evidence to lunar landers, orbital robotics, in-space servicing, payload operations, and other missions where autonomous systems must remain reliable under changing conditions.

Broader trusted autonomy

Adapt the approach to unmanned systems, disaster response, robotic inspection, industrial automation, and other high-consequence AI systems exposed to drift or novel data.

Conclusion

Transition path

MVPT demonstrates how VISIMO approaches AI assurance for high-consequence mission systems. The Phase I work showed that precomputed transformations, model-derived embeddings, validation metrics, and mission-relevant simulation planning can help teams estimate when AI models remain valid and when they may require fallback, review, or additional validation. That makes MVPT a strong example of testable, traceable, and defensible AI decision support.

More case studies

Continue through the selected VISIMO work examples.

Synthetic imagery and evaluation data

EIKON

A synthetic-imagery R&D effort that improved image-similarity metrics and showed how synthetic visual data can strengthen object detection when real examples are scarce.

Read case study

Evidence-grounded knowledge systems

Pelarion

A scientific-review workflow that achieved 80% recall and 86% precision in screening tests while keeping source-linked evidence and human review at the center of the process.

Read case study

Carbon capture forecasting

EmiFor

A DOE Phase I example focused on amine-emissions forecasting, physics-informed modeling, counterfactual analysis, and operational decision support for carbon capture systems.

Read case study

AI-assisted legal review

AILA

A decision-support concept for accelerating legal-review workflows while keeping evidence, explainability, and human judgment visible.

Read case study

Autonomous spacecraft R&D

NASA GRAMS

A Phase I SBIR example focused on autonomous failure detection, digital twins, simulated testing, and transition evidence.

Read case study

Image forensics

Aletheia

An adversarial data detection example built around image forensics, scalable analysis, explainability, and defensible review.

Read case study

Back to case studies

Next step

Turn a mission question into a testable prototype.

VISIMO works with federal stakeholders, primes, research institutions, and technical collaborators on focused AI R&D efforts where software, data, and model evaluation can create practical mission value.

Decision support

AI assurance

Adaptive testing

Geospatial risk

Start the conversation