Contractual AI

The Trust Problem

When an AI system produces a result, how do you know it's correct? With a deterministic API call, you can validate the response against a schema and trust the contract. With an LLM reasoning step, you get unstructured output with no guarantees. The AI might produce exactly what you need, or it might hallucinate, drift, or produce something structurally invalid.

Most AI systems deal with this by adding post-hoc validation: check the output, retry if it's wrong, hope for the best. This is fragile. The more autonomous the AI becomes, the more you need a systematic answer to the question: can I trust this output?

The I/O Contract

Every action invoked by RARS has an explicit I/O contract: a defined input schema (the payload) and a defined output schema (the result). The contract is the same regardless of how the action executes. Whether it's a deterministic service call, a SPARQL script, an agentic reasoning task, or a human approval step, the contract defines what goes in and what must come out.

This uniformity is the foundation of trust. The caller doesn't need to know or care whether a result was produced by a direct API call or by an AI that reasoned through multiple steps. The contract guarantees the output schema. RARS validates every result against the contract before it's accepted.

Agentic Actions, Same Standard

This is where the architecture diverges from the rest of the AI industry. In most systems, deterministic operations have contracts (API schemas, type systems) and agentic operations don't (you get back whatever the LLM generates). The two operate under fundamentally different trust models.

In RARS, agentic actions are held to the same standard as deterministic ones. When a sub-agent reasons through multiple steps to produce a risk assessment, the output is validated against the same SHACL shapes as a risk assessment produced by a direct API call. If the output doesn't conform, RARS catches it.

This means you can swap an action's handler from deterministic to agentic (or vice versa) without changing anything for the caller. A SummarizeReport action might start as a direct LLM call. Later, you could change it to an agentic handler that reads source data, cross-references multiple systems, and produces a richer summary. The contract stays the same. Callers are unaffected. The trust model is unchanged.

Continuous Validation

SHACL shapes don't just validate action outputs. They validate the entire operational state continuously. Every mutation to the graph (whether from an action result, a direct insertion, or a sub-agent's work) is checked against the shapes from all activated matrices.

Think of this as compiler diagnostics for your business state. Violations surface as errors, warnings, or info-level findings in real time. RARS can inspect these findings, trigger governance workflows, or self-correct. You can review them in the validation view of the IDE.

This is what produces pseudo-deterministic AI. RARS can reason flexibly (the probabilistic layer adapts, makes judgment calls, handles ambiguity), but the outputs are grounded in formal specifications (the symbolic layer validates, enforces contracts, and tracks provenance). The result is AI behavior that's as reliable and auditable as deterministic code, with the adaptability of machine learning.

The Diff View

Every change RARS makes to the operational state shows up in the IDE as structured, reviewable code. Additions and retractions are presented as a diff, just like a code review. Each line traces back to the specific observation and process that produced it, with full provenance.

This is the final layer of trust. Even if every validation passes and every contract is satisfied, you can still review exactly what changed, why it changed, and who (which agent, which process) made the change. The diff view isn't a log. It's a structured, per-statement audit with full provenance.

Human oversight is declared in the plan, not bolted on after the fact. When a workflow needs approval (sign-off on a large procurement, partner review of a tax return), that approval is a step in the executable plan with its own I/O contract: the runtime pauses, the human signs off, execution continues. For routine operations that don't require sign-off, the contracts and continuous validation provide confidence without manual review. The level of human oversight scales with the sensitivity of the work, and the diff view gives you an accurate record of what was done either way.

Summary

I/O contracts on every action: defined input and output schemas, validated uniformly regardless of handler type
Agentic outputs held to the same standard: AI-produced results are validated against the same schemas as deterministic API responses
Handler swappability: change an action from deterministic to agentic without breaking callers or the trust model
Continuous validation: SHACL shapes check the full operational state in real time, surfacing violations as compiler diagnostics
Pseudo-deterministic AI: flexible reasoning grounded by formal validation, producing reliable and auditable behavior
Diff view for operations: every change reviewable as structured code with per-statement provenance