The Semantic Operating System

Why everything is data, and why that's the only way to build stateful AI that can truly plan.

The Problem with AI Planning

AI can reason. It can break down complex problems, propose multi-step plans, and adapt when things change. But there's a gap between planning and execution.

Today's LLM-based agents plan by generating text, then execute each step one at a time, reacting to state between stages. They can't look ahead at the full plan and know it will execute correctly. They can't guarantee that step 3 won't invalidate the assumptions of step 7. Every step is a fresh inference, a fresh chance for the plan to drift. This is why "agentic AI" is unreliable at scale: the execution is as non-deterministic as the reasoning.

What's needed is a machine-readable planning language that can execute workflows deterministically. Something like a programming runtime. But traditional programming runtimes are opaque. State lives in memory addresses, call stacks, and heap allocations that can't be meaningfully introspected. You can't pause a Java program mid-execution and ask "what is the current state of every business object and how did it get there?" You'd need to build observability layers, logging, tracing, and monitoring just to approximate an answer.

AI needs to be able to introspect everything, at any point, to understand what's happening. The runtime state needs to be readable, queryable, and semantically meaningful.

Everything Is Readable Data

This is the core insight behind RARS: the entire runtime state is represented as structured, queryable, human and machine-readable data that can be introspected at any point during execution.

The operating model (your matrices) is expressed as RDF. The runtime state (the context graph) is RDF. The execution history (process trees, observations) is RDF. The logic, the instructions, the operational state, the plan, the execution trace: all of it is structured data that RARS can read, query, and reason over at any moment.

This is fundamentally different from traditional runtimes. In a conventional system, you'd need to instrument your code, attach debuggers, or parse log files to understand what happened. In RARS, the state is the data. You can query the graph mid-execution to see every business object, how it got there, which process created it, and whether it's valid. Nothing is hidden in opaque memory. Everything is introspectable.

RARS reads this data to understand your domain, plan its approach, execute operations, and verify results. The AI and the runtime are one system, operating on one unified, fully transparent representation.

This is what makes NeuroSymbolic AI possible. The probabilistic reasoning of the LLM (understanding natural language, making judgment calls, adapting to novel situations) combines with the symbolic execution of the SPARQL engine (deterministic workflows, formal validation, provenance tracking). They work together because they operate on the same data.

A Pragmatic Take on NeuroSymbolic AI

There's a debate in the AI world right now between neural architecture purists, NeuroSymbolic enthusiasts, and symbolic AI proponents. Each camp has strong claims about what makes AI systems work. Our position is pragmatic.

We don't believe symbolic systems make AI reasoning smarter. Neural architectures are remarkably good at understanding intent, making judgment calls, and handling ambiguity. That's not the problem. The problem is on the other side: it is fundamentally impossible for a purely neural system to manipulate the real world. At some point, it needs to execute code to actually do something. It needs to call an API, update a record, approve a request. The question isn't whether the AI can reason about what to do. The question is whether it can do it reliably.

This is where symbolic systems earn their place. Not by making the reasoning smarter, but by making the execution plane smarter. We call this Object-Oriented AI: the AI operates through a concept-oriented programming runtime where domain objects have types, behaviors, and inheritance hierarchies, just like classes in OOP. Semantic inference and the symbolic runtime give the system capabilities that pure neural architectures can't provide on their own:

Type-based dispatch: action overloads resolve based on class hierarchies, the same way method dispatch works in OOP. The runtime selects the right implementation for the right type without the AI having to figure out which specific handler to call.
Formal plan execution: a SPARQL script executes deterministically against the graph. The plan runs as written, not as re-inferred at each step.
Continuous validation: SHACL shapes verify outputs against business rules without the AI having to remember and check every constraint.
Inference: RDFS entailment derives facts automatically (every ElectricalWorkOrder is a WorkOrder), reducing what the AI needs to explicitly reason about.

The neural architecture handles the intelligence: understanding what you want, translating intent into a plan, adapting when things change. The symbolic runtime handles the execution: carrying out that plan deterministically, validating the results, and tracking provenance.

This is what we call a ProActive architecture, replacing the ReAct pattern (reason, then act, then react to what happened, then reason again) with an architecture where the AI proactively plans a complete workflow and the symbolic engine executes it with deterministic guarantees. The AI isn't reacting step-by-step. It's planning ahead and letting a smart execution layer carry out the plan.

From Data Models to Operating Models

The industry has been building "semantic layers" for years: canonical data models that give analytics tools a consistent view of your data warehouse. This is useful, but it treats semantics as a data concept. We think semantics is an operations concept.

In traditional systems, business logic is code that acts on data. You write rules, workflows, and decision trees in a programming language, and they read and modify records in a database. The data and the logic are separate concerns.

Poliglot dissolves this separation. The data is the operating model. The ontology defines what can exist. The shapes define what's valid. The actions define what can be done. And critically, the instance data (the actual state of your system) is what drives what does happen.

When RARS materializes a work order from your system of record, it doesn't follow a predetermined code path. It observes the state. If that work order has properties that make it an electrical work order, the class hierarchy resolves, the electrical-specific actions become available, the permit validation rules apply. The business logic emerges from the data at the moment of observation. Until RARS observes the state, the possible paths are open. Once it does, the semantics of the data determine what happens next.

This is why we don't build "semantic layers" for data. We build canonical operating models where the semantics drive operations. The data isn't separate from business logic. The data encodes the business logic. RARS derives its behavior from the state of the system, not from separately written code.

Why RDF?

Given all of this, why RDF specifically? Not because of "semantic web" idealism or knowledge graph trends. Because RDF has a unique property that no other data representation shares: it's simultaneously a programming state representation and a natural language-adjacent data format.

It's structured enough to execute against deterministically (SPARQL queries, pattern matching, inference rules)
It's readable enough that an LLM can reason about it (triples are close to natural language: subject, predicate, object)
It's composable across independent authors (URI-based identity, standard vocabularies, no schema coordination required)
It supports inference (class hierarchies, property relationships, RDFS/OWL entailment)
It's self-describing (every resource can carry labels, definitions, annotations)
It's queryable as a live graph (not a static document format, but a runtime state representation)

No other format gives you all of these. JSON is readable but not composable or inferrable. SQL is queryable but not self-describing or composable across authors. Programming languages are executable but their state is opaque to the AI. RDF is the intersection of all the properties needed for a NeuroSymbolic runtime.

The Three Layers

The semantic model has three layers that work together:

Ontology (RDFS/OWL)

Defines the vocabulary: what types of things exist, what properties they have, how they relate. This is the "type system" of your operating model. RARS uses it for inference (every ElectricalWorkOrder is automatically a WorkOrder) and for understanding your domain (reading labels and definitions to reason about concepts it hasn't seen before).

See Modeling Your Domain for how to think about designing your ontology.

Constraints (SHACL)

Defines what valid data looks like. SHACL shapes are validated continuously, at assembly time and at runtime. They act as compiler diagnostics for your operational state: errors, warnings, and info-level findings that surface as RARS works. This continuous validation is a core part of what makes RARS pseudo-deterministic. Even when an AI reasoning step produces the output, the shapes verify that the result conforms to your business rules.

Instance Data (RDF)

The live operational state. Every entity, every property value, every relationship is a triple in the context graph. Every triple is an observation with provenance (see Provenance and Observability). This isn't a static database. It's the runtime state of your operating model, where every change is tracked and attributable.

From Query Language to Operating System

Standard SPARQL is a query language. You ask questions about a graph and get answers back. It's powerful for analytics and data retrieval, but it doesn't do anything. It can't call an API, approve a work order, or send a notification.

We rebuilt SPARQL from the ground up, transforming it from a query language into a declarative, procedural DSL that actually operates on the world. In RARS's SPARQL engine, a single script can:

Traverse the knowledge graph to understand the current state
Invoke service integrations that call your external systems of record
Execute actions that mutate those systems (create records, update statuses, trigger workflows)
Delegate to sub-agents for non-deterministic reasoning steps
Pause for human-in-the-loop approval
Write observations back to the graph with full provenance

This is what turns RDF from a knowledge representation into a semantic operating system. The graph isn't just something you query. It's something you operate through. A SPARQL script is a deterministic workflow definition that blends graph traversal, external service calls, AI reasoning, and human judgment into a single executable plan.

# A single SPARQL script that orchestrates a complete workflow
CONSTRUCT {
    ?workOrder  wo:status      ?status ;
                wo:priority    ?priority ;
                wo:approvedBy  ?approver .
}
WHERE {
    # Read current state from the graph
    ?workOrder wo:GetWorkOrder (
        wo:workOrderId "WO-2024-0891"
    ) .

    # Call an AI sub-agent to assess risk
    ?assessment wo:AssessRisk (?workOrder) .
    ?assessment wo:priority ?priority .

    # Pause for human approval
    ?approval wo:RequestApproval (
        ?workOrder
        wo:assessment ?assessment
    ) .
    ?approval wo:approvedBy ?approver .

    # Mutate the external system
    ?dispatch wo:DispatchWorkOrder (
        ?workOrder
        wo:approval ?approval
        wo:priority ?priority
    ) .
    ?workOrder wo:status ?status .
}

This script is a complete business workflow expressed as data. RARS can inspect it, reason about it, and execute it deterministically. The graph traversals, API calls, AI reasoning, and human approvals are all operations within the same declarative language. Because the plan itself is data, RARS can:

Plan: inspect the current state, understand what actions are available, and construct workflows like this as SPARQL scripts
Execute deterministically: the script runs against the graph with formal semantics. Each step produces defined outputs. The plan executes as written, not re-inferred at each step.
Verify: SHACL shapes validate results continuously. I/O contracts on actions guarantee output schemas. The provenance system records exactly what happened.
Adapt: when something unexpected happens (a service fails, data doesn't match expectations), the probabilistic AI reasons about the situation and adjusts. The symbolic engine executes the adjusted plan with the same deterministic guarantees.

See The NeuroSymbolic Engine for a deeper look at the runtime architecture.

This is the NeuroSymbolic loop: the LLM reasons and plans, the symbolic engine executes and validates, and the shared data representation makes it all one coherent system. See The NeuroSymbolic Engine for a deeper look at how this works.

Summary

Everything is data: the runtime state, the logic, the execution history are all represented as queryable, human and machine-readable data
Operating models, not data models: you're codifying business logic, not cataloging data for analytics
RDF is the runtime state: not a storage format, but the live representation that both the AI and the symbolic engine operate on
Deterministic execution on semantic state: SPARQL provides a formal execution language against a graph the AI can inspect and reason over
NeuroSymbolic by design: probabilistic reasoning and symbolic execution work together because they share the same data representation