We use cookies to understand how you use this site and improve your experience.

Anatomy of a Fail-Closed System

Alexandru Mareș

Published10/02/2026

Read time3 min

Topics

Architecture Safety YON

Actions

00

Comments

Loading comments…

The Weight

I built the Runner to say no.

Not because I distrust the machine. Because I respect the weight of action. An agent does not just transmit data. It generates intent. It writes plans. It asks to execute code. If I treat an agent's output as a script to be run blindly, I surrender control.

I needed a boundary.

The Separation

The YON architecture separates the Plan from the Actor. I think of it as a firewall.

YON is the notation of intent. It is a document. It is inert. A plan cannot delete a database. A plan cannot spend money.

The Parser is the reader. It translates text into structure. It validates syntax. It never touches the system.

The Runner is the actor. It translates structure into consequence. It holds the keys. It is the only component capable of harm.

Because the Runner holds the power, the Runner must hold the discipline.

The Pipeline

I designed the Runner around five phases. A YON document must survive all five before any side effect occurs:

Parse. Build the AST from text. Line boundaries are the markers.

Validate. Check structural rules and profile constraints. Verify required header fields.

Resolve. Build the dependency graph. Map in and out references to establish the flow.

Plan. Topological sort of @STEP records to determine optimal, parallelizable execution order.

Execute. Run operations within a sandboxed context. Perform per-operation permission checks.

The Check

When the Runner reaches a @STEP in the Execute phase, it enters a strict evaluation:

Identify. Extract the operation identifier (e.g., std:fs.write).
Lookup. Consult the active Policy for a matching rule.
Evaluate. Check for an explicit ALLOW. Evaluate any conditions.
Execute or Reject.

If no rule matches, the operation fails. If the rule says DENY, the operation fails. If a condition fails, the operation fails.

The system does not fail open. It fails closed.

I remember the moment I first tested this. I deliberately submitted a @STEP without a matching policy. The Runner returned E003: Permission denied. It felt like relief. The machine had refused to act without my explicit consent.

Risk Profiles

Not all operations carry equal weight. I categorize them:

🟢 SAFE. Pure functions. Transformations. Logic. std:data.render or std:control.if. No external side effects. I auto-allow these for internal processing.

🟡 GATE. External calls. std:http.get or std:ai.prompt. These leave the boundary of the system. They incur cost or leak information. They require a domain allowlist.

🔴 RISK. Local mutations. std:fs.write or std:sys.shell. These change the state of the host. They are dangerous.

For 🔴 RISK operations, a static allowlist is often insufficient. The Runner supports a PROMPT action:

@RULE op=std:fs.write | action=PROMPT

When the Runner encounters this rule, it pauses. It delegates the decision to the human. If the human approves, the step executes. If denial or timeout, the step fails.

The machine never leads. The human always decides.

The Architecture of Trust

Trust is not a feeling. It is a verified state.

By forcing every action through the five-phase pipeline and the policy engine, I transformed the nature of the agent. It stopped being a black box script. It became a verifiable participant in a governed system.

The parser ensures the plan is readable. The Runner ensures the action is permissible.

I do not rely on the agent to be good. I built the Runner to be strict.