We use cookies to understand how you use this site and improve your experience.

Alexandru Mareș@allemaar
Alexandru Mareș
  1. Home
  2. Concepts
  3. Notation As Alignment
Email
RSS
YounndAIYou and AI, unifiedBuilt withNollamaNollama
Concept · active

Concept — Notation as Alignment

Notation as AlignmentRuntime Structural Alignment
Canonical artifact →


# Notation as Alignment

## Definition

**Notation as Alignment** is the claim that AI alignment can be enforced at runtime, through structural constraints in the notation the agent reads and writes — without retraining, without weight modification, and without the inspectability problem that training-based approaches inherit. Where RLHF, constitutional AI, and reward modeling encode alignment into model weights ("locks on doors, and locks can be picked"), notation-as-alignment encodes it into the shape of the hallway the model has to walk through.

The mechanism contrast is direct: training-based alignment is opaque (you cannot inspect which constraint failed when a jailbreak succeeds); notation-based alignment is inspectable, editable, and auditable as plain text.

## Coined by

Alexandru Mares (allemaar)

## First published

2026-04-13 — Episode E0015 *Notation as Alignment* and the companion essay [`notation-as-alignment`](https://allemaar.com/writing/notation-as-alignment).

## Canonical artifact

- **Coining episode:** [[2026-E0015 - Notation as Alignment/_metadata|E0015 — Notation as Alignment]]
- **Companion essay:** [Notation as Alignment](https://allemaar.com/writing/notation-as-alignment) (allemaar.com), 2026-04-13.
- **Series context:** Structure Before Scale — follows [[2026-E0014 - The Strong Form/_metadata|E0014 — The Strong Form]], precedes [[2026-E0016 - The Borges Warning/_metadata|E0016 — The Borges Warning]].
- **Hook:** *"Every AI safety technique today is basically a lock on a door. And locks can be picked."*
- **Closing line:** *"And right now, almost nobody is building them."*

## Bodies that develop this concept

- [[2026-E0034 - The First Law That Doesn't Know What AI Is/_metadata|E0034 — The First Law That Doesn't Know What AI Is]] (2026-05-06) — applies notation-as-alignment at continent scale: legal prose (Article 3(1) of the EU AI Act) as the bracket that forces a moving category to hold still; "the format does work on the thing" extended from tooling/research-paper notation to statute notation.
- [[2026-E0036 - What Notation Did to History/_metadata|E0036 — What Notation Did to History]] (2026-05-09) — historical case material the concept presupposes: writing → musical notation → calculus, three moments where notation created a kind of thought that wasn't there before. Anchors the format-shapes-thought lineage (Ong, Goody, Cajori) and is the long-form argument the coining piece compressed.
- [[2026-E0037 - I Caught an LLM at the Edge of Its World/_metadata|E0037 — I Caught an LLM at the Edge of Its World]] (2026-05-13) — the live demonstration: one defined word opens a region of reasoning the model could not enter before. Notation as alignment at the session level — what gets carried in tokens decides what distinctions the model can hold. Companion to the [[token-substrate-hypothesis|TSH]] paper that formalized the mechanism.

## Related concepts

- [[yon|YON]] — the notation that operationalizes runtime structural alignment
- [[ai-cognition|Cluster: AI Cognition]] — parent Cluster
- [[structure-before-scale|Structure Before Scale]] — the principle this concept operationalizes for the safety surface
- [[synthetic-clarity|Synthetic Clarity]] — adjacent discipline (gates over filters, structure over training)
- Emitter Faithfulness · Sapir-Whorf for AI · The Grooves (E0004)
- RLHF · Constitutional AI · Reward Modeling (the techniques notation-as-alignment complements, not replaces)

## Why it matters

Almost the entire AI safety conversation is weight-centric — training-based, opaque, and brittle in exactly the way jailbreaks expose. Notation as Alignment names a complementary mechanism that the conversation is missing: alignment as runtime structure, written in plain text, readable and changeable without retraining. Jailbreaks succeed against trained constraints partly because nobody can see which tumbler gave way; runtime structural constraints fail visibly, in slow motion, where an operator can intervene.

The term's value is to make the alternative *findable*. Once it has a name, the absence becomes auditable: which AI safety teams are working on the runtime/notation layer? Which are working only on weights? The closing line — "almost nobody is building them" — only lands once the category exists.

## Status

`active` — coined 2026-04-13. The complementary-not-replacement framing (alongside RLHF / constitutional AI / reward modeling) is the canonical positioning. Anchors the notation-side of the AI Cognition Cluster's safety subsurface.

Bodies that mention this term (6)

  • The Patterns I Stopped Doing by Hand
    2026-06-16
  • I Caught an LLM at the Edge of Its World
    2026-05-13
  • What Notation Did to History
    2026-05-08
  • The First Law That Doesn't Know What AI Is
    2026-05-06
  • Notation as Alignment
    2026-04-16
  • The Quiet Law: Encoding Ethics into Syntax
    2026-01-15