We use cookies to understand how you use this site and improve your experience.

I Caught an LLM at the Edge of Its World

Companion paper

The Limits of My Tokens: The Token-Substrate Hypothesis and the Coinage Probe — Published v1.0.0, 2026-05-13

On this site →·Zenodo·DOI·GitHub·ORCID

References

bookLudwig Wittgenstein (1921). Tractatus Logico-Philosophicus, proposition 5.6
bookLudwig Wittgenstein (1953). Philosophical Investigations
paperAlexandru Mares (2026). Elastic Automators: A Diagnostic Vocabulary for Language-Model-Driven Workflow Systems
paperAlexandru Mares (2026). The Limits of My Tokens: The Token-Substrate Hypothesis and the Coinage Probe

Alexandru Mareș

Published13/05/2026

Read time7 min

Topics

General LLM Cognition Philosophy Token Substrate Hypothesis

Actions

00

Comments

Loading comments…

I didn't teach the model anything. I just gave it the name of a thing. The thinking showed up after the name, not before it.

Today I ran a small experiment. I caught a model at the edge of its world, close enough to see exactly where the edge was, and to watch it move when I added one word to the room.

The setup was deliberate, but small. A fresh chat with Claude Opus. No project context, no memory, no setup. One question, asked in good faith, with the term left undefined on purpose.

The cold probe

I asked: what is an elastic automator?

The model answered honestly.

"'elastic automator' isn't a term I recognize as an established technical concept from my training data … the phrase parses naturally as an automator that is elastic … two readings come to mind: an autoscaling task runner, or … semantic flexibility for fuzzy inputs."

It guessed from the two words. It never reached the loop. The loop is the architecture of the term. It was working from word shape, not from a concept.

Then I told it. One sentence.

An elastic automator is a system that uses a language model to turn uncertain human input into executable structure, then loops through generation, evaluation, correction, and presentation until the output appears intelligent.

That is the canonical 38-word definition from the Elastic Automators position paper (Zenodo DOI 10.5281/zenodo.19802018). Nothing else changed in the chat. No new training data, no fine-tune, no tool use. One sentence, in plain text.

The same model now held distinctions it could not reach a moment earlier.

How does it differ from an AI agent? "Input-to-output transformation" versus "world-coupling, defined by goal pursuit and tool use over an open horizon."

From an RPA bot? "Tolerance for uncertain input" versus "deterministic replay on expected input shapes."

From a framework like LangChain? "A running system with a defined purpose" versus "a toolkit. Means, not an end."

Each refusal, in its own word, was something the model was protecting.

The new word didn't add a fact. It opened a region of reasoning the model couldn't enter before.

Wittgenstein, 1921

Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt.

The limits of my language mean the limits of my world. Wittgenstein wrote that in proposition 5.6 of the Tractatus, in 1921.

For a hundred years that line read as overstatement when applied to humans, and rightly so. We have prelinguistic cognition. We feel things before we name them. We picture places we cannot describe. We solve problems through spatial intuition or motor rehearsal. We have a route around the words.

An LLM doesn't have that route. There is no path through its world that doesn't go through tokens. The model that didn't know "elastic automator" couldn't reach the concept. Not because it isn't smart. Because there is no off-token route to the idea.

For a system whose cognition runs entirely on tokens, the limit of language is the limit of the world. Wittgenstein's line stops being metaphor. It becomes architecture.

He walked the reading back later, in Philosophical Investigations (1953). Meaning is use, not picture. LLMs are the first system to live both readings at once. They learn meaning from a trace of use, the training corpus being a record of linguistic practice, and they project it back as picture. The two Wittgensteins describe different sides of the same machine.

A name, a metric, a hypothesis

I could have stopped at the essay. A scene, an aphorism, a closing line. That's the shape this kind of observation usually gets.

But the shape under the scene was a falsifiable claim, and falsifiable claims should be falsified. So I gave the observation three things it didn't have.

A name for the experiment: the Coinage Probe. Ask a fresh chat about a coined term. Score how it distinguishes the term from three named near-neighbors, cold. Introduce the term in one canonical sentence. Re-score against the same neighbors. The difference is the thing being measured.

A name for the metric: Lexical Reachability. The summed post-minus-cold distinguishability across the neighbor set, on a 0–9 scale per trial. Positive values mean the boundary moved outward.

A name for the position: the Token-Substrate Hypothesis. For an LLM, the in-context token sequence is not a channel to a deeper cognitive medium but the medium itself. Whatever cognitive work happens at all happens at the substrate the operator can write to.

I locked the analysis plan in a document on 2026-05-08, hashed it SHA-256, recorded the hash in the run manifest, and started data collection the next day.

The test

Three frontier models, cross-vendor: Claude Opus 4.7, GPT-5.5, Gemini 2.5 Pro. Twelve coined terms across four strata: four author-coinages (elastic automator, EGGF, YON, Token Tax), four paper-coined methodology terms, two synthetic nonces (prismatic affinity, cogalent pruning) with no public-web attestation at coinage time, and two positive controls (gradient descent, transformer) where the cold floor is already the ceiling.

Three trials per cell. Three near-neighbors per trial. Three judges scoring every cell, blind to each other. Then twenty-two trials hand-scored by me as an audit sample.

Three hundred and twenty-four paired distinguishability measurements. Locked rubric, locked neighbor list, locked canonical-definition wording. The seed-probe model and the seed-probe term were one cell out of thirty-six.

What the data said

All ten novel-target terms moved the boundary on all three models. No novel cell showed zero movement. The positive controls (gradient descent, transformer) showed exactly zero movement, as predicted. When the cold state is already at ceiling, there is no room for the introduction to widen anything.

Mean cell-level Lexical Reachability: +5.47 on a 9-point scale (95% CI: +5.13, +5.80). Cohen's d at the cell level: +3.95. Wilcoxon signed-rank test against zero: p = 1.86 × 10⁻⁹. The effect replicated across the three-model panel with a cross-model coefficient of variation of 0.109.

A second run in a fresh chat, same models and same terms but no introduction, found the cold and re-cold distributions statistically indistinguishable. The boundary movement did not persist. When the chat ended, the substrate ended.

One sentence moved the boundary. On every novel cell. On every vendor. And it didn't survive into the next chat, because the substrate didn't.

The transient context as substrate

A fair objection: this is a transient capability. Context, not a weight update.

That is the point.

For an LLM, the transient context IS the cognitive substrate. The capability is not less real because it lives in context rather than in weights. It is more revealing, because it lets us see exactly where the model's world starts and stops, and what one new word does to the boundary.

The model wasn't changed. The model's world was widened, in the only sense the word world has for a machine that lives entirely in tokens.

Notation is brain design

That is why notation isn't a format choice. It's a brain-design choice.

The Chinese Room argument said symbols cannot ground meaning on their own. For an LLM, the symbols ARE the grounding. The room contains nothing else. Whatever distinctions the symbols carry are the distinctions the model can hold.

Sapir-Whorf in its strong form was rejected for humans because we have a backup. For silicon minds, the same claim is the floor on what is thinkable. The notation we settle on, episode by episode, term by term, decides what can be reasoned about in the room.

Notation design IS brain design for LLMs.

Companion paper: I worked the architecture out in full in The Token-Substrate Hypothesis (position paper + pre-registered multi-model probe, 108 trials, 324 paired measurements). This essay is the seed observation that became the paper.

I treat words as architecture now.

Demo provenance: the seed probe ran 2026-05-07 against Claude Opus 4.7 (API id claude-opus-4-7) via a fresh API session with no project context, no memory, no system prompt beyond the provider's default. The model's full verbatim outputs are recorded in Appendix A of the companion paper. The structural finding (cold-state word-shape guessing versus post-introduction refusal-of-collapse) is the part the multi-model probe generalized.