There is a puzzle most people know. Three pegs. A stack of disks. You move them one at a time, and you never put a larger disk on a smaller one. It has a name — Tower of Hanoi. And it has a rule.
A team at Tufts University turned that puzzle into a robotic benchmark. Blocks instead of disks. Platforms instead of pegs. Same constraint. Then they compared two kinds of AI on it, and the energy number came back off by two zeros.
Same task, two architectures
The setup is described in the paper The Price Is Not Right — Duggan, Lorang, Lu, and Scheutz of the Tufts Human-Robot Interaction Lab compared a fine-tuned vision-language-action model with a neuro-symbolic architecture on a manipulation version of the Tower of Hanoi. One was a standard VLA — a robot model that looks at a screen, reads instructions, and turns them into actions. The other was neuro-symbolic. It learned the messy movement on the bottom and used explicit symbolic planning for the rule-based part on top.
The numbers are clean. On the 3-block task the neuro-symbolic system solved 95 of every 100 puzzles; the best VLA solved 34. On an unseen 4-block variant the neuro-symbolic system kept 78% success, and both VLAs failed every attempt. And then the energy beat. Training the neuro-symbolic model used about 1% of the energy a standard VLA needed; during operation it drew about 5%. I checked twice. Not a typo.
For the last few years the answer to almost every AI problem has sounded the same. More data. Bigger model. More compute. More energy. The electricity curve around AI infrastructure keeps bending upward, and the default has been: scale.
This paper did the opposite move.
The neural part wasn't asked to rediscover the whole puzzle from examples. That was the important move. What the team added wasn't more parameters. It was structure. Written down. The kind a child learns when they pick up the puzzle.
This block can go here. That one cannot go there. The sequence must preserve the rule.
Researchers have been circling this for years
Yoshua Bengio used the phrase system 2 deep learning in his NeurIPS 2019 keynote to name the same direction — AI that can reason, plan, and generalize beyond surface patterns. The argument runs roughly like this: a useful AI system often needs two things at once. Pattern-matching, which neural networks do well. And structured reasoning, which pure neural systems often do badly.
The Tufts paper shows the cost of ignoring that distinction. Two zeros off the training energy. Almost three times the puzzles solved.
The cheap takeaway is "neural networks are over." That's not what's happening. Both halves do work the other half cannot. The neural part handles messy perception. The symbolic part keeps the task consistent, inspectable, and cheap.
A receipt for a category I named
This is the same shape as a point I named in elastic automators. The mistake is asking the system to be a mind. Treat it as automation with structure, and the engineering gets cleaner. And cheaper.
I have to be careful here. The Tufts result is one paper, on one benchmark, on a robotic puzzle whose rules fit on a sticker. The narrower claim is enough. When the task has rules, hiding them inside weights is wasteful.
So the lesson isn't really about one robot moving blocks. It is about the two zeros.
Structure first, then scale
For a long time the recipe in AI was: scale first, structure later. This finding flips that. Structure first. Then scale.
That is the principle the work runs on at YounndAI: Structure before Scale. The paper just gave it a receipt. Same puzzle. Better answers. One hundredth of the training energy. The model didn't get bigger. It got more legible.
Loading comments…