The stack got smaller, and that turned out to be progress.
Over the last stretch, I spent real time pushing on a few local tools to see what actually held up in a local-first working loop. Some of it was promising. Some of it was clever in theory and expensive in practice. The useful outcome was not that one tool won everything. The useful outcome was that the stack finally told the truth about what it wanted to be.
What the smaller tests actually proved
One of the better checks came from a Claude test where two agents ran work in parallel. That mattered. Delegation is real leverage. But the same experiment also made something else obvious: once too many runtimes, boards, gateways, and vendor assumptions get stacked together, you stop building leverage and start building maintenance debt.
The issue was not ambition. The issue was trust. A system can look advanced on a whiteboard and still become brittle the moment it touches real channels, real state, and real workflow pressure.
The failure that forced the standard
The cleanest lesson came from the ugliest moment. While wiring Hermes into OpenClaw, a bad messaging setup fired random messages into my own WhatsApp groups. That was not a cute bug. That was the point where the stack stopped being interesting and started being untrustworthy.
That mishap forced a harder rule: if the boundary between local execution and outward communication is not brutally clear, the system is not ready. So the WhatsApp and Telegram surfaces went off. The extra control-plane complexity stopped getting the benefit of the doubt.
If a local AI stack can leak into the wrong channel while you are still trying to reason about ownership and routing, it is too loose for serious work.
Why Hermes stayed
Hermes still feels like the right place for self-learning, iteration, and a runtime that can actually compound over time. It is close to the terminal, close to the working loop, and close to the messy reality of how operator work actually happens. The important part is not that Hermes can do everything. The important part is that it still earns its place when the work becomes repetitive, stateful, and slightly ugly.
Why Pi stayed
Pi survived too, but for a different reason. Pi is where I want the code to get a little unhinged. It is the local sandbox for fast coding passes, weird experiments, and tool-heavy sessions that do not need to pretend they are polished orchestration. Right now that lane is cleaner with LM Studio as the default local provider. When the goal is focused coding, LM Studio feels closer to just running the model and farther from managing another routing ideology.
What moved out of the center
That does not mean the rest of the stack was useless. It means each piece taught the boundary it belongs behind.
- Agent Zero taught me that interface quality, visible boundaries, and a strong cockpit matter, but that still does not guarantee a permanent place in the core working loop.
- Paperclip taught a sharper lesson: a board is not the runtime. A management layer can be useful, but the moment it starts pretending to be the worker, the system gets noisier than it needs to be.
- OpenClaw proved that explicit role routing for orchestration, reasoning, and coding is intellectually attractive, but that does not automatically make it a good everyday fit.
- Ollama was useful as a local engine, but once it was asked to help fake a multi-vendor orchestration layer and carry too many threaded personalities at once, the maintenance cost got harder to justify.
- LM Studio has felt cleaner in the places where I simply want a local model lane to exist and behave.
The smaller stack is the better story
So the direction now is intentionally narrower. Hermes stays for self-learning. Pi stays for local code. The rest moves out of the center of the story.
That is not a retreat. It is a correction. A smaller stack with honest boundaries beats an impressive stack that keeps leaking state, leaking channels, and leaking trust. I would rather keep two surfaces that earn their place every week than five that sound advanced in a diagram and become a tax in practice.
This is the long-arc story. As smaller public notes land on Hermes, Pi, LM Studio, and the stack cleanup, the related timeline below updates from the published timeline feed.