The harness I run over the models.
Most people use AI through a chat window — the raw model, one conversation at a time, forgotten by morning. Jenn OS is the layer I built in between: the scaffolding that turns frontier models into a system that remembers my work, enforces my standards, and compounds across every session.
The model is the engine. The harness is everything around it — the agent loop, the tools, the memory, the guardrails. The whole bet is there: a model is a commodity you rent and swap, but the harness is the part that's actually mine, and it's where judgment, memory, and verification live.
Where I sit as an AI user
There is a real split forming between people who use AI and people who build a harness around it. A harness is the scaffolding that turns a raw model into a working agent: the loop that calls the model and runs its tools, the context it carries between steps, the memory it writes, and the guardrails that catch it when it drifts. Building an agent is barely about calling a model in a loop — that part is trivial. The rest is context management, tool execution, and error handling. That “rest” is the system.
So I stopped treating the model as the product. The model is the interchangeable engine; the harness is what determines whether the work is safe, repeatable, and worth trusting tomorrow. Jenn OS is my harness. It is the difference between asking a chatbot a question and running an operating layer that already knows my projects, my stakeholders, my standards, and what I touched yesterday.
Figure 1A
The harness stack
The model sits at the bottom and can be swapped. Everything above it is the part I built and keep — the layer that turns a raw model into a system that remembers and enforces.
My data flows
Signal comes in raw and scattered.
Voice notes, two agents' worth of session logs, email, documents, app traces. None of it is useful while it sits in the place it landed.
It gets cleaned, then indexed.
Sync scripts redact secrets and names at the write boundary. An index builder runs every session, so there is one place to ask instead of cold-searching.
It lands in records I can read.
Build logs, people memory, the standing index, the failure ledger — plain local files. Inspectable beats magical, every time.
Figure 2A
How my data moves
Nothing important lives only in a chat window. Raw signal comes in, gets cleaned and indexed, lands in plain local files I can read by hand, and only then becomes a surface I act on. The store is the source of truth; the interface is just a view of it.
The external tools I run on
Claude Code
primary agentThe deepest programmable harness available — CLI, skills, lifecycle hooks, subagents, and dynamic workflows. This is where I shape my own agent instead of renting a managed one.
Codex / GPT-5.3
second agentThe long-running, autonomous half. It owns its own worktree namespace and shares my skills and build logs, so the two agents never overwrite each other's work.
Model tiers
enginesOpus 4.7 for judgment and synthesis, Sonnet 4.6 for parallel fan-out, Haiku 4.5 for fast routing and hooks. Chosen by the job in front of me, not out of habit.
MCP servers
toolsGmail, Playwright, Supabase, Vercel, Pinecone, and a cross-model interlocutor. Written once against the Model Context Protocol and reachable by either agent.
GitHub + Vercel
durable spineThe operating layer is a private repo; the public site auto-deploys on push. Local archives are buffers, never the source of truth.
Plaud
captureA voice recorder for meetings, calls, and stray thinking. Transcripts become the raw input layer — paraphrased into my own records, never quoted into anything I ship.
Claude and Codex enter through different doors but read the same contract, the same skills, and the same build logs. Each owns a deterministic branch-and-worktree namespace, so two agents can run at once without stepping on each other. The same rule fires whether work starts through a shell wrapper or directly. One language, two engines.
What makes it good
The daily loop it ships
Morning brief
Start from local truth rather than from whatever happens to be in the current chat window.
Context pack
Prepare the right project before work begins, not halfway through the task.
Working surface
Keep the active contract visible while the work is moving so the agent does not drift.
Scoped memory
Treat closeout as part of the product rather than as an optional afterthought.
What it refuses to be
The interesting question stopped being whether models can collaborate. They can. The question is where judgment, memory, and verification should live when the models, shells, and plugins keep changing under you. A harness that hides its own memory or can't explain a step isn't an operating layer; it's just a prettier way to forget.
So Jenn OS holds a few hard lines. They're what keep it from drifting into the demo-friendly version of itself.
Bottom line
Own the harness, rent the model. The harness is where my work compounds — and it stays mine no matter which model is winning that week.