Navigate
HomeStart here
MusingsResearch & long-form
BuildingProjects & learnings
WorkProfessional practice
RunningTraining & races
AboutValues & identity
Notes & ArchiveJournals, essays, portfolio
Workflow essayTechnology & IntelligenceUpdated April 2026Architecture synthesis + product framing
Jenn OSOperating layer2026

The harness I run over the models.

Most people use AI through a chat window — the raw model, one conversation at a time, forgotten by morning. Jenn OS is the layer I built in between: the scaffolding that turns frontier models into a system that remembers my work, enforces my standards, and compounds across every session.

The model is the engine. The harness is everything around it — the agent loop, the tools, the memory, the guardrails. The whole bet is there: a model is a commodity you rent and swap, but the harness is the part that's actually mine, and it's where judgment, memory, and verification live.

What runs the harness
Two agents — Claude Code and Codex — speaking one shared language.
Three model tiers: Opus for judgment, Sonnet for fan-out, Haiku for routing.
MCP tools written once, reachable by either agent.
Memory in plain local files, not a hidden vector store.
agents
2
Claude Code and Codex, one shared contract.
model tiers
3
Opus, Sonnet, Haiku — by job, not by habit.
memory store
local
Plain files I can read by hand.
hidden telemetry
0
The source of truth stays explicit.
01

Where I sit as an AI user

There is a real split forming between people who use AI and people who build a harness around it. A harness is the scaffolding that turns a raw model into a working agent: the loop that calls the model and runs its tools, the context it carries between steps, the memory it writes, and the guardrails that catch it when it drifts. Building an agent is barely about calling a model in a loop — that part is trivial. The rest is context management, tool execution, and error handling. That “rest” is the system.

So I stopped treating the model as the product. The model is the interchangeable engine; the harness is what determines whether the work is safe, repeatable, and worth trusting tomorrow. Jenn OS is my harness. It is the difference between asking a chatbot a question and running an operating layer that already knows my projects, my stakeholders, my standards, and what I touched yesterday.

Figure 1A

The harness stack

The model sits at the bottom and can be swapped. Everything above it is the part I built and keep — the layer that turns a raw model into a system that remembers and enforces.

Surfaces
Morning brief, context packs, this site, the Command Center
Memory
Build logs, people memory, the standing index, the failure ledger — plain local records
Harness
The agent loop, tool calls, context management, hooks, guards, subagents, workflows
Models
Opus 4.7, Sonnet 4.6, Haiku 4.5, Codex / GPT-5.3 — interchangeable engines
02

My data flows

Signal comes in raw and scattered.

Voice notes, two agents' worth of session logs, email, documents, app traces. None of it is useful while it sits in the place it landed.

It gets cleaned, then indexed.

Sync scripts redact secrets and names at the write boundary. An index builder runs every session, so there is one place to ask instead of cold-searching.

It lands in records I can read.

Build logs, people memory, the standing index, the failure ledger — plain local files. Inspectable beats magical, every time.

Figure 2A

How my data moves

Nothing important lives only in a chat window. Raw signal comes in, gets cleaned and indexed, lands in plain local files I can read by hand, and only then becomes a surface I act on. The store is the source of truth; the interface is just a view of it.

Inputs
Plaud voice transcripts
Claude + Codex session logs
Gmail and forwarded mail
Documents and app traces
Processing
Sync scripts with secret + name redaction
The index builder, run every session
Concept-graph extraction
Local stores
Build logs at three layers
People memory + stakeholder ledgers
The standing index (jenn-index.json)
Failure ledger + complaints registry
Surfaces
Morning brief from open loops
Context packs before a task starts
This site + the Command Center
03

The external tools I run on

Claude Code

primary agent

The deepest programmable harness available — CLI, skills, lifecycle hooks, subagents, and dynamic workflows. This is where I shape my own agent instead of renting a managed one.

Codex / GPT-5.3

second agent

The long-running, autonomous half. It owns its own worktree namespace and shares my skills and build logs, so the two agents never overwrite each other's work.

Model tiers

engines

Opus 4.7 for judgment and synthesis, Sonnet 4.6 for parallel fan-out, Haiku 4.5 for fast routing and hooks. Chosen by the job in front of me, not out of habit.

MCP servers

tools

Gmail, Playwright, Supabase, Vercel, Pinecone, and a cross-model interlocutor. Written once against the Model Context Protocol and reachable by either agent.

GitHub + Vercel

durable spine

The operating layer is a private repo; the public site auto-deploys on push. Local archives are buffers, never the source of truth.

Plaud

capture

A voice recorder for meetings, calls, and stray thinking. Transcripts become the raw input layer — paraphrased into my own records, never quoted into anything I ship.

The cross-agent bridge

Claude and Codex enter through different doors but read the same contract, the same skills, and the same build logs. Each owns a deterministic branch-and-worktree namespace, so two agents can run at once without stepping on each other. The same rule fires whether work starts through a shell wrapper or directly. One language, two engines.

04

What makes it good

Rules become machinery
Enforcement, not posters
Most defects already had a rule against them; the gap is enforcement. So a rule that keeps getting broken becomes a hook or a guard that fails the build. The most recent example shipped today: a hard naming rule for this very site is now a pre-build check that blocks any deploy that violates it, plus a redactor that scrubs the name at generation. A cultural rule turning into code is the part that compounds.
One place that already knows
The standing index
Every session rebuilds a single catalog of people, pages, projects, routes, skills, and past failures. The first move is to ask the index, not to cold-search or ask me for something the system already holds. When the index can't answer, that's a signal to widen what it captures.
Failures are kept, not buried
Cross-project learning
When something breaks in a way that could repeat, it goes in a shared failure ledger with how it was detected and what now prevents it. A fix isn't done until it answers: what failed, how we caught it, how we know it's resolved, and what stops a recurrence.
Decisions get priced
Run the numbers
Before an irreversible fork, the system prices the options — base rate, prior, a posterior with a range, and what evidence would move it — instead of going on a hunch. It doesn't decide; it makes the decision legible enough to decide well.
05

The daily loop it ships

1 / Open

Morning brief

Start from local truth rather than from whatever happens to be in the current chat window.

Open loops from build logs and pinned commitments
Active repo and project state
What changed since the last touch
2 / Route

Context pack

Prepare the right project before work begins, not halfway through the task.

Relevant `_WORKSPACE.md` and `_BUILD_LOG.md`
Skill files and canonical commands
Expected output and closeout path
3 / Build

Working surface

Keep the active contract visible while the work is moving so the agent does not drift.

Objective and success condition
Loaded sources and quality checks
Decision point when a real tradeoff appears
4 / Close

Scoped memory

Treat closeout as part of the product rather than as an optional afterthought.

Build log draft from real artifacts
Canonical output recorded
Tomorrow hook and skill promotion cue
06

What it refuses to be

The interesting question stopped being whether models can collaborate. They can. The question is where judgment, memory, and verification should live when the models, shells, and plugins keep changing under you. A harness that hides its own memory or can't explain a step isn't an operating layer; it's just a prettier way to forget.

So Jenn OS holds a few hard lines. They're what keep it from drifting into the demo-friendly version of itself.

Hard lines
No hidden vector-memory religion pretending to replace explicit local records.
No universal life dashboard before the work loop itself is trustworthy.
No plugin-first framing that makes the system depend on one host surface.
No autonomous theater where the interface looks alive but cannot explain itself.

Bottom line

Own the harness, rent the model. The harness is where my work compounds — and it stays mine no matter which model is winning that week.