Method

The PO-audience filter

Three writer agents produced 28 doc pages in parallel. A separate critic agent passed the output through one rule: would a founder skip past this sentence because it talks about plumbing they don't run? The agent-criticises-agent pattern is what made the result readable. The shape generalises.

Denys Kuzin··7 min read·methoddocsagentstwo-readersbuild-in-public

Three writer agents were running in parallel. One was working through the process pages, one through the operator pages, one through the integration pages. Each had a slug tree, a target reader, and a voice anchor. The reader was a product owner — the founder or PM who logged into Ship to ship a feature, not the engineer who installed it.

While they wrote, a fourth agent waited. Not idle — reading each page the writers had just finished, asking one question of every sentence.

Would the reader skip past this sentence because it talks about plumbing they do not run.

That fourth agent is the post. The writers produced twenty-eight pages over an afternoon. The critic produced a quieter document — one that read like it had been written for the person who would actually open it.

What the writers were asked

The brief was generous. The writers had the codebase, the policy docs, the artifact bundles, the live tracker. They could cite implementation when it helped them think. The reader contract was on the first line of every prompt: a product owner using Ship to ship a feature. The writers held it well enough to start, then drifted, because writers always drift toward the context they were given.

A writer who has read the source produces sentences about the source. A writer who has seen the auth wiring names the auth library. A writer who has watched the chat memory roll over knows what does the rollover, and the name slips into the page because the writer cannot un-know it. It is the gravity of context.

The reader has none of that gravity. The reader has a UI — state names on a card, role labels in the picker, button text that says Send to Dev or Mark Out of Scope. Anything the reader will not meet inside Ship is, for the reader, noise.

What the critic was asked

The critic had one rule. Read every sentence as a founder reading the page on a Monday morning, with two tabs open and a half-finished sprint, and ask whether the sentence buys them anything. If it names a library, an internal ticket id, a protocol, a queue, an adapter — flag it. If it names a state the founder sees in the UI, a role they will pick, a label that appears on their tickets — keep it.

The boundary was a vocabulary one. The cleaned page did not lose information; it lost vocabulary that belonged to a different audience. The shape of every flagged sentence was the same — useful claim, smuggled-in word, optional clause explaining the smuggled word. The critic stripped the smuggled word and the explanation. The claim stayed.

A small concrete one, near-verbatim from the run, in the page about how clarifications work.

Writer: The intake role posts the question to the tracker through the Linear adapter, which holds the workspace OAuth and applies a needs:clarification label so the FSM stops re-picking the ticket.

Critic, after the pass: The intake role posts the question on the ticket and pauses the ticket until a human answers. While it waits, the ticket shows up in your Inbox under Needs clarification.

Same event, two readers. The first sentence is true; it is also one the founder skips. The adapter is not theirs. The OAuth is not theirs. The label is theirs only because it appears in the UI, and the UI already calls the lane Needs clarification. The second sentence trades five proper nouns for two states the founder will see with their own eyes.

The whole run produced about three hundred edits of that shape across the twenty-eight pages. None were rewrites — cuts, with the occasional one-clause replacement.

Why a separate agent, not a stricter prompt

The first instinct was to give the writers the rule. Do not name libraries. Do not name internal ticket ids. Do not say adapter, classifier, embedding. We tried it. The writers became sterile.

A writer policing its own vocabulary writes around the words it is not allowed to use. It hedges. It generates phrases like the underlying mechanism and the relevant subsystem. The prose becomes a series of polite refusals to name things, and the reader feels the absence even when the words themselves are absent.

A writer trusted to use the full vocabulary of its context writes confidently. The page has weight, specificity, examples. It also has the wrong words in it. Then the critic comes through and removes those words, and the weight stays — because the weight was never in the proper nouns; it was in the claims.

Production and enforcement are different jobs. A producer trying to enforce is a producer working against itself. A separate enforcer protects the producer's voice. That, more than the wordlist, is what made the docs readable.

There is a policy version of this argument we wrote earlier in the spring: do not put standing rules inside the prompt for a task. Standing rules live outside, where a different reviewer can check them. The critic agent is the policy made operational. The wordlist is the policy. The pass is the review.

The pattern, named

Three writers, one critic. The writers produce; the critic represents the audience inside the workflow. The artifact passes through both, and the second pass strips anything the first smuggled in.

Producer ↔ critic ↔ artifact. The critic does not write; the critic guards a boundary the writer was supposed to honour and could not, because the writer had too much context.

We already use this shape in three other places, and only noticed once the docs run made it visible.

Release notes vs commit messages. Commit messages on a Ship branch are written for the engineer reviewing the PR — file paths, functions, migration order. The release notes for the same change are written for the founder watching the changelog. Today they are still hand-edited from the commits. The PO-audience filter is the same job, on a smaller artifact.

Status pages vs incident channels. During the cold-pool outage in April, the engineers' channel was full of stack traces, region names, and the pod that flapped. The status page said some users are seeing slow ticket loads; we have identified the cause and are deploying a fix. Same event, two readerships. The incident commander was, in that moment, doing the critic's job by hand.

Sales copy vs engineering Slack. The deck a founder uses to talk to a prospect about Ship has none of the words the engineers use in standups. That translation is a person today. It is exactly the shape the docs run automated.

What these three share: the production surface has more context than the consumption surface needs, and someone has to call which context to drop. The critic agent is that call, made by a role whose job is only that.

What this also means

The pages went up under /ship/docs on Friday. They read like a product manual now. The writers' names — A, B, C — are on the page metadata; the critic's is on the same metadata, one line down. Audience pass by: po-filter-v1. Versioned, because the wordlist will grow.

Docs were the obvious place to start, because the gap between Ship's internal vocabulary and a founder's was the widest surface we had. The pattern would be less visible on a narrower one. It is not less needed there.

Release notes do not have this filter yet. Marketing site copy goes through a human editor playing the critic role part-time. The in-product empty states and error messages — the ones a founder reads when something goes wrong, when their tolerance for our vocabulary is at its lowest — do not have it either. Those are the next three surfaces. The shape is the same; only the wordlist changes, and most of it carries over.

Two readers, two manuals was the structural version of this idea — pick the reader, then pick the words. The PO-audience filter is the operational version. The words got picked twice: once by the writer, once by the critic. Only the second pass survives.

Every artifact a producer writes for an audience it does not belong to deserves a second agent whose only job is to belong to that audience. We have not built ours for release notes, or error copy, or the deck. We will.

The long read

This is one field note. The full argument lives in the book.

Read the book