Case study · Ship on Ship

Method, not boast

Every commit in Ship was shipped through Ship.

We did not build a product and then claim our team uses it. We built a product by using a workspace that was sometimes broken, sometimes half-implemented, sometimes the only thing standing between us and a weekend of manual rebases. The loop you read about on the homepage is the loop that produced the homepage. This page is the audit.

Same numbers. Different framing.

The receipts read the same whether you call them a case study or a confession.

608

Commits in 30 days

0

Hand-typed by humans

99%

Live system uptime

Ship Analytics dashboard with DORA-4 charts and live system status from the Ship-on-Ship workspace

The Ship-on-Ship workspace, today.

Movement one

The first time Ship deployed a Ship feature.

For the first few weeks Ship was a thing we ran for the reference deployment and built by hand in a separate repo. The chicken-and-egg moment had a specific shape: the workspace we were building did not yet trust itself to merge changes to its own backend. We had a process model and a pipeline drawing on a whiteboard. We did not have the courage to point that pipeline at the code that defined it.

The first ticket Ship merged about Ship was small. A copy fix on the inbox empty state — the screen that says nothing needs you right now when the queue is empty. The agent opened the branch. The pipeline ran. The validation bundle flagged that we had three different phrasings of nothing needs you in three different files; we picked one. The reviewer specialist asked, politely, whether we wanted the em-dash or the colon. We picked the em-dash. The PR merged.

It was a copy fix. We treated it like a moon landing. From that morning forward, every change to Ship — schema migrations, auth flows, the pipeline visualization itself — walked the same loop. Some of those changes broke the loop. Then the broken loop could not fix itself, and the operator on duty had to pick up a wrench. That is the actual price of dogfooding: when the thing you are building is also the thing you are using, the cost of a regression is doubled, and the discipline you learn from it is doubled too.

Movement two

What dogfooding catches that a paying customer never would.

The workspace has a long-running epic called QA Debt. It is the place we list the failures only someone who lives in the product every day will find. A paying customer would shrug at most of these. We do not get to shrug, because the workspace breaking quietly is the same thing as our own work stopping.

The taxonomy: a clarification reply that lands in the wrong thread because two tickets share a prefix. A self-heal routine that retries forever instead of giving up after the second attempt. A planning artifact that re-generates its own checklist on every render, so the operator's progress dots reset. A dashboard chart that reads Elite on the workspace home and High on the analytics page because two different cron jobs ran the math fifteen seconds apart and one of them rounded.

None of these failures would survive a sales demo, because none of them happen inside a sales demo. They happen on a Tuesday morning, three weeks after release, when the operator opens the inbox and notices something is off. The QA Debt epic is the file where we keep those noticings, scored and queued. Every time we close one, the loop gets a little less embarrassing for the people who run it for a living. That is the whole reason this list exists.

Movement three

The night the agent shipped nothing.

The commit that opens the book is a five-line fix to a scheduled workflow. The workflow was a cron job that asked,is it an even UTC hour?, and routed the day's work based on the answer. GitHub Actions delivered the cron eight minutes late, which turned an even hour odd, which made the guard skip, which meant no ticket was picked, no branch was cut, no PR was opened. The dashboard stayed green. The system had quietly decided it was somebody else's turn.

We opened this case study with that story for the same reason we opened the book with it: almost every interesting failure in this product has been a variation of it. The model did not lie. The prompt did not drift. A small, reasonable assumption about how the world ticks lost an argument with the world, and the only one who noticed was the human who came in the next morning and saw too little.

The full version is in the book. It is one chapter long. It is the truest thing we wrote about this product.

Movement four

Three projects, visible in the screenshot below.

If you open the workspace right now and scroll, you will see the projects we are running on Ship to build Ship. Three of them, in particular, are worth naming by what they did rather than what they were called.

The provider coverage project rewired the executor underneath the loop so that swapping models — Claude to GPT to Codex to whatever ships next month — is a configuration change, not a rebuild. We did not want the process model to assume any particular vendor. Now it does not.

The mid-thread planning pivot project gave the workspace a way to handle the operator who is halfway through drafting one ticket and realizes they actually need a different one. Before, the right move was to abandon the draft and start fresh; now the workspace recognizes the intent flip, offers a clean exit, and remembers where the original thread left off. It sounds small. It is the single highest-impact change we shipped to the operator surface in a month.

The code exploration project gave the agents proper read access to a repository's structure — symbol tables, call graphs, file-level summaries — without re-implementing a code search engine. The output: fewer PRs that touch the wrong file, fewer reviews that hinge on "did you check this other module," fewer afternoons spent explaining the layout to an agent that should already know it.

All three landed in the same calendar month. All three were walked through the same loop you read about on every other page on this site. The workspace below is the one we used to do it.

Ship-on-Ship workspace home with priority buckets, active projects, and merged PR #262 in the footer

The Ship-on-Ship workspace, this morning. Footer: merged · #262 · ElMundiUA/ship · 11:40 AM.

Movement five

The screenshot loop.

Every screenshot on this site — the analytics dashboard, the inbox, the pipeline, the workspace home, the merged-PR footer — came out of the same workspace. Not a staged tenant. Not a demo seeded with prettier numbers. The actual one. The one we opened this morning to ship the change that put this paragraph on the page you are reading.

The catalog of evidence is recursive. The screenshots prove the workspace runs. The workspace produced the screenshots. The workspace also produced the website that hosts the screenshots, including the page you are looking at now. If you crawl this site for proof, you will find that the proof keeps pointing back at itself. That is the only honest way to do a case study on the team that wrote both the product and the book about it.

We are not asking you to take our word for it. We are asking you to read the receipts.

Next

Read the long version. Or talk to the founder directly.

The book is forty short chapters on the same operating model — the scars, the trade-offs, the rules we learned to write down. The mailbox is exactly what it sounds like: a person, at a desk, reading.