Blog

Field notes from the workshop.

Real commits, real failures, real deletions. Every post earns its keep by pointing at code we merged or tore out. No thought leadership. No vibes.

Build in publicAutopsiesDeletions we’re proud ofNumbers from the log

All posts

32 published

  1. Best practice··7 min read

    Two readers, two manuals

    Documentation written for a human and documentation written for an agent are different artifacts. We learned that the hard way and split them. Here is what changed when we did.

    Read the post
  2. Brand··3 min read

    Champagne over teal

    We swapped the accent colour from teal to muted champagne gold this week. Small change in the diff. Bigger change in how the product reads. Notes on why colour is a positioning lever, not a vibe choice.

    Read the post
  3. Best practice··4 min read

    Boring failures are good failures

    The worst failure mode for an agent isn't a crash — it's silence. The pipeline keeps running, the dashboard stays green, and the work that was supposed to happen quietly didn't. The cure is to make failure modes mundane and named, before the agent ever runs.

    Read the post
  4. Autopsy··5 min read

    The mapping table the runtime never read

    We shipped an editor surface, an LLM resolver, and a PR-write flow for a config field. Twenty-four hours later we walked all of it back. The runtime didn't read the field. The story is about where config belongs.

    Read the post
  5. Manifesto··8 min read

    The dark factory is closed

    The dark factory — Foxconn lights out, robots making robots — is the wrong metaphor for what AI agents do to software teams. The team isn't shrinking; it's becoming the part that decides. And that part runs on more brains, not fewer.

    Read the post
  6. Best practice··6 min read

    PO ideas belong in the epic, not the ticket

    When the agent that writes the ticket and the agent that builds it are different processes, scope and motivation have to live somewhere both of them can read. We tried chat. We tried fat tickets. The thing that actually works is putting it in the project description and keeping the tickets thin.

    Read the post
  7. Pattern··4 min read

    The cheap classifier goes first

    We were paying for an LLM call every turn to detect topic shifts that the user had already announced in plain language. Adding a fifteen-line regex in front of the classifier removed the cost and made the UX more decisive at the same time. The rule generalises.

    Read the post
  8. Field note··7 min read

    The agent that finished without committing

    A real intake agent did the right work, in the wrong shape — it answered the ticket but never pushed a branch. The old exit protocol assumed every agent writes code. Branchless agents had nothing to commit, so we replaced the file contract with an HTTP one.

    Read the post
  9. Best practice··5 min read

    Clarification is not failure

    Most agent loops treat "I have a question" as the same outcome as "I crashed." Both stop the pipeline. Both look red on a dashboard. They are completely different things, and conflating them teaches agents to invent rather than ask.

    Read the post
  10. Product··7 min read

    We moved the front door

    Ship did not stop caring about engineers. We stopped making the engineer's tool the first thing a buyer had to understand.

    Read the post
  11. Best practice··6 min read

    Policies before prompts

    A good prompt asks for work. A good policy says what kind of work is allowed, what proof is required, and when the machine must stop.

    Read the post
  12. Best practice··6 min read

    The Inbox is not a backlog

    A backlog stores work. An Inbox stores attention. Mixing the two is how teams turn every agent question into another queue.

    Read the post
  13. Field note··5 min read

    The book was written on Sunday

    Prologue, manifesto, nine lettered sub-chapters, eight field notes. Every new passage anchored to a specific commit SHA from a real reference org. All of it keyed in between commits to the cloud console, on one Sunday, in a single session that started after midnight.

    Read the post
  14. Autopsy··11 min read

    The catalog rename and the matrix lane

    21 patterns renamed across 78 files. A new six-category scheme. Five duplicates deleted. Then a multi-pattern lane with three fan-out modes. RFC-0008, in the order it actually shipped, and why the matrix execution model fell out of the rename.

    Read the post
  15. Case study··8 min read

    Wizard v2 — the art of saying no

    Ten steps became three, then stayed three across seven backend rewrites in one day. A case study in keeping a flow honest while the mechanism under it moves.

    Read the post
  16. Architecture··11 min read

    Lanes as config — or how we killed the workflow artifact

    A full RFC, ten commits, one repo — in a single day we retired a first-class artifact kind, introduced lanes-as-config, and made shipctl run the single entry-point for everything a repo schedules. An autopsy of RFC-0007.

    Read the post
  17. Case study··11 min read

    Knowledge buckets and the Distiller

    Eight phases in one day — a scope ladder, a dual-written articles table, an LLM-backed ingest classifier, Notion and Linear connectors, and a per-user memory bucket that the agent can actually cite. The knowledge layer Ship needed before it could grow a second brain.

    Read the post
  18. Architecture··7 min read

    We deleted the worker. The system got simpler.

    Five moving parts in the morning, two by the end of the day. A worker, a Redis queue, a repo cache, and a git-sync loop — and why deleting them made the Ship Console cheaper, faster, and easier to reason about.

    Read the post
  19. Case study··9 min read

    From chat to Navigator

    A chat window is a failure mode dressed as a product surface. Over two days the Ship chat became a Navigator — fewer bubbles, word-by-word reveal, typed widgets, and a turn that no longer jumps the viewport. A case study in treating a surface as part of the agent.

    Read the post
  20. Autopsy··9 min read

    Artifacts are frontmatter now — the RFC-0005 autopsy

    Two files per artifact, one manifest, and two sources of truth that never agreed on a Monday. How we collapsed 61 artifacts into a single-file shape in one day — and why the cleanup commit mattered more than the migration.

    Read the post
  21. Build in public··11 min read

    Ship — the first two weeks

    189 commits. 16 days. One repo. The story of how Ship, shipctl, and the Ship Console went from an extracted folder to a running cloud platform — read off the actual git log.

    Read the post
  22. Case study··10 min read

    The protocol before the product

    Between the Apr 7 extraction and the Apr 19 cloud console, there is a quiet 11-day stretch that looks like nothing happened. One commit on a Sunday did the whole thing — shipctl v0.9. We wrote the protocol before we wrote the product; here is how and why.

    Read the post
  23. Origin··6 min read

    How we cut Ship out of elmundi

    Ship did not begin as a greenfield repo. It began as a folder inside a product called elmundi. On Apr 7 we cut it out, and twenty commits later it was a standalone thing with its own CI, its own docs, and its own CLI. This is the prequel to every other post on this blog.

    Read the post
  24. Architecture··4 min read

    The methodology API — one endpoint, two consumers

    shipctl reads from it. Every agent reads from it. The customer's repo never sees the source files. One HTTP API, deliberately small, cleanly versioned. The shape that made the rest of April land smoothly.

    Read the post
  25. Architecture··4 min read

    Multi-tracker adapters before there were customers

    We shipped Linear, GitHub Issues, Notion, Jira, and Asana adapters before any of them had a real user. It is the exception to the "build for one before many" rule. Here is why we did it anyway and what kept the cost down.

    Read the post
  26. Field note··4 min read

    adopt-ship.sh and the wrong adopters

    A 60-line shell script meant to make adoption frictionless. The first three people who ran it did the wrong thing in three different ways. What we learned about the gap between "easy to start" and "easy to start correctly.

    Read the post
  27. Autopsy··4 min read

    The docs-mcp-server experiment we deleted

    We built a Model Context Protocol server to expose the docs to agents. It worked. Then we deleted it. Why building the right thing the wrong way is sometimes worse than not building it.

    Read the post
  28. Architecture··4 min read

    The repo refactor that gave shape to everything else

    Three top-level folders — documentation, prompts, runtime. Each one had a different reader. Renaming and moving files for half a day made the next four weeks possible.

    Read the post
  29. Architecture··4 min read

    Killed MkDocs, kept the URLs

    We replaced the docs runtime in one commit. Same content, same URL paths, different stack. The constraint that did most of the work was "no broken links.

    Read the post
  30. Field note··3 min read

    Translating cloud-prompts to English

    The prompts that came over from elmundi were partly Ukrainian. We sat one afternoon and translated them. Three things changed; one of them was unexpected.

    Read the post
  31. Field note··4 min read

    Bunny Magic Containers, three weeks of fights

    Sixteen consecutive `fix(bunny)` commits. The Magic Containers API was new, the docs were thin, and our deploy pipeline learned each lesson the same way every team learns it. Here's the shape of the fight, told off the actual commit log.

    Read the post
  32. Origin··4 min read

    What "extracted from elmundi" actually carried

    One commit, six months of methodology, and the LICENSE file that did more work than any line of code. A look at what was actually inside the first import.

    Read the post