Field notes · 46 published

Receipts, not slogans.

Field notes from running our own delivery on Ship. Every post earns its keep by pointing at code we merged or tore out. No thought leadership. No vibes. The long read lives in the book.

Build in publicAutopsiesDeletions we’re proud ofNumbers from the log

All notes

46 published

  1. Aside··4 min read

    We're changing the industry, anyway

    A small team encoded its judgment into a few thousand lines of prompt, handed the typing to a cheap model, and now the work ships itself. You are supposed to feel the floor move. From inside, it feels like almost nothing — and that, it turns out, is the whole point.

    Read the note
  2. Principle··6 min read

    Human, machine, idea

    Since April our changelog read like an apology — we cut the scheduler, the worker, the Inbox screen, the custom client, the expensive model. It looked like austerity. It was the opposite: we were removing glass. What is left when you remove everything removable is a human, a machine, and an idea — and that is where the truth is.

    Read the note
  3. Field note··3 min read

    The 4-column footer

    The header could not hold the navigation. We rebuilt the footer as four columns — Ship, Lighthouse, Harbor Gang, Contact & legal — and watched it become a navigation surface, not a graveyard for legal links.

    Read the note
  4. Method··7 min read

    The PO-audience filter

    Three writer agents produced 28 doc pages in parallel. A separate critic agent passed the output through one rule: would a founder skip past this sentence because it talks about plumbing they don't run? The agent-criticises-agent pattern is what made the result readable. The shape generalises.

    Read the note
  5. Field note··4 min read

    The sidebar nobody noticed

    The new docs pages have a sticky left nav and a sticky right TOC. We did not invent the pattern — we picked it. The post is about which IA layers to invent and which to copy wholesale from a vendor twenty times your size.

    Read the note
  6. Brand··4 min read

    The cross-product ribbon

    A one-line ribbon under the hero subhead on /ship and /lighthouse points at the other product. Small visible decision. Large positioning consequence. Contextual placement beats global placement when the connection is conditional.

    Read the note
  7. Method··5 min read

    Freezing the run

    /lighthouse/evals is an alias of the latest run. The actual Run 1 lives at /lighthouse/evals/runs/v1 and is frozen forever. When Run 2 ships, the alias moves; v1 keeps its original numbers. Cite-friendly URLs, RFC-style versioning, pulled into a public product artifact.

    Read the note
  8. Brand··4 min read

    What we got wrong the first time

    Most teams write a 'limitations' section and bury it in a footnote. We put 'What we got wrong' above 'Where Lighthouse loses' on the evals page. Same visual weight as the headline numbers. Admitting the dumb thing is the part that earns the right to publish the wins.

    Read the note
  9. Autopsy··7 min read

    Empty versus empty: the bug that made reasoning models look smart

    Our first Lighthouse benchmark scored reasoning models as competitive when both sides had returned nothing. The judge counted empty-vs-empty as a tie. Here is how we caught it, what changed when we re-ran, and why the corrected numbers say Llama 3.3 got worse.

    Read the note
  10. Autopsy··6 min read

    The /docs mashup that broke the IA

    One user message — 'I can't find /docs from /ship' — triggered eight person-days of IA migration. The post is about which user complaints are architectural and which are cosmetic. Architectural complaints are gifts. We were lucky we listened.

    Read the note
  11. Build in public··6 min read

    The page an agent can read

    We rewrote the Lighthouse pitch page with the explicit constraint that an LLM agent reading the HTML should be able to self-configure its MCP client. Then we ran the test. Here is what changed and what the new constraint asked of the writing.

    Read the note
  12. Brand··9 min read

    Skills are not knowledge

    The industry is publishing skill catalogs as the answer for AI agents. Recipe cards, instruction sheets, 'how to do X' packs. All of them rot in twenty minutes. The shape is wrong. Here is the argument for search over catalog, and the secret ingredient nobody else uses.

    Read the note
  13. Method··6 min read

    No broken links, second time

    When we replaced MkDocs in April, the constraint was simple — every old URL keeps working. We applied it again this week to reshape the entire IA. The rule isn't a one-time migration discipline. It is how a site matures without burning its inbound link graph.

    Read the note
  14. Brand··5 min read

    Three deletions for every build

    We counted our own field notes. Far more 'we deleted X' posts than 'we built X' posts. That ratio is the startup loop — ship a hypothesis, learn it was wrong, delete it, ship the next one. The deletion is the receipt.

    Read the note
  15. Best practice··7 min read

    Two readers, two manuals

    Documentation written for a human and documentation written for an agent are different artifacts. We learned that the hard way and split them. Here is what changed when we did.

    Read the note
  16. Brand··3 min read

    Champagne over teal

    We swapped the accent colour from teal to muted champagne gold this week. Small change in the diff. Bigger change in how the product reads. Notes on why colour is a positioning lever, not a vibe choice.

    Read the note
  17. Best practice··4 min read

    Boring failures are good failures

    The worst failure mode for an agent isn't a crash — it's silence. The pipeline keeps running, the dashboard stays green, and the work that was supposed to happen quietly didn't. The cure is to make failure modes mundane and named, before the agent ever runs.

    Read the note
  18. Autopsy··5 min read

    The mapping table the runtime never read

    We shipped an editor surface, an LLM resolver, and a PR-write flow for a config field. Twenty-four hours later we walked all of it back. The runtime didn't read the field. The story is about where config belongs.

    Read the note
  19. Manifesto··8 min read

    The dark factory is closed

    The dark factory — Foxconn lights out, robots making robots — is the wrong metaphor for what AI agents do to software teams. The team isn't shrinking; it's becoming the part that decides. And that part runs on more brains, not fewer.

    Read the note
  20. Best practice··6 min read

    PO ideas belong in the epic, not the ticket

    When the agent that writes the ticket and the agent that builds it are different processes, scope and motivation have to live somewhere both of them can read. We tried chat. We tried fat tickets. The thing that actually works is putting it in the project description and keeping the tickets thin.

    Read the note
  21. Pattern··4 min read

    The cheap classifier goes first

    We were paying for an LLM call every turn to detect topic shifts that the user had already announced in plain language. Adding a fifteen-line regex in front of the classifier removed the cost and made the UX more decisive at the same time. The rule generalises.

    Read the note
  22. Field note··7 min read

    The agent that finished without committing

    A real intake agent did the right work, in the wrong shape — it answered the ticket but never pushed a branch. The old exit protocol assumed every agent writes code. Branchless agents had nothing to commit, so we replaced the file contract with an HTTP one.

    Read the note
  23. Best practice··5 min read

    Clarification is not failure

    Most agent loops treat "I have a question" as the same outcome as "I crashed." Both stop the pipeline. Both look red on a dashboard. They are completely different things, and conflating them teaches agents to invent rather than ask.

    Read the note
  24. Product··7 min read

    We moved the front door

    Ship did not stop caring about engineers. We stopped making the engineer's tool the first thing a buyer had to understand.

    Read the note
  25. Best practice··6 min read

    Policies before prompts

    A good prompt asks for work. A good policy says what kind of work is allowed, what proof is required, and when the machine must stop.

    Read the note
  26. Best practice··6 min read

    The Inbox is not a backlog

    A backlog stores work. An Inbox stores attention. Mixing the two is how teams turn every agent question into another queue.

    Read the note
  27. Field note··5 min read

    The book was written on Sunday

    Prologue, manifesto, nine lettered sub-chapters, eight field notes. Every new passage anchored to a specific commit SHA from a real reference org. All of it keyed in between commits to the cloud console, on one Sunday, in a single session that started after midnight.

    Read the note
  28. Autopsy··11 min read

    The catalog rename and the matrix lane

    21 patterns renamed across 78 files. A new six-category scheme. Five duplicates deleted. Then a multi-pattern lane with three fan-out modes. RFC-0008, in the order it actually shipped, and why the matrix execution model fell out of the rename.

    Read the note
  29. Case study··8 min read

    Wizard v2 — the art of saying no

    Ten steps became three, then stayed three across seven backend rewrites in one day. A case study in keeping a flow honest while the mechanism under it moves.

    Read the note
  30. Architecture··11 min read

    Lanes as config — or how we killed the workflow artifact

    A full RFC, ten commits, one repo — in a single day we retired a first-class artifact kind, introduced lanes-as-config, and made shipctl run the single entry-point for everything a repo schedules. An autopsy of RFC-0007.

    Read the note
  31. Case study··11 min read

    Knowledge buckets and the Distiller

    Eight phases in one day — a scope ladder, a dual-written articles table, an LLM-backed ingest classifier, Notion and Linear connectors, and a per-user memory bucket that the agent can actually cite. The knowledge layer Ship needed before it could grow a second brain.

    Read the note
  32. Architecture··7 min read

    We deleted the worker. The system got simpler.

    Five moving parts in the morning, two by the end of the day. A worker, a Redis queue, a repo cache, and a git-sync loop — and why deleting them made the Ship Console cheaper, faster, and easier to reason about.

    Read the note
  33. Case study··9 min read

    From chat to Navigator

    A chat window is a failure mode dressed as a product surface. Over two days the Ship chat became a Navigator — fewer bubbles, word-by-word reveal, typed widgets, and a turn that no longer jumps the viewport. A case study in treating a surface as part of the agent.

    Read the note
  34. Autopsy··9 min read

    Artifacts are frontmatter now — the RFC-0005 autopsy

    Two files per artifact, one manifest, and two sources of truth that never agreed on a Monday. How we collapsed 61 artifacts into a single-file shape in one day — and why the cleanup commit mattered more than the migration.

    Read the note
  35. Build in public··11 min read

    Ship — the first two weeks

    189 commits. 16 days. One repo. The story of how Ship, shipctl, and the Ship Console went from an extracted folder to a running cloud platform — read off the actual git log.

    Read the note
  36. Case study··10 min read

    The protocol before the product

    Between the Apr 7 extraction and the Apr 19 cloud console, there is a quiet 11-day stretch that looks like nothing happened. One commit on a Sunday did the whole thing — shipctl v0.9. We wrote the protocol before we wrote the product; here is how and why.

    Read the note
  37. Origin··6 min read

    How we cut Ship out of elmundi

    Ship did not begin as a greenfield repo. It began as a folder inside a product called elmundi. On Apr 7 we cut it out, and twenty commits later it was a standalone thing with its own CI, its own docs, and its own CLI. This is the prequel to every other post on this blog.

    Read the note
  38. Architecture··4 min read

    The methodology API — one endpoint, two consumers

    shipctl reads from it. Every agent reads from it. The customer's repo never sees the source files. One HTTP API, deliberately small, cleanly versioned. The shape that made the rest of April land smoothly.

    Read the note
  39. Architecture··4 min read

    Multi-tracker adapters before there were customers

    We shipped Linear, GitHub Issues, Notion, Jira, and Asana adapters before any of them had a real user. It is the exception to the "build for one before many" rule. Here is why we did it anyway and what kept the cost down.

    Read the note
  40. Field note··4 min read

    adopt-ship.sh and the wrong adopters

    A 60-line shell script meant to make adoption frictionless. The first three people who ran it did the wrong thing in three different ways. What we learned about the gap between "easy to start" and "easy to start correctly.

    Read the note
  41. Autopsy··4 min read

    The docs-mcp-server experiment we deleted

    We built a Model Context Protocol server to expose the docs to agents. It worked. Then we deleted it. Why building the right thing the wrong way is sometimes worse than not building it.

    Read the note
  42. Architecture··4 min read

    The repo refactor that gave shape to everything else

    Three top-level folders — documentation, prompts, runtime. Each one had a different reader. Renaming and moving files for half a day made the next four weeks possible.

    Read the note
  43. Architecture··4 min read

    Killed MkDocs, kept the URLs

    We replaced the docs runtime in one commit. Same content, same URL paths, different stack. The constraint that did most of the work was "no broken links.

    Read the note
  44. Field note··3 min read

    Translating cloud-prompts to English

    The prompts that came over from elmundi were partly Ukrainian. We sat one afternoon and translated them. Three things changed; one of them was unexpected.

    Read the note
  45. Field note··4 min read

    Bunny Magic Containers, three weeks of fights

    Sixteen consecutive `fix(bunny)` commits. The Magic Containers API was new, the docs were thin, and our deploy pipeline learned each lesson the same way every team learns it. Here's the shape of the fight, told off the actual commit log.

    Read the note
  46. Origin··4 min read

    What "extracted from elmundi" actually carried

    One commit, six months of methodology, and the LICENSE file that did more work than any line of code. A look at what was actually inside the first import.

    Read the note

Field notes are the receipts

The full argument lives in the book.

Each note is a single scar — one commit, one deletion, one decision. The book strings forty of them into the operating model that comes out the other side.