Brand
Skills are not knowledge
The industry is publishing skill catalogs as the answer for AI agents. Recipe cards, instruction sheets, 'how to do X' packs. All of them rot in twenty minutes. The shape is wrong. Here is the argument for search over catalog, and the secret ingredient nobody else uses.
The industry is doing a thing right now, with a lot of enthusiasm, that is the wrong thing.
It is publishing skill catalogs for AI agents. Recipe cards. Instruction sheets. "How to do X" packs. Directories of named, versioned, neatly-formatted little documents that say here is how to generate a PDF, here is how to set up a Stripe webhook, here is how to write a Postgres migration that does not lock the table. The catalogs are big, well-organised, funded. Conferences are about them.
I do not think they are mostly bad. I think they are all bad, including the ones we would build if we tried. The shape is wrong. This post is about the shape.
A skill is not a recipe
Start with the word.
When you say a person has a skill, you do not mean they have a recipe card. You mean they have judgement. A devops engineer is a skill. Not "the steps a devops engineer would take" — the person. The thing they carry around that lets them stand in front of an alert at 2am and say, with no help from anyone, this is probably the load balancer, not the database, and here is the order I would check things in. That is taste, built from a decade of decisions, most of which were wrong before they were right. You cannot write it down. If you try, you get a six-page document that misses the part that mattered, because the part that mattered was which of the seven plausible things to do first.
A developer is a skill. A product manager is a skill. A medical writer who knows pharma is a skill. People with priors.
What a devops engineer uses to deploy something in a specific company's stack is a different thing entirely. The runbook for the cluster the regulated team runs on. The fact that staging has a different password rotation cadence than production. The custom kubectl wrapper one of the seniors wrote in 2023 that everyone now depends on. The reason migrations on Tuesdays are forbidden. Those are facts about a company. That is a knowledge base.
A skill is a person. A knowledge base is the world that person operates inside.
Collapse the two — write a document called "PDF developer skill" containing both the judgement and the company-specific instructions, call it a skill — and you have made a category error. A recipe card pretending to be a person. The card will look authoritative. The card will be confidently wrong, because the parts of it that are true are the parts that did not need to be written down, and the parts that needed to be written down are the parts that change.
A skill is a person. A knowledge base is the world that person operates inside. Collapsing the two gives you a recipe card pretending to be a person.
This is the same move I tried to make once before about policies and prompts. Two things that look adjacent turn out to be different categories. Collapsing them makes one of them rot.
Instructions rot at about one update per hour
The second problem is the speed.
Take the "PDF developer" skill. Monday morning, it works. It tells the agent which library to install, which method to call, which font path to use. By Tuesday lunch the library has shipped a new minor version, the method has been renamed, the recommended font path moved because the maintainers switched vendors. The skill still says what it said on Monday. It is now actively wrong. An agent loading it gets a confidently-formatted document that points at the wrong API.
This is not a hypothetical. It is the average week of any moderately active package. The half-life of a written-down "how to do X with Y" is, by my count of the last month of Ship's own dependencies, somewhere between an evening and two days. Bake an instruction into a skill and you have built something that decays in twenty minutes.
The defenders of the shape will say: we will update the skill. In principle, sure. In practice the catalog has thirty entries today and three hundred next quarter, every entry depends on three to five libraries, the maintainers are a small team. The math does not work. You cannot keep three hundred recipe cards fresh against a world that updates several times a day. You can keep them fresh against the world as it was the day you started.
What you have, after six months, is a museum.
The catalog itself does not scale
The third problem is the easiest to check.
A catalog of thirty skills is moderatable. A human curator can read every entry, decide whether it belongs, duplicates another, is stale. At thirty, there is a person who knows the corpus.
At a hundred, that person is gone. There is now a process for adding skills, a labels system, an issue tracker, a Slack channel for maintainers, a quarterly audit. The audit does not happen. The labels are wrong. Duplicates pile up. Nobody can answer which of these is the canonical way to do X, because the answer is political rather than technical.
At five hundred, the catalog is a landfill. No human alive has read all of it. An agent loading skills cannot navigate it either, because the skills overlap, contradict, and lie about their own freshness. The directory becomes a search problem on top of a search problem — first you find the skill, then whether anyone uses it, then whether it is still correct, then you do the work the skill was supposed to do for you.
A large public catalog of skills is running right now — bright, well-funded, beautifully organised, categories and tags and version numbers and a search bar across the top. I will not name it. The reader will know which one I mean. Past the second page, every entry is used by something like ten people in the world. Not ten per skill per day. Ten people in the world, total, across the lifetime of the entry. The maintainers know this. The metrics know this. The page exists because the catalog exists, not because anyone needs it.
What catalogs deliver, then, is roughly: something impressive in a deck, something impressive in a demo, a real paid layer of work for enterprise consultants who will author and audit and govern skills quarterly, and — for the agent doing the actual work — nothing. None of it is the operational knowledge an agent needs to do work in your company, because that knowledge is not in a public catalog. It cannot be. It lives in your company.
The right shape is a search engine, not a library
So if not catalogs, then what.
The right shape, the one I think the industry will land on after the catalog phase burns itself out, is a search engine over fresh sources. Not a frozen library of recipe cards. A live retrieval surface, scoped to a single company, aggregating the documents that company actually uses — architecture decisions, runbooks, post-mortems, design docs, on-call rotations, prior incidents and how they were resolved, the strange thing the senior developer typed into Slack last March that turned out to be the answer. Aggregated from where those documents already live, not copied into a separate authoring tool. With recency built in, so a document from Monday outranks one from a year ago, and with priority signals built in, so the canonical architecture page outranks a meeting note from a one-off Zoom.
Then you give a skill — a devops engineer, a developer, a PM, kept as a person — the ability to query that search engine while it works. The skill stays a skill. It carries its judgement, its priors, its taste. When it needs to know what is the deploy command for this service, it asks. The answer is fresh because it came from the source, not from a recipe card someone wrote eight months ago.
This is the inversion. Skill catalogs put knowledge inside the skill and watched it rot. The right shape keeps the skill empty of company-specific knowledge and gives it a way to look things up. The knowledge stays alive because it is not duplicated. The skill stays a skill because it is not pretending to be a knowledge base.
"But humans are bad at docs"
The good counter-argument to all of this is: humans are bad at documentation. Documents go stale because humans do not update them. A search engine over a slowly-rotting corpus has the same problem in a different shape.
This is the strongest objection and it is largely right.
The answer is that the LLMs that ride on top of the search engine should also write and maintain the docs. We do this in Ship today. At least three scheduled routines run against any team's workspace: a daily retro where the agent reads what happened the previous day and updates the relevant runbooks, a post-mortem capture where the agent takes the chat transcript of an incident and writes it into the knowledge base, and a weekly audit where the agent walks the corpus and flags entries no human has touched in the period one would have. The agents are not just consumers of the documentation. They are the maintenance layer.
Once you accept that the agents can write the docs as fast as they read them, the search-engine shape is much cheaper than the catalog shape. The cost of staleness — the load-bearing problem of any documentation system — drops by an order of magnitude, because the thing fixing it is the same kind of thing that consumes it.
The secret ingredient is delete
Here is the part I think nobody else is doing.
The agents have to delete.
Anything stale. Anything superseded. Anything that no longer applies. Anything that contradicts the current architecture. Anything describing a retired service. Anything mentioning a person who left two years ago. Anything whose most recent comment is "this is out of date, do not follow." All of it. Gone.
Not archived. Not tagged "stale." Not moved to a separate folder. Deleted from the corpus the agents search.
The instinct to preserve is a human instinct. Humans are slow, the past was expensive to produce, and the cost of regenerating a document was always higher than the cost of keeping the old one around in case. That instinct is correct for humans. It is wrong for systems that include LLMs. An LLM can rewrite half the requirements doc in an evening for the cost of an API call. There is no scenario in which I need yesterday's version more than I need the right one today. If I ever need yesterday's, it is in git. Git already preserves the history. The corpus is not the place for the archive; the corpus is the place for what is true now.
Catalogs that tag stale entries and keep them visible for a month are making the landfill smell better while still being a landfill. The agents still have to wade through entries and decide whether to ignore them. The cost is not in the disk space; the cost is in the decision the consumer has to make. Delete removes the decision.
The corpus is not the place for the archive. The corpus is the place for what is true now. Git already preserves the history.
This is a bet, and I want to be clear it is a bet. The bet is that the right shape for agent knowledge is search-plus-delete, not library-plus-archive. The agents are fast enough now that the cheapest path to a correct answer is to fix the current source, not to navigate a catalog of past-correct ones. If the bet is wrong, we end up with a corpus that loses things people wanted. So far it has not been wrong — the things the agents delete are things no human had read in months. If it is wrong, git will tell us, and we will tune the threshold.
Closing
The catalog phase will burn through its budget and end. The teams that built the big public skill libraries will pivot to retrieval. Conference talks will change. Consultants will rebrand. What survives is the shape where company-specific knowledge lives in a search engine the agents maintain, and skills stay what skills always were — people with priors, asking the right questions, looking things up as they go.
Maybe I am wrong about the timeline. I do not think I am wrong about the shape.
Lighthouse is the search-engine half of this argument. It aggregates company-specific knowledge from where it already lives, ranks it with fast hybrid retrieval and recency-and-priority signals, and exposes it over MCP so any agent — ours, yours, theirs — can use it as a tool. The Ship CLI calibrates it: the daily retros, the post-mortems, the weekly audits, the delete passes that keep the corpus current. The argument came first. Lighthouse is the operational form of it, and it is the part of Harbor Gang that exists because the catalogs do not.