Product · Lighthouse

Live · global instance

Canonical docs your agent can cite.

Lighthouse hosts a curated knowledge graph for coding agents — RFCs, OWASP, NIST, framework documentation, methodology pages — indexed, chunked, citation-ready over MCP. Plug your client in and the agent stops hallucinating axios v0.21.

One product, two ways to run it: plug into the hosted instance, or self-host the Apache-2.0 engine with your own corpus ↗

By the numbers

The corpus, today.

A live snapshot of what your agent gets the moment you plug in. Refreshed by our ingest crons on per-role cadences.

71k+

Chunks

Indexed paragraphs your agent can search and cite verbatim.

14k

Sources

RFCs, OWASP, NIST SP-800, framework docs, methodology pages.

21

Role recipes

Per-role manifests tuned for security, ML, frontend, DevOps, more.

Connect

One config block — any MCP client.

A read-only global instance is running at https://lighthouse.harborgang.com. No signup for the anonymous tier; sign in to get a personal MCP token. Lighthouse appears in your tool picker with search and fetch_source.

Claude Code

Project (.mcp.json) or user (~/.claude/mcp.json)

claude mcp add lighthouse \
  --transport http \
  --url https://lighthouse.harborgang.com/mcp/

Adds Lighthouse to the current project. Use --scope user for a global registration. Restart and `search` appears in the tool picker.

Cursor

~/.cursor/mcp.json (or .cursor/mcp.json per project)

{
  "mcpServers": {
    "lighthouse": {
      "url": "https://lighthouse.harborgang.com/mcp/"
    }
  }
}

Settings → MCP picks it up. Lighthouse appears in the composer's tool list after a reload.

Claude Desktop

~/Library/Application Support/Claude/claude_desktop_config.json (macOS)

{
  "mcpServers": {
    "lighthouse": {
      "type": "http",
      "url": "https://lighthouse.harborgang.com/mcp/"
    }
  }
}

Merge into any existing mcpServers block. Quit + relaunch Claude Desktop.

Other clients work too — Continue · Cline · Zed · Aider · LangGraph · Codex CLI · ChatGPT (with connectors). Anything that speaks the MCP protocol.

Why hosted

A curated corpus is the floor. We keep it refreshed.

You could run it yourself — the engine is Apache-2.0 and complete. Hosted Lighthouse buys you three things the engine alone can't.

Curation

The corpus is opinionated.

We picked the canonical sources for each role and we maintain the ingest recipes. You don't have to argue with your team about whether MDN or W3C is the source of truth.

Freshness

Crons refresh on cadence.

Per-role refresh schedules. Hash-delta detects changes — re-runs are cheap. When a framework ships a new release, the relevant chunks roll over within a day.

Rerank

Cross-encoder, always on.

The hosted instance runs a tuned cross-encoder on top of BM25 + vector. Pro queries jump the rerank queue under load. Most of the eval lift comes from this layer.

Pricing

Same corpus on every tier. Caps reset 00:00 UTC.

Anonymous works without an account. Paid tiers unlock larger result sets, the cross-encoder rerank, and per-agent MCP tokens. Cancel anytime, prorated.

Anonymous

Free

Kicking the tyres.

  • 30 searches / day per IP
  • Top-10 results
  • Links to original sources
  • No MCP token

Reader

Free

Daily driver for one engineer.

  • 200 searches / day
  • Top-15 results
  • 1 personal MCP endpoint
  • Sort by date or relevance

Pro

$12/ month

One engineer, serious about it.

  • 1,500 searches / day
  • Top-30 results
  • Cross-encoder rerank + summary boost
  • Per-agent MCP tokens, revocable
  • Notify on changes to bookmarked sources
  • Priority in rerank queue

Team

$9/ seat / month

Shared usage view, org SSO.

  • Everything in Pro · 1,500/seat/day
  • GitHub-org SSO (rolling out)
  • Shared bookmarks + usage dashboard
  • Org-scoped private corpus (opt-in)

Prefer to run it yourself? Self-hosting is free — the full engine, Apache-2.0, your infra.

Annual billing on Pro and Team saves 25%. Switch on the pricing page.

Does it actually help?

376 side-by-side tasks. 11 frontier models. Eight wins, two losses, the receipts.

We ran the same prompts twice — once cold, once with Lighthouse — across ten engineering roles. Gemini, Kimi, Qwen, and DeepSeek saw the largest gains; Claude moved slightly, GPT moved a hair. Six of ten roles improved overall. The page is honest about the roles that regressed.

Self-host

Open source · Apache-2.0

The same engine, your hardware, your corpus.

Everything in this repo is the full product: hybrid BM25 + pgvector retrieval, cross-encoder rerank, MCP server, 30 importers (Notion, Confluence, Slack, S3, GitHub, sitemaps, …), admin UI, coverage-gap analytics, API keys, multi-workspace tenancy. Private repos, internal runbooks, vendor docs your compliance team won't let leave the boundary — one container + Postgres on a $20/mo VPS.

1

Clone and boot — Postgres included, no API keys required.

git clone https://github.com/ElMundiUA/lighthouse.git && cd lighthouse
cp .env.example .env
docker compose up --build
2

Open the admin UI, add a source (your docs sitemap, Notion, a GitHub repo — 30 importer types), hit Run.

open http://localhost:8000/ui/
3

Point your agent at it. Done — your corpus, citation-ready.

claude mcp add my-docs \
  --transport http \
  --url http://localhost:8000/mcp/

What will always be free: the engine is complete and Apache-2.0 — we will never move an existing feature behind a paywall. Hosted tiers sell curation, operations, and scale, not withheld code.

Live · global instance

The cheapest way to lift weaker models is to give them a memory.

Plug into the public instance, sign in for a personal MCP token, or start Pro for the rerank-and-priority lane.