colbymchenry/codegraph
CodeGraph hit 29,800 stars, climbed to #2 on GitHub Trending May 23, and shipped v0.9.6 yesterday. It cuts your Claude Code token spend roughly 35% and runs 100% locally. Nobody is calling it a game-changer because the readme just shows you the benchmark numbers and gets on with it. colbymchenry/codegraph is the local-first code knowledge graph built for multi-agent coding workflows. MIT license. TypeScript (91.8%). Version 0.9.6 published May 27, 2026 — one day before this listing. Surged from roughly 15,000 stars to 29,800 in the week of May 21-28 after adding 2,434 stars in a single 24-hour window on May 23. What CodeGraph Actually Does It pre-indexes your codebase as a semantic graph — functions, classes, imports, call chains, type dependencies — and exposes the graph through MCP (Model Context Protocol) tools that integrate directly with Claude Code, Codex CLI, Cursor, OpenCode, Gemini, AntiGravity, Kiro, and Hermes Agent. When the AI agent asks a question, it queries the graph instead of re-scanning files. The difference at the prompt layer is brutal. Without CodeGraph, asking Claude Code 'what calls this function' triggers a multi-file search that pulls hundreds of unrelated lines into the context window. With CodeGraph, the same question routes to a graph traversal that returns just the call sites, the parameters they pass, and the test coverage. The agent reads what matters, not what is near. The Benchmarks That Justify The Star Surge The repo publishes a benchmark across seven real-world codebases. The median results: ~35% cheaper inference cost per session 57% fewer tokens consumed 46% faster responses on common coding queries 70% fewer tool calls per task completed The 70% reduction in tool calls is the underrated number. Tool calls are where agentic coding loops bleed — each one round-trips through the model with the full context, and a typical agentic refactor session fires 50-200 of them. Cutting that count by 70% is what drives the 46% latency improvement and the 35% cost cut at the same time. The Money Math For a single developer on Claude Code at $20/month plus roughly $30/month in API tokens, CodeGraph saves about $10-12/month in token spend. Not life-changing on one seat. But the math compounds linearly with team size, and most teams are paying far more per developer than the seat fee. For a 10-developer team paying $200/month in combined Claude Code API spend, CodeGraph saves roughly $70/month — enough to buy another seat outright. For a 50-developer team running Cursor or Claude Code at scale on a 500K+ LOC codebase, the math gets serious: a 35% cost cut on a $1,500/month combined API bill is $525/month saved, $6,300/year. The free MIT license means zero recurring infrastructure cost to capture those savings. Why Local-First Actually Matters Here Every alternative in this space ships a hosted index. Your code goes to a vendor's servers. Your graph is built on someone else's machine. CodeGraph is the opposite — the indexer runs on your laptop, the graph is stored on your laptop, queries route to your laptop. No code or graph data leaves the developer's machine. For regulated codebases (defense, healthcare, finance, automotive), this is the difference between adoption and procurement freeze. For everyone else, it is the difference between adding another vendor to the security review checklist and just running an npm install. The local-first design is why the 14,000-star surge in a single week happened in regulated industries first. What The 500K+ LOC Use Case Unlocks Claude Code and Cursor both have practical limits on codebases above 500,000 lines of code. Beyond that, the agent's context window cannot hold enough of the repo to reason about cross-file changes without re-reading files on every prompt — which times out, blows the token budget, or both. CodeGraph fixes this by precomputing the cross-file relationships once and letting the agent query the precomputed graph instead. For monorepo teams, this is the unlock that turns Claude Code from 'works on services smaller than X' into 'works on the whole repo.' The seat license that was previously borderline-useful on a giant codebase becomes productive again. What To Do With It This Week If you run Claude Code or Cursor on any codebase north of 100,000 lines, install it tonight. The setup is one CLI command. The benchmark numbers replicate on real codebases — multiple of the Trending May 23 commenters posted their own measured 25-45% token reduction within hours. If you are on a regulated codebase that previously blocked AI coding tools because of the data-egress problem, this is the week to revisit procurement — the local-first design is the audit story. For broader context on the AI-coding-stack-getting-efficient trend, see Pi Coding Agent — the model-agnostic OSS CLI that pairs naturally with CodeGraph as the index layer — and Code Review Graph MCP, which applies the same graph-as-context idea specifically to PR review with a 38x-528x token reduction benchmark. The pattern across the stack is the same: constrain what the AI reads, hand the human the steering wheel, and watch the failure rate collapse. Our breakdown this week of why the AI-replacement narrative is a sales pitch walks through why this is the right architecture.
Why It Matters
AI coding agents re-scan your codebase on every prompt. On a 500K+ LOC repo that means burning $7-20 per session in tokens just for the agent to figure out what it is looking at — before it does any work. CodeGraph pre-indexes your repo as a semantic graph (functions, classes, imports, call chains) and exposes it through MCP tools so the agent reads only what is relevant. Published benchmarks across seven real-world codebases: ~35% cheaper inference cost, 57% fewer tokens, 46% faster responses, 70% fewer tool calls. Runs 100% locally — no code or graph data leaves the developer's machine. For any team paying for Claude Code or Cursor at scale, this is a direct cost cut backed by benchmark data. The 29,800-star surge (#2 on GitHub Trending May 23, 2026, +14K stars in a week) is the OSS community recognizing that the bottleneck on agentic coding tools right now is context inefficiency, not model quality.