VectifyAI/PageIndex
PageIndex hit GitHub Trending on May 6, 2026 with a thesis that should make every vector-database investor uncomfortable: throw out embeddings entirely. No chunking, no cosine similarity, no Pinecone bill. Just a hierarchical document tree the LLM reasons over the way a human flips through a book's index. The number that earned the trending slot: 98.7% accuracy on FinanceBench — a long-document QA benchmark where vector-RAG pipelines have plateaued in the low 80s for over a year. PageIndex's hit rate puts it state-of-the-art on a benchmark the entire RAG-as-a-service industry has been quietly losing on. The mechanism is what makes it worth reading the source. PageIndex is two steps. Step one builds a hierarchical "table of contents" tree of the document — sections, subsections, page references, semantic summaries at every node. Step two hands that tree to the LLM and lets it tree-search: the model reads the table of contents, picks a section, drills down, picks a subsection, drills down again, then reads only the leaves it needs. The LLM is doing what a human researcher does — flip to the index, find the right chapter, scan the headings, read the paragraph. No embedding model in the loop. Why this matters: vector RAG fails in exactly the places that matter most for enterprise deployments. Long financial filings (300+ pages of dense regulated text). Legal contracts with cross-references between clauses. Technical manuals where context two sections back changes the meaning. Chunking destroys those relationships, and cosine similarity cannot reason about hierarchy. PageIndex preserves the structure and lets the LLM walk it. The repo is MIT-licensed Python. Integrations land for Claude Agent SDK, Vercel AI SDK, OpenAI Agents SDK and LangChain — pick your harness. Two deployment paths: self-host the open-source library, or hit the hosted cloud at chat.pageindex.ai if you do not want to run the infrastructure. A sister repo, VectifyAI/pageindex-mcp, exposes the same retrieval natively over the Model Context Protocol so any MCP client (Claude, Cursor, Codex CLI, Gemini CLI) can use it without a custom integration. The honest trade-offs: PageIndex's tree-search costs more LLM tokens per query than a vector lookup — you are paying for reasoning passes the embedding system skips. Latency is higher (multi-step search vs single similarity query). And the tree-build step is computationally expensive on first ingest; expect 30 seconds to 2 minutes per long PDF. If your workload is millions of short queries against short documents, vector RAG is still the right answer. If your workload is high-stakes questions on long, structured documents, PageIndex earns its tokens. The repo just crossed 27.6k stars. The trending placement on May 6 was not an accident — the FinanceBench result is the kind of benchmark gain that gets engineers to migrate workloads. Where it fits in the stack: if you want it served over MCP without writing your own client, see PageIndex MCP. For the broader trend toward consolidating AI tools (which the productivity research strongly supports), read why more AI tools makes your team slower. For governance over the agents that consume PageIndex retrievals, look at Microsoft Agent 365.
Why It Matters
PageIndex is the first credible challenger to the vector-database thesis that has dominated RAG architecture for three years. Hitting 98.7% on FinanceBench — a benchmark vector RAG plateaus on — with an embedding-free pipeline forces every team running enterprise RAG to ask whether the chunking and cosine-similarity pipeline they are paying for is the right architecture. The MIT license means anyone can study the design and build derivatives.