PageIndex MCP
FeaturedOfficialby VectifyAI
PageIndex MCP is the Model Context Protocol server that turns the trending vectorless RAG approach into a one-line install for any MCP client. Drop it into Claude Code, Cursor, Codex CLI, Gemini CLI or Claude Desktop and you get long-document retrieval with no vector database, no chunking pipeline and no embedding model in the loop. The architectural bet is the same one the parent PageIndex repo made: reasoning beats similarity for long, structured documents. The MCP server exposes that retrieval as a clean tool surface. The agent calls one MCP tool to query a PDF; PageIndex builds (or retrieves) the hierarchical document tree and lets the LLM tree-search it. The agent gets back the relevant passages plus the structural context needed to interpret them. No vector DB to provision, no embeddings to refresh, no chunking strategy to tune. The benchmark backing this approach is the part that should pull engineering leads off the fence. The parent project hits 98.7% on FinanceBench — a long-document QA benchmark where vector RAG has plateaued in the low 80s. The MCP server uses the same retrieval engine, so the benchmark transfers directly. If you are running RAG over financial filings, legal contracts or technical manuals, the accuracy gain is not marginal. Distribution is dual-path. The npm package pageindex-mcp is the easiest install — one command and your MCP client sees the server. For teams that need on-prem or strict data control, the open-source repo at VectifyAI/pageindex-mcp self-hosts cleanly, and the parent repo's MIT license carries through. The hosted cloud (chat.pageindex.ai) is the third option for teams that want zero infrastructure. The MCP server lists on mcpmarket.com under the tagline "Self-Hosted Vectorless PDF RAG for Claude." That is the sales pitch in seven words. The MCP server crowd is currently dominated by AWS Labs (infrastructure tools), Jama Software (engineering specs) and a handful of database connectors — PageIndex MCP is the first MCP server in market focused entirely on the document-retrieval problem. The honest trade-offs: tree-search per query burns more model tokens than a vector lookup, and the first ingest of a long document takes 30 seconds to a couple of minutes while the tree builds. For workloads with millions of short queries on short documents, traditional vector RAG is still the cheaper architecture. For high-stakes questions on long, structured documents — the kind of work where a wrong answer has audit consequences — the accuracy gain is the right trade. Where it fits on Skila: the parent project at PageIndex on Skila Repos covers the architecture and benchmark detail. For the broader productivity research suggesting teams should consolidate retrieval into one MCP server rather than stack three, read why more AI tools makes your team slower. For governance over the agents using this MCP server in production, see Microsoft Agent 365.
Installation
Key Features
- ✓Exposes PageIndex's vectorless reasoning-based RAG over the Model Context Protocol
- ✓Compatible with Claude Code, Cursor, Codex CLI, Gemini CLI and Claude Desktop
- ✓No vector database setup — eliminates the chunking and embedding pipeline entirely
- ✓Backed by 98.7% accuracy on FinanceBench at the parent project level
- ✓Available as an npm package (pageindex-mcp) for one-command install
- ✓Self-host the open-source repo for on-prem or strict-data-control deployments
- ✓Listed on mcpmarket.com as 'Self-Hosted Vectorless PDF RAG for Claude'
Use Cases
- →Claude Code agents that need to query 300-page financial filings without losing cross-reference context
- →Cursor workflows that pull current acceptance criteria from a long technical specification
- →Codex CLI sessions that reason over multi-section legal contracts on the fly
- →Gemini CLI agents that retrieve regulatory documents without standing up a vector database
- →Self-hosted RAG deployments where compliance forbids sending document content to a third-party embedding API
- →Teams replacing a Pinecone or Weaviate deployment for long-document workloads