Firecrawl
by Firecrawl
The Firecrawl MCP Server is the official Model Context Protocol integration from Firecrawl that brings production-grade web scraping, crawling, and structured data extraction directly into AI assistants like Claude, Cursor, Windsurf, and VS Code. Firecrawl is backed by Y Combinator and trusted by over 80,000 companies, with its core open-source project earning more than 88,000 GitHub stars. The MCP server itself has accumulated roughly 5,700 stars and receives over 8,800 weekly npm downloads, making it one of the most widely adopted web scraping servers in the MCP ecosystem. At its core, the server converts any publicly accessible website into clean, LLM-ready markdown or structured JSON by stripping ads, navigation elements, footers, cookie banners, and other boilerplate content. It handles JavaScript-rendered single-page applications, dynamically loaded content, and even PDF and DOCX documents without requiring the developer to manage headless browsers or proxy infrastructure. Firecrawl covers approximately 96 percent of the web without proxies and maintains a 95.3 percent scraping success rate with an average response time of seven seconds. The server exposes twelve MCP tools organized around five core capabilities. The scrape tool extracts content from individual pages with support for markdown, JSON, and screenshot output formats. The batch_scrape tool processes multiple URLs in parallel for high-throughput extraction workflows. The crawl tool traverses entire domains with configurable depth limits, URL filtering, and deduplication. The map tool discovers all indexed URLs on a site without requiring a sitemap. The search tool performs web searches with geographic targeting and time-based filtering, optionally scraping the full content of each result. The extract tool uses either cloud AI or self-hosted LLMs to pull structured data matching a developer-defined JSON schema. The agent tool conducts autonomous multi-step research by browsing multiple sources and synthesizing findings. Four browser tools (create, execute, delete, list) provide persistent Chrome DevTools Protocol sessions for interactive automation tasks like form filling, clicking, and scrolling. Configuration requires a single FIRECRAWL_API_KEY environment variable for cloud usage, with an optional FIRECRAWL_API_URL for self-hosted deployments. The server includes built-in resilience features: automatic retry with exponential backoff (configurable up to three attempts with delays from one to ten seconds), rate limit handling, and credit usage monitoring with configurable warning and critical thresholds. Installation takes a single command via npx, and the server supports both STDIO and Server-Sent Events transports for compatibility with remote and local MCP client configurations. A free tier provides 500 scraped pages to get started, with paid plans scaling from hobby to enterprise usage levels.
Installation
Key Features
- ✓Twelve MCP tools covering single-page scraping, batch processing, full-site crawling, URL discovery, web search, structured data extraction, autonomous agent research, and persistent browser automation sessions
- ✓Converts any website into clean, LLM-ready markdown or structured JSON by stripping ads, navigation, footers, cookie banners, and boilerplate -- achieving a 95.3 percent success rate with 7-second average response time
- ✓Structured data extraction via JSON schema definitions using cloud AI or self-hosted LLMs, preventing context window overflow by returning only the fields your application needs
- ✓Built-in resilience with automatic retry and exponential backoff, rate limit handling, and credit usage monitoring with configurable warning and critical thresholds
- ✓Handles JavaScript-rendered SPAs, dynamically loaded content, PDFs, and DOCX files without requiring developers to manage headless browsers or proxy infrastructure
- ✓Supports both cloud-hosted and self-hosted deployments, with STDIO and Server-Sent Events transports for compatibility across Claude Desktop, Claude Code, Cursor, Windsurf, VS Code, and n8n
Use Cases
- →Giving AI coding assistants the ability to scrape documentation sites, API references, and technical blogs in real time to provide accurate, current answers without leaving the editor
- →Building automated research pipelines where AI agents crawl competitor websites, extract product data into structured JSON schemas, and compile market intelligence reports
- →Extracting structured datasets from web pages at scale using batch scraping and JSON schema extraction for training data collection, price monitoring, or content aggregation
- →Enabling AI assistants to search the web, retrieve full page content from results, and synthesize findings across multiple sources for comprehensive research tasks
- →Automating browser-based workflows like form submissions, login sequences, and multi-step data entry through persistent Chrome DevTools Protocol sessions controlled by AI agents