promptfoo/promptfoo

Promptfoo is a CLI and library that eliminates trial-and-error from LLM application development. Instead of manually testing prompts and hoping they work, you define test cases with expected outputs, run them against multiple models simultaneously, and get a pass/fail report with a visual dashboard. The tool supports every major provider — OpenAI, Anthropic, Azure, Bedrock, Ollama, and dozens more — so you can compare model performance side-by-side without rewriting code. Define your evaluations in YAML, run them from the terminal, and view results in a browser-based comparison UI. What sets promptfoo apart from other eval frameworks is its red-teaming capability. Beyond functional testing, it scans your LLM apps for security vulnerabilities: prompt injection, jailbreaks, PII leakage, and harmful content generation. This makes it both a quality assurance tool and a security scanner in one package. The developer experience is polished. Local-first execution means your data never leaves your machine. Built-in caching speeds up repeated runs. CI/CD integration lets you block deployments when prompt quality drops. And the PR review feature automatically flags LLM security issues in pull requests. With 16.4K GitHub stars, 398 releases, and active maintenance (latest release March 12, 2026), promptfoo has become the de facto standard for teams that take LLM output quality seriously.

ai engineering

TypeScript

Why It Matters

Most teams ship LLM features with zero systematic testing — they eyeball outputs and pray. Promptfoo turns prompt engineering from guesswork into a measurable engineering discipline with automated evaluations, regression testing, and security scanning. It's the missing QA layer for the AI-native stack.

Repository Stats

Stars

16.4k

Forks

1.4k

Last Commit

3/16/2026

View on GitHub Visit Website

Related Resources

AI News & Articles

Read about the latest developments related to promptfoo/promptfoo

AI Tools Directory

Compare commercial AI tools and find the right one for your workflow