Back to Repositories

AgentOps-AI/agentops

Two lines of Python. That is the entire cost to get full observability into your AI agents -- session replays, LLM cost breakdowns, step-by-step execution graphs, and exception traces. AgentOps has pulled 5.3K GitHub stars not because it does something new, but because it does something necessary with almost zero friction. Before AgentOps, monitoring an autonomous agent meant stitching together custom logging, cost calculators, and homegrown dashboards. Now you call `agentops.init()` at the top and `agentops.end_session('Success')` at the bottom, and the SDK auto-instruments every LLM call, tool invocation, and reasoning step in between. The integration surface is what separates AgentOps from generic observability platforms like Datadog or New Relic. AgentOps speaks agent natively: first-class support for CrewAI, AG2 (formerly AutoGen), OpenAI Agents SDK (both Python and TypeScript), LangChain, LlamaIndex, Camel AI, Google ADK, Haystack, Smolagents, and Agno. On the model provider side, it covers OpenAI, Anthropic, Cohere, Google Generative AI, LiteLLM, IBM Watsonx, x.AI, and Mem0. If your agent touches any combination of these, AgentOps captures the full interaction graph without per-integration configuration. The session replay dashboard is where most teams get hooked. It renders a waterfall view of every LLM call, action, tool call, and error with precise timestamps -- time-travel debugging for agents. You rewind an agent's execution to the exact decision point where reasoning diverged from the goal, spot recursive thought patterns that burn tokens in infinite loops, and catch failure modes that only surface under production load. PII redaction and audit trails come built in for compliance-sensitive teams. On cost management, AgentOps tracks token usage and spend across every foundation model provider in real time. For teams running multi-agent systems where a single orchestration run chains GPT-4, Claude, and a local Llama model, unified cost visibility per session is the difference between a $50/day agent and a $500/day one. The SDK is MIT-licensed with self-hosting support on your own infrastructure, keeping credentials and data within your security perimeter. Benchmarks show roughly 12% latency overhead -- a reasonable trade-off for full execution visibility. Installation is `pip install agentops`, and the project maintains 800+ commits on main with consistent development velocity.

agent frameworks
Python

Why It Matters

The AI agent ecosystem has a dangerous blind spot: everyone is building agents, almost nobody can explain what their agents actually do in production. When an agent fails at 3am -- hallucinates a bad API call, enters a recursive reasoning loop, or silently burns $200 in tokens on a task that should cost $2 -- most teams have no replay, no trace, and no diagnosis path. AgentOps exists to close that gap with the lowest integration cost in the market. The timing matters more than the technology. As enterprises move from single-model chatbots to multi-agent orchestration (CrewAI crews, AutoGen groups, OpenAI swarms), the observability problem compounds exponentially. Debugging one agent is hard. Debugging five agents coordinating across three model providers with shared tool access is nearly impossible without purpose-built instrumentation. AgentOps is the only open-source SDK that covers all the major agent frameworks under a unified dashboard. If you are shipping agents to production without session-level observability, your costs will prove it before your users do.

Repository Stats

Stars
5.4k
Forks
543
Last Commit
10/30/2025

Related Resources

Weekly AI Digest