aiming-lab/MetaClaw
MetaClaw turns live conversations into continuous training data — no GPU cluster, no offline datasets. You talk to your agent; it learns and evolves. The repo wraps your model behind an OpenAI-compatible API, intercepts interactions from OpenClaw, scores each turn with a reward model (e.g. GPT-5.2), and runs online LoRA fine-tuning via Tinker's cloud API. Updated weights are hot-swapped into production with zero downtime. If you've ever wanted an agent that gets better from real usage instead of static prompt engineering, MetaClaw is built for that. It supports two learning modes: GRPO (RL from implicit feedback) and On-Policy Distillation (OPD), where a teacher model supplies per-token log-probabilities so a smaller student matches the teacher's distribution. Skill injection pulls the most relevant skill instructions into the system prompt at every turn; with skill evolution enabled, when the agent fails, an LLM analyzes the trajectory and generates new skills automatically. Serving, reward modeling, and training are fully decoupled — the agent keeps responding in real time while optimization runs asynchronously. Stack: Python, FastAPI, Tinker SDK, Kimi-2.5 (~200B MoE) or Qwen3-4B. Quick start: install deps, point OpenClaw at the MetaClaw proxy, set TINKER_API_KEY, run the conversation RL script. From there you just chat; MetaClaw collects turns, scores them, and trains. ~250 stars, actively maintained (March 2026), from aiming-lab with citations and clear config.
Why It Matters
Most agent frameworks are static: you ship a policy and it never improves from production traffic. MetaClaw closes the loop. Deploy once, talk to the agent, and it gets better from real conversations without you maintaining a GPU cluster — Tinker handles LoRA in the cloud. That makes continual learning feasible for teams that don't have dedicated ML infra. The OpenClaw integration is a direct fit for the current wave of proactive, autonomous agents; MetaClaw is the piece that lets those agents learn from success and failure in the wild. If you're building on OpenClaw or any OpenAI-compatible agent stack and want the agent to evolve from usage, this is the repo to clone.