Transilience AI Pentest Skills
Officialby Transilience AI
A model scored 100% (104/104) on a published capture-the-flag security benchmark. No fine-tuning. No special training run. Just Claude Code reading a folder of markdown files. That folder is the Transilience AI Community Tools repo, and it is the most convincing proof yet that agent skills — plain text instructions — can turn a general model into a domain specialist.The repo (about 319 GitHub stars, primarily Python) ships 27 Claude Code skills covering vulnerability testing, reconnaissance, and specialized security domains, plus agents and slash commands for AI-powered penetration testing, bug-bounty hunting, and security research. On the same CTF benchmark suite where these skills hit 100%, Claude Sonnet 4.6 reached 96.2% and Claude Haiku 4.5 reached 62.5% — so the skills lift weaker models and max out the strong ones.Coverage is the headline. The skills claim 100% of the OWASP Top 10 and the OWASP LLM Top 10, 90%+ of the SANS Top 25 CWEs, and MITRE ATT&CK TTP mapping. Under the hood there are 53 attack types mapped to skills and 160+ reference files drawn from PayloadsAllTheThings techniques. Three tool integrations wire it into the real world: Playwright for browser-driven attacks, Kali Linux tooling, and NVD/CVE lookup for live vulnerability data.The architecture is genuinely agentic. A coordination skill defines three roles — coordinator, executor, and validator — so the agent plans an attack, runs it, and checks its own findings instead of blasting payloads blindly. Setup uses symlinks from a canonical skills/ and tools/ source into a project's .claude/ directory, with a containerized path via bash scripts/kali-claude-setup.sh projects/pentest for a sandboxed Kali environment.
Installation
Key Features
- ✓27 Claude Code skills across vulnerability testing, recon, and specialized domains
- ✓Scored 100% (104/104) on a published CTF benchmark with no fine-tuning
- ✓100% OWASP Top 10 and OWASP LLM Top 10 coverage, plus 90%+ SANS Top 25 CWE
- ✓53 attack types mapped to skills with 160+ PayloadsAllTheThings reference files
- ✓Coordinator / executor / validator agent roles for self-checking workflows
- ✓Tool integrations: Playwright, Kali Linux tooling, and NVD/CVE lookup
Use Cases
- →Run an autonomous penetration test against an authorized target
- →Automate bug-bounty reconnaissance and vulnerability triage
- →Map findings to OWASP, SANS CWE, and MITRE ATT&CK for reporting
- →Teach Claude Code real attack techniques without fine-tuning a model
- →Spin up a sandboxed Kali + Claude security research environment