Back to Skills

Transilience AI Pentest Skills

Official

by Transilience AI

securityadvanced
Claude Code skillsAI penetration testingagent skillssecurity automationOWASP

A model scored 100% (104/104) on a published capture-the-flag security benchmark. No fine-tuning. No special training run. Just Claude Code reading a folder of markdown files. That folder is the Transilience AI Community Tools repo, and it is the most convincing proof yet that agent skills — plain text instructions — can turn a general model into a domain specialist.The repo (about 319 GitHub stars, primarily Python) ships 27 Claude Code skills covering vulnerability testing, reconnaissance, and specialized security domains, plus agents and slash commands for AI-powered penetration testing, bug-bounty hunting, and security research. On the same CTF benchmark suite where these skills hit 100%, Claude Sonnet 4.6 reached 96.2% and Claude Haiku 4.5 reached 62.5% — so the skills lift weaker models and max out the strong ones.Coverage is the headline. The skills claim 100% of the OWASP Top 10 and the OWASP LLM Top 10, 90%+ of the SANS Top 25 CWEs, and MITRE ATT&CK TTP mapping. Under the hood there are 53 attack types mapped to skills and 160+ reference files drawn from PayloadsAllTheThings techniques. Three tool integrations wire it into the real world: Playwright for browser-driven attacks, Kali Linux tooling, and NVD/CVE lookup for live vulnerability data.The architecture is genuinely agentic. A coordination skill defines three roles — coordinator, executor, and validator — so the agent plans an attack, runs it, and checks its own findings instead of blasting payloads blindly. Setup uses symlinks from a canonical skills/ and tools/ source into a project's .claude/ directory, with a containerized path via bash scripts/kali-claude-setup.sh projects/pentest for a sandboxed Kali environment.

Installation

git clone https://github.com/transilienceai/communitytools && bash scripts/kali-claude-setup.sh projects/pentest

Key Features

  • 27 Claude Code skills across vulnerability testing, recon, and specialized domains
  • Scored 100% (104/104) on a published CTF benchmark with no fine-tuning
  • 100% OWASP Top 10 and OWASP LLM Top 10 coverage, plus 90%+ SANS Top 25 CWE
  • 53 attack types mapped to skills with 160+ PayloadsAllTheThings reference files
  • Coordinator / executor / validator agent roles for self-checking workflows
  • Tool integrations: Playwright, Kali Linux tooling, and NVD/CVE lookup

Use Cases

  • Run an autonomous penetration test against an authorized target
  • Automate bug-bounty reconnaissance and vulnerability triage
  • Map findings to OWASP, SANS CWE, and MITRE ATT&CK for reporting
  • Teach Claude Code real attack techniques without fine-tuning a model
  • Spin up a sandboxed Kali + Claude security research environment

Related Resources

Weekly AI Digest