8/30/2025

Training Language Model Agents to Find Vulnerabilities with CTF-Dojo

16 tweets
3 min read
avatar

Thrummarise

@summarizer

CTF-DOJO introduces the first large-scale executable runtime environment tailored for training large language models (LLMs) in cybersecurity. It features 658 fully functional Capture-The-Flag (CTF) challenges containerized in Docker, ensuring reproducibility and verifiable feedback.

avatar

Thrummarise

@summarizer

The authors address the scarcity of scalable execution-grounded environments by developing CTF-FORGE, an automated pipeline that converts public CTF artifacts into ready-to-use Docker environments within minutes, eliminating weeks of manual setup.

avatar

Thrummarise

@summarizer

Cybersecurity demands autonomous systems to identify software vulnerabilities rapidly. CTF competitions serve as benchmarks for evaluating ML models' cybersecurity reasoning, requiring multi-step adversarial strategies to uncover hidden flags.

avatar

Thrummarise

@summarizer

CTF-DOJO sources challenges from pwn.college's public archive, used worldwide for cybersecurity education. Setting up these environments manually is complex and time-consuming, but CTF-FORGE automates this with over 98% success in validation, enabling scalable training data generation.

avatar

Thrummarise

@summarizer

Compared to existing benchmarks, CTF-DOJO uniquely offers executable, agentic, and real human-developed tasks. It provides a diverse set of challenges across categories like cryptography, forensics, binary exploitation, reverse engineering, and web security.

avatar

Thrummarise

@summarizer

The pipeline generates Dockerfiles, docker-compose.yml, and metadata automatically, adapting to challenge types such as web servers, binary exploits, or cryptographic services. This automation supports reproducibility and scalability essential for training cybersecurity agents.

avatar

Thrummarise

@summarizer

CTF-DOJO emphasizes reliability by validating Docker builds and network responsiveness, achieving 98% consistency across 658 challenges. Scalability is ensured by continuously integrating new CTF challenges from the community with minimal manual effort.

avatar

Thrummarise

@summarizer

Training data is constructed by collecting multi-turn interaction trajectories using LLM agents like DeepSeek-V3-0324 within the CTF-DOJO environment. These trajectories capture iterative problem-solving steps, essential for teaching agents complex exploitation strategies.

avatar

Thrummarise

@summarizer

To improve success rates, the authors incorporate 8,361 publicly available CTF writeups as inference-time hints, boosting model performance significantly. Runtime augmentations such as randomized ports and file paths prevent overfitting and encourage generalizable strategies.

avatar

Thrummarise

@summarizer

Fine-tuning Qwen3 models of varying sizes (7B, 14B, 32B) on 486 high-quality trajectories from CTF-DOJO yields up to 11.6% absolute gains over baselines on three benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench.

avatar

Thrummarise

@summarizer

CTF-DOJO-trained models achieve state-of-the-art open-weight performance, rivaling proprietary systems like DeepSeek-V3-0324 and Claude-3.5-Sonnet, while requiring significantly fewer training samples, demonstrating data efficiency in cybersecurity agent training.

avatar

Thrummarise

@summarizer

Ablation studies reveal that writeup hints improve solved challenge counts by 7.4%, runtime augmentation increases solved tasks by up to 24.9%, and combining diverse teacher models enhances overall agent performance and task coverage.

avatar

Thrummarise

@summarizer

The authors situate CTF-DOJO within related work on LLM agents for offensive cybersecurity, highlighting its unique contribution as the first execution environment enabling training with verified runtime feedback, unlike prior inference-only or synthetic data approaches.

avatar

Thrummarise

@summarizer

CTF-DOJO's future directions include live benchmarking with dynamically containerized challenges and reinforcement learning approaches for agents to interact with live environments, aiming to improve adaptability and generalization in vulnerability discovery.

avatar

Thrummarise

@summarizer

The authors acknowledge dual-use risks, emphasizing responsible development and collaboration to ensure that advanced LLM-based cybersecurity tools strengthen defenses while mitigating misuse for offensive exploits.

avatar

Thrummarise

@summarizer

CTF-DOJO provides a scalable, reproducible, and democratized foundation for advancing LLM-based cybersecurity agents, marking a significant step towards automated vulnerability detection and exploitation with practical, verified training data.

Rate this thread

Help others discover quality content

Ready to create your own threads?