8/30/2025

Training Language Model Agents to Find Vulnerabilities with CTF-Dojo

16 tweets

3 min read

Thrummarise

@summarizer

CTF-DOJO introduces the first large-scale executable runtime environment tailored for training large language models (LLMs) in cybersecurity. It features 658 fully functional Capture-The-Flag (CTF) challenges containerized in Docker, ensuring reproducibility and verifiable feedback.

[1]

Thrummarise

@summarizer

The authors address the scarcity of scalable execution-grounded environments by developing CTF-FORGE, an automated pipeline that converts public CTF artifacts into ready-to-use Docker environments within minutes, eliminating weeks of manual setup.

[1][2]

Thrummarise

@summarizer

Cybersecurity demands autonomous systems to identify software vulnerabilities rapidly. CTF competitions serve as benchmarks for evaluating ML models' cybersecurity reasoning, requiring multi-step adversarial strategies to uncover hidden flags.

[2]

Thrummarise

@summarizer

CTF-DOJO sources challenges from pwn.college's public archive, used worldwide for cybersecurity education. Setting up these environments manually is complex and time-consuming, but CTF-FORGE automates this with over 98% success in validation, enabling scalable training data generation.

[2][3]

Thrummarise

@summarizer

Compared to existing benchmarks, CTF-DOJO uniquely offers executable, agentic, and real human-developed tasks. It provides a diverse set of challenges across categories like cryptography, forensics, binary exploitation, reverse engineering, and web security.

[3]

Thrummarise

@summarizer

The pipeline generates Dockerfiles, docker-compose.yml, and metadata automatically, adapting to challenge types such as web servers, binary exploits, or cryptographic services. This automation supports reproducibility and scalability essential for training cybersecurity agents.

[3][4][5]

Thrummarise

@summarizer

CTF-DOJO emphasizes reliability by validating Docker builds and network responsiveness, achieving 98% consistency across 658 challenges. Scalability is ensured by continuously integrating new CTF challenges from the community with minimal manual effort.

[5]

Thrummarise

@summarizer

Training data is constructed by collecting multi-turn interaction trajectories using LLM agents like DeepSeek-V3-0324 within the CTF-DOJO environment. These trajectories capture iterative problem-solving steps, essential for teaching agents complex exploitation strategies.

[5][6]

Thrummarise

@summarizer

To improve success rates, the authors incorporate 8,361 publicly available CTF writeups as inference-time hints, boosting model performance significantly. Runtime augmentations such as randomized ports and file paths prevent overfitting and encourage generalizable strategies.

[6][7]

Thrummarise

@summarizer

Fine-tuning Qwen3 models of varying sizes (7B, 14B, 32B) on 486 high-quality trajectories from CTF-DOJO yields up to 11.6% absolute gains over baselines on three benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench.

[7][8]

Thrummarise

@summarizer

CTF-DOJO-trained models achieve state-of-the-art open-weight performance, rivaling proprietary systems like DeepSeek-V3-0324 and Claude-3.5-Sonnet, while requiring significantly fewer training samples, demonstrating data efficiency in cybersecurity agent training.

[8]

Thrummarise

@summarizer

Ablation studies reveal that writeup hints improve solved challenge counts by 7.4%, runtime augmentation increases solved tasks by up to 24.9%, and combining diverse teacher models enhances overall agent performance and task coverage.

[8][9][10]

Thrummarise

@summarizer

The authors situate CTF-DOJO within related work on LLM agents for offensive cybersecurity, highlighting its unique contribution as the first execution environment enabling training with verified runtime feedback, unlike prior inference-only or synthetic data approaches.

[10]

Thrummarise

@summarizer

CTF-DOJO's future directions include live benchmarking with dynamically containerized challenges and reinforcement learning approaches for agents to interact with live environments, aiming to improve adaptability and generalization in vulnerability discovery.

[10][11]

Thrummarise

@summarizer

The authors acknowledge dual-use risks, emphasizing responsible development and collaboration to ensure that advanced LLM-based cybersecurity tools strengthen defenses while mitigating misuse for offensive exploits.

[11]

Thrummarise

@summarizer

CTF-DOJO provides a scalable, reproducible, and democratized foundation for advancing LLM-based cybersecurity agents, marking a significant step towards automated vulnerability detection and exploitation with practical, verified training data.

[11]

Rate this thread

Help others discover quality content

Ready to create your own threads?

Get Started Free