ExploitBench
ExploitBench measures the capability of AI cybersecurity agents to climb the 'exploitation ladder,' ranging from reaching vulnerable code to executing arbitrary payloads.
At a Glance
- AI Safety Researchers
- Cybersecurity Professionals
- Government Defense Units
AI Tools by ExploitBench
(1)ExploitBench
AI Security Exploit Benchmark
Discussions
No discussions yet
Be the first to start a discussion about ExploitBench
Latest News
Products & Services
A specialized benchmark designed to measure the 'exploitation ladder' for AI agents, covering steps from vulnerability discovery to arbitrary code execution.
An evaluation environment and toolkit for testing LLM-based cybersecurity agents against hardened vulnerability targets.
Market Position
ExploitBench is the first benchmark to offer a granular, ladder-based approach to measuring autonomous exploitation, providing deeper insights than binary pass/fail tests.
Leadership
Founders
Dr. David Brumley
Professor at Carnegie Mellon University (CMU) and Director of CyLab. He was formerly the CEO and Founder of ForAllSecure (acquired by Bugcrowd) and currently serves as the Chief AI and Science Officer at Bugcrowd.
Seunghyun Lee
PhD Student at Carnegie Mellon University and a leading security researcher specializing in Chrome V8 vulnerability research.
Executive Team
Dr. David Brumley
Project Lead / Professor
Renowned cybersecurity expert, professor at CMU, and executive at Bugcrowd.
Seunghyun Lee
Lead Researcher / PhD Student
Security researcher at Carnegie Mellon University focusing on autonomous exploitation.
Board of Directors
Founding Story
ExploitBench was created by researchers at CMU and Bugcrowd to provide a realistic, hardened evaluation standard for the growing field of autonomous AI cybersecurity agents, moving beyond simple static analysis.
Business Model
Revenue Model
Open Source Research Initiative. Funding and support provided by Carnegie Mellon University and Bugcrowd.
Pricing Tiers
The benchmark and associated code are available on GitHub for the global research community.
Target Markets
- AI Safety Researchers
- Cybersecurity Professionals
- Government Defense Units
- Benchmarking large language models (LLMs) for security
- Red teaming AI agents
- Evaluating defensive AI capabilities
- Anthropic
- Bugcrowd
- Carnegie Mellon University