Frontier AI has broken the open CTF format

May 16, 2026 · ai security · Source ↗

TLDR

A top-tier CTF competitor argues Claude Opus 4.5 and GPT-5.5 have automated enough of the scoreboard that open online CTFs no longer measure human security skill.

Claude Code plus CTFd API makes it trivial to spin up one agent per challenge; teams now compete on orchestration budget, not security depth.
GPT-5.5 Pro can one-shot Insane-rated leakless heap pwn on HackTheBox, meaning most open CTF challenge pools are within frontier model range.
Open CTFs are effectively pay-to-win: more tokens thrown at a 48-hour event equals faster flag capture, independent of human skill.
The beginner ladder is broken; new players are pushed to use AI before building the instincts that AI is replacing, collapsing the learning feedback loop.
Alternatives like picoGym and HackTheBox lab environments are better fits now because the stated goal is education, not a public scoreboard.

Commenters drew a direct parallel to education collapse: the same “do it for me” temptation that breaks CTF learning is degrading homework, coding assignments, and university coursework broadly.
The chess engine analogy was contested sharply; engines are banned during competitive play, so allowing frontier models in CTFs is not augmentation but replacement of the competitor.
Offline or air-gapped competition formats were suggested as a structural fix, mirroring how competitive programming handles fairness, though commenters noted that only gates access rather than solving the skill-signal problem.

@sumeno: “Using AI on CTF is like using a car to get better at the 100 yard dash” – sharpest framing of the skill-replacement distinction.
@lg5689: Notes code golf, an even more niche domain with minimal training data, is also falling to frontier models, suggesting no competitive programming format is safe.