Frontier AI has broken the open CTF format

· ai security · Source ↗

TLDR

  • A top-tier CTF competitor argues Claude Opus 4.5 and GPT-5.5 have automated enough of the scoreboard that open online CTFs no longer measure human security skill.

Key Takeaways

  • Claude Code plus CTFd API makes it trivial to spin up one agent per challenge; teams now compete on orchestration budget, not security depth.
  • GPT-5.5 Pro can one-shot Insane-rated leakless heap pwn on HackTheBox, meaning most open CTF challenge pools are within frontier model range.
  • Open CTFs are effectively pay-to-win: more tokens thrown at a 48-hour event equals faster flag capture, independent of human skill.
  • The beginner ladder is broken; new players are pushed to use AI before building the instincts that AI is replacing, collapsing the learning feedback loop.
  • Alternatives like picoGym and HackTheBox lab environments are better fits now because the stated goal is education, not a public scoreboard.

Hacker News Comment Review

  • Commenters drew a direct parallel to education collapse: the same “do it for me” temptation that breaks CTF learning is degrading homework, coding assignments, and university coursework broadly.
  • The chess engine analogy was contested sharply; engines are banned during competitive play, so allowing frontier models in CTFs is not augmentation but replacement of the competitor.
  • Offline or air-gapped competition formats were suggested as a structural fix, mirroring how competitive programming handles fairness, though commenters noted that only gates access rather than solving the skill-signal problem.

Notable Comments

  • @sumeno: “Using AI on CTF is like using a car to get better at the 100 yard dash” – sharpest framing of the skill-replacement distinction.
  • @lg5689: Notes code golf, an even more niche domain with minimal training data, is also falling to frontier models, suggesting no competitive programming format is safe.

Original | Discuss on HN