Mozilla says 271 vulnerabilities found by Mythos and "almost no false positives"

· ai security · Source ↗

TLDR

  • Mozilla used Anthropic’s Mythos model plus a custom agent harness to find 271 Firefox security flaws in two months with almost no false positives.

Key Takeaways

  • The agent harness, not the model alone, drove results: it wraps the LLM, gives it instructions and tools, and runs it in a loop until completion.
  • Mozilla gave Mythos access to the same tools and Firefox test builds that human developers use, enabling realistic code analysis.
  • Earlier AI vulnerability detection attempts produced high rates of hallucinated bug reports requiring significant human triage to discard.
  • Mozilla Distinguished Engineer Brian Grinstead credited two factors: model improvements and Mozilla’s investment in project-specific harness customization.
  • Mozilla’s CTO has publicly claimed AI means “zero-days are numbered” and “defenders finally have a chance to win, decisively.”

Hacker News Comment Review

  • Commenters flagged a key definitional gap: the 271 items are bugs or potential vulnerabilities, not confirmed exploits with proof-of-concept; Mozilla’s own severity breakdown splits them into sec-critical, sec-high, and lower tiers.
  • One commenter with apparent insider context noted Mythos showed cross-domain reasoning, e.g., identifying that a floating-point value sent over IPC could be attacker-modified in non-obvious ways, suggesting genuine depth beyond pattern matching.
  • Skepticism exists about what distinguishes Mythos from standard frontier models like Opus for security work; Mozilla has not published a clear benchmark comparison.

Notable Comments

  • @jerrythegerbil: “Mythos didn’t write 271 PoC for vulnera” – flags that bug counts and verified vulnerabilities are not the same metric.
  • @canucker2016: Coverity already tracks 5000+ outstanding Firefox defects; overlap with Mythos findings is unknown.

Original | Discuss on HN