Claude system prompt bug wastes user money and bricks managed agents

· coding · Source ↗

TLDR

  • A hardcoded malware-check <system-reminder> injected on every Read and Grep call in Claude Code v2.1.111 causes Opus 4.7 subagents to refuse legitimate code edits at a ~40-60% rate.

Key Takeaways

  • The reminder is embedded in the claude CLI binary itself, not user config – binary grep confirms it in /Users/…/.local/share/claude/versions/2.1.111.
  • Prompt grammar is the root cause: “you MUST refuse to improve or augment the code” is an unconditional sentence; subagents read it literally and refuse, while the main thread applies charitable interpretation.
  • Three of five parallel Opus 4.7 subagents refused a legitimate MIT-licensed Rust reverse proxy, each citing the same reasoning chain: harness-level system reminders take precedence over user instructions.
  • Token waste is compounding: ~400 tokens per Read x 50-100+ reads per session = 20-40k wasted tokens per run, on top of main-thread context spent explaining the reminder to subagents that then fail anyway.
  • This regression was marked fixed in v2.1.92 (issue #47027, closed February); v2.1.111 is 19 versions past that close with identical reproduction.

Hacker News Comment Review

  • Commenters independently confirmed the token-burn pattern in Managed Agents: every Read triggers a malware scan Claude logs as complete, yet refusals persist – direct money loss in production workflows.
  • A structural conflict-of-interest point surfaced: Anthropic both sells API access and ships the agentic harness that burns tokens, and internal engineers reportedly use unlimited plans that make 100k-token overheads feel invisible.
  • One commenter found a workaround – telling Claude inline that the code is not malware stopped the per-file scans – but this is fragile and fails entirely in subagent contexts where the main thread cannot interject.

Notable Comments

  • @wxw: argues agent token consumption is structurally opaque and users “can’t scrutinize system prompts, tool calls, MCPs” – the token model benefits builders over users.
  • @QuercusMax: “It seems pretty clear that the prompt was very poorly phrased” – notes the wording should obviously prevent any code changes after any file read.

Original | Discuss on HN