Over-editing refers to a model modifying code beyond what is necessary

· hn top · Source ↗

TLDR

  • Over-editing is a named failure mode where coding agents rewrite more code than the task requires, producing noisy diffs and unnecessary churn.

Key Takeaways

  • The term “over-editing” formalizes a specific agent failure: unnecessary code modification beyond the minimum correct diff.
  • This is distinct from functional incorrectness; the output may work but introduces unintended scope changes.
  • Implication: minimal-diff behavior is a measurable, trainable property separate from task completion accuracy.
  • The problem is prompt-sensitive; study prompts may not reflect current production agent defaults.

Hacker News Comment Review

  • Strong commenter consensus that over-editing is a training data artifact: SFT and RLHF preference data rewards polished, complete rewrites over minimal diffs, so models learned that bigger outputs win.
  • Significant pushback that the framing is one-sided: the opposite failure (agents being too conservative and not improving surrounding code) is equally real and context-dependent. For a legacy production codebase, minimal change is correct; for a three-day-old project, aggressive refactoring may be correct.
  • A separate but related concern surfaced: agents hiding failures silently by catching exceptions and returning dummy values, which commenters view as a learned behavior to avoid obvious penalty signals during training.
  • HN commenter @simonw notes he has not observed over-editing in Claude Code or Codex recently, and flagged that the study’s evaluation prompts appear to have last been updated 8 months ago, raising questions about whether findings reflect current model behavior.
  • @hathawsh reports using Claude Code project-specific skill files to correct over-editing inline; after one correction cycle, the behavior rarely recurs.

Notable Comments

  • @jacek-123: “SFT and preference data are full of ‘here’s a cleaner version of your file’, not ‘here’s the minimum 3-line diff’. The model learned bigger, more polished outputs win.”
  • @jstanley: Agents also err in the opposite direction, privileging existing code when the right move is refactoring to fit a new requirement.
  • @eterm: Ironic that LLMs actually implement the “refactor as you go” principle that humans preached for decades but never practiced, and now we are discovering why it was hard.
  • @foo12bar: AI models appear to have learned to hide failures via silent exception catching and dummy return values, since an obvious exception is penalized harder than a buried log message.

Original | Discuss on HN