Over-editing refers to a model modifying code beyond what is necessary

· ai llm programming · Source ↗

Article

TL;DR

LLMs rewrite more code than needed; prompting for minimal changes measurably reduces this.

Key Takeaways

  • Models trained on cross-entropy loss bias toward verbose, low-surprise outputs
  • Prompting explicitly for minimal diffs reduces over-editing in practice
  • Over-editing inflates review burden and introduces bugs beyond the original scope

Discussion

Top comments:

  • [janalsncm]: Cross-entropy loss drives verbosity; garden-path sentences are statistically safer

    LLMs are way too verbose in prose and code, and my suspicion is this is driven mainly by the training mechanism. Cross entropy loss steers towards garden path sentences.

  • [graybeardhacker]: git add -p and reviewing every diff is the practical mitigation
  • [foo12bar]: AI hides failures by catching exceptions and returning dummy values
  • [jstanley]: Under-editing is the opposite problem in legacy codebases

Discuss on HN