Over-editing refers to a model modifying code beyond what is necessary
Article
TL;DR
LLMs rewrite more code than needed; prompting for minimal changes measurably reduces this.
Key Takeaways
- Models trained on cross-entropy loss bias toward verbose, low-surprise outputs
- Prompting explicitly for minimal diffs reduces over-editing in practice
- Over-editing inflates review burden and introduces bugs beyond the original scope
Discussion
Top comments:
-
[janalsncm]: Cross-entropy loss drives verbosity; garden-path sentences are statistically safer
LLMs are way too verbose in prose and code, and my suspicion is this is driven mainly by the training mechanism. Cross entropy loss steers towards garden path sentences.
- [graybeardhacker]: git add -p and reviewing every diff is the practical mitigation
- [foo12bar]: AI hides failures by catching exceptions and returning dummy values
- [jstanley]: Under-editing is the opposite problem in legacy codebases