Rars: a Rust RAR implementation, mostly written by LLMs

· ai coding · Source ↗

TLDR

  • One developer used Claude Opus and OpenAI Codex over 5 weeks and ~£40 in tokens to produce a working free-software Rust RAR compressor covering all format versions.

Key Takeaways

  • Workflow split by model strength: Claude for architecture chat and code review, Codex for sustained autonomous spec-to-code generation using the new /goal loop feature.
  • RAR has no official spec; the author reverse-engineered format docs from unar, libarchive, UNRARLIB, Ghidra, and DOSBox-x before writing a single line of compressor code.
  • Test mass matters: fragile unit tests and fixture-driven regression suites acted as context ballast, pulling models back on track and catching hallucinations that survived 10+ review passes.
  • Codex /goal ran 6-16 hour uninterrupted sessions, flood-filling ~40k lines covering encryption, multi-volume, and recovery records with minimal supervision.
  • Compression landed within 5-10% of WinRAR on test data; performance is multiple times slower, attributed to both LLM-generated safe Rust and absent low-level optimization tricks.

Hacker News Comment Review

  • Copyright taint is the sharpest concern: the reverse-engineering workflow kept the same human in the loop throughout disassembly and code generation, making clean-room separation arguments weak.
  • Commenters dispute the “5 years” baseline estimate as a wild overestimate, suggesting a skilled developer could do it in weeks, which deflates the project’s framing.
  • Correctness questions center on test coverage versus real-world archive compatibility; the practical answer offered is empirical use against live archives.

Notable Comments

  • @themafia: raises unresolved copyright licensing question for LLM-generated output derived from decompiled sources.
  • @perching_aix: “basically guaranteed tainted” given no disasm-to-formal-spec firewall before code generation.

Original | Discuss on HN