Writing Z80 assembly, 4 decades later:-)

· devtools · Source ↗

TLDR

  • Developer ports a software 3D points renderer to ZX Spectrum 48K in Z80 assembly, pushing from 6.2 fps in C to 14 fps in ASM and 40 fps with full precomputation.

Key Takeaways

  • Toolchain: z88dk cross-compiler with a simple Makefile; outputs .tap files runnable in the FUSE emulator; repo includes prebuilt taps for statue and sphere models.
  • Core optimization: replaced two runtime divisions with reciprocal lookup table multiplications and used Z80 page-based (HL) register tricks to avoid costly memory addressing.
  • Precompute branch encodes pixel VRAM addresses and bit offsets into 16 bits/pixel at build time, leaving the inner loop as near-pure reads and writes; 4x speedup over runtime ASM.
  • Inline Z80 ASM blitter runs 3.5x faster than equivalent C for the same reads/shifts/writes, confirming Z80 C compilers leave significant register efficiency on the table.
  • Projection math uses no runtime multiplications: two divisions plus additions, derived from standard perspective equations with a fixed orbit camera and pre-scaled integer coordinates (S=8960).

Hacker News Comment Review

  • Commenters converge on tooling as the key insight: modern assembler IDEs and cross-compilers make 8-bit assembly far more productive than 1980s workflows, narrowing the gap with high-level languages.
  • A second commenter is independently porting modern renderers to the Spectrum and flagged the reciprocal lookup and axis-swap tricks as ideas worth adopting.

Original | Discuss on HN