Show HN: TRiP – a complete transformer engine in C built from scratch just by me

· ai · Source ↗

TLDR

  • Solo 18-month project implementing a full transformer engine in C: inference, training, BPE tokenizer, and vision, supporting Gemma, Llama2, PaliGemma, GPT2.

Key Takeaways

  • ~19,300 lines across 7 files; math.c pairs every forward op with its backward counterpart for readability as annotated learning material.
  • Supports SafeTensors and Karpathy formats, bf16/fp16/fp32 weights, AdamW with cosine LR, top-k/top-p sampling, and mmap RAM mode.
  • PaliGemma multimodal inference works via JPEG input with X11 display; X11 code and JSON parser are the only AI-generated sections.
  • No cmake, no Python, no external ML frameworks – make is the only build step; deps are gcc 13+, libjpeg, libx11.
  • float32 outperforms bf16/fp16 on CPU inference due to lack of CPU-native support for those formats – author notes this was surprising.

Hacker News Comment Review

  • No substantive HN discussion yet.

Original | Discuss on HN