DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles

· ai · Source ↗

TLDR

  • SGLang and Miles claim first open-source Day-0 coverage of DeepSeek-V4, spanning both fast inference serving and verified RL training.

Key Takeaways

  • LMSYS ships Day-0 support for DeepSeek-V4 the same day the model launches, a deliberate speed-to-ecosystem strategy.
  • SGLang handles the inference side; Miles handles the RL training side – two separate systems bundled as one announced stack.
  • “Verified RL” in the title signals the training support goes beyond SFT fine-tuning into reinforcement learning with verification.
  • This is an open-source release, meaning builders can run inference and RL training on DeepSeek-V4 without proprietary tooling.
  • LMSYS framing positions SGLang + Miles as the reference open-source stack for both serving and post-training DeepSeek-V4 workloads.

Hacker News Comment Review

  • No substantive HN discussion yet.

Original | Discuss on HN