DeepSeek v4

· ai web · Source ↗

TLDR

  • DeepSeek v4 launches two models (flash, pro) with native thinking support, fully compatible with OpenAI and Anthropic SDKs via base_url swap.

Key Takeaways

  • Drop-in OpenAI/Anthropic replacement: point base_url to api.deepseek.com or api.deepseek.com/anthropic and reuse existing SDK code.
  • Two production models: deepseek-v4-flash (fast, cheap) and deepseek-v4-pro (frontier); legacy aliases deepseek-chat and deepseek-reasoner deprecated 2026-07-24.
  • Thinking mode is a first-class API param: pass {"type": "enabled"} plus reasoning_effort: "high" in the request body.
  • Legacy alias behavior: deepseek-chat maps to v4-flash non-thinking mode; deepseek-reasoner maps to v4-flash thinking mode.
  • Streaming supported on all models via stream: true.

Hacker News Comment Review

  • Pricing undercuts the “subsidized inference” argument: pro at $3.48/M output and flash at $0.28/M output led commenters to question whether frontier labs are actually losing money on inference at all.
  • Weights for the 1.6T-parameter pro base model are live on HuggingFace, which is notable scale for an open release; commenters compared benchmark performance favorably against Opus 4.6.
  • Security skepticism surfaced quickly: at least one commenter raised the possibility of deliberate backdoors embedded in the weights, invoking the xz-utils supply-chain incident as a reference point.

Notable Comments

  • @fblp: developer docs shipped before any press release, which commenters read as a signal of engineering-first culture.
  • @bandrami: alleges weights may contain “xz-level easter eggs” planted with enough lead time for plausible deniability.

Original | Discuss on HN