AI's Economics Don't Make Sense

· ai devtools · Source ↗

TLDR

  • GitHub Copilot’s shift to usage-based pricing June 1, 2026 exposes a structural flaw: monthly AI subscriptions have always required subsidizing wildly variable token costs.

Key Takeaways

  • Microsoft was losing $20-$80/user/month on Copilot in 2023; the flat-fee model was unsustainable for three years before this announcement.
  • A single Copilot “premium request” consumed ~60,000 context tokens plus tool calls, costing ~$11 – well above per-request plan value.
  • Reasoning models increased inference costs even as per-token prices for older models fell, breaking the subsidy math further over time.
  • Anthropic reportedly allowed users to burn ~$8 in compute per $1 of subscription; OpenAI flat plans have similar structural losses.
  • The author argues monthly subscriptions are fundamentally incompatible with LLMs: one user makes 5 requests, another refactors an entire codebase – both pay the same.

Hacker News Comment Review

  • The article’s core premise is heavily contested: commenters cite frontier lab margins reportedly above 80% on tokens, and commodity providers like Kimi K2.6 serving profitably at $4/1M output tokens – suggesting the losses are a deliberate product-layer subsidy decision, not a structural impossibility.
  • Commenters disagree on cost trajectory: the article treats inference cost as permanently high or rising, but several argue 2-5x annual efficiency gains (hardware, quantization, caching, architecture) make flat-fee economics viable once the subsidy phase ends – unlike Uber where fuel costs are physically bounded.
  • There is broad agreement on the UX risk of metered billing: users trained on flat-fee have no intuition for token burn, agentic sessions can rack up costs invisibly, and this pattern has caused user pain from slashdot-era hosting bills to overnight CI retry loops.

Notable Comments

  • @joshjob42: Argues frontier labs run ~80% profit margins; Kimi K2.6 profitable at $4/1Mtok suggests flat-fee losses are policy, not physics.
  • @wood_spirit: Metered billing for compute has always caused surprise cost overruns; agentic AI sessions with unpredictable token counts are the newest version of this.

Original | Discuss on HN