Human Typing Habits and Token Counts

May 9, 2026 · ai · Source ↗

Ordinary typing habits—typos, shorthand, filler words, pasted UUIDs—change token counts without changing intent, and tokenizers bill by pattern regardless of recovered meaning.

What Matters

template → 1 token; tempalte → 3 tokens (OpenAI). Same word, 3× cost from a single transposition.
assistant → 1 token; assitant → 2 (OpenAI), 3 (Claude). Claude consistently tokenizes misspellings more expensively.
Shorthand backfires: pls → 2 tokens (Claude), thx → 2, w/o → 3 (Claude) vs. 1 token each for the full dictionary words.
A UUID like 019d6ce9-7cfe-753a-b6d6-df719510c9e3 costs 24 tokens (OpenAI) or 26 (Claude); an RFC 3339 timestamp costs 16–17 tokens.
Expressive punctuation leaks: Yes!! → 2 tokens, yesss → 3, reeeally → 3—tone markers that rarely help the task.
Suffixes fragment unpredictably: describe → 1, describer → 2, describers → 3; a tiny morpheme can double or triple the split.
Boundary whitespace (leading/trailing spaces) inflates counts; normal internal spacing is generally safe.

Original | Discuss on HN

Bentos

Topics

Human Typing Habits and Token Counts

What Matters

Bentos

Topics

What Matters

Related coverage

Anthropic is expanding to Colossus2. Will use GB200

Long-term editing of brain circuits using an engineered electrical synapse

Show HN: CPU-only transcription for YouTube, TikTok, X, Instagram videos

In Yesterday's IO Keynote Google Declared War on the Remnants of the Web