KV Cache Compression 900000x Beyond TurboQuant and Per-Vector Shannon Limit
https://arxiv.org/abs/2504.15356Article
- Claims KV cache compression exceeding 900,000x over prior state-of-the-art
- Uses the model itself as the compression dictionary
- Purports to exceed the per-vector Shannon entropy limit
Discussion
- Commenters intrigued; argument hard to follow
- Model-as-dictionary is the novel framing
- Skepticism about extraordinary claims without clear proof
| Type | Link |
| Added | Apr 21, 2026 |
| Modified | Apr 21, 2026 |