Anatomy of High-Performance Matrix Multiplication (2008)
https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdfArticle
- Classic 2008 paper by Kazushige Goto on BLAS-level matrix multiply optimization
- Explains cache-blocking strategy and why it maps to memory hierarchy
- Foundation for understanding how modern BLAS/LAPACK achieves near-peak FLOPS
| Type | Link |
| Added | Apr 21, 2026 |
| Modified | Apr 21, 2026 |