After 8 years, I rewrote my open-source PyTorch curvature library

· ai devtools open-source · Source ↗

TLDR

  • pytorch-hessian-eigenthings v1 adds Lanczos, Hutch++ trace, GGN/Fisher operators, and fused Triton kernels for scalable Hessian analysis on real models.

Key Takeaways

  • Computes top eigenvalues/eigenvectors via Lanczos or stochastic power iteration without quadratic memory; only linear-memory HVPs required.
  • v1 adds GGNOperator and EmpiricalFisherOperator sharing the same interface, so all algorithms work interchangeably across curvature matrices.
  • Hutch++ trace estimation and Stochastic Lanczos Quadrature spectral density are new additions in the rewrite.
  • For LM-scale work, a fused CE Hessian-vector Triton kernel delivers ~3.4x speedup and 2x peak-memory reduction over eager PyTorch.
  • Finite-difference HVP path supports FSDP and setups where double-backward is impractical; param_filter enables per-block analysis on transformers.

Hacker News Comment Review

  • No substantive HN discussion yet.

Original | Discuss on HN