pytorch-hessian-eigenthings v1 adds Lanczos, Hutch++ trace, GGN/Fisher operators, and fused Triton kernels for scalable Hessian analysis on real models.
Key Takeaways
Computes top eigenvalues/eigenvectors via Lanczos or stochastic power iteration without quadratic memory; only linear-memory HVPs required.
v1 adds GGNOperator and EmpiricalFisherOperator sharing the same interface, so all algorithms work interchangeably across curvature matrices.
Hutch++ trace estimation and Stochastic Lanczos Quadrature spectral density are new additions in the rewrite.
For LM-scale work, a fused CE Hessian-vector Triton kernel delivers ~3.4x speedup and 2x peak-memory reduction over eager PyTorch.
Finite-difference HVP path supports FSDP and setups where double-backward is impractical; param_filter enables per-block analysis on transformers.