High Performance Git

Apr 28, 2026 · databases devtools systems · Source ↗

TLDR

Book by Ted Nyman covering Git internals, packfiles, partial clone, reftable, and sparse-checkout for monorepo and CI engineers at scale.

Git is simultaneously a content-addressed object store, filesystem index, graph walker, and transfer protocol; performance degrades when any layer is misunderstood.
Covers commit-graph, Bloom filters, MIDX, and bitmaps as local-scale acceleration structures that most engineers never configure.
Partial clone and promisor remotes let teams avoid materializing the full object graph at clone time, critical for large monorepos.
Protocol v2, bundle URIs, and Scalar are the transport and repo-initialization levers for cutting CI clone cost.
Section V provides a diagnosis playbook: instrument Git, isolate the slow layer, apply targeted config, and recover corrupt state.

Author Ted Nyman confirmed a 1.1 edition fixing errors and trimming filler, with a free PDF at gitperf.com; he signals a future piece on “Git Futures” or post-Git tooling is coming.
One commenter challenged the prose as LLM-generated, quoting a passage from the epilogue as evidence; this generated visible friction but no technical rebuttal of the content itself.
The practical gaps commenters surfaced: Git LFS adds noticeable latency on every remote-touching command even for small repos, and uncomplicated sparse checkout (SVN-style path selection) remains a missing ergonomic feature.

@hmpc: Recommends “Building Git” by James Coglan as a complementary deep-dive that reconstructs Git from scratch in Ruby.
@john-tells-all: Points to eagain.net’s “Git for Computer Scientists” for diagrams clarifying why HEAD differs from a commit object.