Mapping a general graph onto a hierarchical tree structure is a third fundamental hard problem in CS, alongside naming and cache invalidation.
Key Takeaways
Tree mapping appears across file systems (Linux package shredding vs. macOS bundles), code repos (component vs. language layout), writing, city planning, and biological taxonomy.
No canonical tree mapping exists for webs of ideas or relationships; every choice distorts the underlying graph structure.
Google’s monorepo (Blaze/Bazel, Pants, Buck) is a concrete attempt to escape forced taxonomies in large codebases.
Christopher Alexander’s 1965 “A City is not a Tree” is cited as an early formal diagnosis: designed cities fail because they impose tree structure on semilattice human relationships.
Cladistics over morphological taxonomy illustrates that preserving real connections beats imposing artificial hierarchies.
Hacker News Comment Review
Commenters converged on a unifying framing: tree mapping is fundamentally a dimensional reduction from high-dimensional graphs, and aggregability is NP-Hard per a cited ACM proof, giving the problem formal teeth.
There is practical consensus that flat structures often beat forced hierarchies; commenters linked to posts on flat Rust workspaces and flat module layouts as concrete alternatives to deep trees.
One commenter argued all three “hard problems” collapse into one: deciding whether two things are the same thing or separate, framing naming, caching, and tree mapping as identity problems.
Notable Comments
@efitz: Names this the “one true taxonomy” problem – stakeholder rooms never converge because every hierarchy privileges one classification dimension over others.
@chaboud: Grounds the difficulty formally: aggregability is NP-Hard, so any tree reduction of a graph is computationally intractable to optimize.