GenCAD

· ai · Source ↗

TLDR

  • Paper introduces GenCAD, an image-conditional model that generates parametric CAD command sequences (CAD programs) convertible to 3D solids via a geometry kernel.

Key Takeaways

  • GenCAD combines four components: autoregressive transformer encoder, contrastive learning for CAD-image/command alignment, latent diffusion model, and a CAD-latent decoder.
  • Output is a full parametric CAD program, not just a mesh or point cloud, preserving modifiability critical for engineering and manufacturing workflows.
  • Training data uses B-rep-compatible command sequences; the paper acknowledges a limited CAD vocabulary lacking revolve, fillet, and chamfer operations.
  • Self-reported reliability is around 60% on training-distribution data; input images are constrained to isometric, noise-free CAD renders.

Hacker News Comment Review

  • Practitioners are skeptical of practical utility: the hard CAD work is specifying dimensions, tolerances, and parametric constraints, none of which GenCAD addresses.
  • Multiple people who tried to run it found setup broken and results only viable on images matching the training distribution, consistent with the paper’s own stated limitations.
  • Consensus is that current weights lack sufficient training data and vocabulary to generalize; the gap between demo images and real-world part photos or hand-drawn sketches is large.

Notable Comments

  • @hspeiser: tested 10 images matching demo complexity, could not produce usable results outside training data; notes ~60% reliability figure from the GitHub.
  • @ponyous: compares to MeshCoder; concludes no current project has enough training to handle arbitrary models, not just GenCAD.

Original | Discuss on HN