Commenters are cautiously optimistic about replacing cudarc-based workflows, but note build-time comparisons to nvcc are unresolved and depend heavily on incremental compilation setup.
The safety model is acknowledged as partial: Rust’s borrow checker cannot enforce GPU thread-level aliasing, so DisjointSlice and ThreadIndex do the heavy lifting where the type system can reach.
Skepticism surfaced that the codebase may be largely AI-generated, which raises concerns about long-term maintainability and correctness of the custom IR and codegen backend.
Notable Comments
@nextaccountic: quotes the safety docs directly – the borrow checker was “not designed for 2048 threads per SM all pointing at the same output buffer.”
@the__alchemist: asks whether cuda-oxide enables shared host/device structs, which existing Rust/CUDA workflows still lack.