Rendered at 13:49:28 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
bobbyzhu2008 21 hours ago [-]
67% less kernel code is the more interesting number here — Hopper's async capabilities have been underutilized largely because the programming model is painful. Curious how it handles cases where compute and memory phases aren't cleanly separable.
jhap 18 hours ago [-]
This seems like a better version of CUDA, for Hopper GPUs?