fergusfinn.com

What happens when you run a CUDA kernel?

mezark · 288 points · 32 comments · 2 days ago

Comments

5 preview comments · loading full thread

kinowyesterday

I just finished a master's on HPC where I had to take some classes on CUDA, MPI+CUDA, OpenCL. Reading an article like this before the classes would have been a lot helpful! Especially the part just before and after "What does it mean for a warp to be eligible?".

mschuetzyesterday

That was an interesting read. Also enjoyed reading about the semaphores in the default stream. It's great that cuda implicitly handles syncing of commands for users and makes parallel commands optional and opt-in via streams, unlike Vulkan which completely unloads the full complexity of syncing to users right from the start.

fooblasteryesterday

The hardware has some open documentation. You don't actually need to read the kernel source to find some of the method documentation or qmd formats. See https://github.com/NVIDIA/open-gpu-doc/blob/master/classes/c...

aberrahmane_byesterday

It's very useful. The doorbell and QMD part were the most useful for me, because it connects the CUDA launch syntax to what actually gets submitted to the GPU. Most explanations stop around kernels, blocks and warps, but this made the CPU to driver to GPU path much easier to follow.

orliesaurusyesterday

There are companies whose whole job right now is to optimize kernels so that things run faster. I wonder if those companies are going to be dethroned by some sort of like open source library that can do that really well (I bet Nvidia could release it any day.).. or if they're going to thrive and be acquired by the big providers as a `moat` to speed up their infrerence.