Hello once more dear internet. Last post I wrote about the results of my planning and how I was putting off the whole FFT algorithm implementation so I could work on a placeholder. Well, the Gerstner Wave placeholder is finished. Now it is time to traverse the great sea of implementing FFT (sea what I did there?). Unfortunately, I’m still in the shallows…okay the metaphor has limits, I but long story short I can’t implement Cooley and Tukey’s famous algorithm quite yet. I need to have a compute pipeline first.
Without a compute pipeline I have no way to run “compute” shaders on the GPU, and I really want to implement FFT on the GPU. See, simulations are traditionally really rough on the CPU and don’t leave a lot of performance left over for other CPU tasks. So it would be great if we could somehow calculate all the wave geometry on something else right? Well we can. Enter “compute” shaders. It turns out the GPU is tremendously general purpose these days. Check out Programming Massively Parallel Processors: A Hands-On Approach by David B Kirk and Wen-mei W. Hwu if you’d like to learn about the wonders of GPU compute (particularly with an emphasis on Nvidia’s CUDA). Anyway, by moving the wave geometry compute to the GPU, I can hopefully leave some extra headroom for the dynamics simulation on the CPU.
Ever used a modern game engine? They have all kinds of fancy stuff, entity-component-systems organizing data in a cache-friendly composable way, scripting engines allowing easy behavior implementation, renderers capable of drawing detailed models with physically based shading, access to the vertex shaders, access to the fragment shaders. You know what they don’t have by default? A compute pipeline with customizable compute shaders. You have to set that up special. Even in the behemoth Unity you have to do a little special setup for compute shaders. Now Bevy is no Unity. It’s small, it’s lean, and extremely new. In Bevy, you have to create the pipeline pretty close to the underlying graphics API (which in Bevy’s case is WebGPU), granted you have some helpers in the “renderer” module, but it’s still quite the process. You have to manually set up all the bind groups, add the compute task manually to the render graph, and even implement the render_graph::Node trait for your type, which involves invoking low-level internal renderer functionality.
So I’ve mostly been trying to understand the why of that example for the past two weeks. (You’d think it would be easy, but Bevy has almost no conceptual documentation.) At the beginning of those weeks I didn’t even know what a render graph was. Now I sort-of know what a render graph is! According to this blog (which is actually really well written and helpful), the idea is that you encode all the different render resources/passes you need into nodes, and you organize those into a kind of dependency graph that allows the data to be there when required.
I also understand the idea of “extraction” now (the Bevy-specific kind, not extraction from a compressed file/stream), which needs to happen because in Bevy, the renderer has its own world and every frame we need to take the relevant data from the “game world” and extract it into the “render world”.
Finally, I understand the basic idea of binding; we need to get data GPU side for reading and writing, whether that data is a texture or a number or something else. Through binding descriptors we can take the data from the render world and give it to the GPU, then our compute shader can access and mutate that data appropriately.
Now that I have some idea what’s happening, I can start building the compute pipeline for my use case. I’ve already begun adapting the code from the one compute example, and hopefully next week I’ll have a working compute shader that outputs some displacement and normal textures based on a seed texture! Then I can finally start work on the FFT (maybe over the holidays?)
That’s all for this term, be sure to check back in the spring for more tales from my journey!
Till then, thank you for reading my little blog!