Layout is a core concept in Triton for representing and optimizing distribution mappings from source problems to the target hardware compute and memory hierarchy. In this blog post I will talk about linear layout in Triton, the new unifying mechanism over existing bespoke layouts for different purposes. The aim is to provide motivation and an intuitive understanding of linear layout; I will rely on examples and illustrations instead of theories and proofs.
Time flies—almost 9 years have passed since I joined Google. Now the time has come for me to leave and move on. While here, I’m super lucky to mostly work on open source projects that I can publicly talk about. So at the end of my tenure with Google, I’d like to reflect and summarize the incredible journey, which I am super grateful for and thoroughly enjoyed, before I forget some details.
2023-09-26
7 min read
Nowadays GPUs are utilized for both graphics rendering and general-purpose compute (GPGPU). For the latter, CUDA is the indisputable leading solution. Though, with so many other GPU vendors, the quest for a GPGPU standard never stops. OpenCL was a great attempt and is used widely; but still it falls short on many aspects. Given the success of Vulkan in graphics and it being both a graphics and compute API, one would wonder whether it can actually be the next-generation GPGPU standard. I certainly believe so; but the road is not full of roses.
Vulkan is designed to be both a graphics and compute API. However, there is no formal definition of the compute subset from the Khronos group, the industry consortium behind Vulkan. The unified specification of Vulkan does not help here either as it contains everything, both graphics and compute. Unlike the complicated graphics subset, the compute subset is actually quite straightforward and clean. So in this blog post I try to explain what Vulkan compute is, from my point of view.