These days if you would like to learn about machine learning, there are abundant great resources on the web discussing model architectures and how to code and train them. Materials about inference, though, are generally much harder to find, especially for edge and mobile. You might ask, inference is just the forward pass of training, so how hard can it be? Actually, it faces lots of unique challenges, to the extent that we are basically solving completely different major problems. I have been working on inference at the edge for a while, so let me capture them in this blog post, by contrasting training and inference in the cloud.
In a previous blog post I gave a general introduction to GPU driver internals in Android/Linux systems. Following up with it, today I will explain how a specific functionality, hardware performance counter (perf counter) queries, is handled in both Qualcomm Adreno and ARM Mali drivers, by walking through the kernel driver source code.
Recently I have been working on a library that needs to directly interact with GPU kernel drivers from various vendors on Android/Linux systems. Compared to various GPU APIs, information at this level is quite sparse; so it is not a straightforward task, to say the least, and ends up requiring me to piece multiple sources together to figure out the details. So I am logging these driver internals and resources down in case it can be useful to others that are interested in these low-level bits.
Vulkan is designed to be both a graphics and compute API. However, there is no formal definition of the compute subset from the Khronos group, the industry consortium behind Vulkan. The unified specification of Vulkan does not help here either as it contains everything, both graphics and compute. Unlike the complicated graphics subset, the compute subset is actually quite straightforward and clean. So in this blog post I try to explain what Vulkan compute is, from my point of view.
On 2018 Vulkan Developer Day in Montréal, I gave a talk regarding “Shader Toolchain: HLSL in Vulkan”. Here are the links to the video recording, slides, and documentation/downloads for DirectX Shader Compiler (DXC) SPIR-V CodeGen.
This blog post discusses how HLSL semantic strings are translated into SPIR-V location numbers for Vulkan shader inter-stage interface matching in the SPIR-V CodeGen of DirectXShaderCompiler (DXC). It is one of the “HLSL for Vulkan” series.
This blog post discusses how to manage resources in HLSL for Vulkan, using the SPIR-V CodeGen of DirectXShaderCompiler (DXC). It is one of the “HLSL for Vulkan” series.
This blog post discusses how HLSL matrices are translated into SPIR-V for Vulkan consumption in the SPIR-V CodeGen of DirectXShaderCompiler. It is one of the “HLSL for Vulkan” series.