This blog post discusses how HLSL semantic strings are translated into SPIR-V location numbers for Vulkan shader inter-stage interface matching in the SPIR-V CodeGen of DirectXShaderCompiler (DXC). It is one of the “HLSL for Vulkan” series.
There are a few places that DirectX and Vulkan adopt different mechanisms (and different terms!) for the same task; how to match/link shader inputs and outputs is one of them.
Terminology
Before diving in, to confuse everyone, here are a few concepts related to shader inter-stage interface. :)
DirectX/HLSL/DXIL
- Input/output signature: all input/output parameters of a shader entry function.
- Signature packing: all signature parameters must fit into a finite space of N 4x32-bit registers. For efficiency reasons, parameters are packed together in a way that does not violate specification constraints.
- Semantic: a string attached to an input/output parameter of a shader entry function to convey the intended usage of the parameter.
- System-value (SV) semantic: a special semantic string prefixed by
SV_
that has specific meanings and constraints. SV semantics are shader stage dependent.
More details can be found in the DXIL spec, “HLSL signatures and semantics”.
Vulkan/GLSL/SPIR-V
- Shader input and output interface: when multiple stages are present in a pipeline, the outputs of one stage form an interface with the inputs of the next stage.
- Interface matching: there are two classes of variables that can be matched between shader stages, built-in variables and user-defined variables. Each class has a different set of matching criteria.
- Location and Component: the
Location
value specifies an interface slot comprised of a 32-bit four-component vector conveyed between stages. TheComponent
specifies components within these vector locations.
More details can be found in the Vulkan spec, “Shader Input and Output Interfaces”.
Shader Inter-stage Interface
Now let’s look at shader inter-stage interface matching for both DirectX and Vulkan.
DirectX
For a single shader stage, the input/output parameters must be packed into a space of N 4x32-bit registers according to runtime constraints. The C/C++ type system adopted by HLSL introduces complexity in the process; therefore, to simplify packing, all signature variables of composite types are fattened first: turning struct parameters into constituent components and making strided arrays contiguous. The DXIL spec, “Signature packing” contains the complete set of packing rules.
Each signature parameter can take one or more registers; and the register assignment process happens in declaration order. For example, for the following source code:
struct VSOut {
float4 pos : SV_Position;
float3 vec1 : VECTOR1;
float3 vec2 : VECTOR2;
};
VSOut main(
float3x2 mat : MATRIX,
float4 pos : POSITION
) { ... }
fxc.exe
gives the following register allocation:
// Input signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// MATRIX 0 xyz 0 NONE float xyzw
// MATRIX 1 xyz 1 NONE float xyz
// POSITION 0 xyzw 2 NONE float xyz
//
//
// Output signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// SV_Position 0 xyzw 0 POS float xyzw
// VECTOR 1 xyz 1 NONE float xyz
// VECTOR 2 xyz 2 NONE float xyz
For input parameters, pos
are assigned to its register after mat
.
For output parameters, pos
are assigned to its register before vec1
and vec2
.
Semantic string
Additionally, each signature parameter must have a
semantic string attached to it. Semantic strings can contain
indices, which are used to disambiguate parameters that use the same semantic
name (e.g., VECTOR
in the above example), or occupy multiple registers
(e.g., MATRIX
in the above example).
If semantic strings are attached to both a struct and its fields, the one on the struct will overwrite those on the fields. For example,
struct S {
float4 a : AAA;
float4 b : BBB;
float4 c : CCC;
};
S main (...) : DDD5 { ... }
fxc.exe
gives the following register and index allocation:
// Output signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------- ------
// DDD 5 xyzw 0 NONE float xyzw
// DDD 6 xyzw 1 NONE float xyzw
// DDD 7 xyzw 2 NONE float xyzw
All member fields in S
have DDD
as the semantic name. Semantic indices
are assigned to them incrementally in declaration order, starting with the
index on the struct.
Interface matching
To match the output signature of a source stage with the input signature of the destination stage, DirectX requires each register has matching type and semantic (both name and index). Because registers are assigned in declaration order, this means signature parameters should match in type and semantic string in declaration order. There is one exception though: tail parameters on the destination stage can be omitted. For example, given the following vertex shader (VS) output signature,
struct VSOut {
float4 pos : SV_Position;
float2 uv : UV;
float3 norm : NORMAL;
};
The following pixel shader input signature will match:
struct PSIn {
float4 pos : SV_Position;
float2 uv : UV;
};
But not this one, which swaps uv
and norm
:
struct PSIn {
float4 pos : SV_Position;
float3 norm : NORMAL;
float2 uv : UV;
};
Vulkan
On the Vulkan side, shader input and output interfaces are matched using location numbers. Locations in Vulkan are conceptually similar to registers in DirectX. But Vulkan has different location assignment rules compared to DirectX signature packing rules. Details can be found in the Vulkan spec, “Location Assignment”.
Vulkan basically requires exact type match for the same location number. Quoting interface matching rules from the Vulkan spec, “Interface Matching”:
A user-defined output variable is considered to match an input variable in the subsequent stage if the two variables are declared with the same
Location
andComponent
decoration and match in type and decoration, except that interpolation decorations are not required to match. For the purposes of interface matching, variables declared without aComponent
decoration are considered to have aComponent
decoration of zero.
For composite types, the spec requires each member to be in exact match:
Variables or block members declared as structures are considered to match in type if and only if the structure members match in type, decoration, number, and declaration order. Variables or block members declared as arrays are considered to match in type only if both declarations specify the same element type and size.
For tessellation and geometry shaders, the outermost array dimension on per-vertex parameters is ignored for the purpose of interface matching:
Tessellation control shader per-vertex output variables and blocks, and tessellation control, tessellation evaluation, and geometry shader per-vertex input variables and blocks are required to be declared as arrays, with each element representing input or output values for a single vertex of a multi-vertex primitive. For the purposes of interface matching, the outermost array dimension of such variables and blocks is ignored.
Semantic String to Location Number
As said in the above, Vulkan locations are conceptually similar to DirectX registers. So a natural translation scheme is to
- 1) follow declaration order of input/output parameters and assign locations incrementally.
- Similarly, 2) struct types should be flattened and each member field should be mapped separately.
For example, for the following source code:
struct S {
float3x2 mat : MATRIX;
float3 vec : VECTOR;
};
struct T {
float arr[4] : ARRAY;
S s;
};
T main() { ... }
We will create 3 stand-alone output variables:
out_var_ARRAY
: takes 4 locationsout_var_MATRIX
: takes 3 locationsout_var_VECTOR
: takes 1 location
OpDecorate %out_var_ARRAY Location 0
OpDecorate %out_var_MATRIX Location 4
OpDecorate %out_var_VECTOR Location 7
%out_var_ARRAY = OpVariable %_ptr_Output__arr_float_uint_4 Output
%out_var_MATRIX = OpVariable %_ptr_Output_mat3v2float Output
%out_var_VECTOR = OpVariable %_ptr_Output_v3float Output
(You may wonder why float3x2
takes 3 Vulkan locations but 2 DirectX registers.
Read more about how matrix types are handled in
“HLSL for Vulkan: Matrices”).
The above scheme implicitly requires that developers keep consistent declaration order for input/output parameters in adjacent shader stages. But since DirectX basically has the same requirement, and sharing the same struct definition among different shader stages is common, it won’t be a problem.
HS/DS/GS arrayness
Input parameters for per-vertex data in hull shader (HS), domain shader (DS), and geometry shader (GS) are arrays, since each primitive/patch contains multiple vertices. Typically, they are declared as arrays of structs in the HLSL source code:
#define NumVertices ...
// Hull shader
HSPerVertexOut main (InputPatch <HSPerVertexIn, NumVertices> inData, ...) { ... }
// Domain shader
DSPerVertexOut main (OutputPatch<HSPerVertexOut, NumVertices> inData, ...) { ... }
// Geometry shader
void main (in triangle DsPerVertexOut inData[NumVertices], ...) { ... }
To meet the requirements of Vulkan interface matching, 3) the flattened struct members retain the outermost array dimension. That is, these array of structs are transformed into arrays of their fields. For example, for the following source code:
struct PerVertex {
float4 a : AAA;
float3 b : BBB;
};
PerVertex main (InputPatch<PerVertex, 3> inData, ...) { ... }
The corresponding SPIR-V code:
%out_var_AAA = OpVariable %_ptr_Output__arr_v4float_uint_3 Output
%out_var_BBB = OpVariable %_ptr_Output__arr_v3float_uint_3 Output
HS output and DS input
One more interesting thing about HS is that, apart from the entry function, it also has a patch constant function. These two functions both write to shader output parameters: some for per-vertex data, some for per-patch data. Using declaration order to assign location numbers becomes problematic and error-prone here: it depends on the subtle positioning of these two functions and the compiler’s visiting order of them.
Since HS and DS must be both present or both absent, which makes the interface between HS output and DS input much tighter than others, we decided to 4) use alphabetical order to assign location numbers to HS output and DS input.
SV semantic vs. non-SV semantic
I haven’t talked about the differences between SV semantics and non-SV semantics thus far.
A related concept is built-in variables and normal input/output variables in Vulkan. Vulkan built-in variables does not need location assignment, but normal input/output variables do. They also subject to different matching rules.
SV semantics are mostly translated into Vulkan built-in variables, but not
always. And the translation is shader stage dependent. For example,
SV_Position
will be translated into a normal input variable if used for
VS input, into Vulkan built-in Position
if used for VS output, and
FragCoord
if used for pixel shader (PS) input. So there is no single
rule for SV semantic mapping; they are handled case by case.
Non-SV semantics are always translated into normal input/output variables.
SV_Target
And an exception for SV_Target of course since it is special!
When you write SV_TargetX
in the source code, you typically would like
to associate it with the X
th render target, regardless of the declaration
order. So, 5) for SV_TargetX
, the location number would be X
,
regardless of declaration order.
SV_ClipDistance & SV_CullDistance
Yes, more exceptions. :)
You can have multiple HLSL variables annotated with
SV_ClipDistanceX
/SV_CullDistanceX
, and each of them can be of float or
vector of float type. But for Vulkan, there is only one built-in for
ClipDistance
/CullDistance
and it must be array of float type.
So to map them into one float array, we 6) firstly sort them asecendingly
according to X
, and then concatenate them tightly. For example,
struct T {
float clip0: SV_ClipDistance0,
};
struct S {
float3 clip5: SV_ClipDistance5;
...
};
void main (T t, S s, float2 clip2 : SV_ClipDistance2) { ... }
Then we have an float array of size (1 + 2 + 3 =) 6 for ClipDistance
,
with clip0
at offset 0, clip2
at offset 1, clip5
at offset 3.
Explicit control
As always, we are trying to provide a meaningful and hassle-free default
behavior in the compiler together with explicit control mechanisms for
developers to overrule the default behavior. For this case, a C++11 style
attribute, [[vk::location(X)]]
, can be used in the source code to
explicitly specify the location of an input/output parameter:
struct S {
[[vk::location(3)]] float4 a : AAA;
[[vk::location(6)]] float4 b : BBB;
};
S main() { ... }
The corresponding SPIR-V code:
OpDecorate %out_var_AAA Location 3
OpDecorate %out_var_BBB Location 6
%out_var_AAA = OpVariable %_ptr_Output_v4float Output
%out_var_BBB = OpVariable %_ptr_Output_v4float Output
Dual-source blending
This explicit control mechanism is also how you can use dual-source blending in Vulkan:
struct PSOut {
[[vk::location(0), vk::index(0)]] float4 a: SV_Target0;
[[vk::location(0), vk::index(1)]] float4 b: SV_Target1;
};
PSOut main() { ... }
The corresponding SPIR-V code:
OpDecorate %out_var_SV_Target0 Location 0
OpDecorate %out_var_SV_Target0 Index 0
OpDecorate %out_var_SV_Target1 Location 0
OpDecorate %out_var_SV_Target1 Index 1
%out_var_SV_Target0 = OpVariable %_ptr_Output_v4float Output
%out_var_SV_Target1 = OpVariable %_ptr_Output_v4float Output
Takeaways
- SV semantics are mostly mapped into Vulkan built-in variables, but not always. Non-SV semantics are always mapped into normal Vulkan input/output variables.
- Vulkan location numbers are sequentially assigned to HLSL input/output parameters following their declaration order, excluding those mapped to Vulkan built-in variables.
- The process flattens struct types and assigns Vulkan locations to struct fields.
- Flattened struct fields will retain the outermost array dimension for number of vertices in HS/DS/GS.
- But there are a few exceptions:
- Location numbers are assigned to HS output and DS input according to alphabetical order of semantics.
SV_TargetX
will be assigned to locationX
.SV_ClipDistanceX
/SV_CullDistanceX
are sorted and aggregated into one float array.