Shifts should always use 32bit shift values, and when lowering to
masked, we need to use 32-bit atomics. That means that we should also
treat 24bit stores as a single masked op rather than one 16bit unmasked
and one 8bit masked.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
DXIL doesn't have a "subgroup ID" or "num subgroups" construct,
so add lowering to construct them. Subgroup ID is done using
once-per-subgroup atomics on a workgroup-shared variable, and
then broadcasting that (using read_first_invocation) to the other
threads. Num subgroups is just a division with the workgroup size.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20777>
This adds support for D3D12-native view instancing to the compiler.
Essentially, it's just the ability to load SV_ViewID (dx.op.viewID),
set the right capability, and fill out some more PSV data. Note that
the PSV data is currently garbage. Ideally, we'd fill out a proper
input -> output and viewID -> output dependency table, but AFAIK
this is only used to enforce D3D API validation, and drivers ignore
it, so it's less critical to get it right.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20650>
User varyings are linked by both name and register. The name is based
on how many *variables* are before it in final driver_location sort
order, not necessarily how many registers are before it.
In some cases where clip/cull distance are involved, it's possible
for one shader to write into a part of the cull distance that's
ignored by a downstream shader, but because linking is done by
*whole* register locations, and clip/cull can be combined using
*fractional* register locations, this is hard to detect. Since no
non-sysvals end up using fractional locations, just put all non-sysvals
first so they always generate the same semantic names for the same
register locations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20346>
In a future commit, nit_type_conversion_op won't be able to handle i2b
(and in a much later commit f2b), so switch many users to the fully
featured function.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
It is invalid to have Boolean variables as either shader inputs or
outputs, so there is no point to try to lower them in general. The only
use for this was some two-phase lowering of
nir_intrinsic_load_front_face that could be done in a single phase.
Create the SYSTEM_VALUE_FRONT_FACE as a uint and compare it with zero at
the same time.
No shader-db or fossil-db changes on any Intel platform.
v2: Remove dxil_nir_lower_bool_input from dxil_nir.h and drop it from
the other caller in the spirv_to_dxil codepath. Noticed by Jesse. Fix
setting bit size when loading SYSTEM_VALUE_FRONT_FACE. Caught by CI.
v3: Use nir_ine_imm. Change type of gl_FrontFacing GS output in
d3d12_nir_passes from Boolean to integer. Both suggested by Jesse.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
No shader-db or fossil-db changes on any Intel platform.
v2: Add missed i2b1 in ir3_nir_opt_preamble.c.
v3: Add missed i2b1 in ac_nir_lower_ngg.c.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
We're about to lower I/O to scalar, which means we'll end up with
multiple writes to position, and none of them has enough info to
fill in the blanks.
This causes a test that previously crashed on WARP (due to
StoreOutput with an undef not being handled) to fail more
gracefully - but that failure means that the test spends
forever just outputting errors, so explicitly skip it.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17603>
While SPIR-V's OpKill is block terminating, the converted discard
intrinsic is not block terminating. This can lead to issues where
instruction could be placed after discard.
This patch adds an extra pass that drops all instructions after discard
before we convert discards.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17474>
Fix defect reported by Coverity Scan.
Side effect in assertion (ASSERT_SIDE_EFFECT)
assignment_where_comparison_intended: Assignment var->type =
glsl_int_type() has a side effect. This code will work differently in a
non-debug build.
Fixes: afb64e10c1 ("microsoft/compiler: Move d3d12_fix_io_uint_type() to dxil_nir.c")
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17097>
We currently have two implementations of the same logic. Let's pick
the d3d12 one, move it to dxil_nir.c and let nir_to_dxil() call it
when appropriate.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17039>
As far as I can't tell, there's no native operation doing this
equivalent of fquantize2f16. Let's lower this operation to
if (val < MIN_FLOAT16)
return -INFINITY;
else if (val > MAX_FLOAT16)
return -INFINITY;
else if (fabs(val) < SMALLER_NORMALIZED_FLOAT16)
return 0;
else
return val;
which matches the definition of OpQuantizeToF16:
"
If Value is an infinity, the result is the same infinity.
If Value is a NaN, the result is a NaN, but not necessarily the same NaN.
If Value is positive with a magnitude too large to represent as a 16-bit
floating-point value, the result is positive infinity. If Value is negative
with a magnitude too large to represent as a 16-bit floating-point value,
the result is negative infinity. If the magnitude of Value is too small to
represent as a normalized 16-bit floating-point value, the result may be
either +0 or -0.
"
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16959>
Sometimes you can end up with tex instructions that have sampler deref srcs, even though
they don't need them, e.g. a txs. In this case, still fix up those derefs in the sampler
splitting pass rather than leaving them pointing to a typed sampler.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16639>
Reusing the shadow sampler's variable causes problems when the sampler
is used more than once. The remaining `deref_var`s will be using the
wrong type.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14988>
The input attachment lowering pass turns input attachment loads into
texel fetch operation, and insert an image -> texture deref cast along
the way. In this situation, we can end up with a texture deref chain
pointing to an image variable, which is not a combined sampler+texture
object. Bail out when an image type is found, like we do for bare
textures.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13993>
In order to get the semantics right, we need to know how many of the clip/
cull fields are designated for which purpose. In the case of a shader that
can receive these fields as both input and output, the shader_info property
is reserved to store the output info. We could add a dedicated input field
to shader_info, but since it'd probably only be useful for us, just send
it through a side channel during shader linking.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14399>
This way, patch varyings come before the patch sysvals (tess levels).
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14399>
Also add tess factors to the list of sysvals that can cause vars to be sorted last.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14399>
It doesn't depend on the clc data being provided externally, so no
need to tie it there, we can re-use it for GL and Vulkan compute.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>
Previously, the lowering was for 8/16/64-bit values, or 8/16-component
vectors. Now, it also handles write masks on 32-bit 1/2/3/4-component
vectors.
DXIL looks like it supports putting an interesting write mask in the
buffer store intrinsic, but DXC never generates stores with write
masks, and multiple drivers completely ignore the write mask.
Also, set the write mask properly on the output intrinsic.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14294>
dxil_nir_opt_alu_deref_srcs will always return false because
the progress variable is declared both for the function and also
inside the loop.
Spotted by a unused-but-set-variable warning from clang
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14290>
_mesa_hash_table_u64_search() returns the data directly, not an
hash_entry object. We also need to take the descriptor set into account
for this pass to work properly on Vulkan shaders.
Fixes: 46bc7cf678 ("microsoft/compiler: Rewrite sampler splitting pass to be smarter and handle derefs")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13912>
After running the (renamed) dxil_nir_split_typed_samplers pass, the
shader will have either:
* Textures, which map to D3D SRVs
* Bare samplers, which map to D3D bare samplers
* Images, which map to D3D UAVs
There shouldn't be any remaining samplers with type information
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13390>