this is for drivers (like freedreno) which need the format in the sampler
state in order to accurately handle border colors
when set, drivers MAY receive a format in the sampler state if the frontend
supports it (e.g., nine does not), and the cso sampler cache will include
the format member of the struct
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>
EOT messages need to use g112-g127 for their sources. With the new
opt_split_sends pass, we may be constructing an EOT message from two
different registers, and be able to copy propagate the original values
into those SENDs.
This can cause problems if we copy propagate from a large register
(say an RGBA value which is 4 GRFs in SIMD8 or 8 GRFs in SIMD16), in a
situation where the SEND only read a subset of that (say the alpha value
out of an RGBA texturing result). g112-127 can only hold 16 registers
worth of data, and sometimes we can only use g112-126. So, we can't
propagate if the GRFs in question are larger than 15 GRFs.
Fixes a shader validation failure in Alan Wake. Thanks to Ian Romanick
for catching this!
shader-db on Icelake shows that only SIMD32 programs are affected, and
the results are pretty negligable:
total instructions in shared programs: 19615228 -> 19615269 (<.01%)
instructions in affected programs: 10702 -> 10743 (0.38%)
helped: 1 / HURT: 43 / largest change: +/- 2 instructions
total cycles in shared programs: 852001706 -> 852001566 (<.01%)
cycles in affected programs: 767098 -> 766958 (-0.02%)
helped: 68 / HURT: 64 / largest change: +/- 774 cycles
GAINED: 2 / LOST: 0
Fixes: 589b03d02f ("intel/fs: Opportunistically split SEND message payloads")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6803
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17390>
A general cleanop of the nir compiler options including separating
the handling for FS and all the other shaders.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
Currently, the NIR code path doesn't use clause local registers,
but these seem to help a lot with some work loads.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
This is a rewite of the NIR backend. it adds some optimization
and a scheduler.
v2: - replace some magic numbers by constants
- make sure constructor is always used with new
- use default initialization in more places
(changes suggested by Filip Gawin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
Not sure whether this is really needed. With the TGSI code path no
other bank swizzle is emitted for LDS ops, so make sure it's the same here.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
The number of ALU groups is important for good sccheduling, so
let's add it to the stats.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
When utrace/perfetto is active, we allocate/free utrace buffers at the
same rate as command buffers. It's useful to have a pool that avoids
GEM_CREATE/GEM_CLOSE ioctls.
v2: Use the pool more
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16613>
Dozen is currently a SW driver as far as WSI is concerned so it's going
to wait on a fence anyway. Also, I highly doubt it's actually hooked
these up properly. It's probably just a copy+paste from ANV.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>
D3D12 requires the offset to be 512B-aligned and row pitch to be
256B-aligned for copy commands. There will need to be a fallback
written eventually because Vulkan has no such requirements but these
will remain the optimal limits as they allow using the D3D12 copy
commands directly.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>
Some drivers such as lavapipe are 100% fine with using linear for WSI
images. Most HW drivers, however, would rather render tiled and eat a
blit.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>
This isn't a big deal for the current buffer paths because the required
alignment for PRIME is already higher than any driver advertises.
However, the SW path we're about to add won't have the PRIME requirement.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>
Instead of taking a single boolean for device-local, take a set of
required properties and denied properties. This will let us require
additional things like being CPU mappable in the future.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>
This turned out to be not a CTS bug but rather hardware bug around the
cache handle BCn textures.
It requires significant tracking to detect such cases, and it's likely
not worth a workaround since reading a texture as both compressed and
uncompressed in succession shall not be a realistic use case.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6689
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17345>
This fixes tests using imageLoad/imageStore on texture
created using glEGLImageTargetTexture2DOES.
Before this change, the format was guessed as GL_RGBA,
which would be rejected by _mesa_get_shader_image_format.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16662>
For images created from textures or renderbuffer, the internal
format is known so store it.
This will be used in the next commit to replaces guessing it.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16662>
Not had time to investiguate what is going is on but it's definitely a
contributor to failures.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16104>
We can't unconditionally support separable shader objects on the host,
so submit the property only if the shader is actually separable, the
host knows about the property, and supports SSO.
Without support for SSOs, the host can still compile and link the shaders,
it needs to do more work on interface matching though.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17344>
Most drivers don't care about the property, and virgl should only handle
it if the host supports it.
This is a partial revert of b634030, i.e. we keep the definition of the
property, but we don't set it only based on the shader info.
Fixes: b634030542
tgsi: Add SEPARABLE_PROGRAM property
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17344>
We can produce slightly better code for these in the backend, so
do that. For this we need to:
1. Fix our implementation of uadd_carry (which wasn't used) to return
an integer instead of a boolean value.
2. Add an implementation of usub_borrow.
Notice these are only used in Vulkan. In GL these instructions are
always unconditionally lowered by the state tracker in GLSL IR so
we never get to see them in the backend.
Shader-db stats from a collection of Vulkan samples:
total instructions in shared programs: 122351 -> 122345 (<.01%)
instructions in affected programs: 196 -> 190 (-3.06%)
helped: 2
HURT: 0
total uniforms in shared programs: 18670 -> 18672 (0.01%)
uniforms in affected programs: 59 -> 61 (3.39%)
helped: 0
HURT: 2
total max-temps in shared programs: 13145 -> 13147 (0.02%)
max-temps in affected programs: 27 -> 29 (7.41%)
helped: 0
HURT: 2
total inst-and-stalls in shared programs: 123052 -> 123046 (<.01%)
inst-and-stalls in affected programs: 197 -> 191 (-3.05%)
helped: 2
HURT: 0
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>
These opcodes where fixed to return an integer instead of a boolean
value some time ago but the documentation for them was not updated
and still talked about a boolean result.
Fixes: b0d4ee520 ('nir/opcodes: Fix up uadd_carry and usub_borrow')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>
The thread dispatch SEND instructions will dispatch new threads
immediately even before the caller of the SEND instruction has reached
EOT. So we really need to make sure all the memory writes are visible
to other threads within the DSS before the SEND.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15755>