The logic in st_atom_shader.c leads me to believe this was supposed
to work, but was incomplete to actually finish it. This fixes
compatibility tess tests on d3d12.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14662>
Indentation fail. This should happen once per instruction, not once per
destination. In theory, this is a minor performance win; in practice,
it's simply less wrong.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14575>
The spec says only polygons, not points/lines, should be culled when
culling is enabled. The hardware does not make this distinction, so we
have to.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14575>
Use a Gallium helper that papers over the differences between primitive
types, as required by hardware operation.
[Cc'd to mesa-stable for use in the next commit.]
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14575>
We are going to need to extend the cache key to add state that effects
the program stateobj, but not necessarily the shader itself (ie. so
ir3_shader_key wouldn't be the correct place to add it).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14643>
Lowered clip planes should respect the enabled/disabled GL_CLIP_PLANEn
(aka GL_CLIP_DISTANCEn), which means updating the rast state as well.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14643>
This lets us support indirect access to UBOs easily. The existing
constant special case disappears too, since the peephole optimizer can
inline the constant later. (note: this is too conservative since we can
go up to 16-bit immediates...)
Unfortunately, nir_opt_algebraic can't seem to optimize expressions like
"((a << 3) + 4) >> 2" to "(a << 1) + 1" which would be necessary for
reasonable perf out of this...
Fixes:
dEQP-GLES2.functional.shaders.indexing.uniform_array.float_dynamic_loop_read_fragment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14581>
Based on the new vk_sync functions.
Copied the version from anv as that seemed more thorough by using the
temporary sync payload. However that does mean we have do use the vk_sync
functions instead of being able to layer it on top of the dispatch table.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14365>
This change introduces the anv_descriptor_size_for_mutable_type and
anv_descriptor_data_for_mutable_type helpers to compute the size and
data flags respectively for mutable descriptor types.
In order to make handling these types easier we now store a precomputed
descriptor stride for all types and use in the in appropriate places.
We also need to adjust the compiler to take into account this descriptor
stride. To that extent, we now pack the stride into the upper 16 bits
alongside the index and the dynamic offset index and use it later to
compute the correct offset.
Closes: #4250
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14633>
Simplify the buffer chaining process with a single loop and
a helper function from Lionel Landwerlin's input.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jianxun Zhang <jianxun.zhang@linux.intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14578>
Without this the simulator wrapper will abort upon seeing this
query, rendering the driver unusable in that context.
Also, it seems the simulator environment doesn't quite work with
multisync at present, so do not enable it until we figure out what
the issue is.
Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14678>
iris_resource_bo() is convenient when we only have a pipe_resource *
variable, and don't need to do a lot with it other than get at the BO.
When we need to do more with a resource, we usually cast it to
iris_resource *, at which point we can just use res->bo instead.
This patch updates iris_copy_region to use src_res->bo, dst_res->bo,
rather than iris_resource_bo(src) and iris_resource_bo(dst), since we
already have those cast versions on hand.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14667>
This should be safe as long as the driver restores descriptors and
push constants correctly for compute pipelines.
This might also reduce the number of compute pipeline changes if eg.
consecutve subpass fast clears with compute.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14616>
While the VRS image can't have mips (and no layers because still not
supported by RADV), applications might still want to bind a
depth/stencil attachment where the base level isn't 0.
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14518>
If the application binds a fragment shading rate attachment to a
subpass and also clears the depth stencil attachment, the VRS rates
would have been reinitialized to 1x1 with fast clears. It makes more
sense to clear and then copy instead of the opposite.
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14518>
Add L1 cache control bit field to RENDER_SURFACE_STATE and
STATE_BASE_ADDRESS instruction.
v1: (Jason)
- Add prefix to bit field
- Don't miss out STATE_BASE_ADDRESS instruction
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14676>
If the file ptr is not NULL then foz_destroy will also try to destroy it.
Fixes: eca6bb9540 ("util/fossilize_db: add basic fossilize db util to read/write shader caches")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14671>
Implements a workaround for HSDES#14014414195. Note that this change
deviates heavily from the workaround suggested in the HSDES, since all
of the suggestions are either costly at runtime or outright
non-compliant, so they would require us to apply the workaround
selectively for affected applications.
Instead of adding hacks to the compiler that manually implement the
LOD computation of 3D texturing operations in the shader, initialize
an extra sampler state structure in the driver that has anisotropic
filtering forced off, and use it instead of the normal sampler state
structure whenever a 3D texture is bound to the same sampler unit.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14489>
For non-discard writes, we need to make sure that the resource has valid contents
so they can be *updated*, not overwritten.
We have to skip this when doing asynchronous maps, but that should be okay, because
the threading context should only do asynchronous map-write when the resource is
known to be idle/empty.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14624>
In the case of a multi-stream GS that is attempting to output wide
points to stream 0, we can support this by lowering stream 0 to
triangles and then removing the other streams. This is only valid
to do if the other streams are not being written to stream output,
either if they're not present in the SO info or no buffer is bound.
Fixes the arb_gpu_shader5/arb_gpu_shader5-emitstreamvertex_nodraw
piglit test which does this.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14624>
I couldn't find this in a spec but the builtin-gl-sample-mask piglit
seems to expect writing to the output sample mask to do nothing when
max num samples == 0.
The ForcedSampleCount property should make everything appear as if
MSAA is disabled. However, it's undefined behavior if depth is
bound, so in that case, we can at least use a lowering pass to
make things *look* like MSAA is off, unless you use atomics to
count invocations.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14624>