this works around most cts coverage which violates spec by creating a view
sized using a range that isn't a multiple of the format's blocksize
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12631>
Previously, we misunderstood how conditional modifiers and saturate
interacted. We thought the condition was evaulated before the saturate
was applied. For the floating point cases, we went to some heroics to
modify the condition to maintain the same results. For integer cases,
it was not clear that this could even work. We had no use-cases and no
tests-cases, so we just disallowed everything.
Now we understand that the condition is evaluated after the saturate.
Earlier commits in this series removed the various floating point
heroics. It is easier to just delete the code that prevents some cases
that just work.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Originally this was part of "intel/fs: Remove condition-based
restriction for cmod propagation to saturated operations". With some
additional changes to that commit, it caused a lot of extra churn in the
unit tests. I felt that made it harder to see the actual changes in the
unit tests, so I split it out.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
I don't know why the float_saturate_l_mov test was #if'ed out, but it
passes... so this commit enables it.
No shader-db or fossil-db changes.
In a previous iteration of this MR, this commit helped ~200 shaders in
shader-db. Now all of those same shaders are helped by "intel/fs: cmod
propagate from MOV with any condition". All of these shaders come from
Mad Max. After initial translation from NIR to assembly, these shader
contain patterns like:
mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.sat(8) g90<1>F g90<8,8,1>F
...
cmp.nz.f0(8) null<1>F g90<8,8,1>F 0 /* 0F */
An initial pass of cmod propagation converts this to
mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.sat.XX.f0(8) g90<1>F g90<8,8,1>F
Without this commit, XX is G. With this commit, XX is NZ. Saturate
propagation moves the saturate:
mul.sat(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.XX.f0(8) g90<1>F g90<8,8,1>F
Without this commit (but with "intel/fs: cmod propagate from MOV with
any condition"), the G gets propagated:
mul.sat.g.f0(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
With this commit (with or without "intel/fs: cmod propagate from MOV
with any condition"), the NZ gets propagated:
mul.sat.nz.f0(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
There were tests related to propagating conditional modifiers from a MOV
to an instruction with a .SAT modifier for a very long time, but they
were #if'ed out.
There are restrictions later in the function that limit the kinds of MOV
instructions that can propagate. This avoids the dangers of
type-converting MOVs that may generate flags in different ways.
v2: Update the added comment to look more like the existing comment.
That makes the small differences between the two cases more obvious.
Noticed by Marcin.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19827127 -> 19826924 (<.01%)
instructions in affected programs: 62024 -> 61821 (-0.33%)
helped: 201
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1
helped stats (rel) min: 0.13% max: 0.60% x̄: 0.35% x̃: 0.36%
95% mean confidence interval for instructions value: -1.02 -1.00
95% mean confidence interval for instructions %-change: -0.36% -0.34%
Instructions are helped.
total cycles in shared programs: 954655879 -> 954655356 (<.01%)
cycles in affected programs: 1212877 -> 1212354 (-0.04%)
helped: 155
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 3.65 x̃: 4
helped stats (rel) min: <.01% max: 0.17% x̄: 0.07% x̃: 0.07%
HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 8
HURT stats (rel) min: 0.04% max: 0.23% x̄: 0.14% x̃: 0.15%
95% mean confidence interval for cycles value: -3.60 -2.90
95% mean confidence interval for cycles %-change: -0.07% -0.05%
Cycles are helped.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
...and rename it to brw_reg_type_is_unsigned_integer. It is now next to
brw_reg_type_is_floating_point and brw_reg_type_is_integer.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
We were previously operating with the mindset "a MOV is just a compare
with zero." As a result, we were trying to share as much code between
the MOV path and the CMP path as possible. However, MOV instructions
can perform type conversions that affect the result of the comparison.
There was some code added to better handle this for cases like
and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD
mov.nz.f0(16) null<1>F g31<8,8,1>D
The flaw in these changed special cases is that it allowed things like
or(8) dest:D src0:D src1:D
mov.nz(8) null:D dest:F
Because both destinations were integer types, the propagation was
allowed. The source type of the MOV and the destination type of the OR
do not match, so type conversion rules have to be accounted for.
My solution was to just split the MOV and non-MOV paths with completely
separate checks. The "else" path in this commit is basically the old
code with the BRW_OPCODE_MOV special case removed.
The new MOV code further splits into "destination of scan_inst is float"
and "destination of scan_inst is integer" paths. For each case I
enumerate the rules that I belive apply. For the integer path, only the
"Z or NZ" rules are listed as only NZ is currently allowed (hence the
conditional_mod assertion in that path). A later commit relaxes this
and adds the rule.
The new rules slightly relax one of the previous rules. Previously the
sizes of the MOV destination and the MOV source had to be the same. In
some cases now the sizes can be different by the following conditions:
- Floating point to integer conversion are not allowed.
- If the conversion is integer to floating point, the size of the
floating point value does not matter as it will not affect the
comparison result.
- If the conversion is float to float, the size of the destination
must be greater than or equal to the size of the source.
- If the conversion is integer to integer, the size of the destination
must be greater than or equal to the size of the source.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Of particular interest are the tests where the MOV performs a type
conversion. If the restriction on conditional modifier for a MOV is
ever relaxed, some of these cases must still be disallowed.
v2: s/NZ/Z/ in one of the comments. Notice by Marcin.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
This foreach_inst_in_block_reverse_starting_from loop only applies
CMP, MOV, and AND. AND instructions break out of the loop before this
point.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
This will simplify some later changes to these tests.
v2: Combine test_positive_saturate_prop and test_negative_saturate_prop
into a single function. Suggested by Marcin.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
==244108== Conditional jump or move depends on uninitialised value(s)
==244108== at 0x48498D5: bcmp (vg_replace_strmem.c:1129)
==244108== by 0x1C37B7DD: radv_bind_dynamic_state (radv_cmd_buffer.c:237)
==244108== by 0x1C388027: radv_CmdBindPipeline (radv_cmd_buffer.c:4794)
==244108== by 0x14E9C01E: bool update_gfx_pipeline<true>(zink_context*, zink_batch_state*, pipe_prim_type) (zink_draw.cpp:406)
==244108== by 0x14E9AAB9: void zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)1, (zink_dynamic_state2)1, (zink_dynamic_vertex_input)1, true>(pipe_cont>
==244108== by 0x14B017EB: tc_call_draw_single (u_threaded_context.c:3033)
==244108== by 0x14AF9C0E: tc_batch_execute (u_threaded_context.c:190)
==244108== by 0x14AFA24F: _tc_sync (u_threaded_context.c:341)
==244108== by 0x14B006E7: tc_texture_subdata (u_threaded_context.c:2549)
==244108== by 0x14238F8C: st_TexSubImage (st_cb_texture.c:2134)
==244108== by 0x14239931: st_TexImage (st_cb_texture.c:2363)
==244108== by 0x1453698A: teximage (teximage.c:3154)
==244108== by 0x1453698A: teximage_err (teximage.c:3181)
==244108== by 0x145388BD: _mesa_TexImage2D (teximage.c:3252)
==244108== by 0x5E88D4: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5E9527: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5E9B72: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5F1092: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x5F10AC: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x48CC66: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x48DDC7: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x40D525: ??? (in /home/zmike/src/piglit/tesseract/bin_unix/linux_64_client)
==244108== by 0x4FF7B74: (below main) (in /usr/lib64/libc-2.33.so)
==244108== Uninitialised value was created by a stack allocation
==244108== at 0x14ECDF55: zink_create_gfx_pipeline (zink_pipeline.c:53)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12618>
the specified stride is irrelevant for this case since there's only one
result to write
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12619>
Everything should be already supported, except patch list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12611>
This patch basically maps pipe_map_flags to pb_flags. Since we are mapping it,
STATIC_ASSERTS won't be required.
Fixes: 00c30dad78 ("gallium: renumber PIPE_MAP_* enums to remove holes")
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12606>
Allow to clear range of layers with vkCmdClear{Color,DepthStencil}Image().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12557>
The lowered LS and NGG stages use local_invocation_index and they
can benefit from the unsigned upper bound because they can emit a
less expensive integer multiplication instruction.
This was working in the past, but accidentally borked by a refactor.
Fossil DB changes on Sienna Cichlid:
Totals from 956 (0.74% of 128647) affected shaders:
CodeSize: 2354172 -> 2344712 (-0.40%)
Instrs: 434359 -> 434327 (-0.01%)
Latency: 1883949 -> 1876814 (-0.38%)
InvThroughput: 762638 -> 757405 (-0.69%)
Fossil DB changes on Sienna Cichlid (with NGGC enabled):
Totals from 57873 (44.99% of 128647) affected shaders:
CodeSize: 155844192 -> 155607064 (-0.15%)
Instrs: 29799184 -> 29799152 (-0.00%)
Latency: 130959764 -> 130814224 (-0.11%); split: -0.11%, +0.00%
InvThroughput: 21100300 -> 20928635 (-0.81%); split: -0.81%, +0.00%
Fixes: 8af6766062
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
Consider the following sequence in a shader:
b = p_extract a
c = v_mad_u32_u16 b, X, 0
The optimizer applies extract, resulting in:
c = v_mad_u32_u16 a, X, 0 (correct)
Then it mistakenly turns that into:
c = v_mul_u32_u24 a, X, 0 (incorrect)
In this case, the p_extract is applied to v_mad_u32_u16 by
apply_extract. After this, we can no longer be sure that
the operands are still 16 or 24-bit, so we have to remove
this flag.
No Fossil DB changes.
Fixes: 54292e99c7
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
We still need to implement quite some functionality before it would make
sense to run dEQP in CI, but it will be already useful to build-test it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12612>
Images that are fast cleared with the comp-to-single mode clears DCC
to 0x10 which tells the hardware to get the clear value from the
main surface instead of the reg.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12565>
This allocate the mit-shm pixmap instead of dri3 pixmaps and
uses the present paths when mit-shm is enabled
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12482>
This pipes the allocation of the MIT-SHM pixmap into the wsi common
code to callback to the x11 path to allocate things in the right place.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12482>
This just adds the xcb bits to detect is the host supports shared
shm pixmaps or whether the old paths should be used.
shm pixmaps will only be used if dri3 is available
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12482>
These won't work since a workgroup can span more than one thread, and
the temporaries are not shared memory.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
Mesh shader outputs are either:
- non-array builtins
- array builtins that are either per-primitive or per-vertex
- user-defined outputs that must be either per-primitive or per-vertex
So we can identify any array output as "arrayed" for the purposes of
I/O lowering.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
Per-primitive is similar to per-vertex attributes, but applies to all
fragments of the primitive without any interpolation involved.
Because they are regular input and outputs, keep track in shader_info
of which I/O is per-primitive so we can distinguish them after deref
lowering. These fields can be used combined with the regular
`inputs_read`, `outputs_written` and `outputs_read`.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
Observed in a shader from Resident Evil Village.
This also helps prevent emitting invalid IR.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12599>