Commit Graph

36032 Commits

Author SHA1 Message Date
Vasily Khoruzhick 34a75ce15c lima: fix blending with min/max ops
It turns out that BLEND_MIN and BLEND_MAX in Utgard take blend factors
into account. My guess is that actual equation looks like:

OP(As * S + Ad * D, Ad) for alpha, and
OP(Cs * S + Cd * D, Cd) for color.

So we have to set S factor to 1 and D factor to 0 to be compliant with
GL spec.

Fixes following piglit tests:
spec@!opengl 1.4@blendminmax
spec@arb_blend_func_extended@arb_blend_func_extended-fbo-extended-blend
(with patch my for ES2_compatibility and EXT_blend_func_extended)

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13873>
2021-11-29 19:31:59 +00:00
Vasily Khoruzhick 5f9434b611 lima: use 1 as blend factor for dst_alpha for SRC_ALPHA_SATURATE
As per [1] alpha blend factors for Sa and Da should be 1 for
SRC_ALPHA_SATURATE

[1] https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_blend_func_extended.txt

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13873>
2021-11-29 19:31:59 +00:00
Vasily Khoruzhick d1d3ebb48c lima: implement dual source blend
It was a bit trickier to RE, since blob doesn't expose this
functionality at all, however we had a clue from the very beginning:
lima_blend_factor is 3 bits, i.e. 8 values, but only 5 of them were
used, it just waited till someone tried what 3 unused values do.

Interestingly enough, it turns out "5" works just as "0" (which is
PIPE_BLENDFACTOR_*SRC_*), but only if output register for gl_FragColor
is $0, So it looks suspiciously similar with PIPE_BLENDFACTOR_*SRC1_*
behavior, and looks like secondary output is taken from $0.

Since output regs for all other outputs are configured via RSW, there
must be a field in RSW for output register for secondary color, it's
likely 4 bits and it's currently set to 0 for reg $0.

Then it was just a matter of brute-forcing various consecutive 4 bits
in RSW - and indeed, setting top 4 bits of rsw->aux0 to the index of
gl_FragColor output register fixes blending tests when we use "5"
blend factor instead of "0".

So it must be a register number for gl_SecondaryFragColor. Unlike
gl_FragColor, the field is only repeated once in RSW.

Wire it up in compiler, and piglit arb_blend_func_extended now passes.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13873>
2021-11-29 19:31:59 +00:00
Vasily Khoruzhick b8f4d36ee4 lima: disasm: call util_cpu_detect() to init CPU caps
It's needed by _mesa_half_to_float(), without this change it hits
assertion failure in util_get_cpu_caps().

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13968>
2021-11-29 18:34:58 +00:00
Vasily Khoruzhick 711a4ccddb lima: disasm: use last argument as a filename
Otherwise it fails to open a file.

Fixes: 9660427ab7 ("lima: Print usage if --help is any of the arguments.")
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13968>
2021-11-29 18:34:58 +00:00
Vasily Khoruzhick 437b97de1c lima: fix crash with sparse samplers
Fixes following piglit tests:
spec@arb_fragment_program@fp-fragment-position
spec@arb_fragment_program@sparse-samplers

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13939>
2021-11-29 18:19:19 +00:00
Marek Olšák 1df7c0ce7e radeonsi: print the shader stage for shader-db dumps
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13869>
2021-11-26 11:58:27 +00:00
Marek Olšák 59926f25fa radeonsi: print source_sha1 as part of shader dumps
It's not part of the shader key, but I don't know where else to put it.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13869>
2021-11-26 11:58:27 +00:00
Marek Olšák cd86f1dc2b radeonsi: rename si_get_shader_wave_size and make it non-inline
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 676d4ddcf8 radeonsi: centralize wave size computation in si_get_shader_wave_size
The big comment was not really true.

The other debug options are unused right now, but will be used again
in the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák b5665bd46c radeonsi: don't use compute_wave_size directly
It will be removed.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 1ef027851d radeonsi: propagate si_shader::wave_size to VGT_SHADER_STAGES
instead of hardcoding them

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 913e1b9138 radeonsi: clean up compute_wave_size use in si_compute_blit.c
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 8290cae2b7 radeonsi: don't use si_get_wave_size in si_get_ir_cache_key
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák d08b09cb7e radeonsi: use si_shader::wave_size
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák bc57488936 radeonsi: add si_shader::wave_size because it will vary
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 41523773f5 radeonsi: add wave32 flag into prolog/epilog keys
It will vary between shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 3b2a6e1b21 radeonsi: don't print uninitialized inlined_uniform_values
We don't set them and we don't read them if they are disabled, so don't
print them either. This silences valgrind warnings.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marius Hillenbrand a46d155329 util/cpu_detect, gallium: use cpu_family CPU_S390X instead of separate flag
to also get rid of the additional function that I introduced before.

Fixes: 82b261417e ("util/cpu_detect: Add flag for IBM Z (s390x)")

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13958>
2021-11-25 12:57:20 +00:00
Vasily Khoruzhick 3b15fb3575 lima/ppir: implement gl_FragDepth support
Mali4x0 supports writing depth and stencil from fragment shader
and we've been using it quite a while for depth/stencil buffer reload.

The missing part was specifying output register for depth/stencil.
To figure it out, I changed reload shader to use register $4 as output
and poked RSW bits (or rather consecutive 4 bit groups) until tests
that rely on reload started to pass again.

It turns out that register number for gl_FragDepth/gl_FragStencil is in
rsw->depth_test and register number for gl_FragColor is in
rsw->multi_sample and it's repeated 4 times for some reason (likely for
MSAA?)

With this knowledge we now can modify ppir compiler to support multiple
store_output intrinsics.

To do that just add destination SSA for store_output to the registers
list for regalloc and mark them explicitly as output. Since it's never
read in shader we have to take care about it in liveness analysis -
basically just mark it alive from the time when it's written to the end
of the block. If it's live only in the last instruction, mark it as
live_internal, so regalloc doesn't clobber it.

Then just let regalloc do its job, and then copy register number to the
shader state and program it in RSW.

The tricky part is gl_FragStencil, since it resides in the same register
as gl_FragDepth and with the current design of the compiler it's hard to
merge them. However gl_FragStencil doesn't seem to be part of GL2
or GLES2, so we can just leave it not implemented.

Also we need to take care of stop bit for instructions - now we can't
just set it in every instruction that stores output, since there may be
several outputs. So if there's any store_output instructions in the
block just mark that block has a stop, and set stop bit in the last
instruction in the block. The only exception is discard - we always need
to set stop bit in discard instruction.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13830>
2021-11-24 02:26:08 +00:00
Vasily Khoruzhick 98a7c4c6f8 lima/ppir: check if mul node is a source of add node before inserting
We can't insert mul node into add node instruction if it's a virtual dep
(sequence or write_or_read dep), so use ppir_node_has_single_src_succ
in addition to ppir_node_has_single_succ.

We can't use ppir_node_has_single_src_succ alone, since node may have
a virtual dependency in addition to source dependency, and we can't
insert it either in this case.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13830>
2021-11-24 02:26:08 +00:00
Thomas H.P. Andersen 64292c0f05 svga: fix bitwise/logical and mixup
The function need_temp_reg_initialization looks suspecious.

It will only ever return true if we get past this if:
if (!(emit->info.indirect_files && (1u << TGSI_FILE_TEMPORARY)) ...

Using the logical && means the intended initialization done
based on the result of this check is not performed.

This code was both introduced and altered in MR 5317.
ccb4ea5a introduces the function.
ba37d408 is a collection of performance improvements and misc
fixes. This altered the if from using bitwise to logical and.

This commit changes it back to bitwise.

Spotted from a compile warning.

Fixes: ba37d408da ("svga: Performance fixes")

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12157>
2021-11-24 01:59:36 +00:00
Marius Hillenbrand c5d6e57e42 llvmpipe: Use lp_build_round_arch on IBM Z (s390x)
LLVM has all the required intrinsics available on IBM Z, so use them for
rounding operations (they will be implemented as a single instruction).
This change makes the test case lp_test_arit pass, because it avoids
using the buggy generic code.

v2: update .gitlab-ci/cross-xfail-s390x to reflect passing lp_test_arit

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13927>
2021-11-23 17:49:02 +00:00
Antonio Caggiano 902c5bf468 virgl: Link shader program
Add a new command associated to glLinkProgram. With this we should be
able to compile and link shaders when requested by the user, thus
avoiding that to happen in the middle of a frame.

Together with the command we pass an array of shader handles attached to
the program, where each position of the array corresponds to a pipe
shader type.

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13674>
2021-11-23 16:14:16 +00:00
Ilia Mirkin bb6fb6065f freedreno/a[345]xx: fix unorm/snorm blend factors when they're "over"
The float value may be out of range, so must be clamped to the allowed
range. Unclear if a3xx also has a SNORM factor that we're just missing
there, but that will be a separate investigation.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13903>
2021-11-22 18:09:44 +00:00
Ilia Mirkin 43f94ee9f1 freedreno/a5xx: add missing L8A8_UNORM format to support TBOs
Fixes arb_texture_buffer_object-formats test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13906>
2021-11-22 17:44:59 +00:00
Ilia Mirkin c87967bf17 freedreno/a4xx: add some missing legacy formats to help TBOs
Unlike with regular textures, we really have to support all the formats
directly for TBOs to work properly. Add the missing formats to fix
arb_texture_buffer_object-formats piglit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13906>
2021-11-22 17:44:59 +00:00
Ilia Mirkin 5a69f34aeb freedreno/a4xx: add missing SNORM formats to help tests pass
Otherwise some of these fall back to RGBA_SNORM, which can screw up
blend factors.

Fixes spec@ext_texture_snorm@fbo-blending-formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13904>
2021-11-22 17:18:56 +00:00
Marek Olšák cdeecadcb6 radeonsi: deduplicate min_esverts code in gfx10_ngg_calculate_subgroup_info
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Marek Olšák 9d7ac70ffb radeonsi: implement shader culling in GS
It already does compaction, so we just need to load vertex positions
and cull. This was easier than expected.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Marek Olšák 492a61fe72 radeonsi: don't use ctx.stage outside of si_llvm_translate_nir
si_llvm_translate_nir() changes ctx.stage, so the outside code shouldn't
use it. This hasn't caused any issues yet. Since ctx.stage starts as 0,
the first use in this commit was a tautology.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Marek Olšák 1c5899900d radeonsi: simplify si_get_vs_key_outputs for GS
ngg_culling is always 0 when GS is enabled. This will change in the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Marek Olšák a368385b23 radeonsi: add is_gs parameter into si_vs_needs_prolog
and disable the VS prolog code for GS.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Marek Olšák f96d1757bb radeonsi: restructure code that declares merged VS-GS and TES-GS SGPRs
no change in the SGPR layout

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Marek Olšák 2418da2d4a radeonsi: separate culling code from VS/TES (to be reused by GS)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13829>
2021-11-20 00:03:45 +00:00
Jesse Natalie b8f41c5c4e d3d12: Validate opened D3D12 resource matches pipe template
Unlike Linux dma-bufs, D3D12 resources are strongly typed, and
can't necessarily just reinterpret the memory arbitrarily.

Allow importing resources with no description coming from the frontend,
and populate the resource desc from the driver instead. If there was
a template, make sure that it matches the incoming resource.

Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie 9740141b2e d3d12: Generate a pipe format -> typeless mapping table too
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie ca7d4fcb3f d3d12: Generate format table using a macro list
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie 25bcc56027 d3d12: Make format list all use macros
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie 96012b686e d3d12: Handle import/export of fd shared handles
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie 2771fd4a3f gallium, windows: Use HANDLE instead of FD for external objects
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie 2188607014 d3d12: Support RGBX formats mapped to RGBA
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie ab9948997a d3d12: Support PIPE_CAP_MIXED_COLOR_DEPTH_BITS
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Jesse Natalie e0576ec148 d3d12: Support BGRA 555 and 565 formats
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
2021-11-19 22:54:46 +00:00
Mike Blumenkrantz 81cc94b8f0 zink: be consistent about waiting on context queue on context destroy
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13885>
2021-11-19 18:56:10 +00:00
Mike Blumenkrantz e92b8956c7 zink: set batch state queue on creation
make this easier to find

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13885>
2021-11-19 18:56:10 +00:00
Emma Anholt b8ffd7a888 freedreno/a5xx: Emit MSAA state for sysmem rendering, too.
This looked obviously wrong, we want to set the sample counts for sysmem
too just like we do on 6xx.  Turns out it fixes some piglits.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Emma Anholt cad0b6e2e5 freedreno/a6xx: Disable sample averaging on non-ubwc z24s8 MSAA blits.
The fallback path we averages unorm textures, but if we don't have ubwc on
either then we can just cast them to uint which then just takes sample 0.

The proper UBWC format I think ends up averaging, though.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Emma Anholt 93eb697a8d freedreno/a6xx: Disable sample averaging on z/s or integer blits.
We can't generally force fd_blitter_blit() to not average in our fallback
blits, but this should at help some cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Mike Blumenkrantz 22d9d0f8b5 zink: add a compiler pass to scan for shader image use
other frontends and internal shaders won't set this

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13864>
2021-11-19 13:14:46 +00:00
Mike Blumenkrantz e386a57769 zink: explicitly init glsl
need this to be able to use other frontends

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13864>
2021-11-19 13:14:46 +00:00
Emma Anholt e277b13182 freedreno: Stop exposing MSAA image load/store on desktop GL.
GLES doesn't support it, and blob VK doesn't support it.  We could
theoretically lower it, but don't bother since it's not required.  Fixes
various piglit image load/store tests.

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13852>
2021-11-18 23:47:58 +00:00
Alyssa Rosenzweig 81d22da6de asahi: Fix BIND_PIPELINE sizing and alignment
Fix a bug in BIND_PIPELINE XML reported by Dougall, which cleans up
a bit of both decoder and driver.

Instead of...

   * 17 bytes BIND_PIPELINE  (17)
   * An unused 8 byte record (25)
   * A set of N 8 byte records (25 + 8 * N)
   * Oops, 1 byte too many! One just disappeared (24 + 8 * N)

It seems to instead be

   * 24 bytes BIND_PIPELINE (24)
   * A set of N 8 byte records (24 + 8 * N)

without the sentinel record. These means the 8 byte records themselves
are shuffled, with the high byte of the pointers split from the low
word, but that's less gross than an off-by-one.

It's still not clear what the last 8 bytes of the BIND_VERTEX_PIPELINE
structure mean, or the last 4 byte of the BIND_FRAGMENT_PIPELINE
structure which seems to be a bit shorter.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
2021-11-18 23:35:25 +00:00
Alyssa Rosenzweig a28775046c asahi: Remove silly magic numbers
These are unnecessary now that the structure of agx_map_* is better
understood.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
2021-11-18 23:35:25 +00:00
Alyssa Rosenzweig d55a1a77bd asahi: Fix agx_map_* structures
Dougall Johnson observed these structures make more sense with indices[]
first in the entries and indices[] absent from the header. Then the
sentinel entry disappears, nr_entries makes more sense, and a few magic
numbers pop out. Many thanks to Dougall's astute eyes.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
2021-11-18 23:35:25 +00:00
Alyssa Rosenzweig 6637fbb211 asahi: Allocate special scratch buffers
Seem to be used for preemption.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
2021-11-18 23:35:25 +00:00
Mike Blumenkrantz 04cc1b93b1 zink: enable PIPE_TEXTURE_TRANSFER_COMPUTE on non-cpu drivers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13859>
2021-11-18 22:12:58 +00:00
Mike Blumenkrantz ea761a40d5 zink: use pb_slab_alloc_reclaimed(reclaim_all) for BAR heap sometimes
this forces a full slab reclaim any time the device is known to have a
too-small BAR in order to keep memory usage at a minimum when it might otherwise
balloon out and crash us

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13850>
2021-11-18 21:22:30 +00:00
Roland Scheidegger b7e2214b3c llvmpipe: adjust rounding for viewport scissoring
Some apps may try to use a viewport adjusted by 0.5 pixels (among other
things) to emulate d3d9 pixel center, and in this case we would end up
with incorrect "fake scissor" box (shifted by 1 pixel), hence pixels
being incorrectly scissored away when permit_linear_rasterizer is set
(this happens even if the linear rasterizer is not used in the end).

So adjust the offset so that the half-way points get rounded down instead
of up.
(This is all a bit iffy I think since we don't use fractional
boxes (with 8 subpixel bits) anywhere yet, but at least without msaa
it should work out.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13794>
2021-11-18 19:23:13 +00:00
Tomeu Vizoso 81f25d8f27 virgl/ci: Run each dEQP instance in its own VM
Currently we run deqp-runner inside a single VM, which makes very poor
use of the available CPUs because Virgl has a bottleneck in the VMM that
serializes everything.

With this change, we can run several Crosvm instances in a runner and
make full use of the CPUs. Getting the same coverage with 3 runners
instead of 6.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
2021-11-18 13:36:24 +00:00
Tomeu Vizoso d542e978e9 virgl/ci: Set GALLIVM_PERF=nopt,no_quad_lod
nopt will disable some shader optimizations that slow down test runs for
no gain.

no_quad_lod will disable some speed hacks that can cause inaccurate
results.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
2021-11-18 13:36:24 +00:00
Mike Blumenkrantz c9a47c85da gallium: rename PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER
this is now a bitfield enum for more functionality

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
2021-11-18 07:58:29 -05:00
Pierre-Eric Pelloux-Prayer df8aeb4598 radeonsi/sqtt: increase the default buffer size to 32MB
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13838>
2021-11-18 10:53:37 +01:00
Pierre-Eric Pelloux-Prayer 56382ec071 radeonsi: unreference framebuffer state after use
util_copy_framebuffer_state increases refcounts, so we have
to decrement them afterwards.

Fixes: b1b491cdbb ("radeonsi: add a faster clear path for glClearTexImage")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5631
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13838>
2021-11-18 10:53:34 +01:00
Mike Blumenkrantz 35ffadb9e7 zink: clamp to 500 max batch states on nvidia
I've been advised that leaving this unclamped will use up all the fds
allotted to a process

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13844>
2021-11-18 00:00:16 +00:00
Mike Blumenkrantz a3be30665f zink: fail context creation more gracefully
handle some cases where context creation fails earlier than expected

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13844>
2021-11-18 00:00:16 +00:00
Mike Blumenkrantz 72a88c77de zink: fix memory availability reporting
this shouldn't report the budgeted available memory, it should return
the total memory, as that's what this api expects

Fixes: ff4ba3d4a7 ("zink: support PIPE_CAP_QUERY_MEMORY_INFO")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
2021-11-17 22:59:43 +00:00
Mike Blumenkrantz 5f140a723d zink: use IMMUTABLE for dummy xfb buffer
this is never getting read back or anything so don't waste BAR allocation

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
2021-11-17 22:59:43 +00:00
Mike Blumenkrantz 1eb2f0d41e zink: demote BAR allocations to device-local on oom
ideally this shouldn't happen, but it's better than crashing even if
it may crash later from attempting to map

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
2021-11-17 22:59:43 +00:00
Mike Blumenkrantz 8f97af050e zink: set zink_resource_object::host_visible based on actual bo placement
the properties determined before allocation may not be the same as what gets
allocated

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
2021-11-17 22:59:43 +00:00
Mike Blumenkrantz 74d2e89201 zink: always use slab allocation placement for domains
this allows the actual bo to have its memory type changed if necessary

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
2021-11-17 22:59:43 +00:00
Mike Blumenkrantz 4fc216b4ba zink: add error for bo allocation failure
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
2021-11-17 22:59:43 +00:00
Mike Blumenkrantz b1a32d1432 zink: implement multiplanar modifier handling
it turns out this is trivial as long as dri gives usable resource templates

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
2021-11-17 19:22:02 +00:00
Mike Blumenkrantz 943f6a038d zink: always set matching resource export type for dmabuf creation
both of these need to be set if one is

cc: mesa-stable

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
2021-11-17 19:22:02 +00:00
Mike Blumenkrantz 11c79a8bd7 zink: stop using VK_IMAGE_LAYOUT_PREINITIALIZED for dmabuf
this is illegal

cc: mesa-stable

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
2021-11-17 19:22:02 +00:00
Omar Akkila 58a0d8d0de llvmpipe: page-align memory allocations
Allows memory allocated by llvmpipe_allocate_memory_fd to be
mappable to guests in virtualized environments like KVM which
requires page-aligned memory.

llvmpipe_allocate_memory is updated similarly for consistency.

Signed-off-by: Omar Akkila <omar.akkila@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13793>
2021-11-17 09:25:37 -05:00
Connor Abbott 508f917d8c util/dag: Make edge data a uintptr_t
Nobody was actually using it as a pointer, and I'm going to introduce a
shared function which relies on it not being a pointer so let's fix this
once and for all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Erico Nunes ee2e14b352 ci: temporarily disable lima CI
The lima board farm will be unavailable for a few days, so disable it
to avoid CI failures.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13595>
2021-11-17 11:40:19 +00:00
Kenneth Graunke 3b78f17532 iris: Tidy code in iris_use_pinned_bo a bit
Now that we aren't short-circuiting most of the code, we should probably
reorganize it a little bit.  Tagged with fixes just so we pull all the
refactors together as one group.

Fixes: b21e916a62 ("iris: Combine iris_use_pinned_bo and add_exec_bo")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13808>
2021-11-17 02:43:30 -08:00
Kenneth Graunke 6e90984934 iris: Check for cross-batch flushing whenever a buffer is newly written.
We need to perform cross-batch flushing if any batch writes to a BO
while others refer to it.  We checked this case when recording a new
BO in the list which we'd never seen before.  However, we neglected to
handle the case when we already read from a BO, but then began writing
to it.  That new write may provoke a conflict between existing reads
in other batches, so we need to re-check the cross-batch flushing.

Caught by Piglit's copyteximage when forcing blits and copies to use
a new IRIS_BATCH_BLITTER that isn't upstream yet.  But this bug could
be provoked by render/compute work today...we just hadn't noticed it.

Fixes: b21e916a62 ("iris: Combine iris_use_pinned_bo and add_exec_bo")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13808>
2021-11-17 02:43:30 -08:00
Kenneth Graunke 76030964a6 iris: Make a helper function for cross-batch dependency flushing
This should have no functional change, but it's tagged with Fixes
anyway because it's needed for the bug fix in the next patch.

Fixes: b21e916a62 ("iris: Combine iris_use_pinned_bo and add_exec_bo")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13808>
2021-11-17 02:43:30 -08:00
Alejandro Piñeiro cbf0d83eac v3d,v3dv: move TFU register definition to a common header
We are using the same definitions for both OpenGL and Vulkan, so let's
move it to common.

As we are here we are also adding versioning on the TFU register
definition. Those are basically register bit places, so really likely
to change between versions.

Adding 33 as it is the first version they got defined.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13832>
2021-11-17 11:04:31 +01:00
Pavel Asyutchenko 8ee7309e57 llvmpipe: enable PIPE_CAP_FBFETCH_COHERENT
llvmpipe's fragment shaders are always run sequentially and
in API order for a single tile, so it's impossible to have
out of order render target writes requiring fetch barriers.

Issues fixed in previous commits were actually breaking most
piglit/deqp tests for coherent extension variant.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
2021-11-17 04:08:54 +00:00
Pavel Asyutchenko e403c1c23e llvmpipe: remove dead args from load_unswizzled_block
They were only used in fs_fb_fetch.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
2021-11-17 04:08:54 +00:00
Pavel Asyutchenko ea6eeb70e6 llvmpipe: fix FB fetch with non 32-bit render target formats
Use lp_build_fetch_rgba_soa instead of lp_build_unpack_rgba_soa.
This one was failing most of deqp *framebuffer_fetch* tests.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
2021-11-17 04:08:54 +00:00
Pavel Asyutchenko 2b3a020928 llvmpipe: protect from doing FB fetch of missing buffers
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
2021-11-17 04:08:54 +00:00
Pavel Asyutchenko 3ebd6498c4 llvmpipe: fix gl_FragColor and gl_LastFragData[0] combination
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
2021-11-17 04:08:54 +00:00
Pavel Asyutchenko b1de61dd38 llvmpipe: fix wrong assumption on FB fetch shader opacity
In certain cases variant->opaque could be set to true, which
reset command list for tiles fully covered by a triangle
with this shader. This is obviously wrong in presence of
framebuffer fetch.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
2021-11-17 04:08:54 +00:00
Mike Blumenkrantz 86eb1549ef zink: implement pipe_context::draw_vertex_state
rough implementation, but it should be a decent start

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13692>
2021-11-17 03:16:13 +00:00
Vasily Khoruzhick 02e5f4fb10 lima: add more wrap modes
Using 1 bit per wrap mode looked very suspicious and after some
experiments it turns out it's 3-bit enum.

Border color is also here, it sits right after depth field. For
some reason it uses 16 bit per channel just like for clear color in RSW

GL_CLAMP mode is broken for nearest filter just as on Midgard, so add
the same workaround - use GL_CLAMP_TO_EDGE for nearest filter.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
2021-11-16 22:58:12 +00:00
Vasily Khoruzhick cbed4d784e lima: handle 1D samplers
It's just a matter of changing number of dimensions in texture
descriptor.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
2021-11-16 22:58:12 +00:00
Vasily Khoruzhick fa86a2a94d lima: add support for 3D textures
It looks like MBS format used by blob doesn't distinguish sampler2D from
sampler3D, so load texture instruction is the same for 2D and 3D
textures.

So all we need to RE is texture descriptor for 3D textures, but blob
doesn't implement it, so we need to do some guesswork:

- unknown_3_1 looks like a depth since it sits after height/width and
  always set to 1
- unknown_2_2 is exactly 3 bits and it follows wrap_t, so it must be
  wrap_r
- missing part is texture type for 3D textures. By trial and error it
  seems to be 4. First bit is only set for cubemap, so it's likely a
  separate flag, and rest 2 bits look like number of tex dimensions akin
  to midgard and later (thanks, panfrost!) with 0 for 1D, 1 for 2D
  and 2 for 3D.

Put it all together and we have working 3D textures on lima!

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
2021-11-16 22:58:12 +00:00
Mike Blumenkrantz 97b92c9c32 zink: set suballocator bo size to aligned allocation size
this is the actual memory size

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13824>
2021-11-16 22:29:20 +00:00
Mike Blumenkrantz eb6f1d5348 zink: block suballocator caching for swapchain/dmabuf images
these have pNext pointers which makes their memory uncacheable

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13824>
2021-11-16 22:29:20 +00:00
Marek Olšák ba6d389fa7 radeonsi: don't use GS SGPR6 for the small prim cull info
use a user SGPR instead. This will be needed in the future.

Also don't upload small_prim_precision because it's passed via
VS_STATE_BITS.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Marek Olšák 0690a44e69 radeonsi: inline declare_vs_specific_input_sgprs
I think it was getting a little hard to follow.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Marek Olšák 513bd6acca radeonsi: cull against clip planes, clipvertex, clip/cull distances in shader
The downside is that this duplicates shader code for clip/cull distances
in both the position and parameter portions of the shader.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Marek Olšák 881c459191 radeonsi: unify how ngg_cull_flags are set
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Jesse Natalie a818f7b686 d3d12: Fix incorrect hash table usage
I'd assumed that since insert didn't take a deleter, it was
find-or-insert, not insert-or-replace. This caused a bo reference
leak if the same bo was used more than once in a batch.

Fixes: fde36d7992 ("d3d12: Don't wait for GPU reads to do CPU reads")
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13819>
2021-11-16 19:27:16 +00:00
Vasily Khoruzhick 764760314d lima: add native txp support
Currently lima uses generic TXP lowering that results in downgrading
coords precision to FP16 since we have to do some calculations with
coords instead of loading them directly from varying.

Mali4x0 has native TXP support, however coords and projector have to
come from a single source.

Add NIR lowering pass that combines coords and projector into a single
backend-specific source and use it instead of generic lowering.

Unfortunately this change regresses one test, but it also fails in blob and
disassembly is now identical.

shader-db diff:

total instructions in shared programs: 15623 -> 15603 (-0.13%)
instructions in affected programs: 877 -> 857 (-2.28%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 2.86 x̃: 2
helped stats (rel) min: 0.87% max: 10.53% x̄: 4.93% x̃: 1.85%
95% mean confidence interval for instructions value: -4.95 -0.76
95% mean confidence interval for instructions %-change: -9.31% -0.55%
Instructions are helped.

total loops in shared programs: 3 -> 3 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 136 -> 137 (0.74%)
spills in affected programs: 0 -> 1
helped: 0
HURT: 1

total fills in shared programs: 598 -> 602 (0.67%)
fills in affected programs: 0 -> 4
helped: 0
HURT: 1

Tested-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13111>
2021-11-16 19:13:42 +00:00
Kenneth Graunke ebc0099d89 intel/genxml: Collapse leading underscores on prefixed value defines
We prefix names with an underscore to make them "safe" C identifiers
when necessary.  For example, a value of "32x32" would become "_32x32".

However, when specifying something like

   <field ... prefix="BLOCK_SIZE">
     <value name="32x32" value="0"/>
   </field>

we already have a prefix that makes the field name safe.  We'd rather
generate a name with a single underscore, i.e.

    #define BLOCK_SIZE_32x32 0

rather than

    #define BLOCK_SIZE__32x32 0

This also fixes up affected defines in crocus.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
2021-11-16 11:38:30 +00:00
Kenneth Graunke f4004fde26 iris: Fix parameters to iris_copy_region in reallocate_resource_inplace
We had accidentally passed <x, y, z, l> instead of <l, x, y, z>.

Fixes: b8ef3271c8 ("iris: Move suballocated resources to a dedicated allocation on export")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13815>
2021-11-16 11:22:04 +00:00
Ilia Mirkin bf14a63e1d freedreno/a4xx: hook up sample mask/id, used to determine helper invocs
This fixes the various gl_HelperInvocation-based tests. There's a
lowering pass which converts it to (1 << sampleid) & samplemask.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
2021-11-16 05:08:26 +00:00
Ilia Mirkin 45606b51cc freedreno/a4xx: indicate whether outputs are uint/sint
Unclear whether this fixes anything, but the blob does seem to set
these. (Discovered while trying to determine if value clamping was
missing for non-32-bit integer formats, which fail in some tests.)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
2021-11-16 05:08:26 +00:00
Ilia Mirkin 14087cb9ea freedreno/a4xx: fix stencil-textured border colors
These are implemented with unusual sampler formats, so the usual approach
of looking at the format descriptors fails.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
2021-11-16 05:08:26 +00:00
Ilia Mirkin 8c041f4bf3 freedreno/a5xx: re-express buffer textures more logically
Instead of treating it as 2 bits to enable, make BUFFER a type (and
extend the bitfield width), and then add a separate BUFFER bit
(ostensibly to perform the width/height concatenation but who knows).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13805>
2021-11-16 04:44:23 +00:00
Ilia Mirkin 6566eae933 freedreno/a4xx: add proper buffer texture support
Rather than faking it as a 1d texture, add the buffer texture type, and
allow a full range of sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13805>
2021-11-16 04:44:23 +00:00
Marek Olšák 42dbfd7206 radeonsi: make si_llvm_emit_clipvertex non-static
it will be used in culling code

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák d3d5777536 radeonsi: remove an incorrect comment at lds_byte0_accept_flag
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 20e83abf06 radeonsi: improve memory instruction tracking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 901697654a radeonsi: add dcc_msaa option to enable DCC for MSAA
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 5a5263d65d radeonsi: unify GFX9_VSGS_NUM_USER_SGPR and GFX9_TESGS_NUM_USER_SGPR
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 9151ac3531 ac,radeonsi: cull small lines in the shader using the diamond exit rule
It also splits clip_half_line_width into X and Y components for tighter
view culling.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 701a0b5165 radeonsi: add si_state_rasterizer::ngg_cull_flags_lines and rename the others
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 3166d4428d radeonsi: set EXTRA_DX_DY_PRECISION for lines where it's supported
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 4571778008 radeonsi: set PERPENDICULAR_ENDCAP_ENA for wide AA lines
This is more correct.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:45 +00:00
Marek Olšák 3338956268 radeonsi: make si_get_small_prim_cull_info static
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:45 +00:00
Marek Olšák 963b7475a9 radeonsi: use ac_build_load_to_sgpr in gfx10_emit_ngg_culling_epilogue
This is more correct because we are loading constants into an SGPR even
though there is no effect on behavior in this case.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:45 +00:00
Marek Olšák f8a0aa6852 radeonsi: fix view culling for wide lines
We need to cull wide lines as quads, but only for view culling.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:45 +00:00
Marek Olšák 8f687bb5dc radeonsi: fix shader culling with integer pixel centers
Only Nine was using them.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:45 +00:00
Paulo Zanoni a9c1cc63c6 iris: call brw_process_intel_debug_variable() earlier
We're currently only calling it after creating the screen and the
bufmgr. There are a few cases where Iris checks for the DEBUG_BUFMGR
bit before we call brw_process_intel_debug_variable(), which means
intel_debug is 0 and so we don't run the debug code. Today, these are
all related to the creation of the workaround bo and its mmap.

I found this in a custom branch after I converted to INTEL_DEBUG an
environment variable that I had.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13780>
2021-11-15 23:33:18 +00:00
Vasily Khoruzhick 15013958d0 lima: enable PIPE_CAP_PREFER_POT_ALIGNED_VARYINGS
Mali4x0 PP doesn't have a swizzle for load_input, so use POT-aligned
varyings to avoid unnecessary movs for vec3 and precision downgrade
in case if this vec3 is coordinates for a sampler

shader-db:

total instructions in shared programs: 15707 -> 15623 (-0.53%)
instructions in affected programs: 3906 -> 3822 (-2.15%)
helped: 47
HURT: 18
helped stats (abs) min: 1 max: 9 x̄: 3.09 x̃: 2
helped stats (rel) min: 1.49% max: 23.53% x̄: 8.20% x̃: 6.45%
HURT stats (abs)   min: 1 max: 7 x̄: 3.39 x̃: 3
HURT stats (rel)   min: 0.78% max: 20.59% x̄: 10.45% x̃: 10.97%
95% mean confidence interval for instructions value: -2.18 -0.41
95% mean confidence interval for instructions %-change: -5.70% -0.38%
Instructions are helped.

total spills in shared programs: 146 -> 136 (-6.85%)
spills in affected programs: 39 -> 29 (-25.64%)
helped: 6
HURT: 0

total fills in shared programs: 617 -> 598 (-3.08%)
fills in affected programs: 125 -> 106 (-15.20%)
helped: 6
HURT: 0

HURT shaders are vertex shaders where we may need more instructions
for non-packed vec3s. It's acceptable trade-off since we don't get
precision downgrade if this varying is coordinates for a sampler.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13151>
2021-11-15 22:52:55 +00:00
Mike Blumenkrantz 43c457a6ec zink: always add VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for 3D images
there's no way to know what an image will be used for, so this bit needs
to always be added

fixes KHR-GL46.packed_pixels.varied_rectangle.compressed_rgb

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13798>
2021-11-15 21:24:05 +00:00
Mike Blumenkrantz 93a55537f2 zink: stop running discard_if in generated tcs
just embarrassing smh

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13798>
2021-11-15 21:24:05 +00:00
Samuel Pitoiset df526aae1b zink: skip one GLES31 subset to avoid GPU hangs on Navi10
Weird bug... I will figure out later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13796>
2021-11-15 20:33:22 +00:00
Rob Clark f53e1823c2 freedreno: caps for clover
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12500>
2021-11-15 18:06:39 +00:00
Rob Clark 9e7f5b75ec freedreno: Add PIPE_SHADER_IR_NIR_SERIALIZED support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12500>
2021-11-15 18:06:39 +00:00
Ilia Mirkin 31d6cd224a a5xx: remove astc srgb workaround logic
This was copied from a4xx, which only needs it on one chip model (A420).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13782>
2021-11-15 17:31:53 +00:00
Samuel Pitoiset cb56b83572 zink: update the CI lists for RADV
Lot of GPU hangs fixed lately.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13792>
2021-11-15 16:19:29 +00:00
Iago Toral Quiroga f384c763fc v3d,v3dv: move tile size calculation to a common helper
We had this code replicated in 3 places across both drivers.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13790>
2021-11-15 11:40:39 +00:00
Dave Airlie 27903abbb6 llvmpipe: fix compressed image sizes.
VK CTS just added some new tests to write to a compressed image
from a compute shader, which was overrunning memory.

The image width/height need to be sized according to the block
sizes to avoid overwriting memory.

dEQP-VK.image.sample_texture.*bit_compressed*

Cc: mesa-stable

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13618>
2021-11-15 07:15:36 +10:00
Dave Airlie 53a8faafc1 llvmpipe: disable 64-bit integer textures.
This fixes some crashes in VK-GL-CTS where it doesn't deal with these.

Cc: mesa-stable

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13617>
2021-11-14 20:47:15 +00:00
Emma Anholt 32b51d5e60 freedreno/a6xx: Do sparse setup of the TFB program.
We don't need to init the whole program RAM, just the locations we are
actually writing from.  Syncs this code up with tu a bit more.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13747>
2021-11-12 20:26:22 +00:00
Ilia Mirkin 170e1aa647 freedreno/a[345]xx: add R8/RG8 SRGB formats
These enable the GL_EXT_texture_sRGB_R8 / GL_EXT_texture_sRGB_RG8
extensions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13765>
2021-11-12 17:22:02 +00:00
Ilia Mirkin 8db29109be freedreno: prefer float immediates when float values are involved
Using double immediates can cause a natively-float value to have to get
upgraded to a double unnecessarily. Use float immediates where possible.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13764>
2021-11-12 16:48:49 +00:00
Ilia Mirkin 269b4dec9e nv50,nvc0: expose R8/RG8_SRGB formats for texturing
This enables the GL_EXT_texture_sRGB_R8/RG8 extensions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13769>
2021-11-12 15:34:45 +00:00
Iago Toral Quiroga 0cb58f80d2 v3d: use V3D_MAX_DRAW_BUFFERS instead of hardcoded constant
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>
2021-11-12 11:04:07 +00:00
Qiang Yu 3900551894 radeonsi: add radeonsi_force_use_fma32 driconf option
fma32 only round once so has 0.5UP accuracy. mad32 round twice so
has 1UP accuracy. This accuracy difference sometimes make the result
different at the last bit.

Applications like META need more accuracy for display right result.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13686>
2021-11-12 09:01:58 +00:00
Ilia Mirkin d903eb156a freedreno/a4xx: fix min/max/bias lod sampler settings
This makes a4xx look more like a3xx for these settings. Most importantly
it adds the workaround for allowing the hw to decide between min and mag
filtering. This fixes a number of dEQP texture filtering tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13763>
2021-11-12 01:12:35 +00:00
Ilia Mirkin 4ffcef821c freedreno/ir3: fix setting the max tf vertex when there are no outputs
Fixes dEQP-GLES3.functional.transform_feedback.* on a4xx.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13760>
2021-11-11 23:49:19 +00:00
Ilia Mirkin c0de7ea0ab freedreno: check batch size after the fallback blitter clear
When force-flushing after every draw, this would otherwise hit a NULL
batch in fd_blitter_clear.

Tested on a4xx.

Suggested-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13761>
2021-11-11 23:26:00 +00:00
Alejandro Piñeiro 3f3820a3a5 v3d: remove static v3d_start_binning
v3dx(start_binning) is just a call to that method, so let's just use
it directly.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13754>
2021-11-11 14:04:22 +01:00
Alejandro Piñeiro 2a65db2458 v3d: remove unused include
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13754>
2021-11-11 14:04:16 +01:00
Andreas Baierl ee41e1bbd2 lima: Fix drawing wide lines
GLES2.0 spec allows parts of wide lines and points to be drawn even if
their center is outside the viewport.
Therefore 0x2000 in PLBU_CMD_PRIMITIVE_SETUP has to be set for points.
This is already our default setting as it seems to have no negative
effect when this bit is always set. Points work as expected but lines
don't. It's hard to RE it, because the affected deqp tests also fail
with the blob.

To respect this behaviour for lines and solve another 2 tests, we need
to do a workaround and temporarily extend the viewport by half of the
line width. The scissor rectangle is still equal with the initial
viewport.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12971>
2021-11-11 11:25:58 +00:00
Samuel Pitoiset 3e7bac80ce ac/rgp: add support for dumping SPM data
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13704>
2021-11-11 10:05:49 +00:00
Neil Roberts bdaf185889 v3d: Update prim_counts when prims generated query in flight without TF
In order to implement GL_PRIMITIVES_GENERATED, v3d allocates a small
resource and adds a command to the job to store the prim counts to it.
However it was only doing this when TF was enabled which meant that if
the query was used with a geometry shader but no TF then the query would
always be zero. This patch makes the driver keep track of how many
PRIMITIVES_GENERATED queries are in flight and then enable writing the
prim count if its more than zero.

Fix dEQP-GLES31.functional.geometry_shading.query.primitives_generated_*

v2: Update CI expectations and references to fixed tests in commit log.
v3: - Add comment that GL_PRIMITIVES_GENERATED query is included because
      OES_geometry_shader, but it is not part of OpenGL ES 3.1. (Iago)
    - Update Fixes to commit introducing geometry shaders. (Iago)

Fixes: a1b7c084 ("v3d: fix primitive queries for geometry shaders")
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13712>
2021-11-11 08:02:04 +00:00
Emma Anholt 07aaef5721 freedreno/a6xx: Inline remaining fd6_tex_const_0() call.
Less indirection and fixups for figuring out what's going on.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 7230058e8a freedreno/a6xx: Drop an unused tile_mode arg.
I added this in ebaeddcbb3 ("freedreno/a6xx: Rewrite the format table
format/swap helpers.")  but it had already become unused through some
bugfixing.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt a9057d45a4 freedreno/a6xx: Clean up sysmem fb read patching using fd6_view.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt c90220e449 freedreno/a6xx: Use fd6_view for non-buffer image descriptors, too.
This deletes a whole lot of code, but there's a modest drawoverhead perf
loss:

drawoverhead 1-image change -6.48856% +/- 4.28269% (n=50)
drawoverhead 8-image change -5.29195% +/- 2.62549% (n=90)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 533e486923 freedreno/a6xx: Switch to relying on fd6_view for our texture descriptors.
Having checked the deltas between fdl6_view and what we did before, switch
over to fdl6_view now.

No statistically significant difference on no-hw drawoverhead 8-texture
change (n=50) with the texture cache disabled from this and the previous
commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 84377785a4 freedreno/a6xx: Create a fd6_view at sampler view update time.
The goal is to share the same code as turnip for descriptor setup. This
just calls it and cross-checks.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 5b3a6ff9f7 freedreno: Set layer_first on (2D) resource imports.
Prevents getting a weird layer stride if you ask for it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Iago Toral Quiroga 3a95e25e84 v3dv,v3d: don't store swizzle pointer in shader/pipeline keys
We had been storing pointers to a driver owned swizzle table
rather than storing the actual swizzle value in various shader
and pipeline keys on both GL and Vulkan drivers.

This doesn't look very robust, particularly since we also
compute sha1 hashes from these values and we may store these
hashes to disk (for the disk cache).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13738>
2021-11-10 11:24:26 +00:00
Mike Blumenkrantz 4dfb5818ed zink: update gfx pipeline shader module pointer even if the program is unchanged
this is used for pipeline comparisons, so it has to always be accurate

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
2021-11-10 01:15:39 +00:00
Mike Blumenkrantz bfa81c1e8c zink: be more consistent about applying module hash for gfx pipeline
this was a little spaghetti-ish: the module hash was sometimes being applied
during module update, sometimes in draw during program create, and then also
it was removed when a shader unbind would cause the program to no longer be reachable

now things are more consistent:
* keep removing module hash when program becomes unreachable
* only apply module hash in draw during updates there

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
2021-11-10 01:15:39 +00:00
Mike Blumenkrantz 937a841b57 zink: ci updates
these don't spend forever in llvmpipe optimization passes anymore

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
2021-11-10 01:15:39 +00:00
Mike Blumenkrantz 2ac23b4d58 zink: always inline uniforms when running on a cpu driver
the overhead from creating new inlined shader variants is likely to be less than
the time required to fully optimize and run those variants, so just
inline 100% of the time to cut down shader runs

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
2021-11-10 01:15:39 +00:00
Mike Blumenkrantz a8d90c8ed5 zink: implement cs uniform inlining
this implements shader variants for compute

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
2021-11-10 01:15:39 +00:00
Mike Blumenkrantz 06f2054cb5 zink: radv ci updates for 1dshadow stuff
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13583>
2021-11-09 23:59:04 +00:00
Mike Blumenkrantz 64e0ca15d6 zink: add 1DShadow sampler handling for drivers (radv) that don't support it
some drivers won't create zs textures in any shape but 2D. this can be
handled instead by using 2D textures and then performing shader rewrites to
convert shadow samplers for 1D and 1DArray types to 2D/array

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13583>
2021-11-09 23:59:04 +00:00
Mike Blumenkrantz 62983f276b zink: add another compiler pass to convert 64bit vertex attribs
gallium always provides uint types, so rewrite the shader to load a 64bit
attrib and then cast back to whatever it was before

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13566>
2021-11-09 21:51:06 +00:00
Mike Blumenkrantz 39bdb00d77 zink: simplify 64bit vertex attrib lowering
this was a cool myfirstcompilerpass.exe but there's easier ways to do
things like this

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13566>
2021-11-09 21:51:06 +00:00
Mike Blumenkrantz 854fd242fa zink: declare int/float size caps inline with type usage
this is much more accurate than trying to use shader info

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13566>
2021-11-09 21:51:05 +00:00
Jesse Natalie fde36d7992 d3d12: Don't wait for GPU reads to do CPU reads
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13669>
2021-11-09 18:31:19 +00:00
Jesse Natalie 8ea1e58f0e d3d12: Don't wait for *all* batches when synchronizing a resource
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13669>
2021-11-09 18:31:19 +00:00
Samuel Pitoiset 5bb72ff750 zink: update the CI lists for RADV
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13726>
2021-11-09 16:41:13 +00:00
Jesse Natalie 1ab906d17f d3d12: Handle non-infinite wait timeouts > 49.7 days as infinite
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12268>
2021-11-09 04:05:55 +00:00
Jesse Natalie accd8326c5 d3d12: Fix Linux fence wait return value
zero is for success, nonzero is failure.

Fixes: 0b60d6a2 ("d3d12: Support Linux eventfds for fences")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12268>
2021-11-09 04:05:55 +00:00
Jesse Natalie e7502c5404 d3d12: Fully init primconvert config
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jesse Natalie c151e9d087 d3d12: Hook up threaded context
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jesse Natalie 2c90fa19a8 d3d12: Pass explicit context to pre/post draw surface blits
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jesse Natalie cd41ed53b2 d3d12: Use thread safe slab allocators in transfer_map handling
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jesse Natalie 17a46e2cf9 d3d12: Inherit from threaded_transfer
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jesse Natalie e9a1e1c21e d3d12: Resources inherit from threaded_resource
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jesse Natalie a463aa0099 d3d12: Inherit from threaded_query
Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13670>
2021-11-09 00:44:52 +00:00
Jordan Justen 7eb13fc2f2 anv,blorp,iris: Set MOCS for COMPUTE_WALKER post-sync operation
We don't current enable post sync operations, but it is probably
better to set them to "internal" MOCS than to remove the non-zero
checking for this genxml field.

Reworks:
 * Fix COMPUTE_WALKER in cmd_buffer_trace_rays (s-b Jason)

Fixes: 7b78b2fcac ("intel/genxml: Assert that all MOCS fields are non-zero on Gfx7+")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13624>
2021-11-08 23:29:51 +00:00
Jason Ekstrand 419b02c90c anv,iris: Advertise a max 3D workgroup size of 1024^3
On GFX version 12.5+ with COMPUTE_WALKER, this is the limit based on the
size of the HW packet.  On older HW, we can technically go a bit bigger
but there's not much point.  Technically, some hardware can support a
scalar workgroup size up to 2048 but most apps don't go any bigger than
1024.

As discussed on the merge request page, the current limit assumes
SIMD32, but it is unclear if we want to encourage applications to use
SIMD32 if it may lead to additional register spilling in shader
programs. Many applications have likely tuned for a limit of 1024
based on the OpenGL minimum limit, so it might not gain much by
advertising more than 1024.

Reworks:
 * Jordan: Use MIN2 and limit total invocations as well.
 * Jordan: Add second paragraph to commit message based on merge
   request discussion.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13538>
2021-11-08 23:07:42 +00:00
Mike Blumenkrantz 8626949f07 zink: flatten out draw templates a bit
having this be super granular was a neat idea, but really I don't care
even a little bit about a driver that's weirdly implementing *only*
dynamic vertex input or *only* dynamic state2

this massively cuts down the combinatorics and provides a more accurate
gauge of driver feature levels, since this is the general level of support
that they're likely to have

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13715>
2021-11-08 21:49:40 +00:00
Marek Olšák 3d80d6b696 radeonsi: enable nir_group_loads for better performance
The best case I have is one viewperf subtest getting +9% performance.

56979 shaders in 34726 tests
Totals:
SGPRS: 2667522 -> 2669178 (0.06 %)
VGPRS: 1543608 -> 1553472 (0.64 %)
Spilled SGPRs: 4090 -> 4100 (0.24 %)
Spilled VGPRs: 1600 -> 1791 (11.94 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 1872 -> 2076 (10.90 %) dwords per thread
Code Size: 59443980 -> 59479804 (0.06 %) bytes
Max Waves: 867280 -> 865634 (-0.19 %)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>

v2: No change in pixels but the hash changed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13604>
2021-11-08 21:20:11 +00:00
Mike Blumenkrantz acddf83c95 zink: update radv ci passes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13716>
2021-11-08 20:02:26 +00:00
Gert Wollny 63c4c559cb virgl: obtain supported number of shader sampler views from host
Modern games may use more than 16 sampler views, so get what the host
actually supports, and default to 16 on old hosts that don't pass the
value.

Since the possible maximal value of PIPE_MAX_SHADER_SAMPLER_VIEWS doesn't
fit into an uint32_t remove the binding flags, they were only used for
releasing the sampler views, and this can be achieved differently.

v2: Fix compilation error

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13646>
2021-11-08 19:34:30 +00:00
Pierre-Eric Pelloux-Prayer e26dd92957 radeonsi/sqtt: fix FINISH_DONE / BUSY usage
They're using more than a single bit so use the proper mask.

Based on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13694

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13696>
2021-11-08 17:16:11 +00:00
Pierre-Eric Pelloux-Prayer 3de072aaec radeonsi/sqtt: fix shader stage values
shader_stages_mask and others expect MESA_SHADER_* based values,
not PIPE_SHADER_*...

Without this the fragment shader wouldn't appear in the "Pipelines"
pane of RGP.

Fixes: c276bde34a ("radeonsi/sqtt: export shader code to RGP")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13696>
2021-11-08 17:16:11 +00:00
Lionel Landwerlin 361b3fee3c intel: move away from booleans to identify platforms
v2: Drop changes around GFX_VERx10 == 75 (Luis)

v3: Replace
   (GFX_VERx10 < 75 && devinfo->platform != INTEL_PLATFORM_BYT)
   by
   (devinfo->platform == INTEL_PLATFORM_IVB)
   Replace
   (devinfo->ver >= 5 || devinfo->platform == INTEL_PLATFORM_G4X)
   by
   (devinfo->verx10 >= 45)
   Replace
   (devinfo->platform != INTEL_PLATFORM_G4X)
   by
   (devinfo->verx10 != 45)

v4: Fix crocus typo

v5: Rebase

v6: Add GFX3, ILK & I965 platforms (Jordan)
    Move ifdef to code expressions (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12981>
2021-11-08 16:48:06 +00:00
Mike Blumenkrantz fbd61d2b02 zink: set new point/line caps
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13676>
2021-11-08 14:37:49 +00:00
Marek Olšák 78337728d1 radeonsi: set correct point and line limits
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13676>
2021-11-08 14:37:49 +00:00
Marek Olšák cf9afc7b0c gallium: add missing point and line CAPs
The returned values are the same as the GL frontend.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13676>
2021-11-08 14:37:49 +00:00
Marek Olšák b80dca86c3 gallium: rename PIPE_CAPF_MAX_POINT_WIDTH -> MAX_POINT_SIZE
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13676>
2021-11-08 14:37:49 +00:00
Lionel Landwerlin a543a94404 intel/dev: fix subslice/eu total computations with some fused configurations
When a device has its first slice/subslice fused off, we can't use the
number of slices/subslices to iterate the mask array.

v2: Fix spelling (Marcin)
    Use size_t for iterator (Marcin)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Matt Roper <matthew.d.roper@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5601
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
2021-11-05 10:22:18 +00:00
orbea 0a6f079afe build: add sha1_h for lp_texture.c
../mesa-9999/src/gallium/drivers/llvmpipe/lp_texture.c:55:10: fatal error: git_sha1.h: No such file or directory

Fixes: 1608a815e3 ("llvmpipe: add support for EXT_memory_object(_fd)")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: orbea <orbea@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13665>
2021-11-05 05:54:20 +00:00
Jordan Justen 6ffdcc335e iris: Use mi_builder in iris_load_indirect_location()
For example, this allows us to take advantage of command-streamer
based register offsets in mi_builder.

Ref: 06cf838cbd ("intel/mi_builder: Support gen11 command-streamer based register offsets")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13652>
2021-11-04 21:23:21 -07:00
Mike Blumenkrantz 833c0394e0 Revert "gallium/u_blitter: work around broken sample shading in llvmpipe and zink"
This reverts commit 8b287c3f92.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13679>
2021-11-05 02:36:32 +00:00
Mike Blumenkrantz 8c37cd8860 zink: rework cached fbfetch descriptor fallback
this ended up being a little trickier than I thought; lazy
descriptors don't use dynamic ubo types for the push set,
which means drivers that (correctly) assert dynamic offset existence
explode because the descriptor template will never work with the
push set

the better, though slightly more annoying, option here is to use the
lazy manager's faster descriptor allocation and lesser complexity to
quickly grab a push set, then tweak the existing cached codepath slightly
in order to update a raw vkdescriptorset

Fixes: 417477f60e ("zink: always use lazy (non-push) updating for fbfetch descriptors")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13677>
2021-11-05 02:21:01 +00:00
Jesse Natalie 2d1f5e3dcb d3d12: Don't accumulate timestamp queries
If an app re-issues a timestamp query a lot, but doesn't ever ask
for the results, we could end up running off the end of our query
heap. But we don't actually need to advance/accumulate, so just
use a single entry in the heap.

Reviewed By: Bill Kristiansen <billkris@microsoft.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12920>
2021-11-05 00:44:15 +00:00
Emma Anholt b0f2b0e980 freedreno/a5xx: Clean up a little bit of blitter array pitch setup.
We have a nice helper function for determining an array pitch.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
2021-11-04 22:49:29 +00:00
Emma Anholt b26e0cdf44 freedreno/a5xx: Try to fix drawing to z/s miplevel/layer offsets.
Terrifyingly, no testcases are fixed by this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
2021-11-04 22:49:29 +00:00
Emma Anholt 99f5b7ba1e freedreno/a5xx: Remove bogus assertion about BO size.
The slice->size0 temp is being used as both the array stride (incorrectly)
and as the size of the slice (for this assert).  This assert doesn't seem
to be in the right place to me, if you want to check that offset+slice
size is < bo size, you could just do that at the end of layout setup.

This caused troubles when fixing the temp to be the actual array stride
for filling out the HW state, since then rendering to nonzero levels would
think that the rendering overflowed the BO when it doesn't.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
2021-11-04 22:49:29 +00:00
Emma Anholt 03d8677bca freedreno/a6xx: Try to fix drawing to z/s miplevel/layer offsets.
Terrifyingly, no testcases are fixed by this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
2021-11-04 22:49:29 +00:00
Caio Oliveira 8fc6a11f0e intel/blorp: Add option to emit packets that disable Mesh
If a driver doesn't support Mesh, don't emit anything.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13660>
2021-11-04 14:41:06 -07:00