Commit Graph

144959 Commits

Author SHA1 Message Date
Jesse Natalie 29e3094d1e symbols-check: Fix symbol demangling for Windows
Only strip leading underscores if there's also a trailing @
Fixes shared-glapi symbol check for x64

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12881>
2021-09-16 17:38:58 +00:00
Mike Blumenkrantz ed794207b5 zink: pass all modifiers through to image creation
let the driver figure these out after zink guarantees that at least one of them
will work

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12857>
2021-09-16 12:36:20 -04:00
Mike Blumenkrantz 4e6d78c3b0 zink: pre-filter multi-plane modifiers
only single plane modifiers are supported now

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12857>
2021-09-16 12:36:20 -04:00
Mike Blumenkrantz cfe0bf5a4a zink: unbreak dmabuf handling
this does need kms handling to do literally anything.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12857>
2021-09-16 12:36:20 -04:00
Mike Blumenkrantz b0318216fa zink: store drm fd to screen
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12857>
2021-09-16 12:36:19 -04:00
Jason Ekstrand 6c7d23e6ca nir: Stop sweeping indirects
They're no longer ralloc'd.

Fixes: 879a569884 "nir: Switch from ralloc to malloc for NIR instructions."
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>
2021-09-16 11:28:36 +00:00
Jason Ekstrand d1eae6f36b nir: Properly clean up nir_src/dest indirects
Now that they're no longer ralloc'd, we have to be much more careful
about indirects.  We have to make sure every time a source or
destination is overwritten, its indirect (if any) is freed.  We also
have to choose a memory ownership convention for the rewrite functions.
Assuming that they will be called with the source from some other
instruction, we choose to always make a copy of the indirect (if any).
It's the responsibility of the caller to ensure its copy of the indirect
is freed.

Unfortunately, all this extra logic is going to make
nir_instr_rewrite/move_src/dest more expensive because they now have
all the logic of nir_src/dest_copy instead of a simple struct
assignment.  Fortunately, the vast majority of rewrite calls are done by
nir_ssa_def_rewrite_uses which is an SSA-only fast-path.

Fixes: 879a569884 "nir: Switch from ralloc to malloc for NIR instructions."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>
2021-09-16 11:28:36 +00:00
Timur Kristóf 92e1981a80 radv: Remove PSIZ output when it isn't needed.
PSIZ output is only needed when:
1. There is a next stage and it reads it.
2. Primitive topology is point list, in the last vertex pipeline stage.

Zink always adds this output in its vertex (and other) shaders,
because it helps Zink avoid recompiling shader variants.

However, this has a performance impact for RADV because
it needs a scalar memory load. That becomes noticeable
at high primitive rates.

The Fossil stats are unremarkable because our DB doesn't include any
shaders from Zink or D9VK, but there are a few affected shaders.

Note that there may be an increase in LDS use in some GS. This is
because with PSIZ removed the ES per-vertex LDS size is smaller, so
we can squeeze more GS threads in the same workgroup.

Fossil DB stats on Sienna Cichlid:

Totals from 14 (0.01% of 128647) affected shaders:
CodeSize: 119884 -> 119732 (-0.13%)
LDS: 235008 -> 228864 (-2.61%); split: -2.83%, +0.22%
Instrs: 23076 -> 23048 (-0.12%)
Latency: 71667 -> 71625 (-0.06%)
InvThroughput: 19155 -> 18870 (-1.49%)
Copies: 1586 -> 1572 (-0.88%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10725>
2021-09-16 11:06:05 +00:00
Dave Airlie a2c30c1488 docs: update docs for new llvmpipe/lavapipe features
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie c1de9eff01 lavapipe: enable KHR_shader_subgroup_extended_types
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 143167f2a0 gallivm/nir: handle subgroup reduction across all types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 3a27e406ed lavapipe: enable KHR_shader_float16_int8
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie f814a2449e llvmpipe: enable FP16 and update CL + traces piglit results.
The fails will be addressed later.

This adds a fail in GLSL compiler that is due to a workaround
that fails when fp16 constants are lowered

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 0d3b285360 gallivm: use llvm intrinsics for 16-bit round/trunc/roundeven
Otherwise the inf translations don't seem to work, and the VK CTS
fails

Fixes VK CTS dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 2277386565 gallivm: increase tgsi nesting call stack size
Some VK CTS tests are topping this out around 76, increase it to 80 for now.

Fixes:
dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.*44*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie c118888f92 gallivm/nir: pass the correct float builder to ddx/y
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie bceae73b3f gallivm/nir: call pow with correct flt builder
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 836b0ace10 gallivm/nir: handle 16-bit exp/lod using intrinsics.
This just passes the 16-bit float versions to the llvm intrinsics

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 3e773501d9 llvmpipe: lower_flrp16
fixes a bunch of spir-v 16-bit tests

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 6decb1b896 gallivm: add 16-bit sin/cos via llvm intrinsic
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie af49f9697a gallivvm/nir: handle non-32bit mask scatter stores
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 0d4f17fe1f gallivm/nir: fix f2b32
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie 0776628d1d gallivm/nir: handle conversion to 16-bit texel fetch
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Dave Airlie c396067366 gallivm: add initial support for 16-bit float builder.
This is an initial patch that is needed for OpenCL and Vulkan
support for proper 16-bit floats.

This doesn't enable the cap bit yet

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11816>
2021-09-16 04:15:41 +00:00
Mike Blumenkrantz 0418b98569 zink: cap max shader variants with inlined uniforms
avoid making a new shader for every frame forever

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 1aa0b2777d zink: simplify shader variant update loop
a single continue makes this much easier to read

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz fb9e9401c9 zink: split out inlined uniform shader variants into separate cache
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz ad32e41efe zink: remove default_variants storage in program struct
these should naturally be the first entry in the list when it matters

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 334576569e zink: replace shader module hash table with a list
this should be significantly more performant for the majority of cases
since it's rare that shaders have multiple variants outside of unit tests,
so now there can just be a list of shaders being iterated instead where the
first entry is the last used

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz c995d7bad8 zink: move shader cache to gfx program struct
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 834cc07e5b zink: stop using hash table for compute programs
this is pointless since there's no variants yet

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 19e99e46db zink: store shader key to shader module
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 8e78a6f67d zink: move uniform size calc for shader keys into keybox
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 19fbdb9064 zink: move shader keys to be persistent on pipeline state
save a cycle or two zeroing and populating this on every recalc

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 80604fee4a zink: move xfb updates to just before draw
it's illegal to bind the pipeline after xfb has begun

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz ff5991e86a zink: simplify flagging last vertex stage for updating
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz e515d9791e zink: only update gfx pipeline cache after creating a real pipeline
async pipelines may not require updates here

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz 7438d670dd zink: remove some ctx references from shader/pipeline compile
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:32 -04:00
Mike Blumenkrantz c587152eba zink: remove ctx references from shader compile path
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:31 -04:00
Mike Blumenkrantz ab4d8ed1e9 zink: make tcs shader generation take screen param
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:31 -04:00
Mike Blumenkrantz da10f13de9 zink: move pending prim type to gfx pipeline struct
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12842>
2021-09-15 23:52:31 -04:00
Dave Airlie 259e26e5e3 llvmpipe/cs: rework coroutine context handling (v2)
Get comfy.

llvmpipe coroutines have a stack frame. This is created by hooking
in malloc and coro.alloc and coro.size intrinsics.

LLVM has an CoroElide pass that is meant to allow that stack frame
to be done as an alloca in the caller instead of using the malloc path.

The CoroElide pass relies on the coroutine being inlined (fixed that).

The CoroElide pass relies on there being a direct connect between
coro.destroy(i8 *arg) and arg = coro.begin(id). However due to the
way the compute shaders are launched, there is no way to ensure that
link. Fixing the CoroElide pass seems quite difficult, I considered
having a force CoroElide always flag to make it dtrt, however I'm not
sure how ugly that would end up.

My first attempt tried to preallocate the stacks at a fixed size,
this turned out to be naive as the stack frame size was not sized
like I expected. Instead the first coro to run allocs enough for
everyone, so avoid the massive amounts of small allocations.

This remove coro malloc from a lot of profiles and shaves another 30s
or so from OpenCL ./conversions/test_conversions uchar_uin
(from 4.40m to just under 4m on my ryzen 7 1800x)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12432>
2021-09-16 13:21:34 +10:00
Dave Airlie 8d3e97344c llvmpipe: shorten hold time on the screen mutex
There is no requirement to hold this mutex over the wait. I doubt
it matters much in practice.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12432>
2021-09-16 13:21:29 +10:00
Dave Airlie 4ccee031e9 gallivm/coro: use a phi instead of alloca
this just matches what the docs recommend

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12432>
2021-09-16 13:21:27 +10:00
Dave Airlie 69109e0b19 llvmpipe/cs: rework thread pool for avoid mtx locking
This helps reduced the mtx lock/unlock overheads for the threadpool
if the work evenly distributes across the number of threads.

The CL CTS conversions tests really hit this, and this takes maybe 10-20s
off a 5min test run.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12432>
2021-09-16 13:21:06 +10:00
Mike Blumenkrantz 53aade0ef0 zink: fix enabled vertex buffer mask calculation
the mask can't entirely be calculated based on the integer parameters,
as it's possible for some of the "bind" slots to actually be unbinds,
so remove bits as necessary to fix this

also add some debug asserts to ensure I don't break this again for the
tenth time

Fixes: 6dd02a5139 ("zink: stop using util_set_vertex_buffers_mask()")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12871>
2021-09-16 01:43:40 +00:00
Icecream95 09bb8602f3 pan/bi: Don't set dependencies for +BLEND in blend shaders
The dependency wait should already have been done in the fragment
shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12461>
2021-09-15 22:42:03 +00:00
Dave Airlie d9a784520a lavapipe: enable dynamic index ubo/ssbo
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12689>
2021-09-16 08:05:59 +10:00
Dave Airlie fc0bf57632 gallivm/ssbo: cast ssbo index to int type.
Since these can be loaded from ubos or other places now.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12689>
2021-09-16 08:05:56 +10:00
Dave Airlie 1ccac4abff gallivm/ssbo: fix up dynamic indexed ssbo load/stores/atomics
Although the index has to be dynamically uniform, if we don't ever
execute a few lanes then we'll have 0, so it important to read the
ssbo index from the first active lane.

Just loop over them all.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12689>
2021-09-16 08:05:51 +10:00