Commit Graph

150929 Commits

Author SHA1 Message Date
Iago Toral Quiroga fed51585c4 nir/schedule: allow drivers to decide about instruction latency
On V3D reading UBOs from uniform addresses uses a more efficient
mechanism with lower latency. On other platforms there may be
simular scenarios.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga e7a4e97076 nir/schedule: use larger delay for non-filtered memory reads
This has been pending for a long time. It is not very consistent to
add a significant delay for textures and not do it for UBOs, etc
The reason we have not been doing this so far is the accumulated effect
on register pressure for V3D as shown by shader-db results below, but
from the point of view of a generic scheduler it makes sense to do this.

Later patches will address V3D specific issues with register pressure
derived from this by letting the driver control its instruction delay
settings.

total instructions in shared programs: 12662138 -> 13126587 (3.67%)
instructions in affected programs: 1813091 -> 2277540 (25.62%)
helped: 2410
HURT: 10499

total threads in shared programs: 415858 -> 407208 (-2.08%)
threads in affected programs: 17348 -> 8698 (-49.86%)
helped: 8
HURT: 4333

total uniforms in shared programs: 3711483 -> 3812698 (2.73%)
uniforms in affected programs: 128012 -> 229227 (79.07%)
helped: 3474
HURT: 2143

total max-temps in shared programs: 2138763 -> 2318430 (8.40%)
max-temps in affected programs: 318780 -> 498447 (56.36%)
helped: 588
HURT: 11997

total spills in shared programs: 3860 -> 49086 (1171.66%)
spills in affected programs: 709 -> 45935 (6378.84%)
helped: 23
HURT: 1595

total fills in shared programs: 5573 -> 55810 (901.44%)
fills in affected programs: 1067 -> 51304 (4708.25%)
helped: 23
HURT: 1595

LOST:   3
GAINED: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga 3bd041e2fb nir/schedule: handle nir_intrinsic_group_memory_barrier
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga 46e330c07e nir/schedule: fix handling of generic memory barrier
We can get a generic nir_intrinsic_memory_barrier to represent a
barrier involving multiple semantics (instead of getting individual
specific barriers for each semantic). This means that we need to
consider these as potentially affecting shared memory access as well.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga 9ef499b315 broadcom/compiler: stop moving UBO loads before NIR scheduling
This doesn't have any significant impact shader-db stats and would
reduce our capacity to hide latency from the loads, so it is probably
undesirable:

total instructions in shared programs: 12663189 -> 12663186 (<.01%)
instructions in affected programs: 4222 -> 4219 (-0.07%)
helped: 9
HURT: 4

total uniforms in shared programs: 3711624 -> 3711629 (<.01%)
uniforms in affected programs: 186 -> 191 (2.69%)
helped: 0
HURT: 2

total max-temps in shared programs: 2138822 -> 2138857 (<.01%)
max-temps in affected programs: 569 -> 604 (6.15%)
helped: 1
HURT: 9

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:03 +00:00
Michel Zou bf7777a5d4 lavapipe: set non-zero device/driver uuid
Closes #5875

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15230>
2022-03-09 15:10:45 +00:00
Danylo Piliaiev 2e878293f4 turnip: Make autotuner work with reusable command buffers
To achieve it each command buffer now has its own GPU memory.

However the BOs usage by autotuner is not optimal, the ideal
pattern would be to use some memory pool to suballocate small
GPU memory chunks, since most command buffers have only a few
renderpasses.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5990

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14996>
2022-03-09 12:56:31 +00:00
Gert Wollny d9e400b9b6 virgl: Add a few more formats to the format table
These formats are used by the piglit

   arb_texture_buffer_object-formats fs arb

Adding them here keeps the piglit from crashing, but most of the related
tests don't pass.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13645>
2022-03-09 09:59:53 +00:00
Kenneth Graunke 8b9045e7a4 intel: Use 3DSTATE_BINDING_TABLE_POOL_ALLOC exclusively on Gfx11+
On Icelake and later, we can use a new 3DSTATE_BINDING_TABLE_POOL_ALLOC
command to update the location of the binder (buffer containing binding
table entries), rather than having to move Surface State Base Address
via a STATE_BASE_ADDRESS command.  This has less stalling and also means
our surface addresses can remain relative to a fixed 4GB address range,
meaning we don't have to re-stream them any time the binder changes.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke e3a0e97300 intel: Limit Wa_1607854226 to Gfx12.0 only
This workaround is needed on all Gfx12.0 parts, but doesn't appear to be
necessary on XeHP.  The other drivers do not appear to be applying this
workaround on those parts.  As further evidence, we accidentally added
the 3DSTATE_BINDING_TABLE_POOL_ALLOC commands after switching back to
GPGPU mode, which would be an incorrect way to implement the workaround,
and things seem to be working.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke ab47cad4fb iris: Rename surface_base_address to binder_address in a few places
On Gfx11+, we're going to stop changing Surface State Base Address
and instead start changing the Binding Table Pool address instead.

So, rename a few things to track the last binder address, which is
what we're actually changing, regardless of how we program it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Kenneth Graunke db34c71513 iris: Use more efficient binding table pointer formats on Icelake+.
Skylake and older use a 15:5 binding table pointer format, which means
our binder can be at most 64kB in size.  Each binding table within the
binder must be aligned to 32B.

XeHP uses a new 20:5 binding table format, which allows us to increase
the binder size to 1MB while retaining the nice 32B alignment.  Larger
binders mean fewer stalls as we update the base address for the binder.

Icelake and Tigerlake can either use the 15:5 format or an 18:8 format.
18:8 mode requires the base of each binding table to be aligned to 256B
instead of 32B, but it gives us a maximum binder size of 512kB.

We can store 64 binding table entries in a 256B chunk (256B / 4B = 64),
but only 8 entries in a 32B chunk (32B / 4B = 8).  Assuming that most
binding tables have fewer than 64 entries, this means that with the 18:8
format, we're likely to be able to fit 2048 (512KB / 256B) tables into a
a buffer before needing to allocate a new one and stall.

Technically, the old format could also store 2048 binding tables per
buffer as well (64KB / 32B = 2048).  However, tables that needed more
than 8 entries would need multiple 32B chunks.  A single table would
take multiple aligned chunks, while with the larger 256B format, it
could fit in a single one.

This cuts binder resets by 6.3% on a Shadow of Mordor benchmark trace.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Jason Ekstrand a83c91a261 blorp: Add a binding_table_offset_to_pointer helper
On Gen11+, we have a feature that requires us to shift binding table
offsets by 3.  This adds a helper which gives the driver a hook to do
this if it so chooses.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Pierre-Eric Pelloux-Prayer 3c3a8f853d gallium/tc: zero alloc transfers
Otherwise this causes trouble with unitialized memory, eg with:
   struct si_transfer {
      struct threaded_transfer b;
      struct si_resource *staging;
   };
'staging' will not be initialized and this causes #6109.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6109

Cc: mesa-stable
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15298>
2022-03-09 08:48:59 +00:00
Pierre-Eric Pelloux-Prayer caeec6262d util/slab: add slab_zalloc
A a variant that clears the allocated object to 0.

Cc: mesa-stable

Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15298>
2022-03-09 08:48:59 +00:00
Danylo Piliaiev 2cd30266f1 tu: Refactor VS DECODE/DEST to be emitted in two pkt4
Refactor to emit VFD_DECODE and VFD_DEST_CNTL in two packets
regardless of attribute count.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14584>
2022-03-09 08:21:40 +00:00
Mike Blumenkrantz fa323cb93a mesa/st: make export_point_size shader key clobber existing psiz
this is necessary to upload the API value using the uniform constant

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 9ae0cdc453 mesa/st: check max output components for adding pointsize during precompile
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 070a7b506d mesa/st: count FF shaders as needing psiz export for precompile
this is consistent with logic for regular compiles

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 1183988d74 mesa/st: precompile with API pointsize only if the shader doesn't have pointsize
this is a more accurate hint, and maintains the existing behavior given
subsequent changes to this area

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 96098d7c25 mesa/st: simplify pointsize precompile conditional
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 882abacde3 mesa/st: simplify pointsize shader update conditional
ES contexts have no API toggle for this, so it will never be flagged

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz b73663c51e mesa: always set PointSizeEnabled for API_OPENGLES2
this is implicit, so make it explicit

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 9d98bc238e mesa/st: only add pointsize output if it doesn't exceed max component limit
fixes (zink-nvidia):
dEQP-GLES31.functional.geometry_shading.basic*

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 0d80aed363 nir/gather_info: check copy_deref instrs for writing outputs
this is a valid way to write an output even though it usually gets rewritten
to some other instruction later on

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz fdee7240ff mesa/st: conditionally add pointsize outputs to ES tess/geom shaders
if the driver requires this value to be set, add it if the shader doesn't
use the ext to allow exporting it

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 597bd11b7b mesa/st: add a gl_program struct flag to skip psiz exports for xfb
if this output did not exist in the original shader,
then it must not be exported in xfb

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Mike Blumenkrantz 53cbba83eb glsl: store OES/EXT point_size extension enablement to shader struct
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
2022-03-09 05:10:21 +00:00
Boris Brezillon 02906baba7 panvk: Implement vkCmdDispatch()
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Boris Brezillon 1056b3e39e panvk: Add support for storage image
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Boris Brezillon eca0a0e29e panvk: Move dummy attribute buffer emission out of emit_{attribute,varying}_bufs
So we can easily add entries after the standard varyings/attributes
(like image descriptors in the attribute array).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Boris Brezillon 0e7e1b64fc panvk: Add support for storage/uniform buffers with dynamic offsets
The idea of storing offsets in a separate UBO and lowering accesses to
UBOs/SSBOs with a dynamic offset was not great. Let's apply the offset
at UBO/SSBO emission time instead.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Boris Brezillon 13378e4129 panvk: Support creation of compute pipelines
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Boris Brezillon b05ffc9fec panvk: Add support for storage buffers
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Boris Brezillon 9a20467cf2 panvk: Add support for push constants
Push constants are stored in a separate UBO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
2022-03-09 04:50:41 +00:00
Mike Blumenkrantz 3634063b08 zink: handle spirv xfb insanity
this comes in two flavors:
* streamout of array<struct/block>
* partial streamout of array/struct/block

for the former:
* arrays of structs can just be blasted out in the initial var declaration (easy)
* arrays of blocks must be output to separate xfb buffers for each array block,
  which requires skipping initial xfb blast-off and instead propagating the values
  using tmp variables at a later point

for the latter:
* the optimal way to do this is to unwrap the struct first to figure out what's being
  emitted, at which point the value can be extracted and exported

fixes the rest of spec@arb_gl_spirv@execution@xfb
...except spec@arb_gl_spirv@execution@xfb@vs_block_array, which I'm suspecting is broken
due to vtn bugs

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz dfbb7c1f73 zink: store shader to ntv_context
it's insane the gymnastics that have to be done because this wasn't stored

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz 76fe6f7235 zink: handle remaining xfb corner cases during analysis
this now handles inlining of stupid types (dvec3, dmatX) and complex
types (goku) as seen in cts

fixes:
KHR-Single-GL46.enhanced_layouts.xfb_explicit_location
KHR-Single-GL46.enhanced_layouts.xfb_struct_explicit_location

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz 3f7da0c584 zink: fix xfb analysis variable finding for arrays
this fixes clipdistance exports

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz 4ed7329236 zink: correctly set xfb packed output offsets
cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz 432700fc61 zink: store the correct number of components for xfb packing outputs
cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz a5c7d34fdf zink: use 64bit mask for xfb analysis
I don't know how this worked before since all the values are oob?

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
2022-03-09 04:39:52 +00:00
Mike Blumenkrantz a0feafbcc8 ci: more stoney flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz ddfa9e7cc9 ci: add another stoney flake
from #6109

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz 6ab720f1f4 aux/cso: stop tracing during cso_unbind()
this unnecessarily bloats lavapipe traces

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz 0e51c47816 aux/trace: dump more rasterizer state members
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz ec45b7ed32 aux/trace: dump clear_texture colors
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz 296e26eec8 aux/trace: dump clear colors as uints
dumping as float is nice if the clear color is a float, but if it isn't
then the value is useless

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz 8142fc5a45 aux/trace: rzalloc the context struct
this has problems if pointers are garbage

cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00
Mike Blumenkrantz f1cdaf36df aux/trace: more screen methods
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
2022-03-09 03:54:37 +00:00