Have to store the used allocator otherwise ends up being free wrong.
Fix for
dEQP-VK.api.object_management.alloc_callback_fail.descriptor_set_layout*
Fixes: f94a5f30e0 ("lavapipe: add reference counting to descriptor set layout")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9247>
On Gen8, updating the clear color will end up allocating new
SURFACE_STATE entries. These might end up living in a different BO
than the original copies, which means that we have to pin _after_
updating the clear color, not before.
Found by inspection.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9257>
Now that there is a global one in util/bitscan.h
Note this version had an extra assert which is not really suitable to a
generic foreach_bit().. just move the assert to the two usages of the
iterator macro.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9191>
this is a standardized (and very slightly improved for usability) version
of the macro that has been copied into every vulkan driver
includes fixup from Rob Clark <robclark@freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9191>
this requires setting up a spec constant on the pipeline state which can
then propagate to the shader and be used like a regular constant
all ARB_compute_variable_group_size should pass now
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9242>
the hardware supports it, the driver supports it, but the driver reports
a lower value due to subtracting some usage that we shouldn't exceed anyway
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9198>
non-intel platforms need border colors pre-swizzled
this is an internal khronos spec bug that will (someday) be resolved in
a more detectable manner
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9136>
For platforms which do not have support for parsing driconf from xml
files on the filesystem, build in driconf tables generated from
00-mesa-defaults.conf at compile time and use that for option matching.
This allows us to have game/engine specific overrides built in to mesa.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9179>
For the static-table alternative to WITH_XMLCONFIG, we are going to want
to re-use the element attribute processing, to avoid duplicating things
like engine name regexp matching and version range matching. This just
shuffles things around a bit so we can re-use useful parts in the next
patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9179>
For builds without runtime xmlconfig parsing, generate a static table
from 00-mesa-defaults.conf.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9179>
PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE translate into
GL_MAX_*_UNIFORM_COMPONENTS, all of which are allowed to be as
low as 1024 by the GL 4.6 spec.
PIPE_CAP_MAX_SHADER_BUFFER_SIZE translate into
GL_MAX_SHADER_STORAGE_BLOCK_SIZE, which has different minimum values in
different versions of the GL spec. In the GL 4.6 spec for instance, it
is required to be 2^27, the same as what Vulkan requires.
But what these limits are in GL is irrelevant at this level of
abstraction. The OpenGL state-tracker cares, but the Gallium driver
shouldn't have to. So let's just delete those parts of the comments.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9216>
It's unnecessary to iterate twice for instructions outside loops.
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-630.64 +/- 6.18761
-27.0751% +/- 0.223134%
(Student's t, pooled s = 7.30785)
Compile-time (entire run):
Difference at 95.0% confidence
-749.54 +/- 48.8272
-1.82644% +/- 0.117838%
(Student's t, pooled s = 57.6672)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-319.51 +/- 5.67632
-12.0627% +/- 0.208076%
(Student's t, pooled s = 6.70399)
Compile-time (overall):
Difference at 95.0% confidence
-385.025 +/- 42.1124
-0.929489% +/- 0.10139%
(Student's t, pooled s = 49.7367)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
Instead of a keeping a worklist of live instructions, use a bitset of live
ssa defs and iterate over instructions in reverse.
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-931.911 +/- 4.41383
-26.0263% +/- 0.105781%
(Student's t, pooled s = 5.21293)
Compile-time (overall):
Difference at 95.0% confidence
-882.245 +/- 28.3492
-2.08541% +/- 0.0665121%
(Student's t, pooled s = 33.4818)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
Split the piglit jobs in multiple parallel executions to speed up the
runtime.
v2:
- Set parallel in V3D piglit jobs.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9022>
This allows to split a piglit job in several parallel jobs, to speed up
the execution.
Due piglit restrictions, this only works for single profiles. Otherwise
an error will be shown in the runner.
Also, a new gitlab job variable `PIGLIT_TESTS` is introduced that
contains the excluded/included tests with `-x` or `-n`. The rest of the
piglit options go to `PIGLIT_OPTIONS` (like `--timeout n`).
v2 (Andres):
- Replay profile is supported in parallel jobs.
- Bail out inmediately if parallel jobs is tried with multiple
profiles.
- Use testlist only when doing parallel jobs.
- Do not drop pass tests when filtering executed tests.
- Get rid of PIGLIT_FRACTION.
v4:
- uncommit unrelated change (Andres).
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9022>
Straightforward by using the pixel hashing table computation helper
previously introduced, assuming we know the fraction of work that
needs to be submitted to each pixel pipe. Note that AFAIA the
hardware maps indices in the table to pixel pipes from largest to
smallest, so it shouldn't be necessary to permute indices based on the
physical IDs of the pixel pipes as we are doing on Gen11.
Improves performance of most non-trivial graphics workloads I've tried
on an 80 EU TGL. E.g. the following testcases improve performance
significantly with sample size 27 and statistical significance 1%:
gputest/pixmark_piano: 62.89% ±0.10%
gputest/pixmark_volplosion: 61.51% ±0.06%
unigine/valley: 26.72% ±0.25%
gfxbench/gl_5_high: 24.70% ±0.19%
unigine/heaven: 23.54% ±0.17%
steam/csgo: 22.75% ±4.36%
gfxbench/gl_manhattan31: 22.43% ±0.29%
gfxbench/gl_4: 20.92% ±0.35%
warsow/benchsow: 19.15% ±2.53%
gfxbench/gl_trex_off: 18.84% ±0.27%
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
Pixel hashing tables are a pain to type in, review and maintain IMHO.
In order to obtain satisfactory load balancing on all Gen12 parts
currently in production this series would need to add 5 different
additional tables. Instead this introduces a simple algorithm able to
calculate a table on the fly based on a handful of parameters.
Note that the Gen11 tables generated with this algorithm are not
identical to the hardcoded ones, however the only difference should be
a phase shift that isn't expected to have any effect on performance,
since it shouldn't change the fraction of work submitted to each pixel
pipe.
The CPU overhead from this change is negligible since the tables only
need to be programmed once at context init time.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
Unlike Gen11, Gen12 hardware supports up to three pixel pipes per
slice.
Unfortunately the kernel interface is somewhat inconsistent between
Gen11 and Gen12: I915_PARAM_SUBSLICE_MASK returns a mask of enabled
*dual* subslices since TGL, so there is half the number of bits per
pixel pipe in the mask. This is worked around here so we're able to
calculate the correct size of each pixel pipe, but the result is
returned in dual subslice units, inheriting the inconsistency from the
kernel -- Reason is that as of now all our Gen12 subslice counts
returned by gen_device_info.c are really dual subslice counts, and the
num_eu_per_subslice counts are also scaled accordingly, so it seems
like it would only make the matter worse if I fixed the units of this
field only without also fixing the rest.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
This command allows programming custom pixel hashing tables
controlling the balancing of load across pixel pipes. Rather
confusingly 3DSTATE_SLICE_TABLE_STATE_POINTERS was serving the same
purpose on Gen11: A pixel is mapped to the pixel pipe with index
specified by the entry in the table corresponding to the LSBs of the
pixel coordinates [Yes you read right the entries are neither subslice
nor slice indices!]. Either a 2-way or a 3-way table can be
programmed based on whether the platform has two or three pixel pipes
per slice. In addition the 16x8 tables defined below can hold two
separate 8x8 tables when in DUAL_TABLE mode (which AFAIA is only
useful for platforms with multiple asymmetric slices -- I.e. no
production platforms as of today to my knowledge).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
The former "Subslice Hashing Mode" field is no longer used by the
hardware, Gen12 parts always do 16x16 subslice pixel hashing -- Remove
it since it's no longer useful. In addition add a couple of bits that
will be useful in order to make some adjustments to the default pixel
pipe hashing behavior.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
Enable pipe capability of exporting stencil from shader when Vulkan
extension is available.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9244>
Enable SPV_EXT_stencil_export and SpvCapabilityStencilExportEXT and
mark output with FragStencilRefEXT when fragment shader writes to
reference stencil value.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9244>
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9244>
this attempts to dynamically establish an upper bound for per-batch descriptor
use, flushing all batches and resetting the pools on alloc failure in
an attempt to be more robust about it
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9117>
previously if any of the pending clears required an explicit clear then
we'd clear them explicitly, but with this patch we're shifting the first
pending clear into the renderpass begin if possible and then applying the
remaining clears on top of that in order to reduce gpu operations
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
we have src regions for all the blit/copy/map calls, so we can use those to
verify whether we actually need to apply the clears now or if we can keep
sitting on them a while longer
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
if we know we're going to be reading from a region then we can examine the
pending clears to see if there's any overlap, which helps us decide whether
we need to apply them immediately
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>