New extensions properties/feature are being put in the `vn_physical_device`
which is not ideal from an organization point of view.
Here the `vn_physical_device_{features,properties}` are two new struct to
help the `vn_physical_device` organzation.
Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15170>
Something is slightly off in the integer values returned. It passes many
tests without the fixup, but the dEQP-GLES31 tests complain. The blob
ends up doing 3x gathers, and selects between them based on getinfo
results. Since we already have a per-sampler key with some spare bits,
just stick the bit-size info in there. And we can derive signedness from
the associated type info.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
It can't be done. This just provides bad results. The blob had a
comparable approach where they fixed up coordinates, but that also can't
work with a separate texture definition with nearest filtering. By then,
might as well provide a unswizzled variant instead, and using native
functionality.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on
all generations.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215>
From Section 4.4.1 (Input Layout Qualifiers) of the GLSL 4.50 spec:
"For some blocks declared as arrays, the location can only be applied
at the block level: When a block is declared as an array where
additional locations are needed for each member for each block array
element, it is a compile-time error to specify locations on the block
members. That is, when locations would be under specified by applying
them on block members, they are not allowed on block members. For
arrayed interfaces (those generally having an extra level of
arrayness due to interface expansion), the outer array is stripped
before applying this rule"
From Section 1.2.1 (Changes from Revision 6 of GLSL Version) of the GLSL 4.50 spec:
"Private Bug 15678: Don’t allow location = on block members where
the block needs an array of locations"
From Section 4.4.1 (Input Layout Qualifiers) of the GLSL ES 3.20 spec
"If an input is declared as an array of blocks, excluding per-vertex-arrays
as required for tessellation, it is an error to declare a member of
the block with a location qualifier"
From Section 1.1.3 (Changes from GLSL ES 3.2 revision 3) of the GLSL ES 3.20 spec:
"Arrayed blocks cannot have layout location qualifiers on members"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11522>
if one of these states change then it affects which result needs to be
used for that query, so split it up over multiple query ids to make sure
the correct result is obtained
fixes (lavapipe):
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_pause_resume
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
Looks like an extension number was assigned in late 2020. This makes it
possible to hook up this format to teximage-colors without teaching it
about ES.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>
When the AOS/linear code was added it only worked with TGSI which
meant nothing in mesa upstream was really using it.
This adds support to analyse NIR shaders, and adds aos support
to the backend.
AOS support is limited to mov,vec,fmul,tex sampling in order to
accelerate mostly compositing operations. I've tested weston uses
the fast path. gnome-shell can't use it yet as we can't optimise
the depth test paths.
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>
This fixes a regressions with overlap in llvmpipe, this is pessimistic
we should write code to make it work properly.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>
Technically this is just JUMP on Valhall, but the semantic is an indirect branch
based on comparing with zero. It can also be used as a conservative branch (like
BRANCHC), but this isn't modeled.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
For Bifrost, we model load/store segments, for example for thread local storage.
We need something similar on Valhall -- access modifiers. There are four access
modifiers on Valhall, controlling memory subsystem optimizations for the access:
none: Nothing may be assumed. Corresponds to "global".
istream: Internally streaming within the GPU. Corresponds to "pos", as it's
used for position stores.
estream: Externally streaming outside the GPU. Corresponds to "vary", as it's
used for varying stores.
force: Force access in discarded threads. Corresponds to "tl", as it's required
for correct behaviour of helper invocations that use the stack.
If these access modifiers end up being useful outside these fixed purposes, we
may need to rework this part of the IR. For now, this should suffice.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
For now, the difference does not matter. However it's better to model the actual
hardware behaviour, rather than isomorphic driver behaviour, when we can do so.
So fix the names.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Technically the table is folded, too; the 0xD refers to table 61. But this
instruction is more general than previously thought.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
So close! However, LD_VAR_IMM is something else -- Bifrost-style varying
interpolation, without a hardware buffer. For ES3, we'll need to support both.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
This can avoid some cases where a constant has to be loaded into a
temporary register.
v2: Update i915-g33-fails.txt.
total instructions in shared programs: 788625 -> 782376 (-0.79%)
instructions in affected programs: 166269 -> 160020 (-3.76%)
helped: 1578
HURT: 0
helped stats (abs) min: 3 max: 21 x̄: 3.96 x̃: 3
helped stats (rel) min: 1.56% max: 33.33% x̄: 4.82% x̃: 3.45%
95% mean confidence interval for instructions value: -4.06 -3.86
95% mean confidence interval for instructions %-change: -5.00% -4.64%
Instructions are helped.
LOST: 0
GAINED: 35
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
I noticed the SGE case while looking at the output of
shaders/closed/steam/trine-2/fp-3.shader_test on i915g. These are
especially bad on i915 that needs two instructions to implement SNE.
An alternative would be to duplicate the sne(sXX(a, b), 0.0) rules in an
algebraic pass that occurs after bool_to_float. Doing the work earlier
seems preferable.
i915
total instructions in shared programs: 788274 -> 788223 (<.01%)
instructions in affected programs: 666 -> 615 (-7.66%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 12 x̄: 10.20 x̃: 9
helped stats (rel) min: 5.00% max: 11.11% x̄: 8.12% x̃: 8.16%
95% mean confidence interval for instructions value: -12.24 -8.16
95% mean confidence interval for instructions %-change: -10.81% -5.43%
Instructions are helped.
LOST: 0
GAINED: 2
The two gained shaders are assembly fragment programs in Euro Truck
Simulator 2.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
This should reduce follow-on optimization work to copy-propagate and
dead-code away the movs generated in construction of vectors.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
Trivial, but will help users avoid some struct constructions that can be
awkward in C.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
Many users of nir_vec() do so by nir_channel()-ing a new ssa defs as movs
from other vectors to put the new vector together, which then just have to
get copy-propagated into the ALU srcs and DCEed away the temporary movs.
If they instead take nir_ssa_scalar, we can avoid that extra work.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
"queueFamilyIndexCount is the number of queue families having access to the image(s) of the
swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT.
pQueueFamilyIndices is a pointer to an array of queue family indices having access to the
images(s) of the swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT."
If the type isn't concurrent, don't attempt to access the arrays.
dEQP-VK.wsi.xlib.swapchain.create.exclusive_nonzero_queues on lavapipe.
Fixes: 5b13d74583 ("vulkan/wsi/drm: Break create_native_image in pieces")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15101>
We were just emitting the bad reloc (either an assert fail on a debug
build or for a release build likely a GPU hang from the resulting fault).
Given that the GLES 3.2 spec's robust context requirement says we should
return undefined data but not terminate for element indices outside of the
VB, ignoring the offset in this case seems like a better behavior to have
in all cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
We only have a single prog_data::total_scratch for all shader variants
(SIMD 8, 16, 32). Therefore we should always max the total_scratch on
top of existing variant.
We probably haven't run into that issue before because we compile by
increasing SIMD size and higher SIMD size is more likely to spill. But
for bindless shaders with return shaders, if the last return part
doesn't spill, we completely ignore the previous parts' scratch
computation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>
Seems we already had implemented this feature (see commit 521e1d0275
"broadcom/vc5: Add support for anisotropic filtering"), but we didn't
enable the proper capability.
Also update the maximum level of anistropy supported.
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4201
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15180>
Also change the code to preserve certain metadata: control flow is not changed
so both block indices and dominance information is preserved.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15206>
Now that we don't sort our nodes we can arrange them so we can
easily translate between nodes and temps without a mapping table,
just applying an offset.
To do this we have a single array of nodes where twe put first the nodes
for accumulators and then the nodes for temps. With this setup we can
ensure that for any given temp T, its node is always T + ACC_COUNT.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
Nodes are allocated in order to registers so initially sorting
was used to ensure that nodes with smaller life ranges would
be assigned first and therefore be more likely to get
accumulators.
However, since d81a6e5f1d now we don't rely on order to make
decisions about accumulators and instead we make policy decisions
based on actual liveness, so sorting is no longer strictly
relevant to this decision.
Furthermore, we are not re-sorting nodes after each spill either,
since that would probably require that we rebuild the interference
graph after each spill (the graph identifies nodes by their index).
Shader-db results show a significant improvement in instruction
counts, due to more optimal accumulator assignments. The reason for
this is that we use a round-robin policy for choosing the next
accumulator to assign. The idea behind this is preventing nearby
temps to be assigned to the same accumulator so that QPU scheduling
is more flexible, but if we sort our nodes, we are basically not
assigning temps in program order any more and the round-robin policy
becomes less effective:
total instructions in shared programs: 13000420 -> 12663189 (-2.59%)
instructions in affected programs: 11791267 -> 11454036 (-2.86%)
helped: 62890
HURT: 19987
total threads in shared programs: 415874 -> 415870 (<.01%)
threads in affected programs: 20 -> 16 (-20.00%)
helped: 2
HURT: 4
total uniforms in shared programs: 3711652 -> 3711624 (<.01%)
uniforms in affected programs: 43430 -> 43402 (-0.06%)
helped: 134
HURT: 173
total max-temps in shared programs: 2144876 -> 2138822 (-0.28%)
max-temps in affected programs: 123334 -> 117280 (-4.91%)
helped: 4112
HURT: 1195
total spills in shared programs: 3870 -> 3860 (-0.26%)
spills in affected programs: 1013 -> 1003 (-0.99%)
helped: 14
HURT: 12
total fills in shared programs: 5560 -> 5573 (0.23%)
fills in affected programs: 1765 -> 1778 (0.74%)
helped: 14
HURT: 17
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
For us they are basically uniforms too so we want to make their
lifespans short to facilitate allocating them to accumulators.
total instructions in shared programs: 13043585 -> 13015385 (-0.22%)
instructions in affected programs: 8326040 -> 8297840 (-0.34%)
helped: 24939
HURT: 19894
total threads in shared programs: 415860 -> 415858 (<.01%)
threads in affected programs: 4 -> 2 (-50.00%)
helped: 0
HURT: 1
total uniforms in shared programs: 3721953 -> 3720451 (-0.04%)
uniforms in affected programs: 96134 -> 94632 (-1.56%)
helped: 744
HURT: 435
total max-temps in shared programs: 2173431 -> 2154260 (-0.88%)
max-temps in affected programs: 264598 -> 245427 (-7.25%)
helped: 10858
HURT: 841
total spills in shared programs: 4005 -> 4010 (0.12%)
spills in affected programs: 700 -> 705 (0.71%)
helped: 5
HURT: 10
total fills in shared programs: 5801 -> 5817 (0.28%)
fills in affected programs: 1346 -> 1362 (1.19%)
helped: 6
HURT: 11
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
If we are compiling with a strategy that does not allow TMU spills
we should not allow spilling anything that is not a uniform.
Otherwise the RA cost/benefit algorithm may choose to spill a
temp that is not uniform and that will cause us to immediately
fail the strategy and fallback to the next one, even if we
could've instead chosen to spill more uniforms to compile the
program successfully with that strategy.
Some relevant shader-db stats:
total instructions in shared programs: 13040711 -> 13043585 (0.02%)
instructions in affected programs: 234238 -> 237112 (1.23%)
helped: 73
HURT: 172
total threads in shared programs: 415664 -> 415860 (0.05%)
threads in affected programs: 196 -> 392 (100.00%)
helped: 98
HURT: 0
total uniforms in shared programs: 3717266 -> 3721953 (0.13%)
uniforms in affected programs: 12831 -> 17518 (36.53%)
helped: 6
HURT: 100
total max-temps in shared programs: 2174177 -> 2173431 (-0.03%)
max-temps in affected programs: 4597 -> 3851 (-16.23%)
helped: 79
HURT: 21
total spills in shared programs: 4010 -> 4005 (-0.12%)
spills in affected programs: 55 -> 50 (-9.09%)
helped: 5
HURT: 0
total fills in shared programs: 5820 -> 5801 (-0.33%)
fills in affected programs: 186 -> 167 (-10.22%)
helped: 5
HURT: 0
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
Our cost was 5 which matches the number of instructions we have to
add for a TMU spill (a fill is 4 instructions).
Uniform spills on the other hand add an extra instruction for each
fill and remove one instruction for the spill itself. These have
a cost of 1.
Therefore, if we have a single spill+fill, we end up with +9
instructions if it is a TMU spill and +0 instructions with a uniform
spill, so making the former only 5 times more costly is probably
not a good idea, and this is without even considering the added
latency of the TMU accesses.
Relevant shader-db changes show this causes as a marginal instruction
count increase in a few shaders but better thread counts and lower
TMU spilling overall:
total instructions in shared programs: 13037315 -> 13040711 (0.03%)
instructions in affected programs: 370106 -> 373502 (0.92%)
helped: 187
HURT: 321
total threads in shared programs: 415090 -> 415664 (0.14%)
threads in affected programs: 574 -> 1148 (100.00%)
helped: 287
HURT: 0
total uniforms in shared programs: 3706674 -> 3717266 (0.29%)
uniforms in affected programs: 63075 -> 73667 (16.79%)
helped: 40
HURT: 395
total max-temps in shared programs: 2176080 -> 2174177 (-0.09%)
max-temps in affected programs: 15838 -> 13935 (-12.02%)
helped: 316
HURT: 34
total spills in shared programs: 4247 -> 4010 (-5.58%)
spills in affected programs: 2599 -> 2362 (-9.12%)
helped: 107
HURT: 14
total fills in shared programs: 6121 -> 5820 (-4.92%)
fills in affected programs: 3622 -> 3321 (-8.31%)
helped: 108
HURT: 13
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
The problem is that dirty_states must be 0 for any state that is NULL
in "queued". This code was flagging dirty_states for such states because
it was only looking at "emitted". It should have been looking at "queued".
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
moved from radeonsi without the vectorization, which won't be needed for
now. We will lower IO in st/mesa instead of radeonsi to get the transform
feedback info into store instructions.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
This is for drivers that have separate store instructions for varyings,
system value outputs (such as clip distances), and transform feedback.
The flags tell the driver not to store the output to those locations.
This will be used by radeonsi initially, and then maybe by a new linker.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
NIR now fully contains pipe_stream_output_info in shader_info and IO
intrinsics if lower_io_variables is true. radeonsi will not use
pipe_stream_output_info after this, and other drivers are free to follow
that.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
This will allow compaction of transform feedback varyings because they
are no longer tied to varying slots with this information.
It will also make transform feedback info available to all NIR passes
after IO is lowered. It's meant to replace pipe_stream_output_info.
Other intrinsics are not used with transform feedback.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
For dmabuf imports, configure the primary surface without support for
compression if the modifier doesn't specify it. This helps to create VkImages
with memory requirements that are compatible with the buffers apps provide.
Suggested-by: Philip Langdale <philipl@overt.org>
Cc: 22.0 <mesa-stable>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5940
Tested-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15181>
Use a variable to store the anv_image_create_info struct. We'll modify it for a
bug fix in the next patch.
Cc: 22.0 <mesa-stable>
Tested-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15181>
Replace the create_info parameter with isl_extra_usage_flags to more closely
match the parameters of explicit layout function.
Tested-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15181>
These are unified in the hardware, so let's unify them in pan_shader_info.
Hoisting this logic to pan_shader.c avoids the need to duplicate this logic for
Midgard/Bifrost (RSD packing) and Valhall (SPD packing).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Instead of specifying the tiling on the texture descriptor, Valhall specifies it
on the plane descriptor. There is a new flag on the texture descriptor
specifying only whether the planes are interleaved or not.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
The plane descriptor is larger than earlier surface descriptors, so we need to
be somewhat careful here. This removes a memory micro-optimization in the
interest of simplifying the code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Unnecessary. To avoid even more #if/#endif soup, merge the v4, v5-v8, and v9
paths together -- by returning 0 as the compression tag on v4 or v9.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
It's probably harmless, but it is logically meaningless. The DDK doesn't do it,
I don't see a reason for us to, either. In theory this should be a small
overhead win.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
To dump all graphics memory via the new pandecode_dump_mappings function(),
since for Valhall I have to do this often enough to warrant a dynamic flag.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
This hook will be repurposed on Valhall to prepare the Shader Program
Descriptor, which takes the role of the RSD. Rename to avoid confusion.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Influences cache prefetching. I don't see a good reason to put anything other
than descriptors inside shader resources, meaning always setting this bit is
appropriate (at least for GLES).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Disable fast clear if not supported by the external modifier.
v2: Set fast_clear value to NONE in case of import/export from/to external
v3: Move logic next to existing acquire/release checks (Nanley)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6056
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15096>
Add support of GUEST_VRAM type of blob. These are dedicated heap memory
allocations required for vk support on hypervisors that don't support
runtime injections of host memory into guest physical address space.
The flow of usage:
1) Host VM reserves dedicated heap memory
2) Device get info about memory reservations and report it to guest
using mmio registers
3) Guest virtio-gpu driver on starts checks mmio registers for
physical address and length of reserved region. Then it reserves it
in guest.
4) On each call of vkAllocateMemory() guest driver gets chunk of
required memory and send it to host using sg list. It uses one sg
entry for 1 blob call. Heap is managed on guest using drm memory
manager (drm_mm).
Signed-off-by: Oleksandr.Gabrylchuk <Oleksandr.Gabrylchuk@opensynergy.com>
Signed-off-by: Andrii Pauk <Andrii.Pauk@opensynergy.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14536>
It provides similar solution as in [1].
This was workaround for the users of gbm_bo_create_with_modifiers(),
which were unable to specify the buffer usage (GPU / GPU+DISPLAY).
But after the commit [2] this become possible. And forcing usage to
GBM_BO_USE_SCANOUT migrated directly into gbm_bo_create_with_modifiers
[3], allowing us to remove such workarounds from the drivers.
[1]: ef3b31c9 ("v3d: Don't force SCANOUT for PIPE_BIND_SHARED requests")
[2]: 268e12c6 ("gbm: add gbm_{bo,surface}_create_with_modifiers2")
[3]: ad50b47a ("gbm: assume USE_SCANOUT in create_with_modifiers")
Suggested-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5642
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14264>
v2.
- Build the runner image as part of the CI for the boot2container
project, rather than as a manually step using the build instructions
in valve-trigger.dockerfile.
- Depend on a non-default kernel build hosted in the valve-infra
package repository. This does reduce the current caching feature of
local artifacts, but makes it easier to chop and change kernels on a
per-project or even per-test basis.
v3.
- Depend on a kernel built and stored in the valve-infra generic
package repo.
- Build the runner container using ci-templates as part of the CI in
valve-infra.
- Now that the runner container is built in the valve-infra CI, I
dropped the source import of client.py and message.py. They are
built in the runner container.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14660>
See 786fa3435c for the rationale of this variable, but the point is to
avoid many error reports for conformance conformance issues within the
VK-CTS shaders.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14660>
nir_var_shader_out writes are only used for later TES invocations, so I
don't think there's any need for the TCS workgroup to wait for them.
fossil-db (Sienna Cichlid):
Totals from 1691 (1.04% of 162293) affected shaders:
Instrs: 710699 -> 709008 (-0.24%)
CodeSize: 3830168 -> 3823404 (-0.18%)
Latency: 3396997 -> 3007934 (-11.45%)
InvThroughput: 1212094 -> 1082823 (-10.67%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15195>
I happened to have these plugged in. Ran them against mesa 21.3 and
recent VK-GL-CTS tree (shortly after vulkan-cts-1.2.8).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14797>
This reverts commit 382f6ccda8.
The spec requires that all color images created with the same tiling
(and a few other properties) support the same memoryTypeBits. So this
wasn't a valid change. It also wasn't necessary - we already have a
mechanism in anv_BindImageMemory2 for disabling compression if the BO
doesn't support it.
With this, XeHP passes the tests in
dEQP-VK.memory.requirements.*tiling_optimal
Fixes: 382f6ccd ("anv: Require the local heap for CCS on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
When an image configured for HIZ_CCS/HIZ_CCS_WT is bound to a BO lacking
implicit CCS, we disable any compression it may have had. Such images
are still compatible with ISL_AUX_USAGE_HIZ however. Fall back to that
aux usage to retain the performance benefit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
When an image is bound to a BO lacking implicit CCS, we disable any
compression it may have had. This is unnecessary in the cases where the
compression type doesn't depend on the BO having implicit CCS support.
Avoid this disabling for ISL_AUX_USAGE_MCS and ISL_AUX_USAGE_HIZ.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
In the bindless case, we don't have to keep any shadow descriptors and
can just reuse the IBO descriptor as a texture descriptor. Now that
we're emitting the swizzle we can just flip this on.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
ncoords includes the array index, and the NIR source has the array index
as its last component, so we have to insert the extra y coordinate in
the middle in this case.
Fixes: 0bb0cac ("freedreno/ir3: handle image buffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
Since these were reverse-engineered, it's become clear that IBO
descriptors are just a subset of texture descriptors, and bindless reads
of readonly images actually use isam on the IBO descriptor, further
confirming that the two are always compatible, even if not all of the
texture fields exist for IBOs. It's pointless to have a separate type
for IBOs, and just leads to things getting out-of-sync unnecessarily
which has already happened. Just remove it and use TEX_CONST insted.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
CAN_REORDER takes volatile into account, and is closer to what we
actually require to use texture instructions, which is that we can
arbitrarily reorder loads.
Fixes: aa93896 ("freedreno/ir3: adjust condition for when to use ldib")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
From Vulkan spec for VkDrmFormatModifierProperties2EXT:
"drmFormatModifierTilingFeatures is a bitmask of VkFormatFeatureFlagBits
that are supported by any image created with format and drmFormatModifier."
"The returned drmFormatModifierTilingFeatures must contain at least one bit."
"Therefore, if the returned drmFormatModifier is DRM_FORMAT_MOD_LINEAR,
then drmFormatModifierPlaneCount must equal the format planecount, and
drmFormatModifierTilingFeatures must be identical to the
VkFormatProperties2::linearTilingFeatures returned in the same pNext chain."
Relevant tests: dEQP-VK.drm_format_modifiers.*
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15032>
this creates the view type expected by the shader instead of doing weird stuff
like trying to create a 3D imageview with layers > 1
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15172>
now when gl_PointSize and gl_PointSizeMESA are both present, the former
will be used for xfb with a new location and the latter will be
exported by the shader
fixes (zink):
GTF-GL46.gtf30.GL3Tests.transform_feedback.transform_feedback_builtins
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
for this pass to work with xfb, the original value in the shader must be
preserved when xfb is active, and the driver must export only the newly
created output
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
creating this at the start of the shader means it will get optimized out
when the pass is used to overwrite existing psiz values, and creating it
at the end means it will get optimized out in geometry shaders, so instead
just walk the instructions and create another store right after the existing one
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
Fixes build on OpenBSD/macppc powerpc
error: incompatible pointer types passing 'int *' to parameter of type 'size_t *'
(aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]
Fixes: 01bd21eef8 ("gallium: Import Dennis Smit cpu detection code.")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
On mips64, the compiler does not allow use of non-zero argument with
__builtin_frame_address(). However, the returned frame address is only
used when PIPE_ARCH_X86 is defined. The compile error can be avoided
by making #ifdef PIPE_ARCH_X86 cover the getting of frame address too.
The argument checking of __builtin_frame_address() has been present
as a debug assert in clang 8. In clang 10, there is a proper runtime
check for the argument. This is why the build has not failed before.
Fixes: dc94a0506f ("gallium: Do not add -Wframe-address option for gcc <= 4.4.")
from Visa Hankala
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
Make this build on clang architectures that don't have 64-bit atomic
instructions. Clang doesn't allow redeclaration (and therefore
redefinition) of the __sync_* builtins. Use #pragma redefine_extname
to work around that restriction. Clang also turns __sync_add_and_fetch
into __sync_fetch_and_add (and __sync_sub_and_fetch into
__sync_fetch_and_sub) in certain cases, so provide these functions as
well.
Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where they're missing")
patch from Mark Kettenis
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
The structure wrapped around the rb tree node was being freed, but not the node
itself, which caused a segmentation fault when accessing its parent node.
Add rb tree node remove call to fix it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15188>
need to iterate over the descriptors in the binding to invalidate the whole
thing here
=================================================================
==546534==ERROR: AddressSanitizer: heap-use-after-free on address 0x61a0000ae6c0 at pc 0x7fe20e26fd9d bp 0x7ffd92be6bc0 sp 0x7ffd92be6bb8
READ of size 8 at 0x61a0000ae6c0 thread T0
#0 0x7fe20e26fd9c in zink_descriptor_set_refs_clear ../src/gallium/drivers/zink/zink_descriptors.c:950
#1 0x7fe20e401304 in zink_destroy_surface ../src/gallium/drivers/zink/zink_surface.c:340
#2 0x7fe20e21311b in zink_surface_reference ../src/gallium/drivers/zink/zink_surface.h:106
#3 0x7fe20e21a5b9 in zink_sampler_view_destroy ../src/gallium/drivers/zink/zink_context.c:835
#4 0x7fe20c41d35f in tc_sampler_view_destroy ../src/gallium/auxiliary/util/u_threaded_context.c:1848
#5 0x7fe20e210ff7 in pipe_sampler_view_reference ../src/gallium/auxiliary/util/u_inlines.h:216
#6 0x7fe20e22d592 in zink_set_sampler_views ../src/gallium/drivers/zink/zink_context.c:1532
#7 0x7fe20c41a3d8 in tc_call_set_sampler_views ../src/gallium/auxiliary/util/u_threaded_context.c:1393
#8 0x7fe20c411706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
#9 0x7fe20c4124ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
#10 0x7fe20c42b728 in tc_destroy ../src/gallium/auxiliary/util/u_threaded_context.c:4250
#11 0x7fe20b65176a in st_destroy_context_priv ../src/mesa/state_tracker/st_context.c:387
#12 0x7fe20b65669f in st_destroy_context ../src/mesa/state_tracker/st_context.c:1009
#13 0x7fe20b7055ab in st_context_destroy ../src/mesa/state_tracker/st_manager.c:944
#14 0x7fe20a9c75bd in dri_destroy_context ../src/gallium/frontends/dri/dri_context.c:256
#15 0x7fe20a9d4bef in driDestroyContext ../src/gallium/frontends/dri/dri_util.c:534
#16 0x7fe22361f25c in drisw_destroy_context ../src/glx/drisw_glx.c:429
#17 0x7fe223625d95 in glXDestroyContext ../src/glx/glxcmds.c:523
#18 0x7fe22636aaeb in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GLX/libglx.c:332
#19 0x7fe2269d9e7d in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GL/g_libglglxwrapper.c:384
#20 0x41b88a in tcu::lnx::x11::glx::GlxRenderContext::~GlxRenderContext() /home/zmike/src/VK-GL-CTS/framework/platform/lnx/X11/tcuLnxX11GlxPlatform.cpp:734
#21 0x41b8e9 in tcu::lnx::x11::glx::GlxRenderContext::~GlxRenderContext() /home/zmike/src/VK-GL-CTS/framework/platform/lnx/X11/tcuLnxX11GlxPlatform.cpp:735
#22 0x2323aa7 in deqp::gles31::Context::destroyRenderContext() /home/zmike/src/VK-GL-CTS/modules/gles31/tes31Context.cpp:77
#23 0x2323969 in deqp::gles31::Context::~Context() /home/zmike/src/VK-GL-CTS/modules/gles31/tes31Context.cpp:55
#24 0x232278e in deqp::gles31::TestPackage::deinit() /home/zmike/src/VK-GL-CTS/modules/gles31/tes31TestPackage.cpp:102
#25 0x2c866c2 in tcu::DefaultHierarchyInflater::leaveTestPackage(tcu::TestPackage*) /home/zmike/src/VK-GL-CTS/framework/common/tcuTestHierarchyIterator.cpp:75
#26 0x2c87058 in tcu::TestHierarchyIterator::next() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestHierarchyIterator.cpp:252
#27 0x2c365da in tcu::TestSessionExecutor::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:122
#28 0x2c00b0c in tcu::App::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuApp.cpp:221
#29 0x4141b7 in main /home/zmike/src/VK-GL-CTS/framework/platform/tcuMain.cpp:58
#30 0x7fe2263e155f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
#31 0x7fe2263e160b in __libc_start_main_impl (/lib64/libc.so.6+0x2d60b)
#32 0x413fa4 in _start (/home/zmike/src/VK-GL-CTS/build/external/openglcts/modules/glcts+0x413fa4)
0x61a0000ae6c0 is located 64 bytes inside of 1328-byte region [0x61a0000ae680,0x61a0000aebb0)
freed by thread T0 here:
#0 0x7fe226cb6627 in free (/usr/lib64/libasan.so.6+0xae627)
#1 0x7fe20aab1751 in unsafe_free ../src/util/ralloc.c:302
#2 0x7fe20aab16c8 in unsafe_free ../src/util/ralloc.c:295
#3 0x7fe20aab13c3 in ralloc_free ../src/util/ralloc.c:265
#4 0x7fe20e269234 in descriptor_pool_free ../src/gallium/drivers/zink/zink_descriptors.c:286
#5 0x7fe20e26937d in descriptor_pool_delete ../src/gallium/drivers/zink/zink_descriptors.c:296
#6 0x7fe20e26ff53 in zink_descriptor_pool_reference ../src/gallium/drivers/zink/zink_descriptors.c:967
#7 0x7fe20e270db2 in zink_descriptor_program_deinit ../src/gallium/drivers/zink/zink_descriptors.c:1071
#8 0x7fe20e3b6536 in zink_destroy_gfx_program ../src/gallium/drivers/zink/zink_program.c:695
#9 0x7fe20e1eaaf9 in zink_gfx_program_reference ../src/gallium/drivers/zink/zink_program.h:242
#10 0x7fe20e20d386 in zink_shader_free ../src/gallium/drivers/zink/zink_compiler.c:2099
#11 0x7fe20e3b9f0b in zink_delete_shader_state ../src/gallium/drivers/zink/zink_program.c:1074
#12 0x7fe20c3e29ad in util_shader_reference ../src/gallium/auxiliary/util/u_live_shader_cache.c:188
#13 0x7fe20e3ba11e in zink_delete_cached_shader_state ../src/gallium/drivers/zink/zink_program.c:1093
#14 0x7fe20c41709e in tc_call_delete_fs_state ../src/gallium/auxiliary/util/u_threaded_context.c:998
#15 0x7fe20c411706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
#16 0x7fe20c4124ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
#17 0x7fe20c423683 in tc_flush ../src/gallium/auxiliary/util/u_threaded_context.c:3003
#18 0x7fe20b62d996 in st_flush ../src/mesa/state_tracker/st_cb_flush.c:60
#19 0x7fe20b62dbe3 in st_glFlush ../src/mesa/state_tracker/st_cb_flush.c:94
#20 0x7fe20ae4bded in _mesa_make_current ../src/mesa/main/context.c:1493
#21 0x7fe20ae49702 in _mesa_free_context_data ../src/mesa/main/context.c:1187
#22 0x7fe20b65668b in st_destroy_context ../src/mesa/state_tracker/st_context.c:1005
#23 0x7fe20b7055ab in st_context_destroy ../src/mesa/state_tracker/st_manager.c:944
#24 0x7fe20a9c75bd in dri_destroy_context ../src/gallium/frontends/dri/dri_context.c:256
#25 0x7fe20a9d4bef in driDestroyContext ../src/gallium/frontends/dri/dri_util.c:534
#26 0x7fe22361f25c in drisw_destroy_context ../src/glx/drisw_glx.c:429
#27 0x7fe223625d95 in glXDestroyContext ../src/glx/glxcmds.c:523
#28 0x7fe22636aaeb in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GLX/libglx.c:332
#29 0x7fe2269d9e7d in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GL/g_libglglxwrapper.c:384
previously allocated by thread T0 here:
#0 0x7fe226cb691f in __interceptor_malloc (/usr/lib64/libasan.so.6+0xae91f)
#1 0x7fe20aab0c81 in ralloc_size ../src/util/ralloc.c:120
#2 0x7fe20aab0e33 in rzalloc_size ../src/util/ralloc.c:153
#3 0x7fe20aab12c8 in rzalloc_array_size ../src/util/ralloc.c:233
#4 0x7fe20e26c76d in allocate_desc_set ../src/gallium/drivers/zink/zink_descriptors.c:657
#5 0x7fe20e26e9cb in zink_descriptor_set_get ../src/gallium/drivers/zink/zink_descriptors.c:840
#6 0x7fe20e2747aa in zink_descriptors_update ../src/gallium/drivers/zink/zink_descriptors.c:1424
#7 0x7fe20e36fc48 in void zink_draw<(zink_multidraw)1, (zink_dynamic_state)2, true, false>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int, pipe_vertex_state*, unsigned int) ../src/gallium/drivers/zink/zink_draw.cpp:788
#8 0x7fe20e29166d in zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)2, true> ../src/gallium/drivers/zink/zink_draw.cpp:907
#9 0x7fe20c424982 in tc_call_draw_single ../src/gallium/auxiliary/util/u_threaded_context.c:3155
#10 0x7fe20c411706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
#11 0x7fe20c4124ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
#12 0x7fe20c41f7a9 in tc_texture_map ../src/gallium/auxiliary/util/u_threaded_context.c:2279
#13 0x7fe20b630757 in pipe_texture_map_3d ../src/gallium/auxiliary/util/u_inlines.h:572
#14 0x7fe20b6341f6 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:546
#15 0x7fe20b42fea7 in read_pixels ../src/mesa/main/readpix.c:1178
#16 0x7fe20b42fea7 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1195
#17 0x7fe20b42ffc0 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1210
#18 0x2a6d094 in glu::readPixels(glu::RenderContext const&, int, int, tcu::PixelBufferAccess const&) /home/zmike/src/VK-GL-CTS/framework/opengl/gluPixelTransfer.cpp:61
#19 0x29eaa06 in deqp::gls::ShaderExecUtil::FragmentOutExecutor::execute(int, void const* const*, void* const*) /home/zmike/src/VK-GL-CTS/modules/glshared/glsShaderExecUtil.cpp:677
#20 0x25a600b in iterate /home/zmike/src/VK-GL-CTS/modules/gles31/functional/es31fOpaqueTypeIndexingTests.cpp:585
#21 0x2322b53 in deqp::gles31::TestCaseWrapper<deqp::gles31::TestPackage>::iterate(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/modules/gles31/tes31TestCaseWrapper.hpp:86
#22 0x2c376fd in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:302
#23 0x2c366e3 in tcu::TestSessionExecutor::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:139
#24 0x2c00b0c in tcu::App::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuApp.cpp:221
#25 0x4141b7 in main /home/zmike/src/VK-GL-CTS/framework/platform/tcuMain.cpp:58
#26 0x7fe2263e155f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
cc: mesa-stable
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
this ensures that swapchain images will have been acquired before potentially
accessing swapchain images bound as descriptors
fixes caselist like:
dEQP-GLES31.functional.fbo.color.texcubearray.r8ui
dEQP-GLES31.functional.primitive_bounding_box.blit_fbo.blit_default_to_fbo
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
Correct type for sysctl argument to fix the build.
../src/util/u_cpu_detect.c:631:29: error: incompatible pointer types passing 'int *' to parameter of type 'size_t *' (aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]
sysctl(mib, 2, &ncpu, &len, NULL, 0);
^~~~
Fixes: 5623c75e40 ("util: Fix setting nr_cpus on some BSD variants")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
move include so va_list will be picked up via stdarg.h
In file included from ../src/util/u_printf.cpp:24:
../src/util/u_printf.h:43:41: error: unknown type name 'va_list'; did you mean '__va_list'?
size_t u_printf_length(const char *fmt, va_list untouched_args);
^~~~~~~
__va_list
/usr/include/machine/_types.h:126:27: note: '__va_list' declared here
typedef __builtin_va_list __va_list;
^
and add includes to u_printf.h as suggested by Ilia Mirkin
stdarg.h for va_list and stddef.h for size_t
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
Put linux specific path inside an ifdef. Unbreaks mips64 build on
OpenBSD and likely other systems without Elf64_auxv_t.
Fixes: 88b234d7a7 ("gallivm: add basic mips64 support and set mcpu to mips64r5 on ls3a4000")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15166>
If we say that per-primitive attributes are flat (which is communicated by
3DSTATE_SBE.ConstantInterpolationEnable), GPU freaks out and applies it
to other (non-flat) attributes.
Fixes: be89ea3231 ("intel/compiler: Handle per-primitive inputs in FS")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15169>
Fix the definitions of the basic texturing instructions. In particular, a
register format and a write mask were previously missing, as well as incorrect
handling of staging registers.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
Ocassionally the 0 value has a meaningful value that's not meaningfully default,
so we want an enum to encode both possible states.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
This required relaxing a core NIR assertion which I don't think is doing
any important validation.
The shader-db effects here are small, but they're important for avoiding a
regression when we start doing per-component DCE in opt_shrink_vectors
(https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468)
softpipe shader-db:
total instructions in shared programs: 2859777 -> 2859454 (-0.01%)
instructions in affected programs: 18881 -> 18558 (-1.71%)
total temps in shared programs: 293994 -> 293914 (-0.03%)
temps in affected programs: 418 -> 338 (-19.14%)
i915g:
total instructions in shared programs: 407562 -> 407544 (<.01%)
instructions in affected programs: 570 -> 552 (-3.16%)
r300:
total instructions in shared programs: 1414450 -> 1414459 (<.01%)
instructions in affected programs: 44494 -> 44503 (0.02%)
total vinst in shared programs: 473782 -> 473727 (-0.01%)
vinst in affected programs: 1102 -> 1047 (-4.99%)
total sinst in shared programs: 231224 -> 231216 (<.01%)
sinst in affected programs: 432 -> 424 (-1.85%)
total temps in shared programs: 197605 -> 197607 (<.01%)
temps in affected programs: 103 -> 105 (1.94%)
crocus hsw:
total instructions in shared programs: 8158185 -> 8158134 (<.01%)
instructions in affected programs: 10927 -> 10876 (-0.47%)
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15178>
Because we have to maintain two different packages of Mesa, one
specific to RADV and another one for RadeonSI and such, it's a bit
annoying to have to synchronize the drirc entries. Currently, only our
Mesa package installs 00-mesa-defaults.conf which means we have to
backport the drirc RADV changes.
This splits 00-mesa-defaults.conf in two to move the drirc RADV entries
to src/amd/vulkan/00-radv-defaults.conf. Meson will install the file
only if RADV is built.
There is still a caveat for common drirc workarounds like for WSI but
they are rare enough and we could still duplicate them if needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15152>
This new mode will be only used for the actual payload variables and
not the number of launched mesh shader workgroups, which will still
be treated as an output.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>
Task shader outputs work differently than other shaders, so they
need special consideration. Essentially, they have two kinds of
outputs:
1. Number of mesh shader workgroups to launch.
Will be still represented by a shader output.
2. Optional payload of up to (at least) 16K bytes.
These payload variables behave similarly to shared memory, but
the spec doesn't actually define them as shared memory (also, they
may be implemented differently by each backend), so we need to add
a new NIR variable mode for them.
These payload variables can't be represented by shader outputs
because the 16K bytes don't fit the 32x vec4 model that NIR uses
for its output variables.
This patch adds a new NIR variable mode: nir_var_mem_task_payload
and corresponding explicit I/O intrinsics, as well as support for
this new mode in nir_lower_io.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>
Several of the new draw packets need this argument
including all of the taskmesh commands, so it's
best to always declare it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
This makes our implementation friendlier to potentially buggy shaders,
meaning that it will less likely to hang the GPU.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
The problem is that the real workgroup launched on NGG HW
can be larger than the size specified by the API, and the
extra waves need to keep up with barriers in the API waves.
There are 2 different cases:
1. The whole API workgroup fits in a single wave.
We can shrink the barriers to subgroup scope and
don't need to insert any extra ones.
2. The API workgroup occupies multiple waves, but not
all. In this case, we emit code that consumes every
barrier on the extra waves.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
This makes it impossible for out of bounds vertex and primitive
attribute stores and indices stores to overwrite this.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
This commit adds support for Vulkan backend on a630_skqp job.
= Needed changes
- Needed to install libvulkan-dev package on system
- Refactored the way the available skqp reports are printed
tested in development builds with skia tools
Piglit expectations had to be updated in various drivers due to !14750 not
having bumped the tags when it tried to uprev.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14686>
The Android CTS 10 version is relative old when compared with skia main
branch, which was being used before. Some modifications in the skqp
build/runner scripts were needed to make it run on CI.
- skqp versions from android-cts have already all assets inside
platform_tools folder.
- along with the assets, are the render and unit files which are
expected to pass in the Android CTS execution.
- removed custom test files from the a630 folder, to make it comply
with the CTS expectations.
- include new patches to remove Python2 dependencies and avoid the
installation of it in rootfs.
- strip binariesthe built binaries `skqp` and `list_gpu_unit_tests`, as
`is_debug = false` gn argument did not work, maybe it is not well
tested in development builds with skia tools
- use Clang instead of GCC. The GCC support is not so graceful as it is
in the skia main branch, some NEON instructions needs to be turned off
in the GCC compilation, causing different tests result. This change
does not imply a bigger rootfs, since the built skqp binary uses GCC
libc++ and other library runtimes. So clang is just a build
dependency.
= Changes in skqp results =
Some errors were found for GL backend and unit tests. GLES and VK tests are green.
All the failed tests were classified as expected to fail in the render and unit tests list.
```
gl_blur2rectsnonninepatch
gl_bug339297_as_clip
gl_bug6083
gl_dashtextcaps
```
```
SRGBReadWritePixels (../../tests/SRGBReadWritePixelsTest.cpp:214 Could not create sRGB surface context. [OpenGL])
```
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14686>
Now the main deqp and piglit run takes about 4:30 of runner time in a
single job.
Added a couple of flakes that hit this MR, but which I think predate it
(probably due to not having #zink-ci until recently).
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15159>
When we shadow a resource, the backing BO is changed; as such,
existing references to the resource become invalid. So batches accessing the
resource need to be flushed (or otherwise have their references invalidated).
The wrong behaviour change (not flushing) was introduced when we started
tracking resources instead of BOs. The issue manifested as a severe performance
regression in glmark2's -bbuffer test, particular the subdata subtest. The issue
is magnified on slow CPUs; without the fix, the test becomes completely CPU
bound
Relevant glmark2 -bbuffer test from 43fps to 84fps.
Apparently, this causes functional issues too -- this performance-minded change
also fixes a few piglits.
Fixes: cecb889481 ("panfrost: Do tracking of resources, not BOs")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Chris Healy <cphealy@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13502>
Roughly use the freedreno logic to handle all the extra things that will
come up in our Piglit sooner than later.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
The MI_FLUSH_DW post-sync operation uses the same encoding as the
PIPE_CONTROL one so we can use the same helper. Write PS Depth Count
is not supported, of course, as the blitter has no depth pipeline.
This means that we can write the timestamp register from the blitter.
Fixes: 604d97671b ("iris: Add support for flushing the blitter (hackily)")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15157>
All of IF, ELSE, ENDIF, BREAK and CONTINUE were already translated
to the predication instructions in rc_vert_fc so all the flow control
we count at the moment is just BGNLOOP and ENDLOOP.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15077>
This allows to disregard the huge shaders that won't run anyway
and hopefully make catching shader regressions that result in a
compile failure easier.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15077>
To make sure it is applied in the cases we expect it to be, to avoid code
generation regressions. Functional regressions are expected to be caught by
integration-testing, so that is not focused on here.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
If a message-passing instruction like LD_VAR is preloaded, it will no longer be
counted in the shader cycle counts. Add a special message preload counter that
approximates the cost of preloading, so this information doesn't get a lost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
Include full message preload descriptors in the RSD on v7, and do the obvious
packing for fragment shader message preloads.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
The compiler will soon produce preloaded messages, but it should not pack them
itself, as this would require depending on GenXML or handcoding bitfields / bit
packs in the compiler. Instead, add a struct encoding the unpacked form of the
message, used as ABI between the compiler and the common driver.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
Currently we have to add almost the same code to the
`vn_physical_device_init_{features, properties}` to add
the extension to the `physical_dev->{features, properties}`
list.
These macros improves the code reusage.
Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15059>
Actually an xfail but occassionally passes and gives us no new information, only
noise.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-and-acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15154>
On V3D the quality of the code we generate is significantly affected by
how we decide to assign accumulators during register allocation, which
is determined by liveness, favoring short-lived temps.
There are many shaders that end up doing a whole lot of uniform loads
first, and using them later, which is very inconvenient for our register
allocation process because this increases uniform liveness and causes
us to use accumulators less efficientely, leading to significant churn.
To fix this, we move uniforms right before their first use in the same
block, but we need to do this after NIR scheduling, which means we are
doing it in non-SSA form, since the scheduler has a tendency to undo
this optimization and it is not easy to modify it to avoid it, since it
works in more abstract terms, using instruction dependencies, estimated
register pressure and instruction delay information to do its work,
which are very different concepts.
total instructions in shared programs: 13316738 -> 13033613 (-2.13%)
instructions in affected programs: 10389172 -> 10106047 (-2.73%)
helped: 55442
HURT: 16144
total threads in shared programs: 413722 -> 415048 (0.32%)
threads in affected programs: 1428 -> 2754 (92.86%)
helped: 680
HURT: 17
total loops in shared programs: 1716 -> 1690 (-1.52%)
loops in affected programs: 26 -> 0
helped: 26
HURT: 0
total uniforms in shared programs: 3704313 -> 3705181 (0.02%)
uniforms in affected programs: 687730 -> 688598 (0.13%)
helped: 2920
HURT: 7384
total max-temps in shared programs: 2364785 -> 2175190 (-8.02%)
max-temps in affected programs: 1215387 -> 1025792 (-15.60%)
helped: 49667
HURT: 1556
total spills in shared programs: 4241 -> 4248 (0.17%)
spills in affected programs: 642 -> 649 (1.09%)
helped: 11
HURT: 19
total fills in shared programs: 6115 -> 6125 (0.16%)
fills in affected programs: 1276 -> 1286 (0.78%)
helped: 11
HURT: 21
total sfu-stalls in shared programs: 34381 -> 36578 (6.39%)
sfu-stalls in affected programs: 16055 -> 18252 (13.68%)
helped: 3647
HURT: 5206
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
Add a global-level variable that allows disabling all jobs that would
have gone to the Collabora lab, to be used in case of outages.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15150>
Many tests had been fixed but weren't being run due to test reshuffles
from uprevs. Add some explanations for what remains.
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15133>
On Bifrost and Valhall, push uniforms are loaded into Fast Access Uniform
Random Access Memory (FAU-RAM). FAU-RAM is organized as an array of 64-bit
slots. A given tuple (Bifrost) or instruction (Valhall) may access at most a
single 64-bit slot. If an instruction requires uniforms from multiple 64-bit
slots, a uniform-to-register move must be inserted to avoid the hazard. However,
if an instruction requires a pair of 32-bit uniforms from the same 64-bit slot,
no move is required.
To reduce the number of moves we emit, this commit adds an optimization pass
that reorders pushed uniforms, trying to group uniforms used by the same
instruction. The pass works by creating a graph of pushed uniforms, where edges
denote the "both 32-bit uniforms required by the same instruction" relationship.
We perform depth-first search on this graph to find the connected components,
where each connected component is a cluster of uniforms that are used together.
We then select pairs of uniforms from each connected component. The remaining
unpaired uniforms (from components of odd sizes) are paired together
arbitrarily.
In principle, we should weight the graph by number of occurences and choose
pairs that maximize the total selected edge weight. This is left for
future work, as it is nontrivial -- selecting these edges optimally appears to
be NP-hard at first blush.
Implementation note: As position and varying shaders share FAU on Bifrost, extra
care is taken with a `push_offset` shader stage info parameter that ensures
varying shaders do not reorder uniforms selected by the previous position
shader.
total instructions in shared programs: 2503343 -> 2451758 (-2.06%)
instructions in affected programs: 1553309 -> 1501724 (-3.32%)
helped: 14256
HURT: 8
helped stats (abs) min: 1.0 max: 80.0 x̄: 3.62 x̃: 3
helped stats (rel) min: 0.06% max: 36.36% x̄: 7.31% x̃: 6.67%
HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.38 x̃: 1
HURT stats (rel) min: 1.30% max: 12.50% x̄: 4.99% x̃: 3.85%
95% mean confidence interval for instructions value: -3.66 -3.58
95% mean confidence interval for instructions %-change: -7.41% -7.20%
Instructions are helped.
total tuples in shared programs: 2008399 -> 1969627 (-1.93%)
tuples in affected programs: 1146344 -> 1107572 (-3.38%)
helped: 12867
HURT: 147
helped stats (abs) min: 1.0 max: 61.0 x̄: 3.03 x̃: 2
helped stats (rel) min: 0.17% max: 42.86% x̄: 6.79% x̃: 4.65%
HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.20 x̃: 1
HURT stats (rel) min: 0.29% max: 20.00% x̄: 2.12% x̃: 1.19%
95% mean confidence interval for tuples value: -3.03 -2.93
95% mean confidence interval for tuples %-change: -6.82% -6.57%
Tuples are helped.
total clauses in shared programs: 408005 -> 401708 (-1.54%)
clauses in affected programs: 90760 -> 84463 (-6.94%)
helped: 6006
HURT: 164
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.08 x̃: 1
helped stats (rel) min: 0.45% max: 33.33% x̄: 12.44% x̃: 14.29%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 1.64% max: 25.00% x̄: 9.81% x̃: 5.26%
95% mean confidence interval for clauses value: -1.03 -1.01
95% mean confidence interval for clauses %-change: -12.03% -11.66%
Clauses are helped.
total cycles in shared programs: 203308.37 -> 202737.83 (-0.28%)
cycles in affected programs: 19264.71 -> 18694.17 (-2.96%)
helped: 3024
HURT: 41
helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.19 x̃: 0
helped stats (rel) min: 0.17% max: 33.33% x̄: 3.83% x̃: 2.83%
HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.06 x̃: 0
HURT stats (rel) min: 0.30% max: 5.88% x̄: 1.41% x̃: 0.93%
95% mean confidence interval for cycles value: -0.19 -0.18
95% mean confidence interval for cycles %-change: -3.89% -3.64%
Cycles are helped.
total arith in shared programs: 76265.67 -> 74669.25 (-2.09%)
arith in affected programs: 45001.50 -> 43405.08 (-3.55%)
helped: 12945
HURT: 97
helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.17% max: 50.00% x̄: 8.06% x̃: 4.88%
HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.05 x̃: 0
HURT stats (rel) min: 0.21% max: 33.33% x̄: 2.16% x̃: 0.96%
95% mean confidence interval for arith value: -0.12 -0.12
95% mean confidence interval for arith %-change: -8.16% -7.81%
Arith are helped.
total quadwords in shared programs: 1796563 -> 1766803 (-1.66%)
quadwords in affected programs: 948830 -> 919070 (-3.14%)
helped: 12078
HURT: 219
helped stats (abs) min: 1.0 max: 42.0 x̄: 2.49 x̃: 2
helped stats (rel) min: 0.10% max: 33.33% x̄: 5.57% x̃: 5.26%
HURT stats (abs) min: 1.0 max: 4.0 x̄: 1.21 x̃: 1
HURT stats (rel) min: 0.33% max: 6.67% x̄: 2.00% x̃: 1.14%
95% mean confidence interval for quadwords value: -2.46 -2.38
95% mean confidence interval for quadwords %-change: -5.52% -5.36%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14163>
Gives us memory back faster which is useful for pathalogical CTS
tests.
The GLSL IR was previously used after converting to NIR for things
like building the GL resource list but we have had a NIR version
for this for some time and I don't believe there are any other
use cases left for keeping the old IR hanging around this long.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15127>
These runners are configured to have a single job take up the whole
runner, which means we get to use threads to our hearts content. The pile
of cores means we don't need to spawn separate jobs to try to load-balance
across fdo's shared runner capacity. Having dedicated runners means we
won't get our MRs blocked as much waiting on non-Mesa testing happening on
fd.o.
We manage to complete all of this llvmpipe testing in about 6:15.
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>