I would like to reuse pan_pool for persistent uploads (shaders and CSOs)
in Gallium. In theory u_upload_mgr is more appropriate, but pan_pool is
already a knockoff u_upload_mgr, so might as well finish the job.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
LOCAL_STORAGE.zw is workgroup local memory, whereas LOCAL_STORAGE.xy is
thread local memory. Likewise PC_SP.zw is the stack pointer, which is
initialized to (LOCAL_STORAGE.zw + offset) but is modifiable by the
shader. Panfrost doesn't modify the s tack pointer, and the register
allocation logic assumes a zero offset, so let's always spill to thread
local memory = LOCAL_STORAGE.xy, as was intended by Italo's cleanup.
This is visible on any shader that spills. Compute shaders aren't
advertised yet, so WLS will be null, causing a fault like the following
(reproduced on Mali T860 with the glyphy trace):
[15634.148873] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x0000000000000000
Reason: TODO
raw fault status: 0x70003C2
decoded fault status: SLAVE FAULT
exception type 0xC2: TRANSLATION_FAULT_LEVEL2
access type 0x3: WRITE
source id 0x700
[15634.658170] panfrost ff9a0000.gpu: gpu sched timeout, js=0, config=0x3300, status=0x8, head=0x31d4540,
tail=0x31d4540, sched_job=00000000e8101b2e
Fixes: 6a12ea02fe ("pan/mdg: properly encode/decode ldst instructions")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
Add the Intel pps driver using functionalities provided by
libintel_perf.
v2: Fix build with perfetto not enabled.
v3: Open perf stream with no filtering.
v4: Drop usage of inc/dec_n_users.
v5: Isolate intel_perf in its own class.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10216>
A bunch of performance counters rely on register snapshots on top of
the OA reports. Those are already conditional to the query mode in the
equations :
availability="true $QueryMode &&"
This change allows to disable counters that are only available with
additional register snapshots. This will be useful if you only want to
OA reports to extract performance counter values.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10216>
Allow libintel_perf to be included as a dependency from a C++ project by
wrapping some declaration within an extern "C" block, and then add a
function to allow direct reading of the OA stream.
v2: Don't expose internal helpers (Lionel)
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10216>
We support this extension on Zink even if we don't support ES 3.2,
because the compatibility extension doesn't require advanced blending.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10861>
By skipping slower tests such as `.*3_level_array.*`, we can add more
tests to CI without dramatically increase job duration.
Besides that, this patch also increases coverage by adding different
kinds of tests that were previously untested, failing or slowed CI too
much.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10855>
When exec is constant, we can remember the constant as the old exec,
and just copy the condition and use it as the new exec. There is no
need to save the constant.
Due to using p_parallelcopy which is lowered to s_mov_b64 (or 32),
many exec restores now become copies, hence the increase in the copy
stats.
Fossil DB changes on Sienna Cichlid:
Totals from 73969 (49.37% of 149839) affected shaders:
SpillSGPRs: 1768 -> 1610 (-8.94%)
CodeSize: 99053892 -> 99047884 (-0.01%); split: -0.02%, +0.01%
Instrs: 19372852 -> 19370398 (-0.01%); split: -0.02%, +0.01%
VClause: 515154 -> 515142 (-0.00%); split: -0.00%, +0.00%
SClause: 719236 -> 718395 (-0.12%); split: -0.14%, +0.02%
Copies: 1109770 -> 1254634 (+13.05%); split: -0.07%, +13.12%
Branches: 374338 -> 374348 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1776481 -> 1653761 (-6.91%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
Previously, we would store even the constant -1 exec mask from the
beginning of every merged shader. With this change it is no longer
necessary because we can restore to constant exec mask directly.
Hence, this frees up a register pair (single register for Wave32)
in every merged shader.
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
This will enable us to store non-temporary values,
such as constant operands there.
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
These don't really need the exec mask (and never have), but we haven't
needed to include them in needs_exec_mask yet.
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
In Vulkan we configure our integer RTs to clamp automatically, so with logic
operations we need to be careful and avoid overflows by discarding any bits
that won't fit in the RT component size.
Fixes remaining CTS test failures in:
dEQP-VK.pipeline.logic_op.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10801>
This avoids debug builds to assert crash. Components that don't exist
won't be used and will be eventually DCEd, so simply lower them to 0.
Fixes CTS tests like these in debug builds:
dEQP-VK.pipeline.logic_op.r8_uint.clear
dEQP-VK.pipeline.logic_op.r8_uint.and
dEQP-VK.pipeline.logic_op.r8_uint.and_reverse
dEQP-VK.pipeline.logic_op.r8_uint.copy
dEQP-VK.pipeline.logic_op.r8_uint.and_inverted
dEQP-VK.pipeline.logic_op.r8_uint.no_op
dEQP-VK.pipeline.logic_op.r8_uint.xor
dEQP-VK.pipeline.logic_op.r8_uint.or
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10801>
There is a regression that made it impossible to export gem
handles with write access.
That is, a client may export gem handles of each buffer plane, then
export dmabuf fds using these handles, and mmap these dmabuf in
a different process (this is what Chromium does).
After https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861,
it became impossible as mmap resulted in EACCESS error as slightly
different approach was taken for exporting these gem handles.
This CL fixes exporting gem handles (which are exported from dmabuf
fds) by adding the DRM_RDWR flag.
Cc: mesa-stable
Fixes#3119
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10850>
The start offset of the vertex samplers isn't zero, but the indexing of
the passed in views array is still zero based. Use the correct indexing
variable to fix vertex sampler setup.
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: 81ab9fe2d0 ("etnaviv: handle NULL views in set_sampler_views")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10724>
There is a regression that made it impossible to export gem
handles with write access.
That is, a client may export gem handles of each buffer plane, then
export dmabuf fds using these handles, and mmap these dmabuf in
a different process (this is what Chromium does).
After https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861,
it became impossible as mmap resulted in EACCESS error as slightly
different approach was taken for exporting these gem handles.
This CL fixes exporting gem handles (which are exported from dmabuf
fds) by adding the DRM_RDWR flag.
Cc: mesa-stable
Fixes#3119
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10851>
It's perfectly legal to declare multiple SSBOs that point to the same
binding/descriptor_set with different access mask. Currently, it will
always get the first one in the list that matches binding/desc_set
regardless of the access mask, but other variables might have different
access mask.
Fix this by being conservative if another variable uses the same
binding/desc_set because we can't get it reliably without adding
a new field to vulkan_resource_index.
This fixes rendering issues in Resident Evil Village with vkd3d-proton.
This bug has been uncovered by ("spirv: Don't remove variables used by
resource indexing intrinsics") because variables are no longer removed
No fossils-db changes.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10692>
Would allow earlier faults instead of having corrupted cmdstream.
This was already done to Freedreno long ago in:
04aff7e4 "freedreno: make cmdstream bo's read-only to GPU"
Since private memory should be GPU writable it is now allocated
separately, instead of suballocation from now read-only cmdstream.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>
GPU won't be able to write to such BOs, which would to useful for
cmdstream BOs.
Move "bool dump" to the new flags along the way.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>
Mostly arbitrary. Use the Gallium limits which are all less than the
hardware limits, and static_assert that this is the case to future
proof.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10845>
Should improve memory usage at the expense of more agressive batch
submission when the free batch pool is empty. The limit has been chosen
such that a batches bitmap fits nicely in a 32-bit integer.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10842>
Flush batches when a new batch accessing the same set of BOs comes in
instead of delaying that operation. This greatly simplifies the
dependency tracking logic and shouldn't hurt the perfs (it might even
improve the latency since jobs are now flushed earlier).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10842>