ENOSYS is commented as "Invalid system call number". This is returned
by qemu-user for unbridged ioctls.
Fixes: 30b57f10b3 ("vulkan/wsi: Signal semaphores and fences from the dma-buf")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17325>
It was previously done dzn_cmd_buffer_flush_transition_barriers(),
leaving the queue+flush case unhandled. Let's fix that by moving
this piece of code to dzn_cmd_buffer_exec_transition_barriers().
Fixes: 35356b1173 ("dzn: Cache and pack transition barriers")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17314>
While we've taken advantage of split-sends in select situations, there
are many other cases (such as sampler messages, framebuffer writes, and
URB writes) that have never received that treatment, and continued to
use monolithic send payloads.
This commit introduces a new optimization pass which detects SEND
messages with a single payload, finds an adjacent LOAD_PAYLOAD that
produces that payload, splits it two, and updates the SEND to use both
of the new smaller payloads.
In places where we manually used split SENDS, we rely on underlying
knowledge of the message to determine a natural split point. For
example, header and data, or address and value.
In this pass, we instead infer a natural split point by looking at the
source registers. Often times, consecutive LOAD_PAYLOAD sources may
already be grouped together in a contiguous block, such as a texture
coordinate. Then, there is another bit of data, such as a LOD, that
may come from elsewhere. We look for the point where the source list
switches VGRFs, and split it there. (If there is a message header, we
choose to split there, as it will naturally come from elsewhere.)
This not only reduces the payload sizes, alleviating register pressure,
but it means that we may be able to eliminate some payload construction
altogether, if we have a contiguous block already and some extra data
being tacked on to one side or the other.
shader-db results for Icelake are:
total instructions in shared programs: 19602513 -> 19369255 (-1.19%)
instructions in affected programs: 6085404 -> 5852146 (-3.83%)
helped: 23650 / HURT: 15
helped stats (abs) min: 1 max: 1344 x̄: 9.87 x̃: 3
helped stats (rel) min: 0.03% max: 35.71% x̄: 3.78% x̃: 2.15%
HURT stats (abs) min: 1 max: 44 x̄: 7.20 x̃: 2
HURT stats (rel) min: 1.04% max: 20.00% x̄: 4.13% x̃: 2.00%
95% mean confidence interval for instructions value: -10.16 -9.55
95% mean confidence interval for instructions %-change: -3.84% -3.72%
Instructions are helped.
total cycles in shared programs: 848180368 -> 842208063 (-0.70%)
cycles in affected programs: 599931746 -> 593959441 (-1.00%)
helped: 22114 / HURT: 13053
helped stats (abs) min: 1 max: 482486 x̄: 580.94 x̃: 22
helped stats (rel) min: <.01% max: 78.92% x̄: 4.76% x̃: 0.75%
HURT stats (abs) min: 1 max: 94022 x̄: 526.67 x̃: 22
HURT stats (rel) min: <.01% max: 188.99% x̄: 4.52% x̃: 0.61%
95% mean confidence interval for cycles value: -222.87 -116.79
95% mean confidence interval for cycles %-change: -1.44% -1.20%
Cycles are helped.
total spills in shared programs: 8387 -> 6569 (-21.68%)
spills in affected programs: 5110 -> 3292 (-35.58%)
helped: 359 / HURT: 3
total fills in shared programs: 11833 -> 8218 (-30.55%)
fills in affected programs: 8635 -> 5020 (-41.86%)
helped: 358 / HURT: 3
LOST: 1 SIMD16 shader, 659 SIMD32 shaders
GAINED: 65 SIMD16 shaders, 959 SIMD32 shaders
Total CPU time (seconds): 1505.48 -> 1474.08 (-2.09%)
Examining these results: the few shaders where spills/fills increased
were already spilling significantly, and were only slightly hurt. The
applications affected were also helped in countless other shaders, and
other shaders stopped spilling altogether or had 50% reductions. Many
SIMD16 shaders were gained, and overall we gain more SIMD32, though many
close to the register pressure line go back and forth.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17018>
SEND messages with EOT need to use g112-g127 for their sources so that
the hardware is able to launch new threads while old ones are finishing
without worrying about register overlap when pushing payloads. For the
newer split-send messages, this applies to both source registers.
Our special case for this in the register allocator was only considering
the first source. This wasn't a problem because we hadn't ever tried to
use split-sends with EOT before. However, my new optimization pass is
going to introduce some shortly, so we'll need to handle them properly.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17018>
This moves the radv specific code into radv, and calls back from
aco into radv.
This should allow easier radeonsi integration later.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16445>
Now that we've removed the thread_local lookup tables using
pointer-to-member C++ features, this can go back to being a standard
C file, like it was in the past. We just need to annotate a couple
of things with "struct".
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
We had been using thread_local index -> opcode_desc tables to avoid
plumbing through a storage location throughout all the code. But now
we have done so with the new brw_isa_info structure. So we can just
store the tables there, and initialize it with the compiler.
This fixes crashes in gtk4-demo on iris, and should help with some
programs on zink as well. Something was going wrong with the
thread_local variables not being set up correctly. While we might be
able to work around that issue, there's really no advantage to storing
these lookup tables in TLS (beyond it being simpler to do originally).
So let's simply stop doing so.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6728
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6229
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
This structure will contain the opcode mapping tables in the next
commit. For now, this is the mechanical change to plumb it into all
the necessary places, and it continues simply holding devinfo.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
This patch creates a new header file, brw_isa_info.h, which will
contains all the functions related to opcode encoding on various
generations. Opcode numbers may have different meanings on different
hardware, so we remap them between an enum we can easily work with
and the hardware encoding.
We move the brw_inst setters and getters to brw_inst.h.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
This header file didn't include normal guards against being included
multiple times. It also defined a function in a header file without
marking it static inline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
src/mesa/main includes are for Mesa's OpenGL implementation, and the
compiler is used in Vulkan drivers and other tools. We really only
needed one #define, which is that we offer 32 samplers. It probably
makes more sense to have our own defined limit for that rather than
importing a project-wide value which theoretically could be adjusted,
so swap MAX_SAMPLERS for a new BRW_MAX_SAMPLERS and call it a day.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
Gallium drivers shouldn't be including src/mesa/main headers, but we're
picking up a rogue main/config.h via the compiler, so this code I ported
over from i965 kept compiling. Use the PIPE_* defines instead so that
we can stop including that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
Gallium drivers shouldn't be including src/mesa/main headers, but we're
picking up a rogue main/config.h via the compiler, so this code I ported
over from i965 kept compiling. Use the PIPE_* defines instead so that
we can stop including that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
VUID-VkViewport-minDepth-01234 specifies that depth must be in the range [0.0, 1.0],
so the viewport must always be clamped to this range
this affects texture clears using u_blitter, as this expects to be able
to use the GL range of [-1.0, 1.0], so pass the depth value as though it's
been de-converted back to a GL z coordinate to account for viewport transform
cc: mesa-stable
fixes#6757
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17319>
This matches up with the native gl drivers as well as the media stack.
Test: android.graphics.cts.MediaVulkanGpuTest
Test: android.media.cts.EncodeDecodeTest
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17306>
This avoids the risks of concatenating our rulesets missing out on some
file list because of a "never" in the middle.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17287>
Not only runs less testing when only one driver is impacted, but also
makes sure zink+turnip is turned off when the farm is.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17287>
This drops the mesa/gallium lists from some build rules, since zink common
rules brings them in already. If we do more driver common rules, we might
end up with those core lists appearing in the yaml multiple times, but
that seems like a small price to pay for not being able to forget some.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17287>
Because !references merging happens after yaml parsing, this lets us
remove a duplicated definition between .test-source-dep.yml and
.gitlab-ci.yml.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17287>
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17242>
nir_ine_imm(b, nir_iand_imm(b, x, mask), 0) and
nir_i2b(b, nir_iand_imm(b, x, mask)) are common
patterns which become quite messy when they are
part of a larger expression. Clang-format does
not improve things either and we can end up with
some rather interesting looking code.
(RADV ray tracing pipeline and query lowering)
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17242>
Change NGG GS emit vertex code to emit combined shared stores,
also change the export vertex code to emit combined shared loads.
This results in more optimal code generation, ie. fewer LDS
instructions are generated.
GS vertices are stored using an odd stride to minimize the chance
of bank conflicts, which means that unfortunately
we still can't use an alignment higher than 4 here,
so the best we can get are some ds_read2_b32 instructions.
Fossil DB stats on Navi 21 (formerly Sienna Cichlid):
Totals from 135 (0.10% of 128653) affected shaders:
VGPRs: 6416 -> 6512 (+1.50%)
CodeSize: 529436 -> 503792 (-4.84%)
MaxWaves: 2952 -> 2924 (-0.95%)
Instrs: 93384 -> 90176 (-3.44%)
Latency: 290283 -> 293611 (+1.15%); split: -0.36%, +1.50%
InvThroughput: 81218 -> 82598 (+1.70%)
Copies: 6603 -> 6606 (+0.05%)
PreVGPRs: 5037 -> 5076 (+0.77%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11425>