Unconditionally lowering prevents GL drivers from natively
implementing these ops. Drivers that need lowering should set
lower_uadd_carry and lower_usub_borrow on nir_shader_compiler_options to
get the nir lowerings.
Tested with dEQP-GLES31.functional.shaders.builtin_functions.integer.*
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19704>
Use "util/detect_os.h" instead of "pipe/p_config.h" and "pipe/p_compiler.h"
in src/util/os_mman.h
This is a prepare to implement os_mman on windows
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19645>
While Panfrost allocates linear images with strides that are a multiple of 64
bytes, other dma-buf producers on the system may not satisfy this requirement.
However, at least on v7 and newer, any image with a regular format must have a
stride that is a multiple of 64 bytes.
This fixes a real bug in an application that created a linear R8_UNORM image
with stride 480 bytes, imported it as an EGL_image, and then tried to texture
from it with the GPU. Previously, the driver allowed this situation but it
resulted in an imprecise fault from the GPU. This patch corrects the driver to
reject the import as invalid due to the unaligned stride, ensuring we never
attempt to texture from such a resource.
To implement, we add some new layout queries to centralize knowledge about the
stride alignment requirements, and we sprinkle in asserts to show how the
invariant is upheld throughout the lifecycle of image creation to texturing.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19620>
For 2D UI workloads and even most 3D workloads, the indirect dispatch shader
won't actually be needed, but we currently compile it during eglInitialize() on
every v7 application. That hurts app start-up time, especially given that this
shader doesn't hit the disk cache. We can instead defer compiling this shader
until it's actually needed, when glDispatchComputeIndirect() gets called.
The tradeoff is that the first glDispatchComputeIndirect() call will be (much)
slower than successive calls, since we need to build and compile this internal
shader. I'm unconvinced that's a problem in practice.
An app would need to call glDispatchComputeIndirect for the first time in the
middle of a scene. 2D apps never would call that, OpenCL doesn't have that, and
GL compute will have the same costs just moved around. So it's down to a 3D
GLES3.1 app that indirectly dispatches compute for the first time time in the
middle of a scene. Which, meh? It's not entirely implausible but we have bigger
fish to fry, and this fixes a real problem (about 5% of eglInitialize time spent
building this shader that won't actually get used).
es2_info starts slightly faster with this change.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19622>
Without additional signalling of modifiers, CRCs cannot possibly in a correct
way work across process boundaries. Since we don't do that signalling, we should
not be allocating private CRCs for imported resources, and we should not be
using our own private CRCs for internal resources.
The entire out-of-bands CRC infrastructure is a hack to let us do CRCs even for
imported/exported BOs, but that can't possibly work. Remove it, and remove a
pile of special cases across the driver.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19576>
We don't need to login anymore, but we can't use plain minio commands
now. `ci-fairy` got a helper as `s3cp` to keep an almost identical
API.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19076>
If nir->info.shared_size = 0 but grid->variable_shared_mem > 0, the shader uses
shared memory but the compiler may not realize that. We need to disable
workgroup merging even in this case. The alternate approach is to statically
check for shared intrinsics in the compiler, but this is a bit easier all things
considered.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18581>
We have no vertex shader key, and unless legacy GL features are used, the
fragment shader key is known ahead-of-time. That means we can precompile shaders
at CSO create time, hopefully avoiding some draw-time jank.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
NIR deemphasizes nir_variable. We want to transition off it. Instead of walking
the list of variables and playing games with the GLSL types to collect varying
information, walk the list of instructions and use the I/O semantics to collect
similar information.
In addition to avoiding the reliance on nir_variable, this fixes handling of
struct varyings under certain circumstances. Such programs are compiled by the
GLES3.1 CTS but not used, so without this fix, the affected tests would regress
when precompiling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Move the pass from the Bifrost compiler to the Midgard/Bifrost common code
directory, and take advantage of it on Midgard, where it fixes the same
tests as it fixed originally on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Bifrost onwards handle this in hardware, and the Midgard lowering isn't
too terrible. Enable the format, otherwise desktop GL apps such as
Hacknet try to render to the format and get an incomplete framebuffer.
Cc stable because apparently we've been advertising this format
unintentionally as a result of some other interaction? Unclear how
Hacknet is hitting this, maybe it's an app bug. Shrug, it's not a big
deal regardless.
Additionally, we need to restrict texturing from 32-bit normalized due
to a restriction added with the v7 pixel format fiasco. That means
restricting rendering to 32-bit normalized on v7 onwards.
Closes: #7251
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Tested-by: Dang Huynh <danct12@disroot.org>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19358>
Trace-based testing has not worked for Panfrost. It was a neat
experiment, and I'm glad we tried it, but the results have been mostly
negative for the driver. Disable the trace-based tests.
For testing that specific API features work correctly, we run the
conformance tests (dEQP), which are thorough for OpenGL ES. For big GL
features, we run Piglit, and if there are big GL features that we are
not testing adequately, we should extend Piglit for these. For
fine-grained driver correctness, we are already covered.
Where trace-based testing can fit in is as a smoke test, ensuring that
the overall rendering of complex scenes does not regress. In principle,
that's a lovely idea, but the current implementation has not worked out
for Panfrost thus far. The crux of the issue is that the trace based
tests are based on checksums, not fuzzy-compared reference images. That
requires updating checksums any time rendering changes. However, a
rendering change to a trace is NOT a regression. The behaviour of OpenGL
is specified very loosely. For a given trace, there are many different
valid checksums. That means that correct changes to core code frequently
fail CI after running through the rest of CI, only because a checksum
changed in a still correct way. That's a pain to deal with, exacerbated
by rebase pains, and provides negative value to the project. Some recent
examples of this I've hit in the past two weeks alone:
panfrost: Enable rendering to 16-bit and 32-bit
4b49241f7d ("panfrost: Use proper formats for pntc varying")
ac2964dfbd ("nir: Be smarter fusing ffma")
The last example were virgl traces, but were especially bad: due to a
rebase fail, I had to update traces /twice/, wasting two full runs of
pre-merge CI across *all* hardware. This was extremely wasteful.
The value of trace-based testing is as a smoke test to check that traces
still render correctly. That is useful, but it turns out that checksums
are the wrong way to go about it. A better implementation would be
storing only a single reference image from a software rasterizer per
trace. No driver-specific references would be stored. That reference
image must never change, provided the trace never changes. CI would then
check rendered results against that image with tolerant fuzzy
comparisons. That tolerance matches with the fuzzy comparison that the
human eye would do when investigating a checksum change anyway. Yes, the
image comparison JavaScript will now report that
0 pixels changed within the tolerance, but there's nothing a human eye
can do with that information other than an error prone copypaste of new
checksums back in the yaml file and kicking it back to CI, itself a
waste of time.
Finally, in the time we've had trace-based testing alongside the
conformance tests, I cannot remember a single actual regression in one
of my commits the trace jobs have identified that the conformance tests
have not also identified. By contrast, the conformance test coverage has
prevented the merge of a number of actual regressions, with very few
flakes or xfail changes, and I am grateful we have that coverage. That
means the value added from the trace jobs is close to zero, while the
above checksum issues means that the cost is tremendous, even ignoring
the physical cost of the extra CI jobs.
If you work on trace-based testing and would like to understand how it
could adapted to be useful for Panfrost, see my recommendations above.
If you work on CI in general and would like to improve Panfrost's CI
coverage, what we need right now is not trace-based testing, it's
GLES3.1 conformance runs on MediaTek MT8192 or MT8195. That hardware is
already in the Collabora LAVA lab, but it's not being used for Mesa CI
as the required kernel patches haven't made their way to mainline yet
and nobody has cherry-picked them to the gfx-ci kernel. If you are a
Collaboran and interested in improving Panfrost CI, please ping
AngeloGioacchino for information on which specific patches need to be
backported or cherry-picked to our gfx-ci kernel. Thank you.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19358>
Now we're back to a single XFB implementation for all gens. Fixes:
KHR-GLES31.core.draw_indirect.advanced-twoPasses-transformFeedback-arrays
KHR-GLES31.core.draw_indirect.advanced-twoPasses-transformFeedback-elements
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19238>
The formats of special attributes are supposed to match their architectural
definitions, and point coordinates are architecturally defined as RGBA32F. In
practice this doesn't seem to fix anything.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19237>
gl_PointCoord is implemented via a special attribute descriptor on Midgard. This
descriptor has an orientation bit, the orientation is driver-controlled. That
means we can map rast->sprite_coord_mode to this bit, rather than lowering in
the shader.
This is a bug fix for point sprites, which are implemented natively on Midgard
for dubious reasons and need to be flipped this way. It is also an optimization
for apps reading gl_PointCoord, removing the extra arithmetic to flip, although
the value of this is somewhat dubious.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19237>
The job fits into 15 minutes of runtime, so deparalelize.
Stress-tested.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19211>
The loop over sources has to happen for every instruction, regardless of whether
we also need to register allocate the destination. The other source loops handle
this properly, but this one was missed.
Fixes spilling failure in shaders/android/angle/aztec_ruins/16.shader_test when
the input NIR is shuffled a bit (from reordering passes).
Fixes: 129d390bd8 ("pan/mdg: Fix bound setting in RA for sources")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19093>
Now the state tracker's responsible to lower away for us (and the state tracker
can do it correctly, our implementation is incorrect with a strict reading of
the Gallium contract).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18658>
This function is only used if PAN_ARCH >= 5
Fixes a clang warning about unused static inlined functions.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18800>
This duplicates the lowering from nir_lower_packing. However, nir_lower_packing
also lowers a pile of other instructions that we do implement natively, and this
is easier than adding a bunch of knobs to nir_lower_packing to get just what we
need.
Fixes test-printf address_space_4.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18656>