If two instructions in a single bundle both write to a spilt
destination, then we need to reuse the fill and spill instructions,
otherwise the value will be overwritten.
This and the rest of this set of Midgard bug fixes were found from a
vertex shader in Firefox WebRender that is used when a video is
clipped, for example by setting the border-radius CSS property.
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>
Check the bytemask against 0xFFFF rather than 0xF so that the fill is
skipped for a .xyzw write rather than a .x write.
Set the mask on the store to 0xF when doing a read so that all
components are written back.
Fixes: 31d26ebf1b ("pan/mdg: Fill from TLS before spilling non-SSA nodes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>
If a value is written in a vector CSEL but then written again by other
instructions, it still needs full alignment, so set min_alignment
using MAX2 to avoid ever reducing it.
Fixes: 1798f6bfc3 ("pan/midgard: Fix masks/alignment for 64-bit loads")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>
To make the disassembly easier to read, add whitespace after disassembled
branches. This makes the basic blocks of the original control flow graph more
obvious, to aid comparison with the IR.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16409>
Fixes
dEQP-VK.glsl.indexing.tmp_array.vec2_static_loop_write_static_loop_read_vertex
which otherwise fails due to nir_opt_sink being "clever" around unused
loop exit blocks.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16155>
The depth equations weren't quite right, with spec citations to prove it. This
didn't fix the test I was debugging, but it surely fixed /something/.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16155>
Moves the needle from Crash to Fail on:
dEQP-VK.synchronization.op.single_queue.fence.write_clear_color_image_read_image_compute.image_64x64x8_r32_sfloat
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16155>
Required to avoid tiler heap out-of-memory condition on Valhall on tests
including:
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_1200x1200_drawcount_8
This test passes on Bifrost without the fix because varyings are only allocated
from the tiler heap on Valhall.
Minimal perf or memory usage impacted is expected, as even old versions of
panfrost.ko support growable memory.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16330>
For some days, the CI was bypassing LAVA and bare-metal jobs due to an
issue in the init-stage2.sh script. After the fix some tests
crashed/failed. This commit updates the expectations for them.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16325>
For some days, the CI was bypassing LAVA and bare-metal jobs due to an
issue in the init-stage2.sh script. After the fix the neverball trace on
panfrost-t860 is producing a different image, due to a bugfix in
Mesa itself driver. This commit updates the neverball trace on that
device.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16325>
Valhall introduces hardware-allocated varyings. Instead of allocating varying
descriptors on the CPU with a slot based interface, the driver just tells the
hardware how many bytes to allocate per vertex and loads/stores with byte
offsets. This is much nicer!
However, this requires us to rework our linking code to account for separable
shaders. With separable shaders, we can't rely on driver_location matching
between stages, and unlike on Midgard, we can't resolve the differences with
curated command stream descriptors. However, we *can* rely on slots matching. So
we should "just" determine the byte offsets based on the slot, and then
separable shaders work.
For GLES, it really is that easy.
For desktop GL, it's not -- desktop GL brings unpredictable extra varyings like
COL1 and TEX2. Allocating space for all of these unconditionally would hamper
performance. To cope, we key fragment shaders to the set of non-GLES varyings
written by the linked vertex shader. Then we may define an efficient ABI, where
only apps only pay for what they use.
Fixes various tests in dEQP-GLES31.functional.separate_shader.random.* on
Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16310>
Spilling is tricky and doesn't get much testing in CI. Run
a subset of dEQP-GLES2.functional.shaders.* with spilling forced to get spilling
tested in CI.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
Packed TLS has cache-locality benefits on Valhall, compared to Bifrost's flat
TLS. Valhall does support flat TLS, but requires extra arithmetic in the shader
for correct results. At least until we get to generic pointers (and maybe even
then), we can use packed TLS. So just use packed TLS always for proper spilling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
The set of blend shaders is closed. They are completely internal. As such, we
know that the registers we reserve for them suffice, and we don't permit
register spilling. Refusing to support spilling in blend shaders simplifies a
number of parts of the compiler. Add a check that we don't try to spill anyway,
which will silently fail.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
BIFROST_MESA_DEBUG=spill now restricts the register file to 1/4 its usual size,
useful for testing register spilling (e.g. running CTS) as well as debugging
spilling on small shaders.
Note blend shaders are exempt, as we don't allow blend shaders to spill.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
This is deterministic, unlike a set. Note we need the extra dereferencing to
keep the macro safe, simple, and standards compliant:
1. Nesting two for-loops would cause break/continue to fail.
2. Declaring variables outside the loop would pollute the namespace.
3. Declaring an anonymous struct is not conformant and doesn't compile in clang.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16279>
These have reasonable interpretations now, and the three row strides have been
deduplicated. So add stride expectations to our ASTC unit tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16201>
Row stride is defined in terms of header blocks for AFBC. Usually,
afbc.row_stride is used for AFBC images and row_stride for non-AFBC images;
however, the nonsense non-AFBC stride leaked into the UABI. So handle that in
the legacy conversion path and use a unified row stride (equal to
afbc.row_stride for AFBC images and row_stride otherwise).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16201>
Line strides don't make sense for linear images, so use row strides instead in
the API. Then update the layout code accordingly.
Note: we need to preserve the old UABI (bug for bug compatibility), so we still
use legacy strides externally. But now we use row strides internally, which is
better than using both everywhere.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16201>
Unfortunately, the botched nonlinear "line strides" have become ingrained in the
UABI. We need to be work with them. Add safe helpers to convert to/from the
legacy strides.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16201>
...Rather than line_stride. For linear images, these are equivalent. For
nonlinear images, rowPitch is implementation-defined. So this isn't strictly a
bug fix, but it gets rid of the nonsense nonlinear line_stride.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16201>
PanVK switched to the common Panfrost layout code, but these duplicate structs
stuck around. Garbage collect them to prevent confusion.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16201>
The range helper is taken from ANV; the gpu_ptr one is original. This
also fixes a few more bugs where we weren't adding offsets in properly.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16216>
Rather than a single stack for all threads to share -- that can't work! :-) We
use the same helper that the GLES driver does. Fixes anything using scratch or
spilling, including:
dEQP-VK.glsl.indexing.varying_array.vec3_static_write_dynamic_read
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16283>
TODO: Make the secondary opcode field wider so that FATAN_ASSIST can
be split into two instructions
[Alyssa: Fixes to the hardware behaviour.]
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15588>
On Bifrost, this is very easy: there's an RSD bit to Y-flip gl_PointCoord. It
should map perfectly to the Gallium bit. With this change, we no longer use
lower_pntc_ytransform on Bifrost, saving a bit of ALU when reading point
coordinates.
On Valhall, this is quite hard: the bit is in the framebuffer descriptor now!
That means it can't be changed in a batch. This is expected to be ok: on GLES
and VK, the origin is controlled only by the framebuffer orientation. It's a
bigger problem on big GL, where GL_POINT_SPRITE_COORD_ORIGIN can be set freely.
To cope, a tri-state data structure is used for the state tracking. This has a
failure case on Valhall: every draw toggling the coord origin. However, the
intention of the ORIGIN state bit is smoothing over coordinate system
differences; it should never /actually/ change once set. Until we see an app
doing something so stupid, I don't think we should worry about.
We need all the Valhall tri-state infrastructure for handling provoking vertices
on big GL anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16173>
Since we don't export the relevant CAP, the state tracker calls
nir_lower_clip_vs for us. However, for some reason we're still responsible for
calling nir_lower_clip_fs. Now that we have sane shader key infrastructure,
let's do so.
Fixes the floor rendering wrong in the title screen of Neverball.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16173>
In Vulkan, it's possible to create a pipeline with no fragment shader that's
still expected to rasterize. This is useful for depth/stencil side effects, and
is closely related to the "fragment shader required" optimization we do in the
GLES driver. Refactor the RSD emit code to handle this case.
Fixes dEQP-VK.pipeline.stencil.nocolor.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16204>
The "fragment shader required?" computed state is about fragment shader side
effects. There may be no fragment shader required but depth/stencil side effects
meaning that rasterization is nonoptional. What actually gates rasterization is
the rasterizer discard bit. Use that instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16204>
Otherwise wide lines break. The alternative approach is to eliminate the points
writes when not drawing points since we do have topology information at compile
time. I'm admittedly stuck in my GL mindset. That's the approach we'll need for
Valhall anyway.
Fixes dEQP-VK.rasterization.interpolation.basic.lines_wide
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16204>
The VAR_TEX definition in ISA.xml only has a field for texture_index,
so trying to read sampler_index will return zero; read from
texture_index instead, and rename other fields for consistency.
The texture and sampler indices must be equal for VAR_TEX to be used,
so either name could be used for the field.
Fixes the wrong textures being used in Thief.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6219
Fixes: eb1479bda2 ("pan/bi: Support message preloading")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16255>
This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.
All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
shadow2darray with NIR, since the shading languages don't expose txl of
those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>
We use nir_assign_io_var_locations() which compacts the varyings and
eliminates any unused input slots. We need to do the same thing when
processing pVertexAttributeDescriptions[] or else we'll end up with
mismatches between the shader and the state setup code.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16183>
The GLSL lowering of half float packing involves software conversion
to half-float; instead, use the lowering in NIR.
Both Midgard and Bifrost are already set to lower the instructions to
bit operations, but change mdg_should_scalarize so that the lowerable
split variants of the pack/unpack instructions are generated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16175>
Would have caught a significant issue with ETC2 handling. Luckily Midgard dEQP
failed on this, even though Bifrost didn't (due to explicit strides?)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Rather than using it as a catch-all initialize, use it to fill in derived from
fields from a partially initialized image_layout. This is easier to understand
and, more importantly, easier to unit test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
We can always align the width/height, now that block_size is defined (as 1x1)
for linear textures. We can also remove the useless effective_depth assignment.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Handle linear, interleaved, and AFBC formats. This requires taking a format, as
block compressed u-interleaved textures have a different tile size than other
u-interleaved textures.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
This gets rid of the weird "call block_dim twice with a mystery argument"
pattern, and will allow us to further unify code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Rather than open-code the > 16 check in multiple places and have to justify it
in each. This is easier to understand at the call sites.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
This requires tons of driver changes we're not ready for. In the mean time, this
will just get in the way of refactoring AFBC support.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Midgard has multiple Surface Descriptor formats selectable in the texture
descriptor. Previously, we have used both the "64-bit surface descriptor" and
the "64-bit surface descriptor with 32-bit line stride and 32-bit layer stride".
A delicate routine tried to guess what stride the hardware will use if we don't
specify it explicitly, and omit the stride if it matches. Unfortunately, that
routine is broken in at least two ways:
* Textures with ASTC must always specify an explicit stride. Failing to do so
(like we were doing) is invalid.
* It applies even for interleaved textures. The comment above the function
saying otherwise is incorrect. (TODO: double check this)
Bifrost onwards always specify the strides explicitly. Let's just do that and
unify the gens. What is lost from doing this? A ludicrously trivial amount of
memory and texture descriptor cache space. 8 bytes per layer*level per texture,
in fact. Compared to the size of the textures being addressed, the memory usage
is trivial. The texture descriptor cache size maybe matters more. But given
Arm's hardware people went this direction for Bifrost and stuck to it, I doubt
it matters much.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Before we used GenXML, pan_texture mixed layout code with texture descriptor
packing code. For the most part, the layout code is generation-independent; the
pack code is not. We introduced an anti-pattern where the file was compiled N+1
times: N times for each PAN_ARCH value, and an extra time with no PAN_ARCH
value. And then the contents of the file changed completely depending on
PAN_ARCH. This is a pretty weird construction.
Let's instead split off the layout file from the descriptor file, compile the
layout file once, and compile the descriptor file per-gen.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Now that everything is appropriately refactored, we can support Valhall's data
structures in the blitter. Things look similar to Bifrost, but the RSD no longer
exists.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
Valhall's data structures are organized differently. In particular, they don't
use RSDs. So we need to reshuffle the blitter's data structures so we can map to
Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
On Valhall, the fragment shader differs based on whether IDVS or the legacy
geometry flow is used be. In particular, varyings are accessed differently.
We use the legacy geometry flow for blitting on all GPUs, so indicate this in
the shader inputs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
Required to query texture features on Valhall. It's technically the same as
previous Malis (except for narrow ASTC), but conceptually it's different as
plane descriptors have superseded indexed pixel formats for block compressed
textures.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
This helper has built-in support to be quieted, which seems like a good
idea to do on ci.
We're already setting the env var in the CI environment, so no need to
do that here.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16033>
This maps nicely to Mali's weirdo MKVEC, so implement it rather than
scalarizing. The scalarization wants an extract implemented which we don't have.
Fixes dEQP-VK.glsl.builtin.function.pack_unpack.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16120>
Change ASSERT_EQ to EXPECT_EQ to avoid aborting before freeing memory.
Fix defects reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable tiled going out of scope leaks the storage it points to.
leaked_storage: Variable linear going out of scope leaks the storage it points to.
leaked_storage: Variable ref going out of scope leaks the storage it points to.
Fixes: bb6c14a697 ("panfrost: Unit test u-interleaved tiling routines")
Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16127>
Copy the code. Fixes workgroup tests, now compute kernels should work properly
on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16123>
Implement as f2f32(f2f16(x)) with the conversions in flush-to-zero mode.
Accessing flush-to-zero mode on Bifrost is nontrivial: it is specified
per-clause, rather than per-instruction. I've opted to pipe support for ftz
clauses through the scheduler. This solution has two nice properties:
* It uses the native hardware for flushing subnormals, avoiding extra lowering.
* It's "smart" about scheduling around FTZ requirements, meaning we get good
code generated even for a shader that e.g. quantizes a vector.
With an unrelated scheduler fix, the *V2F32_TO_V2F16/+F16_TO_F32 operation fits
in a single tuple, minimizing the overhead of the special FTZ clause.
We'll have to do something a bit different for Valhall (FLUSH.f32), but we'll
worry about when we actually have PanVK brought up on Valhall.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16123>
Fix invalid usage of meson objects which violates official meson
specification and thus breaks muon, an implementation of meson
written in C.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15715>
Originally, PAN_GPU_ID was checked in the driver itself. I added the mechanism
to run Bifrost shader-db on my Midgard laptop. There was no drm-shim support at
this point, and this was a reasonable stop gap at the time.
Nowadays, we have a competent drm-shim implementation, which wholly replaces
this use case. So PAN_GPU_ID is only useful for drm-shim. Let's pull the code
into drm-shim and get it out of the driver. This allows NDEBUG drm-shim builds
to work properly.
While we're at it, the default emulated GPU is changed from Mali-T860 to
Mali-G52. This reflects our shifting development priorities.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15930>
The optimized routine documented the tiling format together with the software
algorithm. The reference implementation wants the tiling format alone
documented. Let's break out the high level documentation into somewhere
centrally accessible, and refocus the comments in the optimized file on the
optimization.
This documentation is linked bidirectionally with both implementations, so it
should be easy to find.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
These are complex and not used in all dEQP paths. They're also easy to unit
test, so add some tests to prevent regressions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
The exact semantics of these routines are subtle, although they match what
Gallium wants. We're about to add unit tests. Add some comments that make it
obvious what it is we expect these routines to do. (In particular, it's not a
general region-of-interest copy, it's a region-of-interest of the tiled image
and the entire linear staging image.)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
We depend on this invariant implicitly. Make that dependence explicit so we
don't get confused and add broken unit tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
Thanks to our macros and some type trickery, our generic tiling routines are
type-generic. So we just need to add 48-bit and 96-bit texel types to tile. Note
we only support power-of-two bit sizes in the specialized tile routines for the
sake of replacing a multiplication with a shift.
With this change, all pixel formats supported in Panfrost are tileable.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
Much less noisy, and provides a path to further improvements. There is a slight
behaviour change: int-to-float conversions now use RTE instead of RTZ. For
32-bit opcodes, this affects conversions of integers with magnitude greater than
2^23 by at most 1 ulp. As this behaviour is unspecified in GLSL, this change is
believed to be acceptable.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15187>
To prepare for defeaturing round modes, replace uses of round-to-positive with
round-to-even in our unit tests. This doesn't meaningfully impact test coverage;
there is no way to generate that round mode.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15187>
The relevant data structures have been shuffled a bit. We need to wire up AFBC
for Valhall; however, that's out of scope for the initial bring up. Just hide it
so we can build.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15795>
Use a Bifrost compatible path. It's not clear this is optimal but it passes the
tests and is no worse than what we do on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15795>
Surface descriptors have been replaced by plane descriptors, which facilitate
the intermediate layout of textures. This allows for more sophisticated handling
of texture compressions, of particular to interest to copy_image. However, it
requires a considerable amount of new logic to handle.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15795>
Based on hardware behaviour, it appears vertex_id is zero-based with the legacy
geometry flow but not with the new malloc IDVS flow. Since the geometry flow is
per-shader (not per-machine), there's not a good way to communicate this to NIR.
Rather than trying to shoehorn this obscure detail into NIR, just do the
lowering ourselves instead of in NIR. It's not much more code anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Our swizzle lowering optimizations depend on replication of scalar fp16. This
holds on Bifrost (at least for now), but not on Valhall which has proper support
for write masks. For now, enforce Bifrost-compatible behaviour as we do not make
use of the write masks on Valhall yet.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Replace LOAD.ubo with LD_BUFFER since the .ubo segment doesn't exist on Valhall.
We could do this with a lowering pass instead but this is probably fine.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
It is unclear if FP32 point sizes are supported on Valhall -- I can't get the
DDK to use them at any rate. Always lower them to FP16 and store them as FP16
for hardware use.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Varying stores was changed in Valhall. Rather than using attribute descriptors
like on Bifrost and Midgard, on Valhall we store to memory directly with
hardware-allocated buffers. This requires a new implementation of store_output,
with special provisions for writing gl_PointSize from a position shader.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Memory-allocated IDVS requires special varying load instructions that take an
offset into the hardware-allocated varying buffer, as opposed to a varying slot.
Emit these instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
This affects what instructions the fragment shader uses. Will be used for the
legacy geometry flow in blit shaders. Whether that is a good idea remains to be
seen, admittedly.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Valhall uses an updated version fo the TEXC path. To avoid disrupting the
existing Bifrost code, add a new Valhall-specific texture path that generates
the new-style texture instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Fewer arguments compared to Bifrost; the corresponding information is encoded in
a Valhall-specific blend shader prologue instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
On Bifrost, this is handled in the scheduler. Until we grow a Valhall scheduler,
add a NOP with the appropriate flow control. This is correct but carries a small
performance cost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
We need a slightly different idiom on Valhall, since the segment modifiers no
longer exist but we now have an immediate offset.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Something an instruction has two logic flow controls, namely wait + reconverge.
These are orthogonal -- we need to insert a NOP to handle this. Add a lowering
pass that works out flow control to replace the ad hoc previous va_pack_flow.
Fixes dEQP-GLES31.functional.ssbo.layout.single_basic_type.shared.lowp_vec3.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15756>
These use the attribute pipe, the new versions of LD_ATTR_TEX, but reading
texture descriptors instead of attribute descriptors unlike their Bifrost
predecessors.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15756>
They're both implemented in common code as long as you use
vk_command_buffer.
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>
We need to map to the interchange format, since there is no longer a pixel
format for the memory layout. Use this new format table on v9.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15586>
This is the sort of leakiness I hate about blend shaders. MSAA + blend shader is
somewhat obscure but gets hit in the CTS.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15586>
Although TEX_GATHER looks like TEX_FETCH, it does support shadow comparators
like TEX_SINGLE. Model this in the IR so we can use it.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15586>
Logic is lifted from bi_layout.c, adapted to work on instructions (not
clauses) and for Valhall's off-by-one semantic which is annoyingly
different than Bifrost. (But the same as Midgard -- Bifrost was
annoyingly different than Midgard!)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
We already collect enums in the ISA description XML. Export them for use in the
compiler backend, particularly the packing code.
Usually we'd use Mako for templating. In this case, the script is so trivial a
template engine didn't seem worth it. (The obvious version with Mako was about
10 lines longer than just prints and f-strings used here.)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
Group together dependency waits and flow control into a single enum. This
simplifies the code, clarifies some detail, and ensures consistency moving
forward.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
These are indirect versions of LD_VAR_BUF_IMM, taking their index in bytes. Used
for indirect varying loads (the NIR lowering is inefficient).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
Required on Valhall, where jumping to 0x0 doesn't automatically terminate the
program. Luckily the check is free there too.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
This is basically what's native on Valhall. Use the Valhall naming for the
pseudo-instruction on Bifrost for consistency.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
If there are modifiers only used by pseudo instructions, not the real
instructions, bi_packer can get out-of-sync with bi_opcodes, causing
hard-to-debug issues. Do the stupid-simple thing to ensure this doesn't happen.
This may be a temporary issue, depending whether ISA.xml and the IR get split
out for better Valhall support.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED()
so people don't get tempted to use it and make things incompatible with
MSVC (which doesn't support typeof()).
Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>
It's not usually valid, but sr_count == 0 is encodable and used for the
non-RETURN variant of ATOM1. Allow dis/assembling this syntax.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15515>
Otherwise, if batch->scissor_culls_everything is set for a single draw,
every draw after it in the batch will be skipped because the new
scissor/viewport state will never be processed. Process scissor state
early in draw_vbo to fix this interaction.
We do need to be careful: setting something on the batch can only happen when
we've decided on a batch. If we have to select a fresh batch due to too many
draws, that must happen first. This is pretty clear in the code but worth noting
for the diff.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Reviewed-by: Icecream95 <ixn@disroot.org>
Fixes: 79356b2e ("panfrost: Skip rasterizer discard draws without side effects")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5839
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6136
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15365>
We use bi_dontcare() to specify any encoding where we don't care about
the value, with a preference for power-efficient encodings. On Bifrost,
a (possibly nonexistant) FAU read is the best encoding. On Valhall, that
encoding doesn't exist so just use a zero. That should be good enough in
practice.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
When emitting code during or after register allocation, we need to be able to
emit constants without running the constant->{LUT, move, uniform} pass running
after. In particular, we need to access the constant 0 to implement spill code.
Add a Valhall-specific zero for this purpose.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
These are preloaded in different places across Bifrost and Valhall. Abstract
that away so code using the builder isn't littered with "is Valhall?" checks.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
The unknown field is a descriptor type, which we model as an opcode2 since it's
a fixed constant. This allows us to disambiguate LEA_TEX_IMM from LEA_ATTR_IMM.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
The position and varying shader environment descriptors are additional sections
of the job, rather than part of the (fragment only) DCD. This distinction
matters for non-IDVS jobs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
Lifetime of descriptor sets and pipeline layouts are odd. Let's refcount
them so we don't end up with use-after-free patterns.
That means we can't use custom allocators for those objects.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14406>
Now that we've switched to the common sync/submit framework, this is
implemented in runtime/vk_queue.c. We don't need to provide the stub.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15436>
This is just 2 versions of all the copy/blit entrypoings. The common
Vulkan runtime code will implement the 1.0 versions in terms of the 2
versions.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15436>
Put the 1.0 features at top followed by 1.1 and then 1.2. For filling
out the actual 1.1 and 1.2 structs, use the helpers.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15436>
Now all special FAU handling is unified, which makes both assembler and
disassembler considerably nicer. This adds some more special FAU indices from
page 0 that were previously missing, allowing them to be assembled and
disasembled.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
Like Bifrost, Valhall can access 2x as many fast acess uniforms as previously
thought. However, on Valhall this requires using the pagination mechanism.
Support this in the dis/assembler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
FAU pages do not need to be specified explicitly in the assembly. Rather, they
should be inferred by the assembler by the instructions used. Rewrite the code
handling this in alignment with new information about the hardware.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
These take up two slots, reading an aligned register pair, even though they are
in a 32-bit instruction. Required to correctly model BLEND.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>