Required to query texture features on Valhall. It's technically the same as
previous Malis (except for narrow ASTC), but conceptually it's different as
plane descriptors have superseded indexed pixel formats for block compressed
textures.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
This helper has built-in support to be quieted, which seems like a good
idea to do on ci.
We're already setting the env var in the CI environment, so no need to
do that here.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16033>
This maps nicely to Mali's weirdo MKVEC, so implement it rather than
scalarizing. The scalarization wants an extract implemented which we don't have.
Fixes dEQP-VK.glsl.builtin.function.pack_unpack.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16120>
Change ASSERT_EQ to EXPECT_EQ to avoid aborting before freeing memory.
Fix defects reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable tiled going out of scope leaks the storage it points to.
leaked_storage: Variable linear going out of scope leaks the storage it points to.
leaked_storage: Variable ref going out of scope leaks the storage it points to.
Fixes: bb6c14a697 ("panfrost: Unit test u-interleaved tiling routines")
Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16127>
Copy the code. Fixes workgroup tests, now compute kernels should work properly
on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16123>
Implement as f2f32(f2f16(x)) with the conversions in flush-to-zero mode.
Accessing flush-to-zero mode on Bifrost is nontrivial: it is specified
per-clause, rather than per-instruction. I've opted to pipe support for ftz
clauses through the scheduler. This solution has two nice properties:
* It uses the native hardware for flushing subnormals, avoiding extra lowering.
* It's "smart" about scheduling around FTZ requirements, meaning we get good
code generated even for a shader that e.g. quantizes a vector.
With an unrelated scheduler fix, the *V2F32_TO_V2F16/+F16_TO_F32 operation fits
in a single tuple, minimizing the overhead of the special FTZ clause.
We'll have to do something a bit different for Valhall (FLUSH.f32), but we'll
worry about when we actually have PanVK brought up on Valhall.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16123>
Fix invalid usage of meson objects which violates official meson
specification and thus breaks muon, an implementation of meson
written in C.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15715>
Originally, PAN_GPU_ID was checked in the driver itself. I added the mechanism
to run Bifrost shader-db on my Midgard laptop. There was no drm-shim support at
this point, and this was a reasonable stop gap at the time.
Nowadays, we have a competent drm-shim implementation, which wholly replaces
this use case. So PAN_GPU_ID is only useful for drm-shim. Let's pull the code
into drm-shim and get it out of the driver. This allows NDEBUG drm-shim builds
to work properly.
While we're at it, the default emulated GPU is changed from Mali-T860 to
Mali-G52. This reflects our shifting development priorities.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15930>
The optimized routine documented the tiling format together with the software
algorithm. The reference implementation wants the tiling format alone
documented. Let's break out the high level documentation into somewhere
centrally accessible, and refocus the comments in the optimized file on the
optimization.
This documentation is linked bidirectionally with both implementations, so it
should be easy to find.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
These are complex and not used in all dEQP paths. They're also easy to unit
test, so add some tests to prevent regressions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
The exact semantics of these routines are subtle, although they match what
Gallium wants. We're about to add unit tests. Add some comments that make it
obvious what it is we expect these routines to do. (In particular, it's not a
general region-of-interest copy, it's a region-of-interest of the tiled image
and the entire linear staging image.)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
We depend on this invariant implicitly. Make that dependence explicit so we
don't get confused and add broken unit tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
Thanks to our macros and some type trickery, our generic tiling routines are
type-generic. So we just need to add 48-bit and 96-bit texel types to tile. Note
we only support power-of-two bit sizes in the specialized tile routines for the
sake of replacing a multiplication with a shift.
With this change, all pixel formats supported in Panfrost are tileable.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15803>
Much less noisy, and provides a path to further improvements. There is a slight
behaviour change: int-to-float conversions now use RTE instead of RTZ. For
32-bit opcodes, this affects conversions of integers with magnitude greater than
2^23 by at most 1 ulp. As this behaviour is unspecified in GLSL, this change is
believed to be acceptable.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15187>
To prepare for defeaturing round modes, replace uses of round-to-positive with
round-to-even in our unit tests. This doesn't meaningfully impact test coverage;
there is no way to generate that round mode.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15187>
The relevant data structures have been shuffled a bit. We need to wire up AFBC
for Valhall; however, that's out of scope for the initial bring up. Just hide it
so we can build.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15795>
Use a Bifrost compatible path. It's not clear this is optimal but it passes the
tests and is no worse than what we do on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15795>
Surface descriptors have been replaced by plane descriptors, which facilitate
the intermediate layout of textures. This allows for more sophisticated handling
of texture compressions, of particular to interest to copy_image. However, it
requires a considerable amount of new logic to handle.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15795>
Based on hardware behaviour, it appears vertex_id is zero-based with the legacy
geometry flow but not with the new malloc IDVS flow. Since the geometry flow is
per-shader (not per-machine), there's not a good way to communicate this to NIR.
Rather than trying to shoehorn this obscure detail into NIR, just do the
lowering ourselves instead of in NIR. It's not much more code anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Our swizzle lowering optimizations depend on replication of scalar fp16. This
holds on Bifrost (at least for now), but not on Valhall which has proper support
for write masks. For now, enforce Bifrost-compatible behaviour as we do not make
use of the write masks on Valhall yet.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Replace LOAD.ubo with LD_BUFFER since the .ubo segment doesn't exist on Valhall.
We could do this with a lowering pass instead but this is probably fine.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
It is unclear if FP32 point sizes are supported on Valhall -- I can't get the
DDK to use them at any rate. Always lower them to FP16 and store them as FP16
for hardware use.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Varying stores was changed in Valhall. Rather than using attribute descriptors
like on Bifrost and Midgard, on Valhall we store to memory directly with
hardware-allocated buffers. This requires a new implementation of store_output,
with special provisions for writing gl_PointSize from a position shader.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Memory-allocated IDVS requires special varying load instructions that take an
offset into the hardware-allocated varying buffer, as opposed to a varying slot.
Emit these instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
This affects what instructions the fragment shader uses. Will be used for the
legacy geometry flow in blit shaders. Whether that is a good idea remains to be
seen, admittedly.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Valhall uses an updated version fo the TEXC path. To avoid disrupting the
existing Bifrost code, add a new Valhall-specific texture path that generates
the new-style texture instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Fewer arguments compared to Bifrost; the corresponding information is encoded in
a Valhall-specific blend shader prologue instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
On Bifrost, this is handled in the scheduler. Until we grow a Valhall scheduler,
add a NOP with the appropriate flow control. This is correct but carries a small
performance cost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
We need a slightly different idiom on Valhall, since the segment modifiers no
longer exist but we now have an immediate offset.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15793>
Something an instruction has two logic flow controls, namely wait + reconverge.
These are orthogonal -- we need to insert a NOP to handle this. Add a lowering
pass that works out flow control to replace the ad hoc previous va_pack_flow.
Fixes dEQP-GLES31.functional.ssbo.layout.single_basic_type.shared.lowp_vec3.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15756>
These use the attribute pipe, the new versions of LD_ATTR_TEX, but reading
texture descriptors instead of attribute descriptors unlike their Bifrost
predecessors.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15756>
They're both implemented in common code as long as you use
vk_command_buffer.
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>
We need to map to the interchange format, since there is no longer a pixel
format for the memory layout. Use this new format table on v9.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15586>
This is the sort of leakiness I hate about blend shaders. MSAA + blend shader is
somewhat obscure but gets hit in the CTS.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15586>
Although TEX_GATHER looks like TEX_FETCH, it does support shadow comparators
like TEX_SINGLE. Model this in the IR so we can use it.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15586>
Logic is lifted from bi_layout.c, adapted to work on instructions (not
clauses) and for Valhall's off-by-one semantic which is annoyingly
different than Bifrost. (But the same as Midgard -- Bifrost was
annoyingly different than Midgard!)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
We already collect enums in the ISA description XML. Export them for use in the
compiler backend, particularly the packing code.
Usually we'd use Mako for templating. In this case, the script is so trivial a
template engine didn't seem worth it. (The obvious version with Mako was about
10 lines longer than just prints and f-strings used here.)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
Group together dependency waits and flow control into a single enum. This
simplifies the code, clarifies some detail, and ensures consistency moving
forward.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
These are indirect versions of LD_VAR_BUF_IMM, taking their index in bytes. Used
for indirect varying loads (the NIR lowering is inefficient).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
Required on Valhall, where jumping to 0x0 doesn't automatically terminate the
program. Luckily the check is free there too.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
This is basically what's native on Valhall. Use the Valhall naming for the
pseudo-instruction on Bifrost for consistency.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
If there are modifiers only used by pseudo instructions, not the real
instructions, bi_packer can get out-of-sync with bi_opcodes, causing
hard-to-debug issues. Do the stupid-simple thing to ensure this doesn't happen.
This may be a temporary issue, depending whether ISA.xml and the IR get split
out for better Valhall support.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15223>
We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED()
so people don't get tempted to use it and make things incompatible with
MSVC (which doesn't support typeof()).
Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>
It's not usually valid, but sr_count == 0 is encodable and used for the
non-RETURN variant of ATOM1. Allow dis/assembling this syntax.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15515>
Otherwise, if batch->scissor_culls_everything is set for a single draw,
every draw after it in the batch will be skipped because the new
scissor/viewport state will never be processed. Process scissor state
early in draw_vbo to fix this interaction.
We do need to be careful: setting something on the batch can only happen when
we've decided on a batch. If we have to select a fresh batch due to too many
draws, that must happen first. This is pretty clear in the code but worth noting
for the diff.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Reviewed-by: Icecream95 <ixn@disroot.org>
Fixes: 79356b2e ("panfrost: Skip rasterizer discard draws without side effects")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5839
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6136
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15365>
We use bi_dontcare() to specify any encoding where we don't care about
the value, with a preference for power-efficient encodings. On Bifrost,
a (possibly nonexistant) FAU read is the best encoding. On Valhall, that
encoding doesn't exist so just use a zero. That should be good enough in
practice.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
When emitting code during or after register allocation, we need to be able to
emit constants without running the constant->{LUT, move, uniform} pass running
after. In particular, we need to access the constant 0 to implement spill code.
Add a Valhall-specific zero for this purpose.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
These are preloaded in different places across Bifrost and Valhall. Abstract
that away so code using the builder isn't littered with "is Valhall?" checks.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
The unknown field is a descriptor type, which we model as an opcode2 since it's
a fixed constant. This allows us to disambiguate LEA_TEX_IMM from LEA_ATTR_IMM.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
The position and varying shader environment descriptors are additional sections
of the job, rather than part of the (fragment only) DCD. This distinction
matters for non-IDVS jobs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15461>
Lifetime of descriptor sets and pipeline layouts are odd. Let's refcount
them so we don't end up with use-after-free patterns.
That means we can't use custom allocators for those objects.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14406>
Now that we've switched to the common sync/submit framework, this is
implemented in runtime/vk_queue.c. We don't need to provide the stub.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15436>
This is just 2 versions of all the copy/blit entrypoings. The common
Vulkan runtime code will implement the 1.0 versions in terms of the 2
versions.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15436>
Put the 1.0 features at top followed by 1.1 and then 1.2. For filling
out the actual 1.1 and 1.2 structs, use the helpers.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15436>
Now all special FAU handling is unified, which makes both assembler and
disassembler considerably nicer. This adds some more special FAU indices from
page 0 that were previously missing, allowing them to be assembled and
disasembled.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
Like Bifrost, Valhall can access 2x as many fast acess uniforms as previously
thought. However, on Valhall this requires using the pagination mechanism.
Support this in the dis/assembler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
FAU pages do not need to be specified explicitly in the assembly. Rather, they
should be inferred by the assembler by the instructions used. Rewrite the code
handling this in alignment with new information about the hardware.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
These take up two slots, reading an aligned register pair, even though they are
in a 32-bit instruction. Required to correctly model BLEND.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15364>
We were assuming per-vertex attributes so far. Let's extend the logic
to support per-instance attributes with or without custom instance
divisors. Now that we've got it all hooked up, we can enable
VK_EXT_vertex_attribute_divisor.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15295>
This adds a new get_resource_deref_binding helper which decodes a
resource deref into set, binding, and index. To make texture
instructions nicer, the index can optionally be split into immediate
and SSA parts.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15382>
In the NIR domain, some texture operations don't require a sampler, but
Bifrost/Midgard always want one. Let's add a dummy sampler to handle
that case.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15334>
We're nowhere close to even having Vulkan 1.0 working yet, there's no
reason to get too excited about 1.1. It just means piles more test
crashes for features we're claiming to support but don't. If we want to
enable more tests, we can turn on the extensions for those features once
we actually have them working.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15334>
Handle arrays generically by using the last component of the coordinate source
as the array index. That works for both 2D arrays and cube arrays, fixing cube
arrays. Cube arrays were already handled correctly in core Panfrost code.
This code path is not tested in dEQP-GLES31 without exposing OES_cube_map_array,
which depends on OES_geometry_shader, which we don't have. Yet we do expose
PIPE_CAP_CUBE_ARRAY, so ARB_cube_map_array is exposed.
Disabling PIPE_CAP_CUBE_ARRAY would be an easy band-aid fix, but it's easy
enough to handle correctly.
dEQP-GLES31 passes with a hack enabling OES_cube_map_array [without geometry
shaders].
Also fixes 1D arrays on Bifrost for the same reasons.
Fixes: 70d6c5675d ("pan/bi: Emit TEXC with builder")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15254>
Hardware support was removed with Midgard. Use mesa/st to emulate GL_CLAMP with
nir_lower_tex automatically (the Zink lowering), and disable GL_MIRROR_CLAMP
which isn't lowered correctly.
Fixes *texwrap* Piglit tests on G52.
Fixes: f9ceab7b23 ("panfrost: Fix CLAMP wrap mode")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15253>
nir_lower_explicit_io generates mul+add chains even for constants. One
round of constant folding should get rid of these. This fixes all of
the dEQP-VK.glsl.conversions.* tests on panvk.
GoGoGoGo'd-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15349>
We leave the undecorated ops as fine so we don't disturb panfrost. For
coarse derivatives, we use a lane ID of 0 for the first lane and 1 or 2
for the second depending on axis. This ensures that coarse derivatives
are quad-uniform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15352>
Instead of two magic ternary operations, define a new `axis` temporary
which is 1 for X and 2 for Y. Then define everything else in terms of
this variable. In particular, the mask operation we do on LANE_ID is a
mask so it makes more sense to use ~axis than 1/2 but in the other
order.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15352>
So we can easily add entries after the standard varyings/attributes
(like image descriptors in the attribute array).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
The idea of storing offsets in a separate UBO and lowering accesses to
UBOs/SSBOs with a dynamic offset was not great. Let's apply the offset
at UBO/SSBO emission time instead.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
This is to match other NIR terminology.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
Another instruction might write to the second source, and then an
INSTR_INVALID_ENC fault will be raised because the tuple will write to
and read from the register at the same time.
Fixes: 795638767d ("pan/bi: Use fused dual source blending")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
BITSET_MASK returns ~0 when given an input of zero, when we need it to
return 0 instead.
Fixes shaders with only sysvals but no UBOs when push constants are
disabled.
This breaks when 31 or 32 UBOs are used, but PAN_MAX_CONST_BUFFERS is
currently set to 16.
Fixes: c246af0dd8 ("panfrost: Only upload UBOs when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
This could be made slightly more efficient by only setting the dirty
state that is needed, but eventually you reach a point where it's
cheaper to re-emit everything than work out what can or can't be kept.
Fixes rendering issues in Duckstation.
Fixes: cd2c1ef9da ("panfrost: Dirty track textures/samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
TEXC can have two destinations; the value for neither of them can be
used in the same bundle, so extend the code to check for this to
iterate over both destinations.
Fixes artefacts in the game "LIMBO".
Fixes: a303076c1a ("pan/bi: Add bi_instr_schedulable predicate")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
Trying to write to overlapping register ranges from a single
instruction is undefined behaviour, so add interference between the
nodes to avoid this.
Hit in a dual-texture instruction in LIMBO.
Fixes: 9146bafbb4 ("pan/bi: Add dual texture fusing pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
Technically this is just JUMP on Valhall, but the semantic is an indirect branch
based on comparing with zero. It can also be used as a conservative branch (like
BRANCHC), but this isn't modeled.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
For Bifrost, we model load/store segments, for example for thread local storage.
We need something similar on Valhall -- access modifiers. There are four access
modifiers on Valhall, controlling memory subsystem optimizations for the access:
none: Nothing may be assumed. Corresponds to "global".
istream: Internally streaming within the GPU. Corresponds to "pos", as it's
used for position stores.
estream: Externally streaming outside the GPU. Corresponds to "vary", as it's
used for varying stores.
force: Force access in discarded threads. Corresponds to "tl", as it's required
for correct behaviour of helper invocations that use the stack.
If these access modifiers end up being useful outside these fixed purposes, we
may need to rework this part of the IR. For now, this should suffice.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
For now, the difference does not matter. However it's better to model the actual
hardware behaviour, rather than isomorphic driver behaviour, when we can do so.
So fix the names.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Technically the table is folded, too; the 0xD refers to table 61. But this
instruction is more general than previously thought.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
So close! However, LD_VAR_IMM is something else -- Bifrost-style varying
interpolation, without a hardware buffer. For ES3, we'll need to support both.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
These are unified in the hardware, so let's unify them in pan_shader_info.
Hoisting this logic to pan_shader.c avoids the need to duplicate this logic for
Midgard/Bifrost (RSD packing) and Valhall (SPD packing).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Instead of specifying the tiling on the texture descriptor, Valhall specifies it
on the plane descriptor. There is a new flag on the texture descriptor
specifying only whether the planes are interleaved or not.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
The plane descriptor is larger than earlier surface descriptors, so we need to
be somewhat careful here. This removes a memory micro-optimization in the
interest of simplifying the code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Unnecessary. To avoid even more #if/#endif soup, merge the v4, v5-v8, and v9
paths together -- by returning 0 as the compression tag on v4 or v9.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
To dump all graphics memory via the new pandecode_dump_mappings function(),
since for Valhall I have to do this often enough to warrant a dynamic flag.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
Influences cache prefetching. I don't see a good reason to put anything other
than descriptors inside shader resources, meaning always setting this bit is
appropriate (at least for GLES).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
The structure wrapped around the rb tree node was being freed, but not the node
itself, which caused a segmentation fault when accessing its parent node.
Add rb tree node remove call to fix it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15188>
Fix the definitions of the basic texturing instructions. In particular, a
register format and a write mask were previously missing, as well as incorrect
handling of staging registers.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
Ocassionally the 0 value has a meaningful value that's not meaningfully default,
so we want an enum to encode both possible states.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
When we shadow a resource, the backing BO is changed; as such,
existing references to the resource become invalid. So batches accessing the
resource need to be flushed (or otherwise have their references invalidated).
The wrong behaviour change (not flushing) was introduced when we started
tracking resources instead of BOs. The issue manifested as a severe performance
regression in glmark2's -bbuffer test, particular the subdata subtest. The issue
is magnified on slow CPUs; without the fix, the test becomes completely CPU
bound
Relevant glmark2 -bbuffer test from 43fps to 84fps.
Apparently, this causes functional issues too -- this performance-minded change
also fixes a few piglits.
Fixes: cecb889481 ("panfrost: Do tracking of resources, not BOs")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Chris Healy <cphealy@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13502>
To make sure it is applied in the cases we expect it to be, to avoid code
generation regressions. Functional regressions are expected to be caught by
integration-testing, so that is not focused on here.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
If a message-passing instruction like LD_VAR is preloaded, it will no longer be
counted in the shader cycle counts. Add a special message preload counter that
approximates the cost of preloading, so this information doesn't get a lost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
Include full message preload descriptors in the RSD on v7, and do the obvious
packing for fragment shader message preloads.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
The compiler will soon produce preloaded messages, but it should not pack them
itself, as this would require depending on GenXML or handcoding bitfields / bit
packs in the compiler. Instead, add a struct encoding the unpacked form of the
message, used as ABI between the compiler and the common driver.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
Actually an xfail but occassionally passes and gives us no new information, only
noise.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-and-acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15154>
Add a global-level variable that allows disabling all jobs that would
have gone to the Collabora lab, to be used in case of outages.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15150>
On Bifrost and Valhall, push uniforms are loaded into Fast Access Uniform
Random Access Memory (FAU-RAM). FAU-RAM is organized as an array of 64-bit
slots. A given tuple (Bifrost) or instruction (Valhall) may access at most a
single 64-bit slot. If an instruction requires uniforms from multiple 64-bit
slots, a uniform-to-register move must be inserted to avoid the hazard. However,
if an instruction requires a pair of 32-bit uniforms from the same 64-bit slot,
no move is required.
To reduce the number of moves we emit, this commit adds an optimization pass
that reorders pushed uniforms, trying to group uniforms used by the same
instruction. The pass works by creating a graph of pushed uniforms, where edges
denote the "both 32-bit uniforms required by the same instruction" relationship.
We perform depth-first search on this graph to find the connected components,
where each connected component is a cluster of uniforms that are used together.
We then select pairs of uniforms from each connected component. The remaining
unpaired uniforms (from components of odd sizes) are paired together
arbitrarily.
In principle, we should weight the graph by number of occurences and choose
pairs that maximize the total selected edge weight. This is left for
future work, as it is nontrivial -- selecting these edges optimally appears to
be NP-hard at first blush.
Implementation note: As position and varying shaders share FAU on Bifrost, extra
care is taken with a `push_offset` shader stage info parameter that ensures
varying shaders do not reorder uniforms selected by the previous position
shader.
total instructions in shared programs: 2503343 -> 2451758 (-2.06%)
instructions in affected programs: 1553309 -> 1501724 (-3.32%)
helped: 14256
HURT: 8
helped stats (abs) min: 1.0 max: 80.0 x̄: 3.62 x̃: 3
helped stats (rel) min: 0.06% max: 36.36% x̄: 7.31% x̃: 6.67%
HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.38 x̃: 1
HURT stats (rel) min: 1.30% max: 12.50% x̄: 4.99% x̃: 3.85%
95% mean confidence interval for instructions value: -3.66 -3.58
95% mean confidence interval for instructions %-change: -7.41% -7.20%
Instructions are helped.
total tuples in shared programs: 2008399 -> 1969627 (-1.93%)
tuples in affected programs: 1146344 -> 1107572 (-3.38%)
helped: 12867
HURT: 147
helped stats (abs) min: 1.0 max: 61.0 x̄: 3.03 x̃: 2
helped stats (rel) min: 0.17% max: 42.86% x̄: 6.79% x̃: 4.65%
HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.20 x̃: 1
HURT stats (rel) min: 0.29% max: 20.00% x̄: 2.12% x̃: 1.19%
95% mean confidence interval for tuples value: -3.03 -2.93
95% mean confidence interval for tuples %-change: -6.82% -6.57%
Tuples are helped.
total clauses in shared programs: 408005 -> 401708 (-1.54%)
clauses in affected programs: 90760 -> 84463 (-6.94%)
helped: 6006
HURT: 164
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.08 x̃: 1
helped stats (rel) min: 0.45% max: 33.33% x̄: 12.44% x̃: 14.29%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 1.64% max: 25.00% x̄: 9.81% x̃: 5.26%
95% mean confidence interval for clauses value: -1.03 -1.01
95% mean confidence interval for clauses %-change: -12.03% -11.66%
Clauses are helped.
total cycles in shared programs: 203308.37 -> 202737.83 (-0.28%)
cycles in affected programs: 19264.71 -> 18694.17 (-2.96%)
helped: 3024
HURT: 41
helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.19 x̃: 0
helped stats (rel) min: 0.17% max: 33.33% x̄: 3.83% x̃: 2.83%
HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.06 x̃: 0
HURT stats (rel) min: 0.30% max: 5.88% x̄: 1.41% x̃: 0.93%
95% mean confidence interval for cycles value: -0.19 -0.18
95% mean confidence interval for cycles %-change: -3.89% -3.64%
Cycles are helped.
total arith in shared programs: 76265.67 -> 74669.25 (-2.09%)
arith in affected programs: 45001.50 -> 43405.08 (-3.55%)
helped: 12945
HURT: 97
helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.17% max: 50.00% x̄: 8.06% x̃: 4.88%
HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.05 x̃: 0
HURT stats (rel) min: 0.21% max: 33.33% x̄: 2.16% x̃: 0.96%
95% mean confidence interval for arith value: -0.12 -0.12
95% mean confidence interval for arith %-change: -8.16% -7.81%
Arith are helped.
total quadwords in shared programs: 1796563 -> 1766803 (-1.66%)
quadwords in affected programs: 948830 -> 919070 (-3.14%)
helped: 12078
HURT: 219
helped stats (abs) min: 1.0 max: 42.0 x̄: 2.49 x̃: 2
helped stats (rel) min: 0.10% max: 33.33% x̄: 5.57% x̃: 5.26%
HURT stats (abs) min: 1.0 max: 4.0 x̄: 1.21 x̃: 1
HURT stats (rel) min: 0.33% max: 6.67% x̄: 2.00% x̃: 1.14%
95% mean confidence interval for quadwords value: -2.46 -2.38
95% mean confidence interval for quadwords %-change: -5.52% -5.36%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14163>
There are up to 4 instructions in the latter stage (if a branch is included),
not 3. Bump the limit to fix memory corruption.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15147>
We always blit from a particular level, so it's a waste to compute the LOD.
This corresponds to a simple texture instruction with implement 0 LOD, which is
the optimal texturing path on Bifrost -- it maps to TEXS_2D but does not require
helper invocations.
Functional change on Bifrost: Blit shaders no longer set .computed_lod or
shader_contains_barrier.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
There are always set to true. Don't pollute the driver code with them, make
their existence a local detail to pre-Valhall XML and that's it.
Functional change: "four components per vertex" is now set on vertex job DCDs.
This should be a no-op.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
This reduces code duplication and will ease Valhall porting. Functional changes
on v7:
* Shader contains barrier is now set (perf loss, fixed later in series)
* Shader register allocation is now set (perf win)
* Point sprite inverted, no-op for blit shaders
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
cont -> skip, last -> kill, and fix the special case handling. It's just an
enum. Makes the disassembly easier to read and closer to Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
Extend our existing bi_scoreboard infrastructure with a simple data flow
analysis pass that calculates which dependency slots need waiting. We
still lack a heuristic for selecting dependency slots.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
Bifrost post-RA dead code elimination can cull the destinations of
regular ALU instructions, by weakening from a register write to a
temporary write. However, there is no way to suppress staging writes, so
culling the destinations will result in invalid code generation.
Fixes a regression in
dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_static_vertex
with scoreboarding. The root cause there is the backend dead code
elimination not being sufficiently aggressive in the presence of control
flow. Usually this does not matter, since the backend optimizations are
intended to be local with global optimizations happening in NIR.
Unfortunately, our implementation of IDVS hits this hard. That will need
to be optimized (probably by specializing IDVS shaders in NIR instead of
the backend). In the mean time, let's fix the actual bug affecting
scoreboarding.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
They are useless (given the semantics of DTSEL_IMM) and complicate
scoreboarding. Just remove them in the pass that removes all the other silly
register destinations.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
Barriers need to wait on all outstanding messages. This is more of an API
requirement than a hardware requirement, but it's still an invariant the
scoreboarding pass must respect.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
The important thing isn't the number of words pushed, it's that there are no
UBOs required for us to upload. Check that instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
This reverts commit 29d319c767.
Now that we use nir_lower_bool_to_bitsize, we don't see 1-bit booleans
anymore, so the issue this fixed doesn't apply. Actually, that issue was
(in part) why I started looking into boolean handling in the first
place.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>
lower_bool_to_bitsize can generate i2i32 from a 32-bit source, which is
trivial but needs to be handled explicitly to avoid going down the 8-bit
conversion path.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>
This lets us avoid generating SWZ instructions. Those instructions could
be constant folded but that complicates the replication analysis
introduced in the next commit.
Almost no shader-db changes.
quadwords HURT: shaders/glmark/1-22.shader_test MESA_SHADER_FRAGMENT: 718 -> 722 (0.56%)
total quadwords in shared programs: 68169 -> 68173 (<.01%)
quadwords in affected programs: 718 -> 722 (0.56%)
helped: 0
HURT: 1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>
Label IDVS variants as being MESA_SHADER_{POSITION, VARYING} stages;
reserve the MESA_SHADER_VERTEX label for non-IDVS shaders. This reduces
confusion where a single shader compiles to two MESA_SHADER_VERTEX
shaders with different stats.
While we're at it, de-vendor the blend shader stage name; these stats
are internal anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15086>
There are many of them, and integration testing of the scheduler won't hit every
case. Add targeted unit tests for the various scheduling hazards of this funny
instruction.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>
This hazard exists but is obscure enough to be missed on our existing test
coverage (e.g the conformance tests). Add piles of unit tests for it.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>
This is a very obscure encoding restriction in the Bifrost ISA. Unknown if any
real apps or tests hit this, but we still need to get it right sadly.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>
To help identify problems across the compiler, print more forms of the shader
with MIDGARD_MESA_DEBUG=shaders. Roughly matches the Bifrost compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>
nir_builder_init_simple_shader does this automatically now.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14936>
Commit 38800b38 changed nir_opcodes.py, but that doesn't seem to have
triggered nir_opt_algebraic.py. The change in 75ef5991 depends on
opt_algebraic lowering 16-bit versions of slt, but if opt_algebraic is
not rebuilt, this may not happen. This resulted in some people seeing
assertion failures in, for example,
dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_3.step,
due to the backend seeing nir_op_slt that it didn't know how to handle.
v2: Add nir_opcodes.py to nir_algebraic_py so that all the per-driver
algebraic passes pick up the dependency too. Rename it to
nir_algebraic_depends. Suggested by Emma.
Closes: #6047
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15050>
Valhall has a new twist on Mali's task splitting voodoo, plus compute offset
support.
On Bifrost + Vulkan, compute offsets needed lowering on Bifrost (gl_GlobalID).
Valhall saves a few instructions here.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15047>
Only breaking change since Bifrost is that the shader contains barrier? flag is
now fragment-only, meaning it is just a spawn helper threads flag. This affects
compute shaders slightly.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
Contains stuff needed for layered rendering. Unfortunately, there's no more
provoking vertex per draw -- ugh! That's fine for Vulkan (just don't set
provokingVertexModePerPipeline), but requires inserting extra flushes on desktop
OpenGL.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15003>
Looks like 3 implementations already have that field in their private
command_buffer struct, and having it at the vk_command_buffer opens the
door for generic (but suboptimal) secondary command buffer support.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
The main thing is VK 1.3 testing, but also includes test bugfixes. The
1.3 CTS required an uprev of deqp-runner to handle a new style of test
output, and that deqp-runner brings in some neat new features, too (piglit
in your deqp-runner suite, and extension list checking).
A bunch of VK tests got renamed, so I replaced panvk's custom test list
with simple include filters on the main test list.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14920>
Block compressed formats like ETC2 are now indicated in the plane descriptor,
rather than the pixel format descriptor. Various other minor formats were
removed in Valhall; remove them from the XML so we don't accidentally try to use
them.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14935>
Remove a few that no longer exist, and rename IDVS helper to Malloc Vertex. The
distinction between Malloc Vertex jobs and regular Indexed Vertex jobs is that
the hardware allocates varying buffers dynamically for Malloc Vertex jobs.
Regular IDVS and even legacy tiler jobs are also supported where desired.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14935>
Merged with the Buffer descriptor, hence why it shares a type nibble. However,
Bifrost uses a dedicated tiler heap descriptor, and I see no benefit to merging.
So pretending it's a dedicated descriptor on Valhall too allows us to reuse the
Bifrost code with no modifications.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14935>
Compute shaders might do vec3(Xbits) loads which are translated
to LD.<next_pow2(3 * X)> by the midgard compiler. This might cause
out-of-bound accesses potentially leading to pagefaults if the
access is at the end of a BO. One solution to avoid that (maybe not
the best) is to split non-log2 loads to make sure we only read what's
requested.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>
In that case, the mask is specified on 32bit lanes, so we need to shift
it if it's > 0x3. The expand modifier will take care of selecting the
right side of the 32bit vector.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14885>