GLES doesn't support it, and blob VK doesn't support it. We could
theoretically lower it, but don't bother since it's not required. Fixes
various piglit image load/store tests.
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13852>
Fix a bug in BIND_PIPELINE XML reported by Dougall, which cleans up
a bit of both decoder and driver.
Instead of...
* 17 bytes BIND_PIPELINE (17)
* An unused 8 byte record (25)
* A set of N 8 byte records (25 + 8 * N)
* Oops, 1 byte too many! One just disappeared (24 + 8 * N)
It seems to instead be
* 24 bytes BIND_PIPELINE (24)
* A set of N 8 byte records (24 + 8 * N)
without the sentinel record. These means the 8 byte records themselves
are shuffled, with the high byte of the pointers split from the low
word, but that's less gross than an off-by-one.
It's still not clear what the last 8 bytes of the BIND_VERTEX_PIPELINE
structure mean, or the last 4 byte of the BIND_FRAGMENT_PIPELINE
structure which seems to be a bit shorter.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Dougall Johnson observed these structures make more sense with indices[]
first in the entries and indices[] absent from the header. Then the
sentinel entry disappears, nr_entries makes more sense, and a few magic
numbers pop out. Many thanks to Dougall's astute eyes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
this forces a full slab reclaim any time the device is known to have a
too-small BAR in order to keep memory usage at a minimum when it might otherwise
balloon out and crash us
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13850>
Some apps may try to use a viewport adjusted by 0.5 pixels (among other
things) to emulate d3d9 pixel center, and in this case we would end up
with incorrect "fake scissor" box (shifted by 1 pixel), hence pixels
being incorrectly scissored away when permit_linear_rasterizer is set
(this happens even if the linear rasterizer is not used in the end).
So adjust the offset so that the half-way points get rounded down instead
of up.
(This is all a bit iffy I think since we don't use fractional
boxes (with 8 subpixel bits) anywhere yet, but at least without msaa
it should work out.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13794>
Currently we run deqp-runner inside a single VM, which makes very poor
use of the available CPUs because Virgl has a bottleneck in the VMM that
serializes everything.
With this change, we can run several Crosvm instances in a runner and
make full use of the CPUs. Getting the same coverage with 3 runners
instead of 6.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
nopt will disable some shader optimizations that slow down test runs for
no gain.
no_quad_lod will disable some speed hacks that can cause inaccurate
results.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
this shouldn't report the budgeted available memory, it should return
the total memory, as that's what this api expects
Fixes: ff4ba3d4a7 ("zink: support PIPE_CAP_QUERY_MEMORY_INFO")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Allows memory allocated by llvmpipe_allocate_memory_fd to be
mappable to guests in virtualized environments like KVM which
requires page-aligned memory.
llvmpipe_allocate_memory is updated similarly for consistency.
Signed-off-by: Omar Akkila <omar.akkila@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13793>
The lima board farm will be unavailable for a few days, so disable it
to avoid CI failures.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13595>
Now that we aren't short-circuiting most of the code, we should probably
reorganize it a little bit. Tagged with fixes just so we pull all the
refactors together as one group.
Fixes: b21e916a62 ("iris: Combine iris_use_pinned_bo and add_exec_bo")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13808>
We need to perform cross-batch flushing if any batch writes to a BO
while others refer to it. We checked this case when recording a new
BO in the list which we'd never seen before. However, we neglected to
handle the case when we already read from a BO, but then began writing
to it. That new write may provoke a conflict between existing reads
in other batches, so we need to re-check the cross-batch flushing.
Caught by Piglit's copyteximage when forcing blits and copies to use
a new IRIS_BATCH_BLITTER that isn't upstream yet. But this bug could
be provoked by render/compute work today...we just hadn't noticed it.
Fixes: b21e916a62 ("iris: Combine iris_use_pinned_bo and add_exec_bo")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13808>
This should have no functional change, but it's tagged with Fixes
anyway because it's needed for the bug fix in the next patch.
Fixes: b21e916a62 ("iris: Combine iris_use_pinned_bo and add_exec_bo")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13808>
We are using the same definitions for both OpenGL and Vulkan, so let's
move it to common.
As we are here we are also adding versioning on the TFU register
definition. Those are basically register bit places, so really likely
to change between versions.
Adding 33 as it is the first version they got defined.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13832>
llvmpipe's fragment shaders are always run sequentially and
in API order for a single tile, so it's impossible to have
out of order render target writes requiring fetch barriers.
Issues fixed in previous commits were actually breaking most
piglit/deqp tests for coherent extension variant.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
Use lp_build_fetch_rgba_soa instead of lp_build_unpack_rgba_soa.
This one was failing most of deqp *framebuffer_fetch* tests.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
In certain cases variant->opaque could be set to true, which
reset command list for tiles fully covered by a triangle
with this shader. This is obviously wrong in presence of
framebuffer fetch.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
Using 1 bit per wrap mode looked very suspicious and after some
experiments it turns out it's 3-bit enum.
Border color is also here, it sits right after depth field. For
some reason it uses 16 bit per channel just like for clear color in RSW
GL_CLAMP mode is broken for nearest filter just as on Midgard, so add
the same workaround - use GL_CLAMP_TO_EDGE for nearest filter.
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
It's just a matter of changing number of dimensions in texture
descriptor.
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
It looks like MBS format used by blob doesn't distinguish sampler2D from
sampler3D, so load texture instruction is the same for 2D and 3D
textures.
So all we need to RE is texture descriptor for 3D textures, but blob
doesn't implement it, so we need to do some guesswork:
- unknown_3_1 looks like a depth since it sits after height/width and
always set to 1
- unknown_2_2 is exactly 3 bits and it follows wrap_t, so it must be
wrap_r
- missing part is texture type for 3D textures. By trial and error it
seems to be 4. First bit is only set for cubemap, so it's likely a
separate flag, and rest 2 bits look like number of tex dimensions akin
to midgard and later (thanks, panfrost!) with 0 for 1D, 1 for 2D
and 2 for 3D.
Put it all together and we have working 3D textures on lima!
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
use a user SGPR instead. This will be needed in the future.
Also don't upload small_prim_precision because it's passed via
VS_STATE_BITS.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
The downside is that this duplicates shader code for clip/cull distances
in both the position and parameter portions of the shader.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
I'd assumed that since insert didn't take a deleter, it was
find-or-insert, not insert-or-replace. This caused a bo reference
leak if the same bo was used more than once in a batch.
Fixes: fde36d7992 ("d3d12: Don't wait for GPU reads to do CPU reads")
Reviewed By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13819>
Currently lima uses generic TXP lowering that results in downgrading
coords precision to FP16 since we have to do some calculations with
coords instead of loading them directly from varying.
Mali4x0 has native TXP support, however coords and projector have to
come from a single source.
Add NIR lowering pass that combines coords and projector into a single
backend-specific source and use it instead of generic lowering.
Unfortunately this change regresses one test, but it also fails in blob and
disassembly is now identical.
shader-db diff:
total instructions in shared programs: 15623 -> 15603 (-0.13%)
instructions in affected programs: 877 -> 857 (-2.28%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 2.86 x̃: 2
helped stats (rel) min: 0.87% max: 10.53% x̄: 4.93% x̃: 1.85%
95% mean confidence interval for instructions value: -4.95 -0.76
95% mean confidence interval for instructions %-change: -9.31% -0.55%
Instructions are helped.
total loops in shared programs: 3 -> 3 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 136 -> 137 (0.74%)
spills in affected programs: 0 -> 1
helped: 0
HURT: 1
total fills in shared programs: 598 -> 602 (0.67%)
fills in affected programs: 0 -> 4
helped: 0
HURT: 1
Tested-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13111>
We prefix names with an underscore to make them "safe" C identifiers
when necessary. For example, a value of "32x32" would become "_32x32".
However, when specifying something like
<field ... prefix="BLOCK_SIZE">
<value name="32x32" value="0"/>
</field>
we already have a prefix that makes the field name safe. We'd rather
generate a name with a single underscore, i.e.
#define BLOCK_SIZE_32x32 0
rather than
#define BLOCK_SIZE__32x32 0
This also fixes up affected defines in crocus.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
We had accidentally passed <x, y, z, l> instead of <l, x, y, z>.
Fixes: b8ef3271c8 ("iris: Move suballocated resources to a dedicated allocation on export")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13815>
This fixes the various gl_HelperInvocation-based tests. There's a
lowering pass which converts it to (1 << sampleid) & samplemask.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
Unclear whether this fixes anything, but the blob does seem to set
these. (Discovered while trying to determine if value clamping was
missing for non-32-bit integer formats, which fail in some tests.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
Instead of treating it as 2 bits to enable, make BUFFER a type (and
extend the bitfield width), and then add a separate BUFFER bit
(ostensibly to perform the width/height concatenation but who knows).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13805>
This is more correct because we are loading constants into an SGPR even
though there is no effect on behavior in this case.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
We're currently only calling it after creating the screen and the
bufmgr. There are a few cases where Iris checks for the DEBUG_BUFMGR
bit before we call brw_process_intel_debug_variable(), which means
intel_debug is 0 and so we don't run the debug code. Today, these are
all related to the creation of the workaround bo and its mmap.
I found this in a custom branch after I converted to INTEL_DEBUG an
environment variable that I had.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13780>
Mali4x0 PP doesn't have a swizzle for load_input, so use POT-aligned
varyings to avoid unnecessary movs for vec3 and precision downgrade
in case if this vec3 is coordinates for a sampler
shader-db:
total instructions in shared programs: 15707 -> 15623 (-0.53%)
instructions in affected programs: 3906 -> 3822 (-2.15%)
helped: 47
HURT: 18
helped stats (abs) min: 1 max: 9 x̄: 3.09 x̃: 2
helped stats (rel) min: 1.49% max: 23.53% x̄: 8.20% x̃: 6.45%
HURT stats (abs) min: 1 max: 7 x̄: 3.39 x̃: 3
HURT stats (rel) min: 0.78% max: 20.59% x̄: 10.45% x̃: 10.97%
95% mean confidence interval for instructions value: -2.18 -0.41
95% mean confidence interval for instructions %-change: -5.70% -0.38%
Instructions are helped.
total spills in shared programs: 146 -> 136 (-6.85%)
spills in affected programs: 39 -> 29 (-25.64%)
helped: 6
HURT: 0
total fills in shared programs: 617 -> 598 (-3.08%)
fills in affected programs: 125 -> 106 (-15.20%)
helped: 6
HURT: 0
HURT shaders are vertex shaders where we may need more instructions
for non-packed vec3s. It's acceptable trade-off since we don't get
precision downgrade if this varying is coordinates for a sampler.
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13151>
there's no way to know what an image will be used for, so this bit needs
to always be added
fixes KHR-GL46.packed_pixels.varied_rectangle.compressed_rgb
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13798>
VK CTS just added some new tests to write to a compressed image
from a compute shader, which was overrunning memory.
The image width/height need to be sized according to the block
sizes to avoid overwriting memory.
dEQP-VK.image.sample_texture.*bit_compressed*
Cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13618>
Using double immediates can cause a natively-float value to have to get
upgraded to a double unnecessarily. Use float immediates where possible.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13764>
fma32 only round once so has 0.5UP accuracy. mad32 round twice so
has 1UP accuracy. This accuracy difference sometimes make the result
different at the last bit.
Applications like META need more accuracy for display right result.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13686>
This makes a4xx look more like a3xx for these settings. Most importantly
it adds the workaround for allowing the hw to decide between min and mag
filtering. This fixes a number of dEQP texture filtering tests.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13763>
When force-flushing after every draw, this would otherwise hit a NULL
batch in fd_blitter_clear.
Tested on a4xx.
Suggested-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13761>
GLES2.0 spec allows parts of wide lines and points to be drawn even if
their center is outside the viewport.
Therefore 0x2000 in PLBU_CMD_PRIMITIVE_SETUP has to be set for points.
This is already our default setting as it seems to have no negative
effect when this bit is always set. Points work as expected but lines
don't. It's hard to RE it, because the affected deqp tests also fail
with the blob.
To respect this behaviour for lines and solve another 2 tests, we need
to do a workaround and temporarily extend the viewport by half of the
line width. The scissor rectangle is still equal with the initial
viewport.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12971>
In order to implement GL_PRIMITIVES_GENERATED, v3d allocates a small
resource and adds a command to the job to store the prim counts to it.
However it was only doing this when TF was enabled which meant that if
the query was used with a geometry shader but no TF then the query would
always be zero. This patch makes the driver keep track of how many
PRIMITIVES_GENERATED queries are in flight and then enable writing the
prim count if its more than zero.
Fix dEQP-GLES31.functional.geometry_shading.query.primitives_generated_*
v2: Update CI expectations and references to fixed tests in commit log.
v3: - Add comment that GL_PRIMITIVES_GENERATED query is included because
OES_geometry_shader, but it is not part of OpenGL ES 3.1. (Iago)
- Update Fixes to commit introducing geometry shaders. (Iago)
Fixes: a1b7c084 ("v3d: fix primitive queries for geometry shaders")
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13712>
This deletes a whole lot of code, but there's a modest drawoverhead perf
loss:
drawoverhead 1-image change -6.48856% +/- 4.28269% (n=50)
drawoverhead 8-image change -5.29195% +/- 2.62549% (n=90)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
Having checked the deltas between fdl6_view and what we did before, switch
over to fdl6_view now.
No statistically significant difference on no-hw drawoverhead 8-texture
change (n=50) with the texture cache disabled from this and the previous
commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
We had been storing pointers to a driver owned swizzle table
rather than storing the actual swizzle value in various shader
and pipeline keys on both GL and Vulkan drivers.
This doesn't look very robust, particularly since we also
compute sha1 hashes from these values and we may store these
hashes to disk (for the disk cache).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13738>
this was a little spaghetti-ish: the module hash was sometimes being applied
during module update, sometimes in draw during program create, and then also
it was removed when a shader unbind would cause the program to no longer be reachable
now things are more consistent:
* keep removing module hash when program becomes unreachable
* only apply module hash in draw during updates there
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
the overhead from creating new inlined shader variants is likely to be less than
the time required to fully optimize and run those variants, so just
inline 100% of the time to cut down shader runs
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13727>
some drivers won't create zs textures in any shape but 2D. this can be
handled instead by using 2D textures and then performing shader rewrites to
convert shadow samplers for 1D and 1DArray types to 2D/array
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13583>
We don't current enable post sync operations, but it is probably
better to set them to "internal" MOCS than to remove the non-zero
checking for this genxml field.
Reworks:
* Fix COMPUTE_WALKER in cmd_buffer_trace_rays (s-b Jason)
Fixes: 7b78b2fcac ("intel/genxml: Assert that all MOCS fields are non-zero on Gfx7+")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13624>
On GFX version 12.5+ with COMPUTE_WALKER, this is the limit based on the
size of the HW packet. On older HW, we can technically go a bit bigger
but there's not much point. Technically, some hardware can support a
scalar workgroup size up to 2048 but most apps don't go any bigger than
1024.
As discussed on the merge request page, the current limit assumes
SIMD32, but it is unclear if we want to encourage applications to use
SIMD32 if it may lead to additional register spilling in shader
programs. Many applications have likely tuned for a limit of 1024
based on the OpenGL minimum limit, so it might not gain much by
advertising more than 1024.
Reworks:
* Jordan: Use MIN2 and limit total invocations as well.
* Jordan: Add second paragraph to commit message based on merge
request discussion.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13538>
having this be super granular was a neat idea, but really I don't care
even a little bit about a driver that's weirdly implementing *only*
dynamic vertex input or *only* dynamic state2
this massively cuts down the combinatorics and provides a more accurate
gauge of driver feature levels, since this is the general level of support
that they're likely to have
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13715>
Modern games may use more than 16 sampler views, so get what the host
actually supports, and default to 16 on old hosts that don't pass the
value.
Since the possible maximal value of PIPE_MAX_SHADER_SAMPLER_VIEWS doesn't
fit into an uint32_t remove the binding flags, they were only used for
releasing the sampler views, and this can be achieved differently.
v2: Fix compilation error
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13646>
shader_stages_mask and others expect MESA_SHADER_* based values,
not PIPE_SHADER_*...
Without this the fragment shader wouldn't appear in the "Pipelines"
pane of RGP.
Fixes: c276bde34a ("radeonsi/sqtt: export shader code to RGP")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13696>
When a device has its first slice/subslice fused off, we can't use the
number of slices/subslices to iterate the mask array.
v2: Fix spelling (Marcin)
Use size_t for iterator (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Matt Roper <matthew.d.roper@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5601
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
../mesa-9999/src/gallium/drivers/llvmpipe/lp_texture.c:55:10: fatal error: git_sha1.h: No such file or directory
Fixes: 1608a815e3 ("llvmpipe: add support for EXT_memory_object(_fd)")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: orbea <orbea@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13665>
For example, this allows us to take advantage of command-streamer
based register offsets in mi_builder.
Ref: 06cf838cbd ("intel/mi_builder: Support gen11 command-streamer based register offsets")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13652>
this ended up being a little trickier than I thought; lazy
descriptors don't use dynamic ubo types for the push set,
which means drivers that (correctly) assert dynamic offset existence
explode because the descriptor template will never work with the
push set
the better, though slightly more annoying, option here is to use the
lazy manager's faster descriptor allocation and lesser complexity to
quickly grab a push set, then tweak the existing cached codepath slightly
in order to update a raw vkdescriptorset
Fixes: 417477f60e ("zink: always use lazy (non-push) updating for fbfetch descriptors")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13677>
If an app re-issues a timestamp query a lot, but doesn't ever ask
for the results, we could end up running off the end of our query
heap. But we don't actually need to advance/accumulate, so just
use a single entry in the heap.
Reviewed By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12920>
The slice->size0 temp is being used as both the array stride (incorrectly)
and as the size of the slice (for this assert). This assert doesn't seem
to be in the right place to me, if you want to check that offset+slice
size is < bo size, you could just do that at the end of layout setup.
This caused troubles when fixing the temp to be the actual array stride
for filling out the HW state, since then rendering to nonzero levels would
think that the rendering overflowed the BO when it doesn't.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
- Unset depth_cleared_level_mask for non-clear blits. Set the flag after
the clear, so that we don't have to check blitter_running.
- Set depth_cleared_level_mask only when we set depth_clear_value.
Fixes: ff8a930cf7 - radeonsi: add _once suffix to depth_cleared_level_mask
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13603>
This can happen if view->resource is false.
Fixes a warning in GCC 9+ that's been bugging me for a very long time when building Mesa.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12955>
It returned false when a render condition was not bound, but it should
have returned true.
The bool stuff is random and incomplete, but that's life.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13602>
vertex shader stages that can produce xfb must have
their input size clamped to the compiler define MAX_VARYING
to successfully be able to export an xfb output for each input
fixes KHR-GL46.geometry_shader.limits.max_input_components
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13632>
si_update_ps_colorbuf0_slot used blitter_running as a way to detect
recursive calls.
Unfortunately this catch too many cases; for instance a backtrace
like:
#0 si_update_ps_colorbuf0_slot
#1 si_set_framebuffer_state
#2 do_blits
[...]
#5 si_blit
#6 si_copy_region_with_blit
Would end-up not updating ps_uses_fbfetch; so if the new fb_state is
something like:
cbufs = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, zsbuf = 0x55b8987545e0}
We can have ps_uses_fbfetch=true but cbufs[0] = NULL, which causes a
crash later in si_ps_key_update_framebuffer.
This commit fixes intermittent crashes in KHR-GL46.stencil_texturing.functional.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13550>
src can use dcc even for non sdma v5 variants because si_decompress_dcc
is called in si_sdma_copy_image.
Fixes: 46c95047bd ("radeonsi: implement si_sdma_copy_image for gfx7+")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13550>
We have to set 8c01 to say "leave these channels alone" when
clearing/storing just Z or S of z24s8. Fixes the bypass path for
KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8.
Cc: mesa-stable
Fixes: #5592
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13659>
fbfetch descriptors are uncacheable due to having mixed descriptor types
in the same set, so this needs to always use lazy updating to avoid
exploding the cache and crashing
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13654>
There are three approaches for problem:
- use dummy shader (current solution)
- use software rendering
- stub
First option always gonna give bad results, second one
gonna be slideshow (r300/r400 at best are running with
old 2 cores cpu), third one can sometimes give graphical
gliches.
IMHO third one is least annoying option.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13642>
sparse binds have to be processed synchronously with cmdbuf recording to
avoid resource object desync in the vk driver, which means they have to be
done in the driver thread instead of the flush thread. this necessitates
adding locking for the queue since there is now a case when submissions occur
in a different thread
fixes illegal multithread usage in KHR-GL46.CommonBugs.CommonBug_SparseBuffersWithCopyOps
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13597>
The single-plane descriptor emit helper doesn't strictly need the UBWC
reloc, since imageBuffer can't be UBWC, but it means the function is ready
to be used for non-buffer image descriptors later.
no-hw drawoverhead 1-imageBuffer change throughput 1.95457% +/- 1.44325%
(n=127).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13635>
The gmem store stores both depth and stencil for z24s8. So, if we're
doing a write (clear or draw) to one or the other of the channels, we need
the other one restored as well.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13649>
VVL already prints its messages using configurable settings. there's no
reason for zink to unconditionally repeat them immediately after
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13633>
After running the (renamed) dxil_nir_split_typed_samplers pass, the
shader will have either:
* Textures, which map to D3D SRVs
* Bare samplers, which map to D3D bare samplers
* Images, which map to D3D UAVs
There shouldn't be any remaining samplers with type information
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13390>
In the menu of CS:GO R8_SRGB textures are uploaded and read back, and
since R8_SRGB can't be read back on GLES, because it is not a rendertarget
format and glGetTexImage and siblings don't exists, we can't default to
enabling reading back this format. This leads to an emulation of the
glGetTexImage calls issued by CS:GO, and this slows down the menus a lot
(below 1 fps on Intel XE hosts).
So add this driconf tweak and enable it for CS:GO to work around the issue.
It can be done safely, because in this case we actually can use the data
that is stored on the host in the backing IOV.
This tweak lets the CS:GO menu run at around 60 FPS when run with virgl
on a Intel XE host when it would run with less than 1 FPS without the tweak.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13572>
This shouldn't fix any real world bugs, except it will now clear more
stale syncobjs than it was previously doing, and actually do what the
comment says it does.
I could not find a real workload where this change would be relevant,
although I didn't try too much. I wrote my own little egl program to
test this.
I spotted this while reading the code when investigating a Piglit
failure [0]. It turns out this part the code was not relvant for the
failure.
[0]: ext_external_objects-vk-image-display-overwrite
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13536>
These are manual jobs, so the expected results got out of date.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13089>
Fix defect reported by Coverity Scan.
Evaluation order violation (EVALUATION_ORDER)
write_write_typo: In unsized = unsized =
glsl_array_type(glsl_uintN_t_type(bit_size), 0U, bit_size / 8U),
unsized is written twice with the same value.
Fixes: f79a25653b ("zink: move all shader bo/sharedmem access to compiler passes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13607>
the max driver limit for these is irrelevant if there isn't enough memory
to allocate a buffer of that size
KHR-GL46.texture_buffer.texture_buffer_max_size
cc: mesa-stable
fixes#5568
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13584>
We can't rely on regid to be unique, shaders can have multiple varyings
with the same output value. Normally shader linking deduplicates these,
but we still need to handle the case for xfb. So use slot instead as
the unique identifier.
Fixes KHR-GLES31.core.gpu_shader5.fma_precision_*
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13605>
there's other variants of implicit lod sampling, and none of them are valid
outside fragment stage
Fixes: 3ad06b6949 ("zink: always use explicit lod for texture() when legal in non-fragment stages")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13585>
This is used by our local CI (ie. vk-cts-image) which is a separate
project outside of Mesa. We use it for testing RADV since a while.
The CI lists have been created against Navi2x (Sienna Cichlid).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13573>
This eliminates the overhead of invoking si_decompress_depth.
The complication here is that we need to update needs_depth_decompress_mask
every time we update dirty_level_mask.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13492>
It was mistakenly disabled, decreasing performance a lot.
Only valid for Mesa 21.3.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 21.3 <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13539>
Remove redundancy functions for vcn2 encode. Re-using the vcn1 quality params
function as a result.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13511>
Update vcn 1 encode interface, upgrade interface minor version from 2 to 9,
and add necessary parameters accordingly.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13511>
We've noticed issues with these tests when uprevving Mesa in Chrome OS.
This CI catches some existing failures, and some debug-build assertion
failures as well.
To do this, uprev deqp-runner for its new gtest-runner command. This
runner is not as efficient as I would hope, due to some expensive code in
gtest. I've reported the issue to gtest and it should be easily fixable,
but for now it at least means we get to use the same baseline/skip/flake
handling we have from deqp and piglit runners.
I also fixed build-libdrm for our rootfses to not throw away libdrm's
share directory, which was causing a bunch of test-time spam from radeon's
libdrm when trying to look up its marketing name tables (not that big of a
deal for deqp-runner, but really noisy for piglit and libva-utils which
make gallium screens approximatly per-test).
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13419>
isl now uses info->mocs regardless of whether there's any actual
depth/stencil/HiZ buffers involved, so pass it a legitimate one,
rather than zero. When we have entirely NULL surfaces, we just
default to the MOCS value for an internal buffer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is disabled stream output targets, MOCS shouldn't matter, as
there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for these null stream
output buffers; we can just assume a MOCS for generic internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
This reorganizes the code so that we set fields in a tidy order:
1. Set the base addresses
2. Set either buffer sizes (Gfx8) or upper bound values (Gfx4-7)
(These are logically the same thing, but expressed differently.)
3. Set MOCS (Gfx6+)
I find this easier to follow.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We were only setting MOCS for dynamic state, surface state, instruction,
and indirect base addresses on Gen8+. We should set them on Gen6+.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is disabled stream output targets, MOCS shouldn't matter, as
there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for these null stream
output buffers; we can just assume a MOCS for generic internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is 3DSTATE_VERTEX_BUFFERS where we set NullVertexBuffer.
It shouldn't matter here, as there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for null vertex buffers.
We can assume an internal buffer and request isl's vertex buffer MOCS.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we missed
setting a non-zero MOCS was in 3DSTATE_CONSTANT_ALL packets which fully
disable all constant buffers. (If any constant buffer was present, we
would set an actual MOCS value.)
MOCS really shouldn't matter here, as there are no actual constant
buffers to be cached. That said, it should be harmless to do so, and
we can just assume a generic MOCS for internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We were leaving this blank due to a Broadwell restriction, causing our
constant buffers to be uncached. We later fixed this for Gfx12+, but
left Gfx9-11 without a fix. We should specify one.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
isl now uses info->mocs regardless of whether there's any actual
depth/stencil/HiZ buffers involved, so pass it a legitimate one,
rather than zero. When we have entirely NULL surfaces, we just
default to isl's MOCS value for an internal depth buffer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We don't use bindless sampler states today, but when we do, we'll want
them to have proper MOCS values. This also avoids asserts in upcoming
patches which enforce that MOCS isn't zero.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
this check was illegal because the usage bits weren't yet populated,
so add another check after usage bits are determined to figure out if
CUBE_COMPATIBLE can be applied
additionally, checking sample counts was never needed since the spec
prohibits CUBE_COMPATIBLE use with multisampling
zink DEBUG: ERR: 'Validation Error: [ VUID-vkGetPhysicalDeviceImageFormatProperties-usage-requiredbitmask ] Object 0: VK_NULL_HANDLE, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x991b3105 | vkGetPhysicalDeviceImageFormatProperties: value of usage must not be 0. The Vulkan spec states: usage must not be 0 (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-vkGetPhysicalDeviceImageFormatProperties-usage-requiredbitmask)'
Fixes: 71494c4874 ("zink: only mark resources as cube-compatible if supported")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12580>
The test names are definitely unique (deqp has specific prefixes, piglit
uses '@' as a separator instead of '.'), so we can just have a single file
regardless of test type. Merges the two groups of xfails together so you
can't mix up which file to edit (I certainly have), and so that we don't
need to introduce yet another set of files when we add gtest for libva.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13517>
We have two testsuites with the same format for fails/flakes/skips files,
and test names that are definitely unique. As I'm about to add a third
testsuite (gtest for libva-utils), so let's have just one file each for
fails/flakes/skips instead of one per type of testsuite. This starts the
move with just the bulk rename of deqp.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13517>