The "fragment shader required?" computed state is about fragment shader side
effects. There may be no fragment shader required but depth/stencil side effects
meaning that rasterization is nonoptional. What actually gates rasterization is
the rasterizer discard bit. Use that instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16204>
Otherwise wide lines break. The alternative approach is to eliminate the points
writes when not drawing points since we do have topology information at compile
time. I'm admittedly stuck in my GL mindset. That's the approach we'll need for
Valhall anyway.
Fixes dEQP-VK.rasterization.interpolation.basic.lines_wide
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16204>
The VAR_TEX definition in ISA.xml only has a field for texture_index,
so trying to read sampler_index will return zero; read from
texture_index instead, and rename other fields for consistency.
The texture and sampler indices must be equal for VAR_TEX to be used,
so either name could be used for the field.
Fixes the wrong textures being used in Thief.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6219
Fixes: eb1479bda2 ("pan/bi: Support message preloading")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16255>
While using three component texture formats results in CTs failures,
three component vertex attributes are fine, and not allowing them
results in significant performance regressisons.
Fixes: e41958e344
r600: Disable eight bit three channel formats
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6399
v2: rename function to is_buffer_format_supported (Emma)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16267>
Discrete platforms don't have LLC, but on those, we mmap our buffers
with WC. So we shouldn't need to clflush there.
Anv already had a boolean field on the physical device to know whether
we need to use clflush(), based off the memory heaps available. So use
that instead.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15780>
981bd8cbe2 moved outputs removing handling to NIR, but instead of
applying it only to the last stage before the FS this now applies
it to both the GS and the VS.
This commit fixes this by clearing the kill_outputs field for
the VS when using a ES-GS shader.
Fixes: 981bd8cbe2 ("radeonsi: apply key.ge.opt.kill_{outputs,pointsize,clipdistance} in NIR")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16249>
Layout transitions are not relevant to us, we only care about barriers
that involve a sync point between read/write actions on the image across
GPU jobs.
Image transitions from undefined layout can only happen before the image
is ever used by the GPU, which means they are never relevant to our
implementation.
This improves performance in vkQuake.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16235>
The current DCE pass hits issue around phi nodes. These need to be
solved properly eventually, but for now workaround them by doing
something obviously correct (but suboptimal compile time).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
We need to insert parallel copies at the logical end of blocks, before branches.
Add a pseudo instruction signaling that. Cribbed from ACO.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Lifted from ir3. Algorithm is the same; the data structures and interface are
lightly modified to decouple from ir3's IR.
Sequentializing parallel copies after RA is tricky. ir3's implementation works
well enough, so I use that one.
Original implementation by Connor Abbott.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Rather than using builder magic (implicitly lowered on emit), add actual pseudo
operations (explicitly lowered before encoding). In theory this is slower, I
doubt it matters. This makes the instruction aliases first-class for IR prining
and machine inspection, which will make optimization passes easier to write.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Lifted from Bifrost. Add some basic optimizer tests (they pass!) to show the
compiler is ready to be unit tested. Given we can't have hardware CI for Asahi
yet -- and dEQP is still pretty janky -- unit testing should prove quite useful.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
Instructions, bytes, and registers -- this should hold us over until we
can reverse the underlying uarch and get proper cycle estimations.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16268>
We usually use pdevice for "physical device" and not "device pointer".
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16259>
This changes the intel_device_info calculation to call an additional
DRM query requesting the geometry topology from the kernel, which may
differ from the result of the current topology query on XeHP+
platforms with compute-only and 3D-only DSSes. This seems more
reliable than the current guesswork done in intel_device_info.c trying
to figure out which DSSes are available for the render CS.
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14143>
The glsl-to-tgsi code generation and GLSL IR linker is is going away
(!8044), so we need to make the call on whether to use nir-to-tgsi (See
!15932 and !15541), or switch over to the NIR code generator. The NIR
backend should reduce the compile time regression while providing more
direct control over the IR we receive than going through NTT, while still
providing the optimization that NIR-to-TGSI was bringing us.
nv92 shader-db:
total local in shared programs: 2048 -> 1988 (-2.93%)
local in affected programs: 2048 -> 1988 (-2.93%)
total gpr in shared programs: 688468 -> 724705 (5.26%)
gpr in affected programs: 437159 -> 473396 (8.29%)
total instructions in shared programs: 6115978 -> 5874401 (-3.95%)
instructions in affected programs: 5038041 -> 4796464 (-4.80%)
total loops in shared programs: 1361 -> 835 (-38.65%)
loops in affected programs: 538 -> 12 (-97.77%)
total bytes in shared programs: 42389752 -> 40480416 (-4.50%)
bytes in affected programs: 36311616 -> 34402280 (-5.26%)
LOST: 0
GAINED: 1 (pixmark-piano)
nv120 shader-db:
total local in shared programs: 4416 -> 1988 (-54.98%)
local in affected programs: 4416 -> 1988 (-54.98%)
total gpr in shared programs: 870534 -> 893490 (2.64%)
gpr in affected programs: 564210 -> 587166 (4.07%)
total instructions in shared programs: 6379402 -> 6243210 (-2.13%)
instructions in affected programs: 5430790 -> 5294598 (-2.51%)
total bytes in shared programs: 68184224 -> 66729672 (-2.13%)
bytes in affected programs: 58013544 -> 56558992 (-2.51%)
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15949>
nvc0 aligns to 0x10 in setting up its rogram header, but nv50 TLS
allocation expects the incoming value to be aligned already (like TGSI
always did). Avoids regression in
KHR-GL33.shaders.arrays.declaration.dynamic_expression_array_access_* with
the nir backend.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15949>
The nir_move/sink caused instructions to sink interleaved into the output
stores at the end of the shader. nouveau's RA doesn't track liveness of
FS outputs in registers after the export instruction, so they could end up
overwritten. To work around it, after normal NIR move/sink, move the
output stores back to the end of the shader.
Fixes: b1fa2068b8 ("nouveau/nir: Enable nir_opt_move/sink.")
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15949>
The ARB_shader_objects spec says the following:
> The error INVALID_VALUE is generated by any command that takes one or
> more handles as input, and one or more of these handles are not an
> object handle generated by OpenGL.
And a long, long time ago, we used do to just that for
glDeleteObjectARB... Until 9ac9605de1, all the way back in February 2006,
where the error condition was removed without explanation.
Let's restore it, because it should really be there.
This was noticed by running the tests that are in the mesa-demos
repository, that actually tested this condition.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16211>
We haven't been doing what the comment says for about a decade, it's
about time to update the comment!
Fixes: 5f60a00743 ("st/glx: remove STENCIL_BITS, DEFAULT_SOFTWARE_DEPTH_BITS")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16213>
Otherwise, the code to actually run Release() might not be loaded or
callable anymore.
Fixes: 193cf76c ("microsoft/compiler: add common dxil-validator API")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16225>
I can't reproduce GPU hangs after 5 CTS runs and Timur also confirmed
that his Bonaire GPU didn't hang after one CTS run.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16244>
With NGG GS, the hardware can't know the number of generated primitives
and we have to increment it manually from the shader using a plain GDS
atomic operation.
Though this had a serious problem (see this old TODO) if the bound
pipeline was using legacy GS because the query implementation was
relying on NGG GS. Another situation is if we had one draw with NGG GS,
followed by one draw with legacy (or the opposite) the query result
would have been broken.
The solution is to allocate two 64-bit values for storing the begin/end
values if the query pool is supposed to need GDS and accumulate the
result with the number of generated primitives generated by the hw.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15892>
When HW binning is used tile loads/stores could be skipped
if there is no geometry in the tile.
Loads could be skipped when:
- The attachment won't be resolved, otherwise if load is skipped
there would be holes in the resolved attachment;
- There is no vkCmdClearAttachments afterwards since it is likely
a partial clear done via 2d blit (2d blit doesn't produce geometry).
Stores could be skipped when:
- The attachment was not cleared, which may happen by load_op or
vkCmdClearAttachments;
- When store is not a resolve.
I chose to predicate each load/store separately to allow them to be
skipped when only some attachments are cleared or resolved.
Gmem loads are moved into separate cs because whether to emit
CP_COND_REG_EXEC depends on HW binning being enabled and usage of
vkCmdClearAttachments.
CP_COND_REG_EXEC predicate could be changed during draw_cs only
by perf query, in such case the predicate should be re-emitted.
(At the moment it is always re-emitted before stores)
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15974>
The input is an array so moving it to a single temporary value doesn't
seem to make much sense. I also don't see any piglit regressions when
not moving the value to a temporary.
Fixes: bc912bace1
virgl: Add workarounds for virglrenderer input/sv signedness bugs.
v2: remove unused enum for SAMPLEMASK (Emma)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15997>
This way we can make allow_draw_out_of_order true by default for all
apps, iff the driver allows it.
And allow_draw_out_of_order=false can still be used in drirc, for
apps that need this optim to be turned off.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16139>
This changes the code so that it only looks at the passed in families
when concurrent, otherwise it always allocates one.
Fixes: 48b3ef625e ("vulkan/wsi: handle queue families properly for non-concurrent sharing mode.")
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15331>
Casts shouldn't change the bit pattern of the deref and you have to cast
again after you're done with the ALU anyway so we can ignore casts on
ALU sources. This means we can actually start constant folding NULL
checks even if there are annoying casts in the way.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15673>
This just is an initial wrapping of all calls into the driver
to check for codec support.
The idea is to add more to this function to support the meson
level disables.
Acked-by: Christian König <christian.koenig@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15258>
This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.
All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
shadow2darray with NIR, since the shading languages don't expose txl of
those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>
extended_dynamic_state.*_raster tests timeout because the new
VK_DYNAMIC_STATE_RASTERIZER_DISCARD_ENABLE is not handled in venus.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16222>
This patch adds Tile 4 modifier support to Mesa and allows Mesa to
use Tile 4 on gen12-hp with GBM.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: 22.1 <mesa-stable>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14521>
Although modifiers which use a clear color plane specify that the
plane's pitch should be ignored, some kernels have been found to require
64-byte alignment.
Cc: mesa-stable
Fixes: db475c81b7 ("iris: Return non-zero stride for clear color plane")
Reported-by: Dongwon Kim <dongwon.kim@intel.com>
Suggested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14521>
This will only work if all contexts have been destroyed. If the app
attempts to re-create one context, while another outstanding context
exists and is still in the removed state, then the screen is not
recovered and the new context will fail to create.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15002>
This also breaks screen init/deinit into two parts. The first part of
creation cannot fail, and is not repeatable. The second part of creation
can fail, and is repeatable, to be used for reset recovery.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15002>
We can at least correctly return whether the context was lost, but
at this point can't correctly tear down and create a new one, nor
do we support the callback for dynamic notification.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15002>
is_pixmap is defined in kopper_allocate_textures() as being (!window && x11),
which is very different from this check, which determines whether the drawable
is a window
so rename it to keep things consistent
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16190>
On platforms where we're not using DXGI swapchains, there's no reason
to disallow DISPLAY for formats like B5G6R5. In fact, on Android,
we need to support this format as BIND_DISPLAY.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16154>
Convert all SNORM formats to SINT.
This fixes SNORM blits for radeonsi.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16132>
I added this hack to my tree when testing another MR and ended up
squashing it into c2a3236d1a (etnaviv: clean up tiling setup in
etna_compile_rs_state) by accident when doing some changes to this
commit. Reinstate the assert.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16212>
Force default device if MESA_VK_DEVICE_SELECT_FORCE_DEFAULT_DEVICE
environment variable set. This will not give multiple device
options to app. There are apps that selects gpu to use based on its
own criteria, this patch can force default behaviour for these apps
by giving only one gpu device to select from.
v2: return 0 if no physical device present (Mihai Preda)
v3: document environment variables (Mihai Preda)(Marek Olšák)
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15585>
Without this being atomically incremented and decremented, I observed
this assert triggering in debug builds:
src/vulkan/wsi/wsi_common_x11.c:x11_present_to_x11_dri3():
assert(chain->sent_image_count <= chain->base.image_count);
I think this was happening since,
src/vulkan/wsi/wsi_common_x11.c:x11_handle_dri3_present_event()
which decrements chain->sent_image_count may be run in a separate
thread.
Fixes: d0bc1ad377 ("vulkan/wsi/x11: add sent image counter")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15908>
With externaly imported resources, we can have situations where we can't
mmap and directly access linear buffers. So use the staging blit path
for this case.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
This is needed for the VIRTGPU_WAIT ioctl to work.
TODO we could perhaps limit this, since it is not needed for residency,
but only fencing. Ie. we could omit cmdstream, and probably anything
that has FD_BO_NOMAP flag.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
This syncs up with the protocol of what eventually landed in virglrender.
1) Move all static params to capset to avoid having to query host
(reduce synchronous round trips at startup)
2) Use res_id instead of host_handle.. costs extra hashtable lookups in
host during submit, but this lets us (with userspace allocated IOVA)
make bo alloc and import completely async.
3) Require userspace allocated IOVA to simplify the protocol and not
have to deal with GEM_NEW/GEM_INFO potentially being synchronous.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
These paths should be corner cases, but still it is a bad idea to block
in the host (because it is single threaded), so instead just turn waits
in the host into polling in the guest.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
If supported by host virglrenderer and host kernel, use userspace
allocated GPU virtual addresses. This lets us avoid stalling on
waiting for response from host kernel until we need to know the
host handle (which is usually not until submit time).
Handling the async response from host to get host_handle is done
thru the submit_queue, so that in the submit path (hot) we do not
need any additional synchronization to know that the host_handle
is valid.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
ring_idx zero is the CPU ring, others map to the priority level, as each
priority level for a given drm_file on the host kernel side maps to a
single fence timeline.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
Check the shader IR type first before freeing the NIR IR in
draw_delete_xxx_shader() in case the IR has been converted to TGSI
and the NIR IR has already been freed.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16199>
This doesn't fix anything because memcpy is only used before secondary
buffer execution and we dirty everything after that.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16189>
JPEG does not require create and destroy codec messages.
It is not firmware based, so these messages are redundant.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16160>
Dumb buffers do not work with AMD gpus. So use AMD ioctl to create
proper buffers.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16187>
Instead of calling later an ioctl to get the device id, let's store it
while initializing the physical device.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16187>
We use nir_assign_io_var_locations() which compacts the varyings and
eliminates any unused input slots. We need to do the same thing when
processing pVertexAttributeDescriptions[] or else we'll end up with
mismatches between the shader and the state setup code.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16183>
This commit make simple adding tests which use both GL(ES) and VK.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16048>
Due to both Lavapipe on Windows and Dozen, we need to support MSVC in
the shared Vulkan code. So let's make sure we compile with the
compatibility flags for it.
Techinically speaking, we also need this in the wsi subdir, because we
also compile wsi_common_win32.c with MSVC. But wsi_common_wayland.c
contains void-pointer arithmetic, causing compiler errors if we do.
Fixing that properly is a bit more involved, because Meson doesn't love
passing different compiler arguments per source-file. The alternative is
to remove the void-pointer arithmetic, but that seems a bit pointless as
this code will never be compiled on MSVC.
So, let's leave that one out for now. We can probably do better in the
future, but this gets us a step further.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6386
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16162>
To workaround game bugs where partial derivatives are used in
non-uniform control flow. A proper solution needs to be implemented,
but as a quick fix disabling nir_opt_sink() works.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16165>
On older GPUs a color tile was always 64 Byte. On new GPUs with
CACHE128B256BPERLINE support the tile size is either 128 Byte or
256 Byte depending on the TS mode. Add a helper to return the
color tile size and use in in places that use hard-coded tile
size values or do their own calculation.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
128B/256B tile support is not a HALTI5 property, but has its own
separate feature bit.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
With access to HALTI5 GPUs with and without DEC400 compression it's
obvious that the previous compression state setup only worked when
DEC400 was present. Properly set up the compression state bits.
This is only the second part of the fix, first part is moving the
compression state to the correct bit location, which has already
happened via the import of new rnndb headers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
On GPUs with the CACHE128B256BPERLINE feature the RS gained some
new state bits to deal with the new additional information required
for this big tile support.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
Using the raw layout bits in the tiling setup makes this function harder
to read than necessary. Use the tiling bit defines and assign them to
some local bools with a proper name to make this easier to read.
No functional change.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
Support for multiple constant sources per instruction is not a HALTI5
capability, there is a separate feature bit to signal the availability
of this shader core enhancement.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
We used the number of pipes to determine which state registers to use
for the RS pipe address configuration, as the dual pipe GPUs were the
first one where the new states were used. This isn't correct though,
as now there are single pipe GPUs which also use the new state
addresses.
There actually is a feature flag telling us to use the new RS pipe
address states, use it. As this feature flag is not available on early
GPUs using the new base address (mostly because we don't have HWDB
entries for them), still check for more than a single pipe as an
additional clue to use new states.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
We used the number of pipes to determine which state registers to use
for the PE pipe address configuration, as the dual pipe GPUs were the
first one where those new states were used. Now there are some new
single pipe GPUs where this logic breaks. HALTI0 added the new PE
address states and all GPUs with at least this feature level are using
the new states exclusively, even if they only have a single PE pipe.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
Bits per tile and the tile clear value are not determined by the
HALTI version, but by two separate feature bits that are not always
present on HALTI5 GPUs. With big 128B/256B tile support the bits
per tile are always 4.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
Update to rnndb commit ad665b720421.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
The kernel exposes more minor GPU feature registers. Fill them
all into our internal feature struct.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
The debug option only disables the general can_supertile spec of the GPU, so
we should also take this into account when deciding about the layout of a
sampler resource.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9255>
When the divisor is 0, the compiler should generate a different VS
prolog instead of re-using a previous prolog that uses nontrivial
divisors. This is because divisor == 0 and divisor > 1 should use
a different path to guarantee that the index is correctly computed.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16009>
With NTT these opcodes are now emitted and need to be handled.
Fixes: a4840e15ab
r600: Use nir-to-tgsi instead of TGSI when the NIR debug opt is disabled.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16130>
* Don't lower fp64 to software when on Cayman but
* lower fpow only when on native NIR, the TGSI backend handles
TGSI_OPCODE_POW
Fixes: a4840e15ab
r600: Use nir-to-tgsi instead of TGSI when the NIR debug opt is disabled.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16130>
Sine NAN's can be involved the result can't be deducted like this.
Also with NTT inplace now we can assume that most possible
arithmetic optimizations have already been applied.
Piglit: spec@glsl-1.30@execution@range_analysis_fsat_of_nan
Fixes: a4840e15ab
r600: Use nir-to-tgsi instead of TGSI when the NIR debug opt is disabled.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16130>
Because for fragment shaders we still use the variables, and
lower_io_to_vector may leave dead variables that duplicate inputs
that are now vectorized, we have to call this pass, because otherwise
we will may hit the assertion
src/gallium/auxiliary/tgsi/tgsi_ureg.c:318:
ureg_DECL_fs_input_centroid_layout:
Assertion `(ureg->input[i].usage_mask & usage_mask) == 0'
This is relevant for
spec@arb_enhanced_layouts@execution@component-layout@*
on r600/ntt
Fixes: a4840e15ab
r600: Use nir-to-tgsi instead of TGSI when the NIR debug opt is disabled
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16130>
There's below AHB VU on the image view:
VUID-VkImageViewCreateInfo-image-02399
If image has an external format, format must be VK_FORMAT_UNDEFINED
This is well hidden and completely missed from the original venus ahb
implementation.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16159>
The GLSL lowering of half float packing involves software conversion
to half-float; instead, use the lowering in NIR.
Both Midgard and Bifrost are already set to lower the instructions to
bit operations, but change mdg_should_scalarize so that the lowerable
split variants of the pack/unpack instructions are generated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16175>
due to desync between the frontend and the driver, the size that the
depth buffer was created with may not match the size of the swapchain if
the window is being resized very quickly, so just go ahead and clobber
the existing depth buffer with a series of very illegal internal object
replacements to make everything match up
do not try at home.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16151>
The vkd3d-proton ray tracing tests delete shader modules after creating
pipeline libraries from them. This resulted in a use after free when
creating ray tracing pipelines.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16027>
Would have caught a significant issue with ETC2 handling. Luckily Midgard dEQP
failed on this, even though Bifrost didn't (due to explicit strides?)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Rather than using it as a catch-all initialize, use it to fill in derived from
fields from a partially initialized image_layout. This is easier to understand
and, more importantly, easier to unit test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
We can always align the width/height, now that block_size is defined (as 1x1)
for linear textures. We can also remove the useless effective_depth assignment.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Handle linear, interleaved, and AFBC formats. This requires taking a format, as
block compressed u-interleaved textures have a different tile size than other
u-interleaved textures.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
This gets rid of the weird "call block_dim twice with a mystery argument"
pattern, and will allow us to further unify code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Rather than open-code the > 16 check in multiple places and have to justify it
in each. This is easier to understand at the call sites.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
This requires tons of driver changes we're not ready for. In the mean time, this
will just get in the way of refactoring AFBC support.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Midgard has multiple Surface Descriptor formats selectable in the texture
descriptor. Previously, we have used both the "64-bit surface descriptor" and
the "64-bit surface descriptor with 32-bit line stride and 32-bit layer stride".
A delicate routine tried to guess what stride the hardware will use if we don't
specify it explicitly, and omit the stride if it matches. Unfortunately, that
routine is broken in at least two ways:
* Textures with ASTC must always specify an explicit stride. Failing to do so
(like we were doing) is invalid.
* It applies even for interleaved textures. The comment above the function
saying otherwise is incorrect. (TODO: double check this)
Bifrost onwards always specify the strides explicitly. Let's just do that and
unify the gens. What is lost from doing this? A ludicrously trivial amount of
memory and texture descriptor cache space. 8 bytes per layer*level per texture,
in fact. Compared to the size of the textures being addressed, the memory usage
is trivial. The texture descriptor cache size maybe matters more. But given
Arm's hardware people went this direction for Bifrost and stuck to it, I doubt
it matters much.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
Before we used GenXML, pan_texture mixed layout code with texture descriptor
packing code. For the most part, the layout code is generation-independent; the
pack code is not. We introduced an anti-pattern where the file was compiled N+1
times: N times for each PAN_ARCH value, and an extra time with no PAN_ARCH
value. And then the contents of the file changed completely depending on
PAN_ARCH. This is a pretty weird construction.
Let's instead split off the layout file from the descriptor file, compile the
layout file once, and compile the descriptor file per-gen.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15991>
New blob versions always emit this state on GPUs that don't have the
NEW_GPIPE feature bit before drawing a primitive, as it needs to be
set according to the primitive type.
Closes: #2933
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16094>
Most of the time when the logging code is invoked, it means we're
already in an edge case. It should be as robust as possible, otherwise
we risk making hard to debug things even harder. To that end, instead
of blowing up if passed a NULL object on the list, handle it as
gracefully as we can.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16107>
These look similar to Bifrost IDVS but with a twist: memory allocation is
handled by the hardware, and the descriptors are split up. Add the handling for
these.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
Instead of being globbed into the RSD, Valhall uses minimal shader program
descriptors. For IDVS, we need separate descriptors for position and varying
shaders. It's actually worse -- we need separate descriptors for drawing points
and drawing lines/triangles in order to skip over the gl_PointSize write. Adapt
prepare_shader to upload all these descriptors.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>
The split between attribute descriptors and buffer descriptors parallels that of
Bifrost's attribute descriptors and attribute buffer descriptors, with some
shuffling and simplication.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16035>