evergreen_emit_atomic_buffer_setup_count is only called if chip >= EVERGREEN
otherwise atomic_used_mask is left uninitialized when unconditionally used by
r600_need_cs_space so it might want more space than needed
fix this by always initializing atomic_used_mask
Fixes: 32529e6084 ("r600/eg: rework atomic counter emission with flushes")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7417>
spirv can't handle this, so instead we have to convert this into constant values
for any driver passing this sort of instruction through to vtn
eventually this will get removed in favor of using direct bo derefs
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7199>
We just freed the NIR after turning it into TGSI, no using it in that last
switch statement.
Closes: #3725
Fixes: 57effa342b ("st/mesa: Drop the TGSI paths for PBOs and use nir-to-tgsi if needed.")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7407>
Detects the MoltenVK layer and extension.
If present, get the ext function pointers and use to enable full swizzeling suport.
Fixes issues with Swizzling behaviour fro MoltenVk is disabled by default and needed to be enable via this API.
This also supplied the ground work to allow IOSurfaces to be used later for surface passing.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7383>
This will create code that is easier to combine into MADs/FMA when the
last component of the vector is 1.0.
nir_opt_algebraic_late has an optimization to do something similar but it
only works for inexact code, if the multiplication-by-1 optimization is
done before it and if the backend enables fuse_ffma.
fossil-db (Navi):
Totals from 4296 (3.75% of 114665) affected shaders:
SGPRs: 283468 -> 283764 (+0.10%); split: -0.02%, +0.12%
VGPRs: 172868 -> 172904 (+0.02%); split: -0.09%, +0.11%
CodeSize: 14045312 -> 14027128 (-0.13%); split: -0.15%, +0.02%
MaxWaves: 59285 -> 59282 (-0.01%); split: +0.04%, -0.05%
Instrs: 2703507 -> 2683187 (-0.75%); split: -0.76%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>
This will create code that is easier to combine into MADs/FMA when the
last component is 1.0.
nir_opt_algebraic_late has an optimization to do something similar but it
only works for inexact code, if the multiplication-by-1 optimization is
done before it and if the backend enables fuse_ffma.
fossil-db (Navi):
Totals from 85583 (74.64% of 114665) affected shaders:
SGPRs: 4556060 -> 4558596 (+0.06%); split: -0.07%, +0.12%
VGPRs: 3315060 -> 3312984 (-0.06%); split: -0.23%, +0.17%
SpillSGPRs: 13552 -> 13553 (+0.01%)
CodeSize: 184962756 -> 184431388 (-0.29%); split: -0.32%, +0.03%
MaxWaves: 1208693 -> 1209361 (+0.06%); split: +0.17%, -0.11%
Instrs: 35678819 -> 35361617 (-0.89%); split: -0.91%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>
Drop the PAN_MESA_DEBUG=bifrost flag. Load on Bifrost chips by default.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7408>
It doesn't look like the lowering in !6473 will land before the branch
point. Let's nop out point sprites in the backend to avoid MMU faults
from creating invalid Midgard-style varyings.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7408>
We need to inject a *CUBEFACE1 at pack-time so everything works out.
This is a pretty ugly hack but it'll hold us over until we have a real
scheduler, at which point it won't be necessary at all.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7408>
We need to do the transform specified in the OpenGL spec ourselves, with
some assistance from the hardware.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7408>
Some special instructions are scheduled on the FMA unit, let's add a
new class for this case and rename the old one accordingly.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7408>
Like with other getters, invalid enum is dealt in find_value by setting
error to GL_INVALID_ENUM and returning INVALID_TYPE which makes
get_value_size return 0.
Fixes false 'implementation errors' seen with Piglit test:
ext_external_objects-memory-object-api-errors
"Mesa 20.3.0-devel implementation error: invalid value type in GetUnsignedBytei_vEXT()
Please report at https://gitlab.freedesktop.org/mesa/mesa/-/issues"
v2: add assert to get_value_size() (Lionel)
Fixes: e064d66020 ("mesa: implement glGetUnsignedByte{v|i_v}")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eleni Maria Stea <estea@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7403>
Something may go wrong during import which leaves pointer to null and
when ctx and it's shared state gets destroyed we will attempt to call
memobj_destroy. Instead of forcing every driver to handle it, add check
here.
Fixes crashes with Piglit test:
ext_external_objects_fd-memory-object-api-errors
Fixes: 99cf910834 ("mesa/st: Actually free the driver part of memory objects on destruction.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eleni Maria Stea <estea@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7403>
This stops issues if a driver returns a value that is greater than a signed int.
Also make it match many of the other limit versions conversions.
Seen on Radeon drivers, IIRC. gpuinfo.org also reports many GPUs returning 4GB values.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7402>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f21995f98 ("radv: add new drirc option radv_enable_mrt_output_nan_fixup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7423>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: bdd7587414 ("radv: use nir_lower_discard_to_demote to work around game bugs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7423>
Addresses the following linker error when building for Android:
ld.lld: error: undefined symbol: freedreno_dev_info_init
>>> referenced by freedreno_screen.c:1001 (external/mesa3d/src/gallium/drivers/freedreno/freedreno_screen.c:1001)
>>> freedreno_screen.o:(fd_screen_create) in archive [..]/libmesa_pipe_freedreno_intermediates/libmesa_pipe_freedreno.a
These functions were introduced in a file that was not included in the
Android build yet. Also sort the list of files alphabetically as
requested in an earlier MR.
Fixes: 4a0bdf47e4 ("freedreno: Introduce common device info struct")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7411>
It is not used right now, so keeping it adds some noise/confusion.
So far configuring Z test are done through the CFG_BITS. See
v3dX(emit_state) at v3dx_emit.c for v3d, and pack_cfg_bits at
v3dv_pipeline.c for v3dv. There flags like z_updates_enable and others
are filled up.
That key field seems like a leftover coming from using vc4 as
reference, as that driver defines and uses a field with name name.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7421>
As a result of this patch, compiler chooses SIMD32 shaders more
frequently.
Current logic is designed to avoid regressions from enabling SIMD32 at
all cost, even though the cases where regression can happen are probably
for smaller draw calls (far away from the camera and though smaller).
In Intel perf CI this patch improves FPS in:
- gfxbench5 alu2: 21.92% (gen9), 23.7% (gen11)
- synmark OglShMapVsm: 3.26% (gen9), 4.52% (gen11)
- gfxbench5 car chase: 1.34% (gen9), 1.32% (gen11)
No observed regressions there.
In my testing, it also improves FPS in:
- The Talos Principle: 2.9% (gen9)
The other 16 games I tested had very minor changes in performance
(2/3 positive, but not significant enough to list here).
Note: this patch harms synmark OglDrvState (which is not in Intel perf
CI) by ~2.9%, but this benchmark renders multiple scenes from other
workloads (including OglShMapVsm, which is helped in standalone mode)
in tiny rectangles. Rendering so small drastically changes branching
statistics, which favors smaller SIMD modes. I assume this matters
only in micro-benchmarks, as in real workloads more expensive (with
more uniform branching behavior) draw calls dominate.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7137>
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7382>
I need this for emitting the SO program for turnip, where we want to
skip over unused slots by manually advancing the counter. freedreno will
also want to use it when it supports multistream streamout.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6962>