ra_block_compute_live_ranges() treats zero as "not yet defined", so
probably best to not let this be a valid instruction #
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
After RA, we can schedule to increase parallelism (reduce nop's) without
worrying about increasing register pressure. This pass lets us cut down
the instruction count ~10%, and prioritize bary.f, kill, etc, which
would tend to increase register pressure if we tried to do that before
RA.
It should be more useful if RA round-robin'd register choices.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
kill (and other cat0/flow instructions) do not have a dst register.
Which was mostly harmless before, other than RA thinking it would need
a free register to write. (But nothing consumed it, so the value would
be immediately dead.) But this would cause more problems with postsched
which would see a bogus dependency.
Also, post-RA sched *does* need to see the dependency on the predicate
register.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
Originally these were nested functions, which worked nicely, giving us
the function of a local macro that was actual 'c' syntax (ie. not token
pasted macro). But these were converted to macros because clang doesn't
let us have nice gcc extensions.
Extract these back out into functions, before adding more things and
making the macros even more cumbersome.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
A post-RA sched pass will move the extra mov's to the wrong place, so
rework the fixup so it can run after RA (and therefore after postsched)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
We want to do this only once. If we have post-RA sched pass, then we
don't want to do it pre-RA. Since legalize is where we resolve the
branch/jumps, we might as well move this into legalize.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
This way we can deal with it in one place, *after* all the blocks have
been scheduled. Which will simplify life for a post-RA sched pass.
This has the benefit of already taking into account nop's that legalize
has to insert for non-delay related reasons.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
This scenario can come up with block-sched and nop-sched moved to after
RA. So lets fix it first to keep things bisectable.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
This patch fixes the way_size_per_bank for Gen12+
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>
This patch is required to fix 11K+ vulkan CTS failures we were
getting with way_size_per_bank of 4 (see next patch).
Thanks to Sagar Ghuge and Jordan Justen for all the hard work of
debugging and testing.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>
The old code would only break at stride boundaries if the stride was
less than 32B; otherwise it would just break every 32B. This commit
makes it break at stride boundaries and 32B boundaries (starting from
the last stride). This makes reading large vertex buffers in aubinator
much nicer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3642>
Otherwise, the validator tries to read the type of src1 of a SEND/SENDS
which doesn't actually have a type field. This prevents validation
issues in the next commit.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3642>
GCC 10 added _mm256_storeu2_m128i.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91341
This patch fixes this build error with GCC 10.
In file included from src/gallium/drivers/swr/rasterizer/codegen/gen_knobs.cpp:39:
../src/gallium/drivers/swr/rasterizer/common/os.h:178:20: error: ‘void _mm256_storeu2_m128i(__m128i*, __m128i*, __m256i)’ redeclared inline without ‘gnu_inline’ attribute
178 | static INLINE void _mm256_storeu2_m128i(__m128i* hi, __m128i* lo, __m256i a)
| ^~~~~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:51,
from /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32,
from ../src/gallium/drivers/swr/rasterizer/common/os.h:107,
from src/gallium/drivers/swr/rasterizer/codegen/gen_knobs.cpp:39:
/usr/lib/gcc/x86_64-redhat-linux/10/include/avxintrin.h:1580:1: note: ‘void _mm256_storeu2_m128i(__m128i_u*, __m128i_u*, __m256i)’ previously defined here
1580 | _mm256_storeu2_m128i (__m128i_u *__PH, __m128i_u *__PL, __m256i __A)
| ^~~~~~~~~~~~~~~~~~~~
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3650>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3650>
To avoid hitting the assert in the default case, add a nop for this
intrinsic.
dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.3
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3625>
Previously, i965/iris tried to reuse the currently programmed URB config
if it was good enough for BLORP, rather than reprogramming it each time.
However, this will make some things harder on Gen12+ and we've not seen
any performance impact from emitting URB more frequently in ANV.
This makes the blorp <-> driver interface a bit simpler on Gen7+ because
now all the driver has to do is to provide the L3$ config rather than
trying to hand off URB re-config to blorp.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
If BLORP is the first thing to execute, we may not have set the L3$
config yet. That's not normally a problem but we're about to add code
to BLORP which will look at brw_context::l3::config and we'd like that
to be initialized. It's also just good practice.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
We only calculate them based on device info and never change them so
this seems like a reasonable place to put them. We could also put them
in the context, but that's not accessible from iris_init_*_context.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
SML is no longer in the L3$ on Gen11+. It's not incredibly clear from
the docs but no Gen11 platforms are in the list of platforms on which
this bit exists. Also, we've been always setting it false on Gen11 in
ANV and i965 thanks to GEN_L3P_SLM being zero with no ill effects.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
According to the BSpec, this should prevent hangs when using shaders
with large URB entries. A more precise fix can be done but it requires
re-arranging URB setup.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>