We'll have to extend this at some point, and using a bitfield union in
this way makes it easier to get the right index without excessive
branching.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
debugoptimized builds don't define NDEBUG, but they also don't define
DEBUG. We want to enable cheap debug code for these builds.
I only chose those occurences that I care about.
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Acked-by: Dave Airlie <airlied@redhat.com>
We already use GFX9 and I don't want us to have confusing naming
in the driver. GFXn naming is better from the driver perspective,
because it's the real version of the gfx portion of the hw. Also,
CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI.
It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have
nothing to do with GFXn and they have their own version numbers.
The overall goal is to support unaligned loads from vertex buffers
natively on SI.
In the unaligned case, we fall back to the general case implementation in
ac_build_opencoded_load_format. Since this function is fully general,
we will also use it going forward for cases requiring fully manual format
conversions of dwords anyway.
This requires a different encoding of the fix_fetch array, which will now
contain the entire format information if a fixup is required.
Having to check the alignment of vertex buffers is awkward. To keep the
impact on the fast path minimal, the si_context will keep track of which
vertex buffers are (not) at least dword-aligned, while the
si_vertex_elements will note which vertex buffers have some (at most dword)
alignment requirement. Vertex buffers should be dword-aligned most of the
time, which allows a fast early-out in almost all cases.
Add the radeonsi_vs_fetch_always_opencode configuration variable for
testing purposes. Note that it can only be used reliably on LLVM >= 9,
because support for byte and short load is required.
v2:
- add a missing check to si_bind_vertex_elements
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Needed to track context rolls caused by streamout and ACQUIRE_MEM.
ACQUIRE_MEM can occur outside of draw calls.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110355
v2: squashed patches and done more rework
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Move the definition of radeonsi_clear_db_cache_before_clear there,
as well as radeonsi_enable_nir.
This removes the AMD_DEBUG=nir option.
We currently still have two places for options: the driconf machinery
and AMD_DEBUG/R600_DEBUG. If we are to have a single place for options,
then the driconf machinery should be preferred since it's more flexible.
The only downside of the driconf machinery was that adding new options
was quite inconvenient. With this change, a simple boolean option can
be added with a single line of code, same as for AMD_DEBUG.
One technical limitation of this particular implementation is that while
almost all driconf features are available, the translation machinery doesn't
pick up the description strings for options added in si_debvug_options. In
practice, translations haven't been provided anyway, and this is intended
for developer options, so I'm not too worried. It could always be added
later if anybody really cares.
v2:
- use bool instead of uint8_t for options
- si_debug_options.inc -> si_debug_options.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
so that bound compute shader resources won't be added when they are not
needed and same for graphics.
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The OpenMAX state tracker will use this.
RadeonSI is adapted to use pipe_grid_info::last_block instead of its
internal state.
Acked-by: Leo Liu <leo.liu@amd.com>
v2: use tc.stream_uploader in si buffer_transfer_map if not called from
the driver thread
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
We should get fewer context rolls with the SET_CONTEXT_REG optimization,
but it would have been for nothing if the scissor state rolled the context
anyway. Don't emit the scissor state if there is no context roll.