Commit Graph

288 Commits

Author SHA1 Message Date
Eric Engestrom 2c67457e5e util/list: rename LIST_ENTRY() to list_entry()
This follows the Linux kernel convention, and avoids collision with
macOS header macro.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6751
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6840
Cc: mesa-stable
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17772>
2022-07-28 10:10:44 +00:00
Konstantin Seurer 66344fae4d r600: Remove format desc null checks
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17490>
2022-07-21 12:48:01 +00:00
Gert Wollny 8222840e3f r600: limit loops when trying to merge alu groups
On Cayman bank_swizzle[4] is never counted up, so add an
additional condition to make sure the loop is finished
at one point. This is a hot-fix, the logic below should be
improved.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17484>
2022-07-13 15:10:28 +02:00
Gert Wollny 79ca456b48 r600/sfn: rewrite NIR backend
This is a rewite of the NIR backend. it adds some optimization
and a scheduler.

v2: - replace some magic numbers by constants
    - make sure constructor is always used with new
    - use default initialization in more places
      (changes suggested by Filip Gawin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
2022-07-07 20:11:02 +00:00
Gert Wollny 3525d29a8d r600: Make sure that LDS instructions only use bank swizzle 012
Not sure whether this is really needed. With the TGSI code path no
other bank swizzle is emitted for LDS ops, so make sure it's the same here.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
2022-07-07 20:11:01 +00:00
Gert Wollny 105b03a5ed r600: Add number of ALU groups to statistics
The number of ALU groups is important for good sccheduling, so
let's add it to the stats.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>
2022-07-07 20:11:01 +00:00
Marek Olšák 39800f0fa3 amd: change chip_class naming to "enum amd_gfx_level gfx_level"
This aligns the naming with PAL.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pellou-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16469>
2022-05-13 14:56:22 -04:00
Emma Anholt 25836895f3 r600: Fix reading back from a temp array immediately after writing on RV770.
KHR-GL33.shaders.indexing.tmp_array.vertexid regressed with the switch to
NIR-to-TGSI because the shader got optimized enough to emit a read just
after writing to the array (the kind of situation where a non-rel write
would have been followed by a PV/PS read).  The R600 and EG docs say you
always need to do this, but apparently some hardware gives you the right
answer anyway so we don't flag it on all of them.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14319>
2022-04-20 21:46:09 +00:00
Gert Wollny e466d73368 r600: make r600_load_ar available to driver code
This is needed for the new NIR assembler

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>
2022-04-12 12:10:19 +00:00
Gert Wollny 050e05db22 r600: Set the last bit if an alu group is split by kcache allocation
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>
2022-04-12 12:10:19 +00:00
Gert Wollny d920200ad6 r600: Force last instruction of group when starting a new CF
When emitting the AR forces splitting an ALU group, and at the same time
a new CF instruction is started, then the last instrcution in the finished
CF block might not have the "last" bit set, which results in an invalid
shader that might hang, or crash SB.
So when a new CF is started, force the last bit in the last ALU instruction.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>
2022-04-12 12:10:19 +00:00
Gert Wollny 04fd9a6488 r600: don't reschedule INTERP_LOAD_P0
With the NIR code, we have instructions groups that use
INTERP_LOAD_P0 that don't fill all slots. Just make sure
the backend scheduler doesn't fill in INTERP_LOAD_P0
instructions with a different LDS location.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>
2022-04-12 12:10:19 +00:00
Gert Wollny 3c4644afb0 r600: ignore dest sel for non-write targets when counting registers
Since the value is not written, there is no need to allocate
a register for it, so don't take it into account.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>
2022-04-12 12:10:19 +00:00
Gert Wollny 67d145d9ab r600: Don't limit scheduling of PARAM_SRC values
ALU_SRC_PARAM_BASE is an inline constant that defines the
address for pulling data from LDS memory for interpolation
and not a value from the kcache, so there is no need to
take these values into account when allocating kcache
load slots.

v2: Fix the constant range check to not exclude the translated
    ranges for kcache banks 2 and 3.
v3: limit range check to only include kcache values and and
    rename relevant function (Emma).

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>
2022-04-12 12:10:19 +00:00
Gert Wollny c63424b2eb r600: Only emit the NOP group triggered by dest.rel after a full group
In addition really fill all slots, because otherwise the alu-group merger
might move a read from the indirectly written register into the 't' slot.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15848>
2022-04-12 10:33:58 +00:00
Emma Anholt b8324a7387 r600: Implement memoryBarrier() in the non-SFN path.
Previously we were just doing a group barrier for both membar and barrier.
This sometimes worked out, because atomics and reads waited for ack
already, but writes were not waiting for ack.  Use the need_wait_ack
pattern that scratch writes used, with a little refactoring for
reusability.

The refactor also incidentally fixes the atomics waiting for outstanding
acks to be > 1 instead of > 0.

Cc: mesa-stable
Fixes: #6028
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14429>
2022-03-28 10:54:46 -07:00
Emma Anholt ef151d24ab r600: Fix ordering of SSBO loads versus texturing.
The two types of instructions get added to the same CF list, but not the
same instr list within the CF list.  So, if you SSBO fetched your
texcoord, the emission of the SSBO fetch would come *after* the texcoord
fetch.

Avoids regressions when NIR-to-TGSI starts optimizing more.

Cc: mesa-stable
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14429>
2022-03-28 10:54:46 -07:00
Gert Wollny 9bf5033941 r600: don't put INTERP_X and INTERP_Z into one instruction group
Apparently this is not allowed and results in interpolation errors.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10608>
2021-05-18 22:07:10 +02:00
Gert Wollny 46fc8ed432 r600: Enable sb for nir only on specific request
SB si known to be buggy and the ultimate aim is to make it go away. To
test workloads with better optimizations it makes sense to be able to
enable SB, but for the NIR backend it should not be enabled together
with NIR the default. Therefore an a specific debug option "nirsb" that
enables NIR with SB.

Fixes: 3b27243b01
    r600: Enable sb also for NIR

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10108>
2021-04-09 07:32:41 +00:00
Marek Olšák 65495e6caa radeon_winsys.h: add a winsys parameter to most winsys buffer functions
This will allow removing the winsys pointer from buffers.

The amdgpu winsys adds dummy_ws to get radeon_winsys because there can be
no radeon_winsys around (e.g. while amdgpu_winsys is being destroyed), but
we still need some way to call buffer functions.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9809>
2021-04-06 22:31:15 +00:00
Gert Wollny 4d91812d3c r600: Don't optimize using source modifiers on literals
The code improvement is limited and it interferes with using literals
directly in LDS index ops, since here source modifiers are not
supported, but the current assembler code might inject the modifiers.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>
2021-03-02 18:46:17 +01:00
Marek Olšák 8904fcca6d gallium: inline struct u_suballocator to remove dereferences
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7901>
2020-12-03 21:41:19 +00:00
Gert Wollny 9a6b11a733 r600/sfn: Fix indirect const buffer access
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6879>
2020-09-28 18:16:28 +00:00
Marek Olšák 8e1193b8d3 radeon: rename RADEON_TRANSFER_* -> RADEON_MAP_*
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5749>
2020-09-22 03:20:54 +00:00
Marek Olšák 22253e6b65 gallium: rename PIPE_TRANSFER_* -> PIPE_MAP_*
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5749>
2020-09-22 03:20:54 +00:00
Gert Wollny 901793d558 r600: Fix duplicated subexpression in r600_asm.c
Fixes: 4422ce1b04
    r600: force new CF with TEX only if any texture value is written

Closes #3012

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5187>
2020-05-26 06:17:42 +00:00
Gert Wollny 4422ce1b04 r600: force new CF with TEX only if any texture value is written
This works aound splitting the CF when the gradient is set.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3225>
2020-02-10 19:09:07 +00:00
Timothy Arceri c578600489 util: remove LIST_DEL macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri 255de06c59 util: remove LIST_ADDTAIL macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri 7ae1be1028 util: remove LIST_INITHEAD macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Nicolai Hähnle eb94b6bd5c winsys/amdgpu: explicitly declare whether buffer_map is permanent or not
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY
that specifies whether the caller will use buffer_unmap or not. The
default behavior is set to permanent maps, because that's what drivers
do for Gallium buffer maps.

This should eliminate the need for hacks in libdrm. Assertions are added
to catch when the buffer_unmap calls don't match the (temporary)
buffer_map calls.

I did my best to update r600 for consistency (r300 needs no changes
because it never calls buffer_unmap), even though the radeon winsys
ignores the new flag.

As an added bonus, this should actually improve the performance of
the normal fast path, because we no longer call into libdrm at all
after the first map, and there's one less atomic in the winsys itself
(there are now no atomics left in the UNSYNCHRONIZED fast path).

Cc: Leo Liu <leo.liu@amd.com>
v2:
- remove comment about visible VRAM (Marek)
- don't rely on amdgpu_bo_cpu_map doing an atomic write
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-28 18:24:14 +01:00
Gert Wollny 9c1ae6a1a1 r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formats
Below tests would fail with an error message
  "Vertex format (R4G4B4A4|R5G5B5A1) not supported."
Add the formate to the translation routine to enable these formats.

Fixes:
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d
  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-07-05 07:57:28 +02:00
Dave Airlie 4d72a1efea r600: add time lo/hi debugging output.
This just adds the these to the debug prints.
2018-02-26 11:05:26 +10:00
Glenn Kennard 1d871aa626 r600g: Implement spilling of temp arrays (v2)
Pessimistically spills arrays if GPR limit is exceeded.

v2: fix r600 support [airlied]

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:26 +10:00
Glenn Kennard 9d31596d7a r600g: Add pending output function
Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:08 +10:00
Glenn Kennard 9c48a139b0 r600g: Support emitting scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:52:48 +10:00
Dave Airlie 8fa5aade43 r600: initial attempt at gl_HelperInvocation (v3)
This passes the CTS and piglit tests.

This also disable sb for helper invocations until it doesn't
mess up the VPM flags.

Thanks to Ilia and Glenn for advice, and Roland for working
out the working evergreen path.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 09:46:05 +10:00
Dave Airlie df155a73f4 r600: fix buffer resinfo opcode translation.
The vtx operations never got translated, so things worked by
0 being equal to 0, translate them so we can use the proper buffer
resinfo code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 11:59:55 +10:00
Roland Scheidegger c5162fd3c4 r600: use GET_BUFFER_RESINFO vtx fetch on eg instead of setting up consts
Contrary to what the comment said, this appears to work just fine on my rv770
(tested with piglit textureSize 140 fs/vs samplerBuffer).
Dave Airlie confirmed it working on cayman too.
I have no clue though if it's actually preferrable to use it (unfortunately
we cannot get rid of the tex constants completely, as we still require them
for cube map txq).
Albeit filling in the format (1 channels or 4?) and the stuff related to mega-
or mini-fetch (what the hell is this...) is just a guess based on other usage
of vtx fetch instructions...

v2: it really needs to be done through texture cache (I botched the
testing because sb optimizations turned it automatically into tc, but
can't rely on it and isn't happening on tes).

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Roland Scheidegger 0be1dc25cf r600: increase number of ubos by one to 14
Ideally we'd support 16 (d3d11 requires 15, and mesa subtracts one for non-ubo
constants), but that's kind of impossible (it would be only doable if either
we'd somehow merge the mesa non-ubo constants with the driver constants, or
only use the driver constants with vtx fetch instead of through the kcache
mechanism - the latter probably wouldn't be too bad).
For now just do as the comment already said, place the gs ring (not really
a const buffer in any case) which is only ever referred to through vc fetch
clauses at index 16. Throw in a couple asserts for good measure to make sure
the hw limit isn't exceeded.

Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-10 04:59:00 +01:00
Gert Wollny 1d076aafbc r600: Emit EOP for more CF instruction types
So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END,
CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction
was added to add the EOP flag, even though this is not actually
needed, because all these instrutions support the EOP flag.

This patch removes the fixup code, adds setting the EOP flag for the
according instructions as well as others like CF_OP_TEX and CF_OP_VTX,
and adds writing out EOP for this type of instruction in the disassembler.

This also fixes a bug where shaders were created that didn't actually have
the EOP flag set in the last CF instruction, which might have resulted
in GPU lockups.

[airlied: cleaned up a little]
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-22 22:39:42 +00:00
Dave Airlie f3c6149c26 r600: add support for emitting RAT instructions to the assembler.
This adds support for emitting RAT instructions to the assembler.
RAT instructions are used to implement image accessors.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:33 +10:00
Dave Airlie 159bf38c3a r600: add support for mark bit to the assembler.
This adds support to the assembler for the mark bit
 on the export word1.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:30 +10:00
Dave Airlie d584b4671f r600: add support for some ALU sources.
These special ALU sources provide the shader engine,
simd and hw wave ids.

These are required for images support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:31:50 +10:00
Michal Srb e6d7937b86 r600: Add support for B5G5R5A1.
Fixes rendercheck errors when using glamor acceleration in X server.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-25 19:17:03 +02:00
Constantine Charlamov dacb319777 r600g: constify some args at r600_asm.c
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 09:24:27 +02:00
Constantine Charlamov 3823e4905b r600g: remove unused "bc" args, and one unneeded forward declaration
To ease review just highlight "bc," string.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 09:24:17 +02:00
Dave Airlie 27380d6b3e r600/asm: add support for other GDS operations.
This adds support for the GDS operations needed to do atomic
counters.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:27:51 +10:00
Dave Airlie ccab3f7e1b r600: don't merge GDS into VTX
We don't want vtx/tex instructions ending up in GDS sections.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:21 +10:00
Dave Airlie 043f16eba1 r600: for memory instructions dump index gpr for read indirects also.
This just makes sure we can see the index gpr in the asm dumps.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:21 +10:00