Commit Graph

161 Commits

Author SHA1 Message Date
Marek Olšák 492ad9a402 ac: remove unused variable from ac_build_ddxy
trivial
2019-01-07 14:51:25 -05:00
Nicolai Hähnle 8efaffa893 amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan
Allow for a unified but efficient treatment of adding a bitmask over a
wave or an entire threadgroup.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:19 +01:00
Nicolai Hähnle 300876a9a7 amd/common: scan/reduce across waves of a workgroup
Order-aware scan/reduce can trade-off LDS traffic for external atomics
memory traffic in producer/consumer compute shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:17 +01:00
Nicolai Hähnle 3963402fd3 amd/common: add ac_build_ifcc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:15 +01:00
Nicolai Hähnle 3c77f26ccc amd/common: whitespace fixes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:12 +01:00
Rhys Perry 12dc7cb202 ac: refactor visit_load_buffer
This is so that we can split different types of loads more easily.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Samuel Pitoiset 3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Dave Airlie ec9fe8abc7 ac: avoid casting pointers on bcsel and stores
For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:25 +10:00
Bas Nieuwenhuizen dd0172e865 radv: Use structured intrinsics instead of indexing workaround for GFX9.
These force the index to be used in the instruction so we don't need the
workaround.

Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-19 23:36:00 +01:00
Marek Olšák 8676af12c8 ac: fix ac_build_fdiv for f64
trivial

Fixes: a5f35aa742
2018-10-29 17:24:21 -04:00
Connor Abbott 59535b05cf ac: Introduce ac_build_expand()
And implement ac_bulid_expand_to_vec4() on top of it.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Marek Olšák bfc795670e ac: add helpers for fast integer division by a constant 2018-10-16 17:23:25 -04:00
Samuel Pitoiset 416013b4f5 radv: emit the GLC bit for SSBO loads/stores when needed
This fixes some new memory model tests:
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 08:42:08 +02:00
Marek Olšák 77903c8cfb ac: add ac_build_round 2018-10-06 21:50:09 -04:00
Marek Olšák 82f5f89bf6 ac: simplify LLVM alloca helpers 2018-10-06 21:50:09 -04:00
Marek Olšák a668c8d6ba ac: define all address spaces properly 2018-10-06 21:50:09 -04:00
Samuel Pitoiset cd76ce0078 ac: add 16-bit support to ac_build_bitfield_reverse()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:37 +02:00
Samuel Pitoiset fc398f4d67 ac: add 16-bit support to ac_build_bit_count()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:34 +02:00
Samuel Pitoiset 94dd08eb7c ac: add 16-bit support to ac_find_lsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:32 +02:00
Samuel Pitoiset 5a6c8ca3e8 ac: add 16-bit support to ac_build_umsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:30 +02:00
Samuel Pitoiset 3e7f3e2cd1 ac: add 16-bit support to ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:28 +02:00
Samuel Pitoiset cfd6314cfe ac: add 16-bit constant values for zero and one
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:26 +02:00
Samuel Pitoiset 074e29183c ac: add ac_build_bifield_reverse() helper
Are we missing 64-bit support?

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:23 +02:00
Samuel Pitoiset 371c35e5bb ac: add ac_build_bit_count() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:20 +02:00
Marek Olšák be0bd95abf radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák cc36ebbdc3 ac: use iN_0/1 constants
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák a5f35aa742 ac: revert new LLVM 7.0 behavior for fdiv
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák 60beac9efc ac,radeonsi: use ac_build_fmad
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák 659f2e0fcb ac: add imad & fmad helpers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák 2276f8f064 ac: add ac_build_s_barrier
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák fd1121e839 amd: remove support for LLVM 5.0
Users are encouraged to switch to LLVM 6.0 released in March 2018.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-03 18:36:11 -04:00
Daniel Schürmann a6a21e651d ac: add support for 16bit UBO loads
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann f582367d49 ac: add 16bit conversion operations
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Marek Olšák 4695984dbc ac: fold LLVMContext creation into ac_llvm_context_init
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Marek Olšák e5e57c3a5e ac: handle undefined EQAA samples in ac_apply_fmask_to_sample
RADV might wanna use this helper too.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:12 -04:00
Timothy Arceri fae3b38770 ac: fix possible truncation of intrinsic name
Fixes the gcc warning:
snprintf’ output between 26 and 33 bytes into a destination of size 32

Fixes: d5f7ebda3e ("ac: add LLVM build functions for subgroup instrinsics")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-08 09:24:15 +10:00
Bas Nieuwenhuizen 4fc2d5e141 amd/common: Fix number of coords for getlod.
The LLVM 6 code reduced it to a non-array call. We need to do that
with the new code too.

This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*array* for radv.

Fixes: a9a7993441 "amd/common: use the dimension-aware image intrinsics on LLVM 7+"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-07 23:59:52 +02:00
Nicolai Hähnle a9a7993441 amd/common: use the dimension-aware image intrinsics on LLVM 7+
Requires LLVM trunk r329166.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-06-04 21:34:59 +02:00
Bas Nieuwenhuizen 699e1f5aac ac: Use DPP for build_ddxy where possible.
WQM is pretty reliable now on LLVM 7, so let us just use
DPP + WQM.

This gives approximately a 1.5% performance increase on the
vrcompositor built-in benchmark.

v2: Use ac_build_quad_swizzle.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-23 21:02:45 +02:00
Marek Olšák f9eb1ef870 amd: remove support for LLVM 4.0
It doesn't support GFX9.

Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-17 14:54:41 -04:00
Dave Airlie eba4cf797c ac/llvm: use amdgcn.tbuffer.store instead of SI.tbuffer.store intrinsic
Drop the use of the old intrinsic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-17 11:46:53 +10:00
Nicolai Hähnle c0acb596f4 amd/common: use llvm.amdgcn.wqm for explicit derivatives
To comply with an upcoming change in LLVM, see
https://reviews.llvm.org/D46051

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-04 11:02:48 +02:00
Samuel Pitoiset d136a5fad9 ac: fix the number of coordinates for ac_image_get_lod and arrays
This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*

Cubemaps are the same as 2D arrays.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 21:48:38 +02:00
Nicolai Hähnle 24fb3e6aa1 ac/nir: use ac_build_image_opcode for image intrinsics
So that we'll use the dimension-aware intrinsics in the future.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:30:07 +02:00
Nicolai Hähnle 74063431f1 radeonsi: generate image load/store/atomic ops using ac_build_image_opcode
In preparation of dimension-aware LLVM image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:29:57 +02:00
Nicolai Hähnle 625dcbbc45 amd/common: pass address components individually to ac_build_image_intrinsic
This is in preparation for the new image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:52 +02:00
Nicolai Hähnle f931583828 amd/common: pass new enum ac_image_dim to ac_build_image_opcode
This is in preparation for the new, dimension-aware LLVM image
intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:40 +02:00
Daniel Schürmann d5f7ebda3e ac: add LLVM build functions for subgroup instrinsics
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:09 +02:00
Daniel Schürmann d19f20e793 ac: make ballot and umsb capable of 64bit inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Marek Olšák dc04e4bba2 radeonsi: move FMASK shader logic to shared code
We'll need it for FBFETCH in both TGSI and NIR paths.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:22 -04:00