Marek Olšák
492ad9a402
ac: remove unused variable from ac_build_ddxy
...
trivial
2019-01-07 14:51:25 -05:00
Nicolai Hähnle
8efaffa893
amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan
...
Allow for a unified but efficient treatment of adding a bitmask over a
wave or an entire threadgroup.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:19 +01:00
Nicolai Hähnle
300876a9a7
amd/common: scan/reduce across waves of a workgroup
...
Order-aware scan/reduce can trade-off LDS traffic for external atomics
memory traffic in producer/consumer compute shaders.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:17 +01:00
Nicolai Hähnle
3963402fd3
amd/common: add ac_build_ifcc
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:15 +01:00
Nicolai Hähnle
3c77f26ccc
amd/common: whitespace fixes
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:12 +01:00
Rhys Perry
12dc7cb202
ac: refactor visit_load_buffer
...
This is so that we can split different types of loads more easily.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-16 14:56:10 +00:00
Samuel Pitoiset
3fbdcd942f
amd: remove support for LLVM 6.0
...
User are encouraged to switch to LLVM 7.0 released in September 2018.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Dave Airlie
ec9fe8abc7
ac: avoid casting pointers on bcsel and stores
...
For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:25 +10:00
Bas Nieuwenhuizen
dd0172e865
radv: Use structured intrinsics instead of indexing workaround for GFX9.
...
These force the index to be used in the instruction so we don't need the
workaround.
Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-19 23:36:00 +01:00
Marek Olšák
8676af12c8
ac: fix ac_build_fdiv for f64
...
trivial
Fixes: a5f35aa742
2018-10-29 17:24:21 -04:00
Connor Abbott
59535b05cf
ac: Introduce ac_build_expand()
...
And implement ac_bulid_expand_to_vec4() on top of it.
Fixes: 7e7ee82698
("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Marek Olšák
bfc795670e
ac: add helpers for fast integer division by a constant
2018-10-16 17:23:25 -04:00
Samuel Pitoiset
416013b4f5
radv: emit the GLC bit for SSBO loads/stores when needed
...
This fixes some new memory model tests:
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 08:42:08 +02:00
Marek Olšák
77903c8cfb
ac: add ac_build_round
2018-10-06 21:50:09 -04:00
Marek Olšák
82f5f89bf6
ac: simplify LLVM alloca helpers
2018-10-06 21:50:09 -04:00
Marek Olšák
a668c8d6ba
ac: define all address spaces properly
2018-10-06 21:50:09 -04:00
Samuel Pitoiset
cd76ce0078
ac: add 16-bit support to ac_build_bitfield_reverse()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:37 +02:00
Samuel Pitoiset
fc398f4d67
ac: add 16-bit support to ac_build_bit_count()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:34 +02:00
Samuel Pitoiset
94dd08eb7c
ac: add 16-bit support to ac_find_lsb()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:32 +02:00
Samuel Pitoiset
5a6c8ca3e8
ac: add 16-bit support to ac_build_umsb()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:30 +02:00
Samuel Pitoiset
3e7f3e2cd1
ac: add 16-bit support to ac_build_isign()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:28 +02:00
Samuel Pitoiset
cfd6314cfe
ac: add 16-bit constant values for zero and one
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:26 +02:00
Samuel Pitoiset
074e29183c
ac: add ac_build_bifield_reverse() helper
...
Are we missing 64-bit support?
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:23 +02:00
Samuel Pitoiset
371c35e5bb
ac: add ac_build_bit_count() helper
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:20 +02:00
Marek Olšák
be0bd95abf
radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
cc36ebbdc3
ac: use iN_0/1 constants
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
a5f35aa742
ac: revert new LLVM 7.0 behavior for fdiv
...
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
60beac9efc
ac,radeonsi: use ac_build_fmad
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
659f2e0fcb
ac: add imad & fmad helpers
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
2276f8f064
ac: add ac_build_s_barrier
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Marek Olšák
fd1121e839
amd: remove support for LLVM 5.0
...
Users are encouraged to switch to LLVM 6.0 released in March 2018.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-03 18:36:11 -04:00
Daniel Schürmann
a6a21e651d
ac: add support for 16bit UBO loads
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Daniel Schürmann
f582367d49
ac: add 16bit conversion operations
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-23 23:16:25 +02:00
Marek Olšák
4695984dbc
ac: fold LLVMContext creation into ac_llvm_context_init
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Marek Olšák
e5e57c3a5e
ac: handle undefined EQAA samples in ac_apply_fmask_to_sample
...
RADV might wanna use this helper too.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:12 -04:00
Timothy Arceri
fae3b38770
ac: fix possible truncation of intrinsic name
...
Fixes the gcc warning:
snprintf’ output between 26 and 33 bytes into a destination of size 32
Fixes: d5f7ebda3e
("ac: add LLVM build functions for subgroup instrinsics")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-08 09:24:15 +10:00
Bas Nieuwenhuizen
4fc2d5e141
amd/common: Fix number of coords for getlod.
...
The LLVM 6 code reduced it to a non-array call. We need to do that
with the new code too.
This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*array* for radv.
Fixes: a9a7993441
"amd/common: use the dimension-aware image intrinsics on LLVM 7+"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-07 23:59:52 +02:00
Nicolai Hähnle
a9a7993441
amd/common: use the dimension-aware image intrinsics on LLVM 7+
...
Requires LLVM trunk r329166.
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-06-04 21:34:59 +02:00
Bas Nieuwenhuizen
699e1f5aac
ac: Use DPP for build_ddxy where possible.
...
WQM is pretty reliable now on LLVM 7, so let us just use
DPP + WQM.
This gives approximately a 1.5% performance increase on the
vrcompositor built-in benchmark.
v2: Use ac_build_quad_swizzle.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-23 21:02:45 +02:00
Marek Olšák
f9eb1ef870
amd: remove support for LLVM 4.0
...
It doesn't support GFX9.
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-17 14:54:41 -04:00
Dave Airlie
eba4cf797c
ac/llvm: use amdgcn.tbuffer.store instead of SI.tbuffer.store intrinsic
...
Drop the use of the old intrinsic.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-17 11:46:53 +10:00
Nicolai Hähnle
c0acb596f4
amd/common: use llvm.amdgcn.wqm for explicit derivatives
...
To comply with an upcoming change in LLVM, see
https://reviews.llvm.org/D46051
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-04 11:02:48 +02:00
Samuel Pitoiset
d136a5fad9
ac: fix the number of coordinates for ac_image_get_lod and arrays
...
This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*
Cubemaps are the same as 2D arrays.
Fixes: 625dcbbc45
("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 21:48:38 +02:00
Nicolai Hähnle
24fb3e6aa1
ac/nir: use ac_build_image_opcode for image intrinsics
...
So that we'll use the dimension-aware intrinsics in the future.
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:30:07 +02:00
Nicolai Hähnle
74063431f1
radeonsi: generate image load/store/atomic ops using ac_build_image_opcode
...
In preparation of dimension-aware LLVM image intrinsics.
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:29:57 +02:00
Nicolai Hähnle
625dcbbc45
amd/common: pass address components individually to ac_build_image_intrinsic
...
This is in preparation for the new image intrinsics.
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:52 +02:00
Nicolai Hähnle
f931583828
amd/common: pass new enum ac_image_dim to ac_build_image_opcode
...
This is in preparation for the new, dimension-aware LLVM image
intrinsics.
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:40 +02:00
Daniel Schürmann
d5f7ebda3e
ac: add LLVM build functions for subgroup instrinsics
...
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:09 +02:00
Daniel Schürmann
d19f20e793
ac: make ballot and umsb capable of 64bit inputs
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Marek Olšák
dc04e4bba2
radeonsi: move FMASK shader logic to shared code
...
We'll need it for FBFETCH in both TGSI and NIR paths.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:22 -04:00