Commit Graph

1330 Commits

Author SHA1 Message Date
Nicolai Hähnle 74a26af913 amd/common/gfx10: add register JSON
A small number of fields now need new disambiguation.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 536782b0b7 amd/common: add GFX10 chips
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Marek Olšák 78cdf9a99f amd/addrlib: add gfx10 support
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Samuel Pitoiset 83297baf2d ac: compute the DCC fast clear size per slice on GFX8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:44 +02:00
Samuel Pitoiset 6517d226ac ac: compute the size of one DCC slice on GFX8
Addrlib doesn't provide this info. Because DCC is linear, at least
on GFX8, it's easy to compute the size of one slice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:41 +02:00
Emil Velikov 4ec32413f3 ac: change ac_query_gpu_info() signature
Currently libdrm_amdgpu provides a typedef of the various handles. While
the goal was to make those opaque, it effectively became part of the API

To the best of my knowledge there are two ways to have opaque handles:
 - "typedef void *foo;" - rather messy IMHO
 - "stuct foo;" and use "struct foo *" through the API

In our case amdgpu_device_handle is used only internally, plus
respective code is not used or applicable for r300 and r600. Hence we
copied the typedef.

Seemingly this will be a problem since libdrm_amdgpu wants to change the
API, while not updating the code(?).

Either way, we can safely s/amdgpU_device_handle/void */ and carry on.

Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak at amd.com>
2019-06-28 17:49:32 +01:00
Samuel Pitoiset 34bef8a0d7 radv: clear CMASK layers instead of the whole buffer on GFX8
This reduces the size of fill operations needed to clear CMASK
for layered color textures.

GFX9 unsupported for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:36:28 +02:00
Samuel Pitoiset 476b907a3b radv: clear FMASK layers instead of the whole buffer on GFX8
This reduces the size of fill operations needed to clear FMASK
for layered color textures.

GFX9 unsupported for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:36:25 +02:00
Marek Olšák ac4b1e2f0a radeonsi: set the calling convention for inlined function calls
otherwise the behavior is undefined

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-24 21:04:10 -04:00
Nicolai Hähnle bd3a3fd25a amd/rtld: update the ELF representation of LDS symbols
The initial prototype used a processor-specific symbol type, but
feedback suggests that an approach using processor-specific section
name that encodes the alignment analogous to SHN_COMMON symbols is
preferred.

This patch keeps both variants around for now to reduce problems
with LLVM compatibility as we switch branches around.

This also cleans up the error reporting in this function.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-24 21:04:10 -04:00
Marek Olšák 0032f6b8a0 ac/surface: remove addrlib_family_rev_id
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-24 21:04:10 -04:00
Daniel Schürmann 0daeb1d127 amd/common: lower bitfield_extract to ubfe/ibfe.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann 48a75e7af0 amd/common: lower bitfield_insert to bfm & bitfield_select
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Nicolai Hähnle 21dd881416 ac/rtld: report better error messages for LDS overallocation
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00
Marek Olšák b64bd5887e ac/rtld: check correct LDS max size
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00
Nicolai Hähnle 1ee0f0d315 radeonsi: add s_sethalt to shaders for debugging
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00
Nicolai Hähnle 87182200c7 ac/rtld: fix sorting of LDS symbols by alignment
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00
Connor Abbott 53a7649e5d ac/nir: Set speculatable for buffer loads where allowed
This brings the nir path in line with the TGSI path.

Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 2792 -> 2652 (-5.01 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 247380 -> 248072 (0.28 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 121 -> 132 (9.09 %)
Wait states: 0 -> 0 (0.00 %)

Most of the change came from DiRT: Showdown, and came from sinking SSBO
loads.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-19 14:08:28 +02:00
Connor Abbott 3bf8981c51 ac,radeonsi: Always mark buffer stores as inaccessiblememonly
inaccessiblememonly means that it doesn't modify memory accesible via
normal LLVM pointers. This lets LLVM's dead store elimination, memcpy
forwarding, etc. ignore functions with this attribute. We don't
represent descriptors as pointers, so this property is always true of
buffer and image stores. There are plans to represent descriptors via
pointers, but this just means that now nothing is inaccessiblememonly,
as LLVM will then understand loads/stores via its usual alias analysis.

Radeonsi was mistakenly only setting it if the driver could prove that
there were no reads, and then it was cargo-culted into ac_llvm_build
and ac_llvm_to_nir. Rip it out of everything.

statistics with nir enabled:

Totals from affected shaders:
SGPRS: 152 -> 152 (0.00 %)
VGPRS: 128 -> 132 (3.12 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 9324 -> 9244 (-0.86 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 17 -> 17 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

The only difference was a manhattan31 shader.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-19 14:08:27 +02:00
Samuel Pitoiset 4c7ef1b02e ac: make ac_compute_cmask() a static function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-17 11:30:47 +02:00
Samuel Pitoiset b5012a0518 ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+
LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-17 08:58:33 +02:00
Marek Olšák abe9a51d27 ac: add radeon_info::is_amdgpu instead of checking drm_major == 3
and clean up

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-14 13:31:18 -04:00
Daniel Schürmann deedc0b31d amd/common: add support for AMD_shader_ballot functions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-13 12:44:23 +00:00
Nicolai Hähnle f8315ae04b amd/rtld: layout and relocate LDS symbols
Upcoming changes to LLVM will emit LDS objects as symbols in the ELF
symbol table, with relocations that will be resolved with this change.

Callers will also be able to define LDS symbols that are shared between
shader parts. This will be used by radeonsi for the ESGS ring in gfx9+
merged shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle 1ff2440eee amd/common: use ARRAY_SIZE for the LLVM command line options
This is more convenient for changing it around during debug.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle 3c958d924a amd/common: add ac_compile_module_to_elf
A new variant of ac_compile_module_to_binary that allows us to
keep the entire ELF around.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle 77b05cc42d radeonsi: use ac_shader_config
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle b3be346c68 amd/common: add a more powerful runtime linker
Using an explicit linker instead of just concatenating .text
sections will allow us to start using .rodata sections and
explicit descriptions of data on LDS that is shared between
stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle c129cb3861 amd/common: clarify ac_shader_binary::lds_size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 18:33:21 -04:00
Nicolai Hähnle 2e96c01073 amd/common: extract ac_parse_shader_binary_config
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 18:33:08 -04:00
Marek Olšák 4773f5a293 radeonsi: use the ac helper for index buffer stores in the culling shader 2019-06-11 20:05:21 -04:00
Connor Abbott 9d93d2a404 ac/nir: Remove stale TODO
While we're here, copy the comment explaining this from radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-06 17:14:28 +02:00
Marek Olšák ff63b99531 ac: rename LLVM <= 7 helpers for readability
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-04 18:53:46 -04:00
Marek Olšák c9b64b58de ac: fix a typo in ac_build_wg_scan_bottom
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-04 18:53:46 -04:00
Rhys Perry 73dda85512 ac/nir: mark some texture intrinsics as convergent
Otherwise LLVM can sink them and their texture coordinate calculations
into divergent branches.

v2: simplify the conditions on which the intrinsic is marked as convergent
v3: only mark as convergent in FS and CS with derivative groups

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-04 17:30:53 +01:00
Samuel Pitoiset 33f4e04d5a ac,radv: do not emit vec3 for raw load/store on SI
It's unsupported, only load/store format with vec3 are supported.

Fixes: 6970a9a6ca ("ac,radv: remove the vec3 restriction with LLVM 9+")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-04 08:47:26 +02:00
Marek Olšák b2bbd1a27b ac/registers: don't use the si, cik, vi names, use gfxN
trivial
2019-06-03 20:06:41 -04:00
Nicolai Hähnle f480b8aaa4 amd/common: use generated register header 2019-06-03 20:05:20 -04:00
Nicolai Hähnle cf51009ad2 amd/common: unify PITCH_GFX6 and PITCH_GFX9
The definition of the fields differs, but PITCH_GFX9 is a mere extension
of PITCH_GFX6 that does not conflict with any other fields.

This aligns the definitions with what will be generated from the
register JSON.

The information about how large the fields really are is preserved in
the register database.
2019-06-03 20:05:20 -04:00
Nicolai Hähnle e04215815e amd/common: rename R_3F2_CONTROL to IB_CONTROL for disambiguation
This "register" name collides with R_370_CONTROL.

This aligns the definitions with what will be generated from the
register JSON.
2019-06-03 20:05:20 -04:00
Nicolai Hähnle cd247cf456 amd/common: cleanup DATA_FORMAT/NUM_FORMAT field names
The field layout wasn't actually changed in gfx9, so having the suffix
isn't very useful. The field *contents* were changed, but this is
reflected in the V_xxx_xxx definitions and is taken into account by
the ac_debug logic based on the register JSON.

This aligns the definitions with what will be generated from the
register JSON.
2019-06-03 20:05:20 -04:00
Nicolai Hähnle ef6ef098af amd/common: derive ac_debug tables from register JSON 2019-06-03 20:05:20 -04:00
Marek Olšák 486bc1e17e ac: use amdgpu-flat-work-group-size
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-03 14:32:47 -04:00
Samuel Pitoiset 6970a9a6ca ac,radv: remove the vec3 restriction with LLVM 9+
This changes requires LLVM r356755.

32706 shaders in 16744 tests
Totals:
SGPRS: 1448848 -> 1455984 (0.49 %)
VGPRS: 1016684 -> 1016220 (-0.05 %)
Spilled SGPRs: 25871 -> 25815 (-0.22 %)
Spilled VGPRs: 122 -> 122 (0.00 %)
Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread
Code Size: 55324500 -> 55301152 (-0.04 %) bytes
Max Waves: 235660 -> 235586 (-0.03 %)

Totals from affected shaders:
SGPRS: 293704 -> 300840 (2.43 %)
VGPRS: 246716 -> 246252 (-0.19 %)
Spilled SGPRs: 159 -> 103 (-35.22 %)
Scratch size: 188 -> 180 (-4.26 %) dwords per thread
Code Size: 8653664 -> 8630316 (-0.27 %) bytes
Max Waves: 60811 -> 60737 (-0.12 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-03 11:30:08 +02:00
Marek Olšák b257956021 ac: treat Mullins as Kabini, remove the enum
it's the same design
2019-05-27 15:10:51 -04:00
Jason Ekstrand f2dc0f2872 nir: Drop imov/fmov in favor of one mov instruction
The difference between imov and fmov has been a constant source of
confusion in NIR for years.  No one really knows why we have two or when
to use one vs. the other.  The real reason is that they do different
things in the presence of source and destination modifiers.  However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-05-24 08:38:11 -05:00
Samuel Pitoiset d7501834cd radv: add a workaround for Monster Hunter World and LLVM 7&8
The load/store optimizer pass doesn't handle WaW hazards correctly
and this is the root cause of the reflection issue with Monster
Hunter World. AFAIK, it's the only game that are affected by this
issue.

This is fixed with LLVM r361008, but we need a workaround for older
LLVM versions unfortunately.

Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-17 11:41:19 +02:00
Marek Olšák 9d1485554c ac: match radeonsi code in ac_shader_binary_read_config 2019-05-16 13:15:36 -04:00
Marek Olšák 894e017c9c r600+radeonsi: use ctx_query_reset_status on radeon
This allows a nice cleanup, because the winsys always handles it.
2019-05-16 13:15:36 -04:00
Marek Olšák b19884e08e winsys/amdgpu: add a parallel compute IB coupled with a gfx IB
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:07:00 -04:00
Marek Olšák eda281e977 ac: add LLVM code for triangle culling
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:58 -04:00
Marek Olšák ccfcb9d818 ac: rename SI-CIK-VI to GFX6-GFX7-GFX8
Acked-by: Dave Airlie <airlied@redhat.com>

We already use GFX9 and I don't want us to have confusing naming
in the driver. GFXn naming is better from the driver perspective,
because it's the real version of the gfx portion of the hw. Also,
CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI.

It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have
nothing to do with GFXn and they have their own version numbers.
2019-05-15 20:54:10 -04:00
Marek Olšák e5cc363f43 ac: add comments to chip enums
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (except GFX2 changes)
Reviewed-by: Dave Airlie <airlied@redhat.com> (except <= GFX5 changes)
2019-05-15 20:54:10 -04:00
Marek Olšák 6b0b8f132a ac: use 1D GEPs for descriptors and constants
just a cleanup

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-14 15:15:11 -04:00
Nicolai Hähnle 81fe33735a amd/common: add ac_build_opencoded_fetch_format
Implement software emulation of buffer_load_format for all types required
by vertex buffer fetches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-13 17:07:23 +02:00
Samuel Pitoiset 4f18c43d1d radv: apply the indexing workaround for atomic buffer operations on GFX9
Because the new raw/struct intrinsics are buggy with LLVM 8
(they weren't marked as source of divergence), we fallback to the
old instrinsics for atomic buffer operations only. This means we need
to apply the indexing workaround for GFX9. The load/store
operations still use the new LLVM 8 intrinsics.

The fact that we need another workaround is painful but we should
be able to clean up that a bit once LLVM 7 support will be dropped.

This fixes a GPU hang with AC Odyssey and some rendering problems
with Nioh.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573
Fixes: 31164cf5f7 ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-03 17:59:12 +02:00
Samuel Pitoiset 492e828848 ac: tidy up ac_build_llvm8_tbuffer_{load,store}
For consistency with ac_build_llvm8_buffer_{load,store}_common
helpers and that will help a bit for removing the vec3 restriction.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-02 09:24:05 +02:00
Eric Engestrom 7ca8ba199f delete autotools .gitignore files
One special case, `src/util/xmlpool/.gitignore` is not entirely deleted,
as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-04-29 21:17:19 +00:00
Rhys Perry bd4c661ad0 ac,ac/nir: use a better sync scope for shared atomics
https://reviews.llvm.org/rL356946 (present in LLVM 9 and later) changed
the meaning of the "system" sync scope, making it no longer restricted to
the memory operation's address space. So a single address space sync scope
is needed for shared atomic operations (such as "system-one-as" or
"workgroup-one-as") otherwise buffer_wbinvl1 and s_waitcnt instructions
can be created at each shared atomic operation.

This mostly reimplements LLVMBuildAtomicRMW and LLVMBuildAtomicCmpXchg
to allow for more sync scopes and uses the new functions in ac->nir with
the "workgroup-one-as" or "workgroup" sync scopes.

      F1 2017 (4K, Ultra High settings, TAA), avg FPS : 59 -> 59.67 (+1.14%)
     Strange Brigade (4K, ~highest settings), avg FPS : 51.5 -> 51.6 (+0.19%)
RotTR/mountain (4K, VeryHigh settings, FXAA), avg FPS : 57.2 -> 57.2 (+0.0%)
    RotTR/tomb (4K, VeryHigh settings, FXAA), avg FPS : 42.5 -> 43.0 (+1.17%)
  RotTR/valley (4K, VeryHigh settings, FXAA), avg FPS : 40.7 -> 41.6 (+2.21%)
                         Warhammer II/fallen, avg FPS : 31.63 -> 31.83 (+0.63%)
                         Warhammer II/skaven, avg FPS : 37.77 -> 38.07 (+0.79%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-29 18:20:44 +01:00
Bas Nieuwenhuizen 427024bf2e ac/nir: Add support for planes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-25 19:56:20 +00:00
Marek Olšák 2313176817 ac: add REWIND and GDS registers to register headers
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Marek Olšák 35cd57df2e ac: add ac_get_i1_sgpr_mask
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Marek Olšák bfb9287599 ac: add radeon_info::is_pro_graphics
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Marek Olšák 64d6cc982d ac: add radeon_info::marketing_name, replacing the winsys callback
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-04-23 11:28:56 -04:00
Samuel Pitoiset 2b515a8259 ac/nir: use the new raw/struct SSBO atomic intrisics for comp_swap
This is actually fixed now.

This change requires LLVM r358579. Make sure to have it in
your tree, otherwise the following piglit will hang:

tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-19 09:20:15 +02:00
Samuel Pitoiset 895e10d2db ac/nir: only use the new raw/struct SSBO atomic intrinsics with LLVM 9+
They are buggy with older LLVM version, see r358579.

Fixes: 78c551aca1 ("ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-19 09:20:13 +02:00
Samuel Pitoiset 31164cf5f7 ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+
They are buggy with LLVM 8 because they weren't marked as source
of divergence, see r358579.

Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-19 09:20:09 +02:00
Samuel Pitoiset ad6dc13fc7 ac: use struct/raw store intrinsics for 8-bit/16-bit int with LLVM 9+
This changes requires LLVM r356465.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-17 22:10:30 +02:00
Samuel Pitoiset 26ea506235 ac: use struct/raw load intrinsics for 8-bit/16-bit int with LLVM 9+
This changes requires LLVM r356465.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-17 22:10:28 +02:00
Samuel Pitoiset 6fd5e39b60 ac: add support for more types with struct/raw LLVM intrinsics
LLVM 9+ now supports 8-bit and 16-bit types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-17 22:10:25 +02:00
Samuel Pitoiset d118e382dd ac/nir: add 64-bit SSBO atomic operations support
Except compare&swap which is still buggy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-17 21:59:54 +02:00
Samuel Pitoiset 78c551aca1 ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap
Use the raw version (ie. IDXEN=0) because vindex is unused.
Use the old intrinsic for compare&swap because the new one
hangs the GPU for some reasons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-17 21:59:52 +02:00
Bas Nieuwenhuizen af9534b9f3 ac: Move has_local_buffers disable to radeonsi.
In radv we had a separate flag to actually use it + an env option
to experimentally use it.

The common code setting has_local_buffers to false of course broke
that experimental option.

Also the "enable on APU" did not make sense for RADV as it is still
disabled by default.

Fixes: b21a4efb55 "radv/winsys: allow local BOs on APUs"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 20:39:28 +02:00
Marek Olšák dbab755ecf ac: fix incorrect bindless atomic code in visit_image_atomic
Coverity: CID 1444664

Fixes: d62d434fe9 ("ac/nir_to_llvm: add image bindless support")

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-15 12:52:02 -04:00
Rhys Perry 8671cfe2a2 nir,ac/nir: fix cube_face_coord
Seems it was missing the "/ ma + 0.5" and the order was swapped.

Fixes: a1a2a8dfda ('nir: add AMD_gcn_shader extended instructions')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 17:22:47 +01:00
Karol Herbst 14531d676b nir: make nir_const_value scalar
v2: remove & operator in a couple of memsets
    add some memsets
v3: fixup lima

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2019-04-14 22:25:56 +02:00
Karol Herbst adb2263014 amd/nir: some cleanups
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-14 22:25:56 +02:00
Marek Olšák f4ae188d50 ac: use the common helper ac_apply_fmask_to_sample
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-12 11:35:31 -04:00
Marek Olšák 971bc10177 radeonsi: set AC_FUNC_ATTR_READNONE for image opcodes where it was missing
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-12 11:34:39 -04:00
Samuel Pitoiset 6718bb57ac ac/nir: remove some useless integer casts for ALU operations
Sources are always casted to integers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset 8a6442075f ac/nir: remove useless integer cast in visit_image_load()
ac_build_image_opcode() casts if necessary and buffer images
are casted too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset ffbb62f808 ac/nir: remove useless integer cast in adjust_sample_index_using_fmask()
It's already casted if necessary in ac_build_image_opcode().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset 7b5b27a685 ac/nir: remove useles LLVMGetUndef for nir_op_pack_64_2x32_split
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset fd4041987b ac: add ac_build_load_helper_invocation() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset 590a4c8981 ac: add ac_build_ddxy_interp() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset 4cb13e9462 ac: add ac_build_umax() and use it where possible
This changes the predicate from LessThan to Equal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset cf88bfa75a ac/nir: make use of ac_build_umin() where possible
This changes the predicate from LessThan to Equal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:54 +02:00
Samuel Pitoiset 15dd81913f ac/nir: make use of ac_build_imin() where possible
This changes the predicate from LessThan to Equal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:54 +02:00
Samuel Pitoiset d7a0c0d53b ac/nir: make use of ac_build_imax() where possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:54 +02:00
Timothy Arceri d62d434fe9 ac/nir_to_llvm: add image bindless support
With this all piglit bindless image tests pass on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Timothy Arceri 55fb93b586 ac/nir_to_llvm: make get_sampler_desc() more generic and pass it the image intrinsic
This will be required by the bindless support in the following patches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Karol Herbst d7bbb3caf1 glsl_to_nir: handle bindless textures
v2: add support for AMD

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Rhys Perry fd1fc255d9 ac: add 16-bit support to ac_build_ddxy()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-10 09:05:58 +02:00
Samuel Pitoiset bc6d486c78 ac/nir: fix nir_op_b2f16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-10 09:05:55 +02:00
Samuel Pitoiset 27b8f3ecc3 ac/nir: fix intrinsic names for atomic operations with LLVM 9+
This fixes the following LLVM error when using RADV_DEBUG=checkir:
Intrinsic name not mangled correctly for type arguments! Should be: llvm.amdgcn.buffer.atomic.add.i32
i32 (i32, <4 x i32>, i32, i32, i1)* @llvm.amdgcn.buffer.atomic.add

The cmpswap operation still uses the old intrinsic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-08 13:16:50 +02:00
Marek Olšák b563460b49 radeonsi: enable displayable DCC on Ravens 2019-04-04 09:53:24 -04:00
Marek Olšák 1f21396431 radeonsi: add support for displayable DCC for multi-RB chips
A compute shader is used to reorder DCC data from aligned to unaligned.
2019-04-04 09:53:24 -04:00
Marek Olšák 2c09eb4122 radeonsi: add support for displayable DCC for 1 RB chips
This is the simpler codepath - just disable RB and pipe alignment for DCC.
2019-04-04 09:53:24 -04:00
Samuel Pitoiset d099bc5829 ac: add 8-bit and 64-bit support to ac_build_bitfield_reverse()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:57 +02:00
Samuel Pitoiset 2cecf6c5cc ac: add 8-bit support to ac_build_umsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:55 +02:00
Samuel Pitoiset a45d9e3e8d ac: add 8-bit support to ac_find_lsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:53 +02:00
Samuel Pitoiset 89cf8ca0ae ac: add 8-bit support to ac_build_bit_count()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:52 +02:00
Samuel Pitoiset 869af0464a ac/nir: add support for nir_op_b2i8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:49 +02:00
Samuel Pitoiset 4d5fce29c3 ac: fix ac_build_umsb() for 16-bit integer type
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 09:51:56 +02:00
Samuel Pitoiset 7a088d1ac8 ac: fix ac_find_lsb() for 16-bit integer type
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 09:51:54 +02:00
Samuel Pitoiset b16dffff23 ac: fix ac_build_bitfield_reverse() for 16-bit integer type
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 09:51:52 +02:00
Samuel Pitoiset 9d13b9e53e ac: fix ac_build_bit_count() for 16-bit integer type
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 09:51:49 +02:00
Samuel Pitoiset e39a6b940f ac/nir: fix nir_op_b2i16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 09:51:47 +02:00
Timothy Arceri 4478c5374b Revert "ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations"
This reverts commit 29132af234.

It seems the new intrinsic causes a hang on radeonsi (VEGA) when running the
piglit test:

tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test
2019-03-29 21:04:01 +11:00
Samuel Pitoiset cc752dea61 ac: fix return type for llvm.amdgcn.frexp.exp.i32.64
This fixes the following piglit with RadeonSI
tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-29 09:18:24 +01:00
Samuel Pitoiset 52c02d921f ac: add ac_build_frex_exp() helper ans 16-bit/32-bit support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-28 13:02:48 +01:00
Samuel Pitoiset 1bf9311c59 ac: add ac_build_frexp_mant() helper and 16-bit/32-bit support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-28 13:02:46 +01:00
Samuel Pitoiset d6a07732c9 ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-27 14:45:52 +01:00
Nicolai Hähnle e16ac33f37 amd/surface: provide firstMipIdInTail for metadata surface calculations
This field was added in a recent addrlib update, and while there
currently seems to be no issue with skipping it, we will have to
set it correctly in the future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-03-26 10:00:55 +01:00
Bas Nieuwenhuizen 82075e3c42 ac/nir: Return frag_coord as integer.
To preserve the invariant that nir ssa defs are integers or pointers
in LLVM.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-03-26 09:41:15 +01:00
Rhys Perry f736250ab4 ac/nir: implement 16-bit pack/unpack opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-22 12:50:16 +01:00
Samuel Pitoiset 00327f827f ac: fix incorrect argument type for tbuffer.{load,store} with LLVM 7
GLC/SLC are boolean.

This fixes the following LLVM error when checkir is set:
Intrinsic has incorrect argument type!
void (i32, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32)* @llvm.amdgcn.tbuffer.store.i32

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:02:00 +01:00
Samuel Pitoiset 20cac1f498 ac: fix 16-bit shifts
This fixes the following LLVM error when ckeckir is set:
Type too small for ZExt

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:01:58 +01:00
Samuel Pitoiset 2ac5c5c1b5 ac: add 16-bit support to fract
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:09 +01:00
Samuel Pitoiset 0eb1478ac2 ac: add 16-bit support fo fsign
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:07 +01:00
Samuel Pitoiset ff11c9dcc7 ac: add f16_0 and f16_1 constants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:05 +01:00
Rhys Perry 3cc72a88d8 ac/nir: implement 8-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:25 +01:00
Rhys Perry c73f8b6576 ac/nir: add 8-bit types to glsl_base_to_llvm_type
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:22 +01:00
Rhys Perry 9c5067acf1 ac/nir: implement 8-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:20 +01:00
Samuel Pitoiset b235d77e18 ac: add ac_build_tbuffer_store_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:18 +01:00
Rhys Perry b12e074b89 ac/nir: implement 8-bit push constant, ssbo and ubo loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:16 +01:00
Samuel Pitoiset 104dbc64a5 ac: add ac_build_tbuffer_load_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:14 +01:00
Samuel Pitoiset 6e632eb24b ac: add various int8 definitions
Original patch by Rhys Perry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:10 +01:00
Samuel Pitoiset 72e366b4c2 ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()
New buffer intrinsics have a separate soffset parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:19 +01:00
Samuel Pitoiset 9d960c17a8 ac: use new LLVM 8 intrinsic when storing 16-bit values
vindex is always 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:14 +01:00
Samuel Pitoiset 2a9d331898 ac: add ac_build_{struct,raw}_tbuffer_store() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:12 +01:00
Samuel Pitoiset 30c2aca67f ac: use new LLVM 8 intrinsics in ac_build_buffer_load()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:08 +01:00
Samuel Pitoiset da46dbb1be ac/nir: use ac_build_buffer_store_dword() for SSBO store operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:06 +01:00
Samuel Pitoiset 6b573c00c9 ac/nir: use ac_build_buffer_load() for SSBO load operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:02 +01:00
Samuel Pitoiset 29132af234 ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations
Use the raw version (ie. IDXEN=0) because vindex is unused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:56 +01:00
Samuel Pitoiset b39844457f ac/nir: remove one useless check in visit_store_ssbo()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:54 +01:00
Samuel Pitoiset a2073f49f1 ac: add ac_build_buffer_store_format() helper
Similar to ac_build_buffer_load_format().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:50 +01:00
Samuel Pitoiset 4debe49d44 ac/nir: set attrib flags for SSBO and image store operations
For consistency regarding other store operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:37 +01:00
Samuel Pitoiset 1b553dd47f ac: make use of ac_get_store_intr_attribs() where possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:35 +01:00
Samuel Pitoiset f4f0e3a395 ac: use llvm.amdgcn.fract intrinsic for nir_op_ffract
Noticed with a Doom shader.

29077 shaders in 15096 tests
Totals:
SGPRS: 1282125 -> 1282133 (0.00 %)
VGPRS: 908716 -> 908616 (-0.01 %)
Spilled SGPRs: 24811 -> 24779 (-0.13 %)
Code Size: 49048176 -> 48936488 (-0.23 %) bytes
Max Waves: 244232 -> 244226 (-0.00 %)

Totals from affected shaders:
SGPRS: 229584 -> 229592 (0.00 %)
VGPRS: 163268 -> 163168 (-0.06 %)
Spilled SGPRs: 8682 -> 8650 (-0.37 %)
Code Size: 12819572 -> 12707884 (-0.87 %) bytes
Max Waves: 24398 -> 24392 (-0.02 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 09:06:35 +01:00
Timothy Arceri 010570c8e3 ac/nir_to_llvm: add assert to emit_bcsel()
nir to llvm assumes we have already split vectors to scalars via
nir_lower_alu_to_scalar().

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-18 09:39:04 +11:00
Samuel Pitoiset cbf022cb31 ac: use the raw tbuffer version for 16-bit SSBO loads
vindex is always 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 14:16:14 +01:00
Samuel Pitoiset 045fae0f73 ac: add ac_build_{struct,raw}_tbuffer_load() helpers
The struct version sets IDXEN=1, while the raw version sets IDXEN=0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 14:15:05 +01:00
Samuel Pitoiset 489dac0d21 ac: rework typed buffers loads for LLVM 7
Be more generic, this will be used by an upcoming series.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 13:31:06 +01:00
Rhys Perry 0f025bbccc ac/nir: fix 16-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-12 15:51:52 +01:00
Timothy Arceri 54522d0506 nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()
Replace done using:
find ./src -type f -exec sed -i -- \
's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Timothy Arceri 8294295dbd glsl: rename record_location_offset() -> struct_location_offset()
Replace done using:
find ./src -type f -exec sed -i -- \
's/record_location_offset(/struct_location_offset(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Bas Nieuwenhuizen a1fdd4a4a7 radv: Fix float16 interpolation set up.
float16 types can have non-flat interpolation so set up the HW
correctly for that.

Fixes: 62024fa775 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-22 17:06:55 +01:00
Bas Nieuwenhuizen 1ef2855692 radv: Handle clip+cull distances more generally as compact arrays.
Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 .

That MR keeps the clip and cull arrays split.

So we have to handle
 - compact arrays with location_frac != 0
 - VARYING_SLOT_CLIP_DIST1

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 22:49:52 +00:00
Kenneth Graunke ba7519ca36 radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow
ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL
leaves it undefined).  Performing fpow lowering in NIR would break this
behavior, preventing us from using prog_to_nir.

According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common
expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>,
which presumably does a zero-wins multiply.

Lowering in NIR results in a non-legacy multiply, where:

   pow(0, 0) = 2^(log2(0) * 0)
             = 2^(-INF * 0)
             = 2^(-NaN)
             = -NaN

which isn't the desired result.

This reverts:
- commit d6b7539206
  (ac/nir: remove emission of nir_op_fpow)
- commit 22430224fe
  (radeonsi/nir: enable lowering of fpow)

and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir
after enabling prog_to_nir in st/mesa later in this series.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:19 -08:00