Commit Graph

6836 Commits

Author SHA1 Message Date
Samuel Pitoiset f6755eee0c radv: enable SQTT support on GFX10.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8616>
2021-01-22 14:25:16 +00:00
Samuel Pitoiset aedcaff356 ac,radv: add SQTT support on GFX10.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8616>
2021-01-22 14:25:16 +00:00
Samuel Pitoiset cd53f24fbf ac/rgp: add support for GFX10.3
According to AMDVLK, GFX10.3 uses SQTT version 2.4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8616>
2021-01-22 14:25:16 +00:00
Samuel Pitoiset 5b5cd18853 radv: inhibit clock gating when tracing with SQTT
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8616>
2021-01-22 14:25:16 +00:00
Samuel Pitoiset c40ea24ee0 radv: fix overflow when computing the SQTT buffer size
With RADV_THREAD_TRACE_BUFFER_SIZE=1073741824, the computed size
will overflow and be 4096 instead of 4294967296.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8616>
2021-01-22 14:25:16 +00:00
Rhys Perry e115b01948 aco: return references in instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:33 +00:00
Rhys Perry 1d245cd18b aco: use format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry 70dbcfa1c9 aco: use instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry fb12302b8e aco: add instruction cast and format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry 441ead5fb3 aco: remove Format::{VOP3A,VOP3B}
These are really the same as Format::VOP3.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry 824eba2148 aco: don't consider a phi trivial if same's register doesn't match the def
For example:
 s2: %688:s[32-33] = p_linear_phi %3:s[10-11], %688:s[32-33]
would have been considered trivial.

This might happen due to parallelcopies when assigning phi registers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 69b6069dd2 ("aco: refactor try_remove_trivial_phi() in RA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8645>
2021-01-22 12:42:47 +00:00
Rhys Perry 9f389af35f radv: sink load_ssbo
fossil-db (GFX10.3):
Totals from 11485 (8.24% of 139391) affected shaders:
SGPRs: 1032456 -> 1033696 (+0.12%); split: -0.69%, +0.81%
VGPRs: 815332 -> 807448 (-0.97%); split: -1.04%, +0.07%
SpillSGPRs: 18014 -> 13497 (-25.07%); split: -28.28%, +3.20%
SpillVGPRs: 1821 -> 1749 (-3.95%)
CodeSize: 101194172 -> 101235028 (+0.04%); split: -0.06%, +0.10%
Scratch: 198656 -> 178176 (-10.31%)
MaxWaves: 86703 -> 87219 (+0.60%); split: +0.67%, -0.07%
Instrs: 19224250 -> 19238562 (+0.07%); split: -0.05%, +0.13%
Cycles: 1486045388 -> 1487481292 (+0.10%); split: -0.03%, +0.13%
VMEM: 2040484 -> 2127647 (+4.27%); split: +6.64%, -2.37%
SMEM: 724060 -> 674966 (-6.78%); split: +1.22%, -8.00%
VClause: 312375 -> 314735 (+0.76%); split: -0.26%, +1.02%
SClause: 702274 -> 711991 (+1.38%); split: -0.77%, +2.15%
Copies: 1413440 -> 1422782 (+0.66%); split: -0.45%, +1.11%
Branches: 658696 -> 658838 (+0.02%); split: -0.12%, +0.14%
PreSGPRs: 884666 -> 879736 (-0.56%); split: -1.30%, +0.74%
PreVGPRs: 777374 -> 769947 (-0.96%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
2021-01-21 18:07:03 +00:00
Rhys Perry af4c6605a8 radv: use nir_opt_access
fossil-db (GFX10.3):
Totals from 3231 (2.32% of 139391) affected shaders:
SGPRs: 168654 -> 167454 (-0.71%); split: -0.72%, +0.00%
VGPRs: 152352 -> 152416 (+0.04%)
CodeSize: 13872836 -> 13806376 (-0.48%); split: -0.50%, +0.02%
MaxWaves: 36640 -> 36634 (-0.02%)
Instrs: 2639959 -> 2626852 (-0.50%); split: -0.52%, +0.03%
Cycles: 77706000 -> 77496792 (-0.27%); split: -0.28%, +0.01%
VMEM: 809496 -> 790847 (-2.30%); split: +2.06%, -4.36%
SMEM: 267843 -> 253187 (-5.47%); split: +0.76%, -6.23%
VClause: 61353 -> 60426 (-1.51%); split: -1.86%, +0.35%
SClause: 95409 -> 92355 (-3.20%); split: -3.24%, +0.04%
Copies: 194951 -> 196702 (+0.90%); split: -0.53%, +1.43%
Branches: 84320 -> 84331 (+0.01%); split: -0.00%, +0.02%
PreSGPRs: 110162 -> 110203 (+0.04%); split: -0.04%, +0.07%
PreVGPRs: 127021 -> 127037 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
2021-01-21 18:07:03 +00:00
Rhys Perry dc19fe0e9f radv,aco: use deref_buffer_array_length
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3993
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163>
2021-01-21 11:53:12 +00:00
Daniel Schürmann e10779a9f0 radv: don't vectorize shift operations
Currently, these cannot be vectorized as in NIR
shift operands are 32bit while for 16bit-vectorization
they need to be 16bit.

No fossildb changes.

Fixes: fcd2ef23e5 ('radv: vectorize 16bit instructions')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8612>
2021-01-21 11:44:00 +00:00
Daniel Schürmann 7dcb9a0d8c aco/optimizer: convert extract_vector with index 0 into parallelcopies if possible
Totals from 273 (0.20% of 139391) affected shaders (Navi10):
VGPRs: 11600 -> 11792 (+1.66%)
CodeSize: 1389304 -> 1383152 (-0.44%); split: -0.53%, +0.08%
MaxWaves: 3848 -> 3752 (-2.49%)
Instrs: 240228 -> 239478 (-0.31%); split: -0.37%, +0.06%
Cycles: 20637708 -> 20580024 (-0.28%); split: -0.46%, +0.18%
VMEM: 39164 -> 38831 (-0.85%); split: +0.06%, -0.91%
SMEM: 21743 -> 22204 (+2.12%)
VClause: 4787 -> 4783 (-0.08%)
Copies: 39057 -> 38308 (-1.92%); split: -2.28%, +0.37%
Branches: 6556 -> 6557 (+0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann ebbf5fe716 aco/optimizer: expand subdword vectors with SGPRs on all generations
No fossildb changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann 96fafcca63 aco: propagate temporaries into PSEUDO instructions if it can take it
This patch relaxes copy-propagation for PSEUDO instructions with
subdword Operands / Definitions:
general:
- only propagate VGPR temps if the Definition is VGPR (or on p_as_uniform)

parallelcopy/create_vector/phis:
- size has to be the same

extract_vector/split_vector:
- propagate SGPR temps on GFX9+ or if the Definitions are not subdword
- split_vector: size must not increase

Totals from 282 (0.20% of 140985) affected shaders (Polaris10):
VGPRs: 14520 -> 14408 (-0.77%)
CodeSize: 2693956 -> 2694316 (+0.01%); split: -0.20%, +0.21%
Instrs: 512874 -> 512864 (-0.00%); split: -0.16%, +0.16%
Cycles: 26338860 -> 26320652 (-0.07%); split: -0.36%, +0.29%
VMEM: 49460 -> 49634 (+0.35%); split: +0.47%, -0.12%
SMEM: 10035 -> 10036 (+0.01%)
VClause: 7675 -> 7674 (-0.01%)
Copies: 66012 -> 65943 (-0.10%); split: -1.31%, +1.20%
Branches: 17265 -> 17281 (+0.09%); split: -0.10%, +0.19%
PreVGPRs: 12211 -> 12124 (-0.71%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann 21a7bea342 aco/validate: relax subdword restrictions
This affects constants/SGPRs on GFX6-8 and
the operand regClass of SDWA instructions.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann 77c9629046 aco/validate: ensure that Operand and Definition size matches for parallelcopies
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann 8fb66187ec aco/validate: validate that p_create_vector operands are aligned unless they are subdword operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann c0cec3a29b aco: generalize subdword constant copy lowering
This will allow to propagate and emit sub-register constants
on all hardware generations.

Also fixes GFX8 constant emission to not use SDWA.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann 856fd4750d aco/optimizer: don't propagate subdword temps of different size
It could happen that due to inconsistent copy-propagation

  v1 = p_parallelcopy v2b

instructions were left after optimization on GFX8.

Cc: 20.3
Cc: 21.0

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann cd870d1b6a aco/optimizer: don't copy-prop logical phis
This is dangerous w.r.t. LCSSA-phis.

Totals from 746 (0.54% of 139391) affected shaders (Navi10):
CodeSize: 8592160 -> 8568156 (-0.28%); split: -0.30%, +0.02%
MaxWaves: 5172 -> 5171 (-0.02%); split: +0.02%, -0.04%
Instrs: 1653949 -> 1648489 (-0.33%); split: -0.36%, +0.03%
Cycles: 49474892 -> 49329224 (-0.29%); split: -0.33%, +0.03%
VMEM: 137574 -> 137421 (-0.11%); split: +0.18%, -0.29%
SMEM: 42391 -> 42439 (+0.11%); split: +0.12%, -0.01%
VClause: 26946 -> 26943 (-0.01%)
Copies: 130902 -> 126176 (-3.61%); split: -4.05%, +0.43%
Branches: 54891 -> 54556 (-0.61%); split: -0.64%, +0.03%
PreVGPRs: 53941 -> 53939 (-0.00%)

This has a slight effect on RA due to affinity changes.

Cc: 20.3
Cc: 21.0

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Samuel Pitoiset 085e2ce3d4 radv: fix a sync issue with geometry shader primitives query on GFX10+
When NGG is used, the hw can't know the number of geometry shader
primitives. To fix that, the NGG geometry shader accumulates itself
the number of primitives by using an atomic operation directly to GDS.

Then, begin/query copy the start/stop values from GDS to the
query pool buffer using a PS_DONE event. This was actually wrong
because PS_DONE is completely asynchronous to everything and executed
when the preceding draws finish pixel shaders.

Fix this by using a COPY_DATA packet which is synced with CP. This
fixes random failures on Sienna Cichlid with
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8590>
2021-01-21 08:15:43 +01:00
Rhys Perry 914c61d6c0 radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2
Fixes several dEQP-VK.robustness.robustness2.* tests on GFX8. Generations
other than GFX8 don't fail the tests because bounds-checking is done using
the index (making it per-vertex).

fossil-db (Polaris):
Totals from 1387 (0.99% of 140385) affected shaders:
(no statistics affected)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7834>
2021-01-20 17:57:56 +00:00
Samuel Pitoiset 4eec0fb55c radv: remove redundant check in depth_view_can_fast_clear()
We check below if HTILE is in compressed state, so checking if
the image has HTILE is useless because radv_layout_is_htile_compressed()
will return FALSE if no HTILE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
2021-01-20 17:18:39 +00:00
Samuel Pitoiset 27d4a15528 radv: remove unnecessary radv_image::tc_compatible_htile
Use the surface flags directly instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
2021-01-20 17:18:39 +00:00
Samuel Pitoiset c30f010e8f radv: remove redundant check in radv_process_depth_stencil()
This is already checked in radv_handle_depth_image_transition().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
2021-01-20 17:18:39 +00:00
Rhys Perry af9977a3d5 aco: add affinity for non-sequential MIMG operands
fossil-db (GFX10.3):
Totals from 42008 (30.14% of 139391) affected shaders:
VGPRs: 2139116 -> 2147696 (+0.40%); split: -0.06%, +0.46%
CodeSize: 199109120 -> 198637852 (-0.24%); split: -0.24%, +0.01%
Instrs: 37713901 -> 37714574 (+0.00%); split: -0.02%, +0.03%
Cycles: 1621911328 -> 1621634168 (-0.02%); split: -0.02%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry 4015b3651a aco: only require texture coordinates to be in WQM if NSA is used
From comment in emit_mimg():
We don't need the bias, sample index, compare value or offset to be
computed in WQM but if the p_create_vector copies the coordinates, then it
needs to be in WQM.

fossil-db (GFX10.3):
Totals from 1778 (1.28% of 139391) affected shaders:
SGPRs: 105080 -> 105072 (-0.01%); split: -0.02%, +0.01%
VGPRs: 96800 -> 96776 (-0.02%); split: -0.07%, +0.05%
CodeSize: 10001120 -> 10001384 (+0.00%); split: -0.04%, +0.04%
MaxWaves: 18164 -> 18163 (-0.01%)
Instrs: 1883750 -> 1883598 (-0.01%); split: -0.06%, +0.05%
Cycles: 34800176 -> 34767840 (-0.09%); split: -0.10%, +0.01%

We don't have a p_create_vector if we use NSA.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry c353895c92 aco: use non-sequential addressing
fossil-db (GFX10.3):
Totals from 70493 (50.57% of 139391) affected shaders:
SGPRs: 4232624 -> 4231808 (-0.02%); split: -0.09%, +0.07%
VGPRs: 2831772 -> 2764740 (-2.37%); split: -2.53%, +0.17%
CodeSize: 225584412 -> 225048740 (-0.24%); split: -0.44%, +0.21%
MaxWaves: 875319 -> 878837 (+0.40%); split: +0.44%, -0.04%
Instrs: 43157803 -> 42496421 (-1.53%); split: -1.54%, +0.01%
Cycles: 1656380132 -> 1641532056 (-0.90%); split: -0.94%, +0.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry faf3e9a27f aco: move VADDR to the end of the operand list
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry cd29210fce aco: add emit_mimg() helper
Some fossil-db noise from slightly different order of instructions.

fossil-db (GFX10.3):
Totals from 73 (0.05% of 139391) affected shaders:
SGPRs: 3424 -> 3440 (+0.47%)
CodeSize: 199076 -> 199064 (-0.01%); split: -0.01%, +0.00%
Instrs: 37303 -> 37300 (-0.01%); split: -0.01%, +0.00%
Cycles: 786328 -> 786316 (-0.00%); split: -0.00%, +0.00%
VMEM: 19448 -> 19454 (+0.03%); split: +0.04%, -0.01%
SMEM: 5241 -> 5305 (+1.22%); split: +1.70%, -0.48%
SClause: 1282 -> 1281 (-0.08%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry 9890dabb1b aco: have emit_wqm() take Builder instead of isel_context
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry 489aa8c7cb aco: fix num_waves on GFX10+
There are half the SIMDs per CU and physical_vgprs should be 512 instead
of 256.

fossil-db (GFX10.3):
Totals from 3622 (2.60% of 139391) affected shaders:
VGPRs: 298192 -> 289732 (-2.84%); split: -3.43%, +0.59%
CodeSize: 29443432 -> 29458388 (+0.05%); split: -0.00%, +0.06%
MaxWaves: 21703 -> 23395 (+7.80%); split: +7.84%, -0.05%
Instrs: 5677920 -> 5681438 (+0.06%); split: -0.01%, +0.07%
Cycles: 280715524 -> 280895676 (+0.06%); split: -0.00%, +0.07%
VMEM: 981142 -> 981894 (+0.08%); split: +0.18%, -0.10%
SMEM: 243315 -> 243454 (+0.06%); split: +0.07%, -0.02%
VClause: 88991 -> 89767 (+0.87%); split: -0.02%, +0.89%
SClause: 200660 -> 200659 (-0.00%); split: -0.00%, +0.00%
Copies: 430729 -> 434160 (+0.80%); split: -0.07%, +0.86%
Branches: 158004 -> 158021 (+0.01%); split: -0.01%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Rhys Perry 12ea0143de radv: fix max_waves estimation on GFX10.3
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
2021-01-20 16:46:54 +00:00
Samuel Pitoiset 4c99d6ff54 radv: flush L2 for images affected by the pipe misaligned issue on GFX10+
In some rare cases, L2 needs to be flushed if an image is affected
by the pipe misaligned issue. This is roughly based on AMDVLK.

I confirmed that disabling TC-compat HTILE, and respectively DCC,
for the relevant images also fixes the regressions below.

This fixes some regressions introduced with L2 coherency for
dEQP-VK.renderpass2.depth_stencil_resolve.image_2d_* and for
dEQP-VK.renderpass2.suballocation.multisample_resolve.*.

Fixes: 4a783a3c78 ("radv: Use L2 coherency on GFX9+.")
Co-Authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8557>
2021-01-19 19:51:44 +00:00
Samuel Pitoiset 8882abe47e radv: restore invalidating the vector cache for internal meta operations
The driver used to invalidate the vector cache for meta operations
but this has been removed and I think it should be restored to fix
a bunch of regressions on GFX8.

This probably needs to be cleaned up but this is a hotfix.

This fixes a bunch of regressions and flakes on GFX8 like
dEQP-VK.pipeline.multisample.sample_locations_ext.draw.color.samples_4.*.

Fixes: 8f8d72af55 ("radv: Use access helpers for flushing with meta operations.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8573>
2021-01-19 19:15:39 +00:00
Samuel Pitoiset c28401ab43 radv: enable TC-compat HTILE for D16S8 on GFX9+
I don't know why this wasn't enabled but I think it should be.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8562>
2021-01-19 18:16:35 +00:00
Samuel Pitoiset cc5b6a0e89 radv: enable TC-compat HTILE with D32S8 and MSAA on GFX9+
Only GFX8 has some depth/stencil resolve failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8562>
2021-01-19 18:16:35 +00:00
Samuel Pitoiset 60ead6e04b radv: add a comment explaining the micro tile mode resolve
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8558>
2021-01-19 18:52:43 +01:00
Rhys Perry 4c1953a9b8 aco: add test for incorrect convert_to_SDWA() check
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8577>
2021-01-19 15:38:56 +00:00
Rhys Perry fcda9b6737 aco: fix convert_to_SDWA() check in add_subdword_definition()
v_or_b32 with a v2b definition should use SDWA if is_partial=true.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 56345b8c61 ("aco: allow reading/writing upper halves/bytes when possible")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8577>
2021-01-19 15:38:51 +00:00
Samuel Pitoiset c3ac6f7cd7 radv: flush L2 metadata as part of CB/DB flush instead of CS_DONE on GFX9
This restores the previous logic because L2 coherency was fully
implemented. It appears that flushing L2 metadata with a CS_DONE
event hangs.

This fixes GPU hangs with Monster Hunter World.

Fixes: 4a783a3c ("radv: Use L2 coherency on GFX9+.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8566>
2021-01-19 07:47:34 +01:00
Bas Nieuwenhuizen c4ea4e026b radv: Add a trivial implementation of VK_KHR_deferred_host_operation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8545>
2021-01-19 01:25:38 +01:00
Bas Nieuwenhuizen af1aef10f9 radv: Do not use a pipe offset for aliased sparse images.
Otherwise the offset might not match between the images that are
aliased.

Fixes: e553ea51e8 ("radv: Create sparse images.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4072
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8535>
2021-01-18 11:12:45 +00:00
Bas Nieuwenhuizen c28469bae1 ac/surface: Fix GFX9 sparse mip info.
Used the wrong offset & pitch for gfx9.

Fixes: 50bafb85ec ("ac/surf: Add sparse texture info to radeon_surf.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4072
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8526>
2021-01-16 14:09:18 +00:00
Samuel Pitoiset c6849f9687 radv: do not invalidate the L2 metadata cache on compute queues
The flush VA space was only allocated for command buffers on the
graphics queue. Also, the ZPASS_DONE event should never be emitted
on compute queues because it hangs.

Invalidating the L2 metadata cache is only required for coherency
between the RBs and L2, so only on the graphics queue.

The L2 cache is invalidated at beginning of any IBs and that should
also invalidate the L2 metadata cache for compute anyways.

Fixes: 4a783a3c ("radv: Use L2 coherency on GFX9+.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8494>
2021-01-15 07:36:11 +01:00
Pierre-Eric Pelloux-Prayer c4b7a0d61d ac: add ifdef __cplusplus guard to header
ac_shadowed_regs.h can be included from si_state_draw.cpp so this commit
adds the needed guards.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8433>
2021-01-14 10:33:10 +01:00
Rhys Perry dfe429eb41 nir/loop_unroll: unroll more aggressively if it can improve load scheduling
Significantly improves performance of a Control compute shader. Also seems
to increase FPS at the very start of the game by ~5% (RX 580, 1080p,
medium settings, no MSAA).

fossil-db (Sienna):
Totals from 81 (0.06% of 139391) affected shaders:
SGPRs: 3848 -> 4362 (+13.36%); split: -0.99%, +14.35%
VGPRs: 4132 -> 4648 (+12.49%)
CodeSize: 275532 -> 659188 (+139.24%)
MaxWaves: 986 -> 906 (-8.11%)
Instrs: 54422 -> 126865 (+133.11%)
Cycles: 1057240 -> 750464 (-29.02%); split: -42.61%, +13.60%
VMEM: 26507 -> 61829 (+133.26%); split: +135.56%, -2.30%
SMEM: 4748 -> 5895 (+24.16%); split: +31.47%, -7.31%
VClause: 1933 -> 6802 (+251.89%); split: -0.72%, +252.61%
SClause: 1179 -> 1810 (+53.52%); split: -3.14%, +56.66%
Branches: 1174 -> 1157 (-1.45%); split: -23.94%, +22.49%
PreVGPRs: 3219 -> 3387 (+5.22%); split: -0.96%, +6.18%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6538>
2021-01-13 18:54:18 +00:00
Tony Wasserka b603875482 aco/ra: Use PhysRegInterval for count_zero
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka c30e83cc51 aco/ra: Use PhysRegInterval for collect_vars parameters
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka 0959b7c435 aco/ra: Use PhysReg when indexing into RegisterFile's containers
This gets rid of a lot of implicit/explicit conversions from PhysReg to
unsigned.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka c3660f4781 aco/ra: Use PhysReg for member functions of PhysRegInterval
This replaces the various PhysReg{lb} casts that had been all over the place.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka d2d0096c0c aco/ra: Remove unused function parameter
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka d9e1375e27 aco/ra: Use std::all_of to simplify a loop
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka f7e6b61379 aco/ra: Add helpers to test for intersection/containment of reg intervals
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka 88f21ad87a aco/ra: Move commonly repeated code to a helper function
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka 8962510e38 aco/ra: Conservatively refactor get_reg_specified to use PhysRegInterval
All expressions have been replaced by their closest equivalent. No major
simplification efforts have been made to minimize risk of regressions.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka 46c9d76134 aco/ra: Use std::all_of to simplify a loop
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka 2b3b2f7ff5 aco/ra: Use std::find_if(_not) to clean up get_reg_simple
This makes for a more self-describing iteration behavior, and it gets rid
of the need for the duplicated "final check" at the bottom.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka ebdb362937 aco/ra: Add iterator interface for PhysRegInterval
This enables various loops to use range-based for.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:06 +00:00
Tony Wasserka 689ce1f39d aco/ra: Remove always-false conditions
All code paths that set "found" to true either break or return before the
loop header is reached again, so the checks are unnecessary.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:05 +00:00
Tony Wasserka 46eee40abc aco/ra: Conservatively refactor existing code to use PhysRegInterval
All expressions have been replaced by their closest equivalent. No major
simplification efforts have been made to minimize risk of regressions.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:05 +00:00
Tony Wasserka 9bbd6162a9 aco/ra: Introduce PhysRegInterval helper class
This mainly clarifies the semantics of register bounds (inclusive vs
exclusive), and further groups related varaibles together to clarify
sliding-window-style loops.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:05 +00:00
Tony Wasserka 67c1f32228 aco/ra: Update register use bounds before recursing in get_regs_for_copies
Delaying the call to adjust_max_used_regs until after get_regs_for_copies
returns puts the RA context into a state where registers past max_used_gpr
may be blocked. This isn't an issue on its own, but it adds a surprising
corner case to get_reg_simple that is easily avoided now.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799>
2021-01-13 18:21:05 +00:00
Daniel Schürmann 288032a873 aco: remove divergent branches which only jump over very few instructions
Totals from 18436 (13.23% of 139391) affected shaders (NAVI10):
CodeSize: 138428504 -> 138172588 (-0.18%)
Instrs: 26605127 -> 26541176 (-0.24%)
Cycles: 1624994088 -> 1622461620 (-0.16%)
VMEM: 3689892 -> 3689102 (-0.02%)
SMEM: 1131767 -> 1131761 (-0.00%)
Branches: 851796 -> 787852 (-7.51%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7814>
2021-01-13 18:04:28 +00:00
Daniel Schürmann 412291ddef aco: propagate swizzles when optimizing packed clamp & fma
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 6ecbccfb23 aco: optimize v_pk_fma_f16 -> v_pk_fmac_f16 on GFX10
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann b03be30e07 aco: optimize packed fneg
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann e3790fc458 aco: optimize packed clamp
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann a9fd9187e8 aco: optimize packed mul+add to v_pk_fma_f16
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 01134b0bfe aco: simplify multiply-add combining
When both operands of a v_sub (same apply for v_add) are mul and one
already uses clamp/omod, pick the other operand to get a chance to
combine to a MAD.

No fossils-db changes.

Co-authored-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann fcd2ef23e5 radv: vectorize 16bit instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 454bbf8f23 aco: emit packed 16bit instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 5ad52ac906 aco: create helpers to emit vop3p instructions
Also make get_alu_src() capable to return
unswizzled multi-component SGPR sources.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 036a369f46 aco: change usesModifiers() considering opsel_hi on packed instructions
opsel_hi == 1 means that the high operand selects the
high bits of the input, which is the normal behavior.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 178b33c870 aco: allow SGPRs on every src position for VOP3P
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 0db4263a3a aco: allow constants/literals on every src position for VOP3P
and prevent literals on VOP3P pre-GFX10.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Daniel Schürmann 4a75a28698 aco/RA: fix subdword operands on VOP3P instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:55 +00:00
Daniel Schürmann 2caba08c1a aco: fix VOP3P assembly, VN and validation
aco/opcodes: rename v_pk_fma_mix* -> v_fma_mix*
and add modifier capabilities for VOP3P.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:55 +00:00
Samuel Pitoiset 3c1275ccae radv: enable DCC for MSAA on GFX10+
It should work fine now.

This gives +1-2% improvements with Control MSAA (2x and 4x)
on Sienna.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8413>
2021-01-13 17:24:31 +00:00
Bas Nieuwenhuizen 4a783a3c78 radv: Use L2 coherency on GFX9+.
Especially on GFX10 we can avoid pretty much all L2 flushes.

However, instead of that we have to do L2_METADATA invalidations. We
do that every time we could possibly be reading new DCC/HTILE info
from the L2 cache in shaders.

Benchmark results, basemark on high preset with a navi10 on profile_standard
(which is slower than a navi10 on default settings, please don't compare
 to random navi10 results you find)

before:
  5932
  5928
  5937

after:
  6011
  6013
  6009

So this looks like a >1% increase.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202>
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen 0af86341a2 radv: Use L2 for CP DMA on GFX9+.
This enables assuming that the L2 is always up to date for barriers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202>
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen 8f8d72af55 radv: Use access helpers for flushing with meta operations.
This way we're properly using the vulkan barrier paradigm instead
of adhoc guessing what caches need to be flushed. This is more robust
for cache policy changes as we now don't have to revisit all the meta
operations all the time.

Note that a barrier has both a src and dst part though. So

barrier:
   flush src
   meta op
   flush dst

becomes

barrier:
  flush barrier src
  flush meta op dst
  meta op
  flush meta op src
  flush barrier dst

And there are some places where we've been able to replace a CB flush
with a shader flush because that is what we'd need according to vulkan rules
(and it turns out that in the cases the CB flush mattered the app will set the
bit in one of the relevant flushes or it was needed as a result of an optimization
that we counter-acted in the previous patch.)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202>
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen dba0a523a0 radv: Do dst invalidations for write accesses.
For write-after-write hazards.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202>
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen 9026f10cda radv: Invalidate CB on SHADER_WRITE for meta operations.
To cancel the optimization in radv_dst_access_flush if these helpers
get used by meta operations.

We could also remove that optimization but I think this triggers less
often as all SHADER_WRITE flushes on images not supporting STORAGE should
be meta

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202>
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen 3d7713b5a2 radv: Remove redundant WB_L2 flush.
INV_L2 already does that.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202>
2021-01-13 16:27:19 +00:00
Samuel Pitoiset afad13700a radv: disable VK_EXT_sample_locations again on GFX10+
I attempted to enable it for 21.0, only 2x and 4x were supported
but there is new failures if DCC+MSAA is enabled.

Disable it again because DCC is more important than this feature and
no Mesa releases have it on GFX10+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8472>
2021-01-13 15:04:56 +00:00
Samuel Pitoiset 001c1105f1 radv: enable DCC for mipmaps on GFX10+
Seems to work fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468>
2021-01-13 13:42:04 +00:00
Samuel Pitoiset 825e2386dc radv: do not enable DCC for 3D images with mipmaps on GFX10+
This is broken for some reasons, and probably rare enough to
care for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468>
2021-01-13 13:42:04 +00:00
Samuel Pitoiset 755a8313fc radv: add support for fast-clearing DCC levels on GFX10+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468>
2021-01-13 13:42:04 +00:00
Samuel Pitoiset 5537c9de73 radv: prevent fast-clearing uncompressed DCC levels
When size is 0, this means the level can't be compressed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468>
2021-01-13 13:42:04 +00:00
Samuel Pitoiset a4876f055c ac/surface: store DCC mip info into the surface
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468>
2021-01-13 13:42:04 +00:00
Rhys Perry 8301d483ff aco/tests: don't rely on argument evaluation order
The argument evaluation order is implementation-defined and affects the
order the instructions are inserted.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3938
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7945>
2021-01-13 13:04:26 +00:00
Samuel Pitoiset fcd5925612 radv: skip fast-clear eliminate for CMASK based on a predicate
If we have CMASK, we can also skip FCE like we do for DCC.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8332>
2021-01-13 12:24:32 +01:00
Samuel Pitoiset 697c93abc1 radv: update the FCE predicate for fast clears using CMASK
Fast clearing with CMASK should always be eliminated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8332>
2021-01-13 12:24:30 +01:00
Samuel Pitoiset 051e2bfe80 radv: allocate and initialize the FCE predicate value for CMASK too
In case we don't have DCC, we can still predicate FCE with CMASK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8332>
2021-01-13 12:24:29 +01:00
Samuel Pitoiset 735b808639 radv: only use predication if the FCE value is allocated
The FCE predicate value is only allocated if DCC is enabled.
We only want to use predication for DCC decompressions and for FCE
but not having FMASK doesn't mean the predicate is allocated.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4075
Fixes: 6e7008e94b ("radv: do not predicate FMASK decompression when DCC+MSAA is used")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8441>
2021-01-13 11:13:47 +00:00
Samuel Pitoiset dbe845624b radv: fix clearing DCC on GFX9
dcc_slice_size is in DWORD on GFX9... Also, layers aren't supported
because they might be interleaved. Fix this by clearing the entire
DCC buffer.

Fixes: 5e8f6967b1 ("radv: add support for fast-clearing DCC layers on GFX9+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8443>
2021-01-13 08:33:40 +01:00
Bas Nieuwenhuizen 9a937330ef radeonsi: Only set modifier creation function for GFX9+ & with kernel support.
Fixes: c786150dfa ("radeonsi: Add modifier support.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3963
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8407>
2021-01-12 23:47:09 +00:00
Bas Nieuwenhuizen 4956f6d0bf radv: Add Android module info to linker script.
The Android Vulkan loader needs this symbol, so the addition of the
linker script broke Vulkan for Android.

(For non-Android builds: I checked that having a non-existent symbol in
 the linker script works ok and doesn't put the symbol in the library)

Fixes: 41bb6459d3 ("radv: restrict exported symbols with static llvm")
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8437>
2021-01-12 20:17:52 +00:00
Timur Kristóf 4ee6d68d1f aco: Wait for stores when NGG or legacy VS can finish early.
When there are no param exports in an NGG (or legacy VS) shader,
the NO_PC_EXPORT=1 is set, which means PS waves can launch before
the current stage finishes.

If the current stage has any stores, we need to make sure to wait for
those before we allow PS waves to start, so that PS can read what
these instructions stored.

Fossil DB results on Navi 10:
Totals from 45 (0.03% of 136420) affected shaders:
CodeSize: 87224 -> 87404 (+0.21%)
Instrs: 16750 -> 16795 (+0.27%)
Cycles: 69580 -> 69760 (+0.26%)
VMEM: 8022 -> 8167 (+1.81%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7868>
2021-01-12 16:43:27 +00:00
Timur Kristóf 38da379b3e aco: Note if rasterization can start early.
When there are no param exports in an NGG (or legacy VS) shader,
the NO_PC_EXPORT=1 is set by RADV, which means PS waves can launch
before the current stage finishes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7868>
2021-01-12 16:43:27 +00:00
Daniel Schürmann 00cf077c15 aco/ra: fix infinite recursion in get_reg_simple() with subdword registers
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Fixes: f8c7661eca ('aco: try to better align 8+ dword SGPR vectors')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425>
2021-01-12 16:14:00 +00:00
Daniel Schürmann 7b669ff789 aco: simplify and fix operand/definition sizes
These are mainly needed for constant propagation
and subdword register allocation.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425>
2021-01-12 16:14:00 +00:00
Daniel Schürmann d495a5c183 radv: enable .lower_ineg
We already emit ineg as isub most of the time.

The results are a bit mixed, but shouldn't really make a difference.
A couple of additional copies are needed as isub writes scc.

Totals from 5975 (4.29% of 139391) affected shaders:
CodeSize: 31508648 -> 31509264 (+0.00%); split: -0.00%, +0.00%
Instrs: 6073379 -> 6073531 (+0.00%); split: -0.00%, +0.00%
Cycles: 47186280 -> 47187116 (+0.00%); split: -0.00%, +0.00%
VMEM: 2528515 -> 2529139 (+0.02%); split: +0.03%, -0.01%
SMEM: 596842 -> 596924 (+0.01%); split: +0.02%, -0.00%
SClause: 280596 -> 280594 (-0.00%)
Copies: 288554 -> 288669 (+0.04%); split: -0.00%, +0.04%
PreSGPRs: 240390 -> 240397 (+0.00%)
PreVGPRs: 349630 -> 349749 (+0.03%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425>
2021-01-12 16:14:00 +00:00
Daniel Schürmann e92bd57008 radv: don't lower_pack() after load-store-vectorization
Totals from 7 (0.01% of 139391) affected shaders:
CodeSize: 282900 -> 283324 (+0.15%); split: -0.01%, +0.16%
Instrs: 45287 -> 45338 (+0.11%); split: -0.01%, +0.12%
Cycles: 11496332 -> 11510396 (+0.12%); split: -0.00%, +0.12%
VMEM: 2355 -> 2335 (-0.85%)
Copies: 15506 -> 15561 (+0.35%)

A bit of noise in some parallel-rdp shaders.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425>
2021-01-12 16:14:00 +00:00
Daniel Schürmann 987a0e6a67 radv: call nir_opt_algebraic_late() after lowering idiv for small bitsizes
This is needed because lower_idiv() introduces ineg again
which we'll remove next.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425>
2021-01-12 16:14:00 +00:00
Daniel Schürmann 1ab9dd22a2 radv: optimize idiv_const for small bitsizes
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425>
2021-01-12 16:14:00 +00:00
Samuel Pitoiset 20af07d089 radv: fix color resolves if the dest image has DCC
Using the graphics resolve path when DCC is enabled should only be
a hint to avoid DCC fixup.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3388
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8326>
2021-01-12 16:03:36 +00:00
Samuel Pitoiset 3e781056b9 radv: fixup DCC after color resolves using the compute path
If the dest image has DCC it should be re-initialized to the
uncompressed state.

Note that the driver always selects the graphics path if the dest
image has DCC, so this has no effect for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8326>
2021-01-12 16:03:36 +00:00
Samuel Pitoiset 1f548b7670 radv: decompress DCC for partial resolves using the compute path
Because DCC is re-initialized to the uncompressed state after the
resolve, so if the app does a partial resolve it should be
decompressed first.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8326>
2021-01-12 16:03:36 +00:00
Samuel Pitoiset 095a428844 radv: set depth to 1 for subpass resolves using the compute path
To match Vulkan convention.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8326>
2021-01-12 16:03:36 +00:00
Rhys Perry 04e3d7ad93 aco: improve nir_op_vec with constant operands
Could still be improved a little. For example, 8-bit pack without
constants could be:
(s_pack_ll(x, z) & 0x00ff00ff) | ((s_pack_ll(y, w) & 0x00ff00ff) << 8)

fossil-db (Sienna):
Totals from 136 (0.10% of 139391) affected shaders:
CodeSize: 279776 -> 278144 (-0.58%)
Instrs: 50742 -> 50470 (-0.54%)
Cycles: 211560 -> 210472 (-0.51%)
SMEM: 3607 -> 3557 (-1.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8421>
2021-01-12 15:50:54 +00:00
Rhys Perry 255ca7ecda radv: set invariantgeom for Shadow of the Tomb Raider
Work around flickering foliage on GFX10.3

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4064
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8104>
2021-01-12 15:11:49 +00:00
Rhys Perry f17de6a803 radv: add RADV_DEBUG=invariantgeom
This can be used to work around a common class of bugs appearing as
flickering.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8104>
2021-01-12 15:11:49 +00:00
Samuel Pitoiset c24d6916e6 aco: fix inserting expcnt for MIMG on GFX6
MIMG VDATA has moved to its own operand.

Fixes: 962c917cea ("aco: move MIMG VDATA to its own operand")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8435>
2021-01-12 11:32:12 +00:00
Daniel Schürmann bd8e84eb8d nir: replace .lower_sub with .has_fsub and .has_isub
This allows a more fine-grained control about whether
a backend supports one of these instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Samuel Pitoiset d2524ed4a0 radv: mark VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT as unsupported on GFX6-7
This is only supported on GFX8+, this fixes a ton of CTS failures
on my Pitcairn (GFX6).

Fixes: af7fb4df50 ("radv: Add sparse image queries.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8415>
2021-01-11 17:17:42 +00:00
Rhys Perry 4ea0ce2f55 aco: remove can_reorder semantic in get_sync_info_with_hack
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8416>
2021-01-11 16:35:19 +00:00
Rhys Perry f8c7661eca aco: try to better align 8+ dword SGPR vectors
This doesn't have much of an effect, but it helps avoid a
pathological case for Assassin's Creed Valhalla and a RDR2 shader with a
future change.

fossil-db (Sienna):
Totals from 55074 (39.51% of 139391) affected shaders:
SGPRs: 3515076 -> 3567744 (+1.50%); split: -0.01%, +1.51%
CodeSize: 206942120 -> 206941868 (-0.00%); split: -0.00%, +0.00%
Instrs: 39625900 -> 39625837 (-0.00%); split: -0.00%, +0.00%
Cycles: 1640088780 -> 1640088828 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4070
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8416>
2021-01-11 16:35:19 +00:00
Samuel Pitoiset 7d44ba7217 radv: enable DCC for layered color images on GFX10+
There is still some CTS failures on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8368>
2021-01-11 15:42:22 +00:00
Samuel Pitoiset 8754f9e8f9 radv: do not use predication when the range doesn't cover the whole image
The predication is based on the mip level, so if the image has layers
and DCC is enabled, it should only be used if the range of layers
covers the whole image.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8368>
2021-01-11 15:42:22 +00:00
Samuel Pitoiset 5420ab9cdf radv: clean up radv_decompress_dcc_compute()
Remove one old comment because it supports decompressing layers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8368>
2021-01-11 15:42:22 +00:00
Samuel Pitoiset 5e8f6967b1 radv: add support for fast-clearing DCC layers on GFX9+
Layers are contiguous in memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8368>
2021-01-11 15:42:22 +00:00
Samuel Pitoiset 7a3e6f5ac2 ac/surface: initialize dcc_slice_size on GFX9+
Will be used by RADV to implement DCC layers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8368>
2021-01-11 15:42:22 +00:00
Timur Kristóf b75d8052a7 aco: Spill more optimally before loops.
This further reduces the dead code emitted by the spiller.
Some minimal amount of dead IR is still emitted sometimes,
but that doesn't generate any compiled code at the end.

Totals from 1953 (1.40% of 139391) affected shaders:
VGPRs: 206980 -> 206588 (-0.19%)
SpillSGPRs: 24719 -> 16423 (-33.56%); split: -33.58%, +0.02%
CodeSize: 28448516 -> 28343836 (-0.37%); split: -0.38%, +0.01%
MaxWaves: 8960 -> 8992 (+0.36%)
Instrs: 5422049 -> 5408334 (-0.25%); split: -0.26%, +0.01%
Cycles: 511240864 -> 512460764 (+0.24%); split: -0.02%, +0.26%
VMEM: 346681 -> 346468 (-0.06%); split: +0.27%, -0.33%
SMEM: 124160 -> 122802 (-1.09%); split: +0.33%, -1.42%
VClause: 81102 -> 81163 (+0.08%); split: -0.01%, +0.09%
SClause: 174404 -> 174237 (-0.10%); split: -0.23%, +0.13%
Copies: 530216 -> 532961 (+0.52%); split: -0.90%, +1.42%
Branches: 189114 -> 189221 (+0.06%); split: -0.13%, +0.18%
PreSGPRs: 206017 -> 206526 (+0.25%); split: -0.08%, +0.33%
PreVGPRs: 183103 -> 182964 (-0.08%)

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8026>
2021-01-11 12:25:29 +00:00
Timur Kristóf b03fbec4f1 aco: Keep live-though variables and constants spilled.
This noticably reduces the amount of dead code emitted by our
spiller, when eg. previously a constant was spilled then
rematerialized before a loop, but then spilled again inside the loop.

Fossil DB changes on Navi 10:
Totals from 263 (0.19% of 139391) affected shaders:
VGPRs: 30044 -> 30028 (-0.05%)
SpillSGPRs: 8800 -> 4948 (-43.77%)
CodeSize: 4496040 -> 4335448 (-3.57%); split: -3.57%, +0.00%
Instrs: 843942 -> 819219 (-2.93%); split: -2.93%, +0.00%
Cycles: 76485744 -> 73549080 (-3.84%); split: -4.04%, +0.20%
VMEM: 38204 -> 38147 (-0.15%); split: +0.08%, -0.23%
SMEM: 17872 -> 17959 (+0.49%)
SClause: 24298 -> 24012 (-1.18%)
Copies: 98023 -> 82960 (-15.37%); split: -15.38%, +0.01%
Branches: 29074 -> 27632 (-4.96%)
PreVGPRs: 25291 -> 25241 (-0.20%)

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8026>
2021-01-11 12:25:29 +00:00
Bas Nieuwenhuizen 9f43b44bf0 radv: Enable sparse buffer and image support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen af7fb4df50 radv: Add sparse image queries.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen 3ac8804829 radv: Add image sparse memory update implementation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen e553ea51e8 radv: Create sparse images.
Disable all metadata for now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen d3286bdd76 radv/winsys: Fix offset in range merging.
If we change the virtual address we also have to change the offset in the buffer
to be mapped.

Fixes: 715df30a4e "radv/amdgpu: Add winsys implementation of virtual buffers."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen 2b12e6931e radv/winsys: Fix inequality for sparse buffer remapping.
Found a case where we mapped a range too many.

Per the comment the constraint is:

	/* [first, last] is exactly the range of ranges that either overlap the
	 * new parent, or are adjacent to it. This corresponds to the bind ranges
	 * that may change.
	 */

So that means that after the ++last we the ranges[last] should still
be adjacent. So we need to test the post-increment value to see whether
it is adjacent.

Failure case:
  ranges:
    0: 0 - ffff
    1: 10000 - 1ffff
    2: 20000 - 2ffff
    3: 30000 - 3ffff
  new range: 10000 - 1ffff

wrong first, last: 0,3
  However range 3 clearly isn't adjacent at all.

Fixes: 715df30a4e "radv/amdgpu: Add winsys implementation of virtual buffers."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen f56a28daa4 ac/surf: Use correct tilemodes on GFX8 for PRT.
Otherwise addrlib will assign the non-PRT tiling indices anyway ...

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen 50bafb85ec ac/surf: Add sparse texture info to radeon_surf.
For GFX9 I didn't reuse the existing mipmap offset/pitch because
last time we did that there was a revert request from Marek.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen cd5458f367 ac/surf: Implement PRT layout.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Bas Nieuwenhuizen dea1c06c9b ac/surf: Prepare for 64-bit flags.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7953>
2021-01-11 12:01:34 +00:00
Samuel Pitoiset 8914efb5b7 radv: only re-initialize HTILE after ds compute resolves if compressed
If the current layout isn't compressed we don't have to re-initialize
the HTILE metadata.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8389>
2021-01-11 11:27:05 +00:00
Samuel Pitoiset 1645d9ebab radv: re-initialize HTILE properly after depth/stencil compute resolves
This was added to workaround some CTS failures which no longer happen.
Note that radv_clear_htile() will only clear the depth or stencil
bytes of the HTILE buffer based on the aspect.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8389>
2021-01-11 11:27:05 +00:00
Samuel Pitoiset 52b6adfbfb radv: disable TC-compat HTILE in GENERAL for Detroit: Become Human
The game has invalid usage of render loops and enabling TC-compat
HTILE in GENERAL introduces rendering issues.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3063
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8359>
2021-01-11 11:15:56 +00:00
Samuel Pitoiset 8f9b2afe70 radv: fix crashes when fast-clearing in a secondary command buffer
iview can be NULL inside a secondary command buffer.

Fixes: 00064713a3 ("radv: determine at creation if an image view can be fast cleared")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8408>
2021-01-11 11:07:09 +00:00
Pierre-Eric Pelloux-Prayer c4427c2b53 ac/rgp: add missing include
The build would fail without this include if -std=gnu17 is used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4057
Fixes: ffdfe136e6 ("ac/sqtt: move rgp/sqtt def to ac")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8376>
2021-01-11 10:11:09 +00:00
Samuel Pitoiset 6e7008e94b radv: do not predicate FMASK decompression when DCC+MSAA is used
Even if the FCE predicate is FALSE, we might still need to decompress
FMASK if compressed rendering was used. FMASK decompressions should
never been predicated.

This fixes a ton of CTS failures and a rendering issue with Control
when DCC+MSAA is force-enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8331>
2021-01-11 09:30:41 +00:00
Samuel Pitoiset 00064713a3 radv: determine at creation if an image view can be fast cleared
This can be determined earlier than every time a clear is performed
by the driver, it probably saves a bunch of CPU cycles.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8370>
2021-01-11 08:31:11 +01:00
Mauro Rossi e7444bd3a6 android: ac/radv: fix typo in ac_rgp.h listed in Makefile.sources
Fixes the following building error:

error: external/mesa/src/amd/Android.mk: libmesa_amd_common: Unused source files: common/ac_rgph.

Fixes: 4ec5cf5318 ("ac/radv: move radv_rgp.c to ac")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8371>
2021-01-09 11:15:09 +01:00
Rhys Perry f01bca8100 radv/winsys: set has_packed_math_16bit in null winsys
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8392>
2021-01-08 16:16:19 +00:00
Simon Ser 7ef2046065 radv: only set BO metadata for the first plane
To properly support multi-planar images, we don't want to set metadata
on anything other than the first plane. To achieve this radv currently
checks for the image TILING and assumes LINEAR means it's not the first
plane.

However this doesn't account for images with a single LINEAR plane. We
still want to set metadata on those, e.g. to properly set the scanout
bit in the tiling flags.

Instead of checking for LINEAR, check if the offset is zero. Only the
first plane has a zero offset on AMD.

This mirrors the radeonsi logic [1].

While at it, move the metadata declaration into the if block.

[1]: 6fecdc6dda/src/gallium/drivers/radeonsi/si_texture.c (L710)

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8086>
2021-01-08 14:52:18 +00:00
Rhys Perry d95fe8a25e radv: support SpvCapabilitySparseResidency
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 4c67423e99 radv: implement is_sparse_texels_resident and sparse_residency_code_and
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 6d5e26752c ac/nir: implement sparse image/texture loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 55aeac7af4 ac/nir: implement nir_op_vec5
Since sparse fetch/load uses vec5 destinations, it may be possible that we
encounter nir_op_vec5.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry a502aa7b04 aco: form sparse load clauses
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 0bd14be962 aco: implement sparse image loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 382f50ad2c aco: implement sparse texture fetches
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 5a4f6313b1 aco: implement nir_op_vec5
Since sparse fetch/load uses vec5 destinations, it may be possible that we
encounter nir_op_vec5.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 962c917cea aco: move MIMG VDATA to its own operand
We will want both a VDATA operand and a sampler for some TFE/LWE MIMG
instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 2aaf52bb85 aco: fix MIMG_instruction::lwe comment
The ISA docs were inconsistent about what this flag does, but that seems
fixed in the RDNA doc.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Rhys Perry 816b7fb5cb aco: fix unreachable() for uniform 8/16-bit nir_op_mov from VGPR
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: d20a752c0d ("aco: use Builder::copy more")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8380>
2021-01-08 12:54:36 +00:00
Samuel Pitoiset f40a7d3c93 radv: fix performance regression by restoring TC-compat HTILE in GENERAL
This fixes a performance regression for games (eg. Youngblood) that
declare all images as concurrent. This is likely buggy for compute
queues but this just restores the previous behaviour for now.

Fixes: f4f096805b ("radv: fix TC-compat HTILE images with DST_OPTIMAL on the compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8351>
2021-01-08 09:22:32 +00:00
Samuel Pitoiset 0ae1cf46a6 radv: fix enabling TC-compat HTILE in GENERAL for writes on GFX10+
It wasn't expected to also enable inside render loops.

Fixes: 4bb92d9145 ("radv: enable TC-compat HTILE in GENERAL on GFX10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8351>
2021-01-08 09:22:32 +00:00
Samuel Pitoiset 20683461e3 radv: configure the texture descriptor for TC-compat CMASK on GFX10+
This was missing, it can be enabled with RADV_PERFTEST=tccompatcmask.
Note that this feature is still experimental.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8350>
2021-01-08 08:21:17 +01:00
Samuel Pitoiset d2f4934121 radv/llvm,aco: always split typed vertex buffer loads on GFX6 and GFX10+
To avoid any alignment issues that triggers memory violations and
eventually a GPU. This can happen if the stride (static or dynamic)
is unaligned and also if the VBO offset is aligned to scalar
(eg. stride is 8 and VBO offset is 2 for R16G16B16A16_SNORM).

The AMD Windows driver also always splits typed vertex fetches.

fossils-db (Sienna Cichlid):
Totals from 56508 (40.54% of 139391) affected shaders:
SGPRs: 2643545 -> 2664516 (+0.79%); split: -0.19%, +0.98%
VGPRs: 2007472 -> 1995408 (-0.60%); split: -0.74%, +0.13%
CodeSize: 70596372 -> 73913312 (+4.70%); split: -0.00%, +4.70%
MaxWaves: 772653 -> 774916 (+0.29%); split: +0.37%, -0.08%
Instrs: 14074162 -> 14567072 (+3.50%); split: -0.00%, +3.51%
Cycles: 69281276 -> 71253252 (+2.85%); split: -0.00%, +2.85%
VMEM: 22047039 -> 25554196 (+15.91%); split: +17.20%, -1.29%
SMEM: 4120370 -> 4360820 (+5.84%); split: +7.41%, -1.58%
VClause: 416913 -> 438361 (+5.14%); split: -1.86%, +7.01%
SClause: 536739 -> 542637 (+1.10%); split: -0.33%, +1.43%
Copies: 977194 -> 970015 (-0.73%); split: -2.43%, +1.69%
Branches: 241205 -> 241193 (-0.00%); split: -0.06%, +0.06%
PreVGPRs: 1505645 -> 1505379 (-0.02%)

This fixes GPU hangs with bin/draw-vertices from Piglit on GFX10+
with Zink.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8363>
2021-01-07 17:28:00 +00:00
Samuel Pitoiset 68c2537062 aco: fix creating the dest vector when 16-bit vertex fetches are splitted
Compute the number of components of the destination vector from the
bitsize when eg. a 16-bit vec2 vertex fetches is splitted. This is
because the dst will be a v1, so the p_create_vector should be created
from two v2b fro both sizes to match.

This prevents a regression from the next change which will split
typed vertex buffer loads on GFX6 and GFX10+.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8363>
2021-01-07 17:28:00 +00:00
Rhys Perry f5adf27fb9 nir,radv: add and use nir_vectorize_tess_levels()
fossil-db (Sienna):
Totals from 1342 (0.97% of 138791) affected shaders:
CodeSize: 3287996 -> 3269572 (-0.56%); split: -0.56%, +0.00%
Instrs: 629896 -> 628191 (-0.27%); split: -0.31%, +0.04%
Cycles: 2619244 -> 2612424 (-0.26%); split: -0.30%, +0.04%
VMEM: 388807 -> 389273 (+0.12%); split: +0.14%, -0.02%
SMEM: 90655 -> 90700 (+0.05%); split: +0.06%, -0.01%
VClause: 21831 -> 21812 (-0.09%)
PreVGPRs: 44155 -> 44058 (-0.22%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry bfc777f83e radv: vectorize shader I/O
Fixes code size regressions after enabling TCS/TES for ACO.

fossil-db (Sienna):
Totals from 2588 (1.86% of 138791) affected shaders:
SGPRs: 109950 -> 108480 (-1.34%); split: -1.43%, +0.09%
VGPRs: 107764 -> 112060 (+3.99%); split: -0.03%, +4.02%
CodeSize: 5957760 -> 5321656 (-10.68%)
MaxWaves: 31718 -> 30358 (-4.29%); split: +0.03%, -4.32%
Instrs: 1116300 -> 1029000 (-7.82%)
Cycles: 4600344 -> 4251072 (-7.59%)
VMEM: 980024 -> 812978 (-17.05%); split: +1.14%, -18.18%
SMEM: 275458 -> 258227 (-6.26%); split: +2.34%, -8.60%
VClause: 42925 -> 30533 (-28.87%); split: -31.02%, +2.15%
SClause: 31554 -> 31362 (-0.61%); split: -1.79%, +1.18%
Branches: 15689 -> 15697 (+0.05%)
PreVGPRs: 80399 -> 83953 (+4.42%); split: -0.00%, +4.42%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry f199b7188b nir/load_store_vectorize: add data as callback args
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry 00c8bec47b nir: add nir_load_store_vectorize_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry cacce76db9 radv: workaround games which assume full subgroups if cswave32 is enabled
This assumption becomes incorrect with RADV_PERFTEST=cswave32.

Games include Detroit: Become Human and Doom Eternal.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Rhys Perry 5bb94ab050 radv: implement CREATE_REQUIRE_FULL_SUBGROUPS_BIT with cswave32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Michel Dänzer 1de2fd0cf2 wsi/x11: Always link against xcb-xrandr
The next commit will make use of it even without
VK_USE_PLATFORM_XLIB_XRANDR_EXT.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197>
2021-01-07 14:57:45 +01:00
Pierre-Eric Pelloux-Prayer df5233b977 ac/sqtt: move radv_get_expected_buffer_size to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:10:16 +01:00
Pierre-Eric Pelloux-Prayer ea6176e63e ac/sqtt: move ac_is_thread_trace_complete to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:10:14 +01:00
Pierre-Eric Pelloux-Prayer ffdfe136e6 ac/sqtt: move rgp/sqtt def to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:57 +01:00
Pierre-Eric Pelloux-Prayer 4ec5cf5318 ac/radv: move radv_rgp.c to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:49 +01:00
Pierre-Eric Pelloux-Prayer bbc245ab2e ac/radv: move sqtt structs and helpers to amd/common
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:47 +01:00
Pierre-Eric Pelloux-Prayer 04f6ba113c ac/sqtt: add ac_thread_trace_data
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:45 +01:00
Rhys Perry 1fd8b46667 nir,spirv: add sparse image loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Samuel Pitoiset 4bb92d9145 radv: enable TC-compat HTILE in GENERAL on GFX10+
GFX10+ supports compressed writes to HTILE, so it should just work
to skip decompressions when transitioning from/to GENERAL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Samuel Pitoiset 326c7312bf radv: only load the DS fast clear values for compressed rendering
Otherwise it's useless because we are unlikely to perform a
fast depth stencil clear.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Samuel Pitoiset 76e33d528b radv: clean up radv_layout_is_htile_compressed()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Samuel Pitoiset f4f096805b radv: fix TC-compat HTILE images with DST_OPTIMAL on the compute queue
This is probably rare but can happen if someone performs a depth-stencil
copy on the compute queue. This might work (untested by CTS) but it
looks more conservative to decompress before perfoming the operation.

Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Samuel Pitoiset 1c539b6484 radv: add radv_htile_get_initial_value() and document the HTILE dword
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Samuel Pitoiset 3038c88661 radv: fix potential HTILE issues for TC-compat images on GFX8
We can only use the entire HTILE buffer if TILE_STENCIL_DISABLE is
TRUE. On GFX8+, this is only true if the depth image has no stencil
and if it's not TC-compatible because of the ZRANGE_PRECISION issue.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Samuel Pitoiset f7f6e9ad56 radv: always clear the SR0/SR1 bits of the HTILE buffer
To make sure the stencil compare state is properly initialized and
cleared when the driver performs a fast depth clear.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
2021-01-05 12:10:11 +00:00
Pierre-Eric Pelloux-Prayer d0767fc045 amd/addrlib: use cpp.has_argument() to filter compiler arguments
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>
2021-01-05 11:29:11 +00:00
Rhys Perry c5973ede01 ac/nir: use llvm.readcyclecounter for LLVM9+
Unlike llvm.amdgcn.s.memtime, this works on GFX10.3

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4033
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8306>
2021-01-05 10:27:00 +00:00
Samuel Pitoiset 831d9d406a radv: remove unused radv_image::aspects
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8324>
2021-01-05 09:46:01 +00:00
Samuel Pitoiset 58c68bac39 radv: fix clearing images with vkCmdClear{Color,DepthStencil}Image()
The image aspects field is actually never set and we should use the
range aspect anyways.

Fixes: 1a7b7b17ad ("radv: avoid oob read during clear")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8324>
2021-01-05 09:46:01 +00:00
Marek Olšák b94626d3ee ac,radeonsi: limit Smart Access Memory to Zen 3 and GFX10.3 due to perf issues
Many people experience performance degradation on some systems.
There will be a driconf option to enable SAM on other chips as well as
disable it on enabled systems.

Fixes: d3d6d38145 - ac: add radeon_info::all_vram_visible for Smart Access Memory
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3982

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225>
2021-01-05 02:43:55 +00:00
Samuel Pitoiset 3ae1c6a4fb radv: disable A2 SNORM/SSCALED/SINT for texel buffers & images on all gens
AMDVLK and AMDGPU-PRO also don't support these formats for texel
buffers and images.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3386
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8315>
2021-01-04 17:19:41 +00:00
Rhys Perry b2d000513e aco: fix incorrect address calculation for load_barycentric_at_sample
Fix address calculation for indirect load_barycentric_at_sample on GFX6-8
with a uniform sample index.

A non-zero uniform sample index does not seem to be tested by CTS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3966
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8302>
2021-01-04 16:53:29 +00:00
Mike Blumenkrantz 1a7b7b17ad radv: avoid oob read during clear
when clearing a depth/stencil image the passed colorvalue pointer is
smaller than the VkClearValue struct size

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8288>
2021-01-04 14:11:56 +00:00
Bas Nieuwenhuizen 3898f747ce radv: Use VRAM for the initial gfx cmdbuffer.
Not expect it to make any real difference, but lets be consistent.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979>
2021-01-04 13:10:16 +00:00
Bas Nieuwenhuizen b7cc5dc853 radv: Put commandbuffers in VRAM if all VRAM is CPU visible.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979>
2021-01-04 13:10:15 +00:00
Bas Nieuwenhuizen f06e91d85a radv: Use VRAM for upload buffers if entire VRAM is CPU-visible.
Not doing this for APUs because spilling is quite likely, due to
overall VRAM pressure.

Also adding a flag to disable for performance debugging.

Finally adds some memset for places where we depended on the memory
being initialized to zero, which we won't get with VRAM anymore.
(I think these places should stop depending on it since it hides
 issues with executing the cmdbuffer multiple times, but this
  preserves behavior)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979>
2021-01-04 13:10:15 +00:00
Samuel Pitoiset ef06f1bb03 radv: disable stippledBresenhamLines on GFX9
Some CTS fail on Vega10 but work on Raven.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8242>
2020-12-31 09:17:21 +01:00
Tony Wasserka 9d59c84e31 aco/ra: Avoid redundant RegisterFile copies in get_reg_impl
Now that this function does not block RegisterFile entries anymore,
the temporary copy is only needed upon reaching the collect_vars call.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8261>
2020-12-30 17:36:33 +01:00
Samuel Pitoiset d90a102a01 radv: add a Python script to check if a VA was ever valid
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7891>
2020-12-30 08:40:21 +01:00
Samuel Pitoiset 6ed4332591 radv: dump VA ranges history when a GPU hang is detected
This is enabled only with RADV_DEBUG=hang. This adds a small

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3904
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7891>
2020-12-30 08:40:19 +01:00
Tony Wasserka 6b538506f2 aco/ra: Fix register allocation for subdword operands
ACO attempts to store the output of an instruction in the same register
occupied by its operands where possible. Importantly this only works if
the operands are large enough to store the result register size. The code
failed to consider subdword operands when checking for this, causing
entire register slots to be freed up even though subdword parts were still
used.

In Mafia 3, this affected the following code:
v2b: %363:v[2][0:16],  v2b: %362:v[2][16:32] = p_split_vector %360:v[2]
v1:  %116:v[2] = v_cvt_f32_f16 %362:v[2][16:32]
v1:  %117:v[2] = v_cvt_f32_f16 %363:v[2][0:16]
where v[2] is allocated to %116 even though its original lower 16 bits are
still used in the instruction after.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3717
Fixes: 031edbc4a5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Tony Wasserka 187b185502 aco/ra: Add some documentation
This should make these somewhat tricky bits easier to follow.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Tony Wasserka b841b4fde8 aco: Add tests for subdword register allocation
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Tony Wasserka 6a246f5c6d aco/tests: Fix deadlock for too large test lists
The write() to the communication pipe shared with check_output.py would block
for large test output streams since the pipe's consumer wouldn't be launched
until the write already completed.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Tony Wasserka a240341ec9 aco/tests: Allow specifiying the test subvariant in setup_cs
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Tony Wasserka 05ca6758cb aco/tests: Fix GFX10_3 being printed as gfx11
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Tony Wasserka d06abc263d aco/ra: Add policy parameter to select implementation details for testing
This new policy parameter allows disabling the optimistic path of get_reg
(i.e. get_reg_simple) to improve test coverage of the pessimistic path
provided by get_reg_impl.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461>
2020-12-29 18:57:10 +00:00
Samuel Pitoiset 9c176a7e63 Revert "radv: use 32-bit predication for skipping FCE on GFX10.3+"
This is actually wrong because we still assume 64-bit in a bunch
of places.

This reverts commit b24b3026cc.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8214>
2020-12-24 09:56:25 +01:00
Samuel Pitoiset 2d0c723ce6 radv: make sure FMASK compression is enabled for MSAA copies
Fixes dEQP-VK.api.copy_and_blit.*.4_bit. I think the MSAA2x and
MSAA8x just passed by luck.

Fixes: 7b21ce401f ("radv: disable FMASK compression when drawing with GENERAL layout")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7915>
2020-12-23 11:25:34 +00:00
Daniel Schürmann b1e12747b9 aco: create VMEM clauses slightly more aggressive
Totals from 3325 (2.39% of 139391) affected shaders (NAVI10):
SGPRs: 331528 -> 331056 (-0.14%); split: -0.14%, +0.00%
VGPRs: 306164 -> 337764 (+10.32%); split: -0.02%, +10.34%
CodeSize: 38843180 -> 38865388 (+0.06%); split: -0.04%, +0.10%
MaxWaves: 18908 -> 17028 (-9.94%); split: +0.01%, -9.95%
Instrs: 7423908 -> 7427934 (+0.05%); split: -0.06%, +0.12%
Cycles: 527411756 -> 526388408 (-0.19%); split: -0.21%, +0.02%
VMEM: 1148421 -> 992660 (-13.56%); split: +0.10%, -13.67%
SMEM: 227337 -> 232380 (+2.22%); split: +2.26%, -0.04%
VClause: 146416 -> 111171 (-24.07%); split: -24.10%, +0.03%
SClause: 243674 -> 243689 (+0.01%); split: -0.00%, +0.01%
Copies: 663496 -> 660333 (-0.48%); split: -0.85%, +0.37%
Branches: 223725 -> 223721 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann ac40301dbb aco: schedule position exports in the same pass as memory operations
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann 0287ebeef3 aco: fix def-use distance calculation when scheduling.
This change also increases the VMEM_MAX_MOVES
to mitigate some of the scheduling changes.

Totals from 34301 (24.61% of 139391) affected shaders:
SGPRs: 2515440 -> 2552304 (+1.47%); split: -1.25%, +2.71%
VGPRs: 1786676 -> 1794724 (+0.45%); split: -0.31%, +0.76%
CodeSize: 151079856 -> 151209828 (+0.09%); split: -0.06%, +0.15%
MaxWaves: 392454 -> 388966 (-0.89%); split: +0.39%, -1.28%
Instrs: 28870746 -> 28895907 (+0.09%); split: -0.09%, +0.17%
Cycles: 960450680 -> 961315796 (+0.09%); split: -0.09%, +0.18%
VMEM: 19027987 -> 19796223 (+4.04%); split: +7.49%, -3.45%
SMEM: 2434691 -> 2394829 (-1.64%); split: +2.80%, -4.43%
VClause: 551776 -> 543051 (-1.58%); split: -1.73%, +0.15%
SClause: 1230147 -> 1227637 (-0.20%); split: -1.40%, +1.20%
Copies: 1957640 -> 1963617 (+0.31%); split: -1.11%, +1.41%
Branches: 611747 -> 612504 (+0.12%); split: -0.11%, +0.23%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann 3f14140f48 aco: allow to schedule SALU/SMEM through exec changes
Totals from 16794 (12.05% of 139391) affected shaders (NAVI10):
SGPRs: 757760 -> 762048 (+0.57%); split: -0.39%, +0.95%
VGPRs: 402844 -> 402744 (-0.02%); split: -0.04%, +0.02%
CodeSize: 22290900 -> 22285068 (-0.03%); split: -0.06%, +0.04%
MaxWaves: 294163 -> 294222 (+0.02%); split: +0.03%, -0.01%
Instrs: 4190074 -> 4188513 (-0.04%); split: -0.08%, +0.04%
Cycles: 40685028 -> 40678640 (-0.02%); split: -0.03%, +0.02%
VMEM: 7711867 -> 7704315 (-0.10%); split: +0.28%, -0.38%
SMEM: 942472 -> 1007052 (+6.85%); split: +7.15%, -0.30%
VClause: 92990 -> 92974 (-0.02%); split: -0.03%, +0.01%
SClause: 263700 -> 263810 (+0.04%); split: -0.38%, +0.42%
Copies: 277467 -> 276988 (-0.17%); split: -0.37%, +0.20%
Branches: 45899 -> 45896 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann 4a70c4d383 aco: make pred_by_exec_mask() accessible in other files
and rename to needs_exec_mask().

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann 2116b4504e aco: don't emit parallelcopy when switching to WQM.
The reason was an RA bug which has been fixed a while ago.
This change fixes some register demand miscalculations.

Totals from 1013 (0.73% of 139391) affected shaders (NAVI10):
CodeSize: 6050408 -> 6047504 (-0.05%); split: -0.05%, +0.00%
Instrs: 1160533 -> 1159765 (-0.07%); split: -0.07%, +0.00%
Cycles: 8027212 -> 8024140 (-0.04%); split: -0.04%, +0.00%
VMEM: 296195 -> 296091 (-0.04%)
SMEM: 73003 -> 73011 (+0.01%); split: +0.05%, -0.04%
SClause: 37221 -> 37222 (+0.00%)
Copies: 70931 -> 70166 (-1.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Samuel Pitoiset 4a4ea89a99 radv: add code that checks if the extension table is sorted correctly
Ported from ANV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8190>
2020-12-22 14:09:54 +01:00
Samuel Pitoiset e1d1e5b7bd radv: sort the extension table like Khronos
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8190>
2020-12-22 14:09:52 +01:00
Samuel Pitoiset 2d87e52b37 radv: enable VK_EXT_line_rasterization on GFX9
It was disabled because some CTS failed but they pass now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8189>
2020-12-22 09:25:48 +01:00
Bas Nieuwenhuizen 9339ed2f85 radv: Enable DCC in the GENERAL layout on GFX10+.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
2020-12-21 18:32:24 +00:00
Bas Nieuwenhuizen 18ddd48e70 radv: Disable DCC explicitly for incompatible copies.
If we enable DCC for GENERAL we cannot set the layout to GENERAL to
disable DCC, so do it explicitly.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
2020-12-21 18:32:24 +00:00
Bas Nieuwenhuizen f23eaf0db6 radv: Add option to disable DCC in renderpasses without layout.
If DCC is enabled for GENERAL then we cannot disable DCC by going
to the GENERAL layout.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
2020-12-21 18:32:24 +00:00
Bas Nieuwenhuizen 88f392f6f8 radv: Never allow fast clears on DCC images that are not compressed.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
2020-12-21 18:32:24 +00:00
Bas Nieuwenhuizen da36577558 radv: Don't skip layout transitions that only differ in render loop.
This can result in meaningful compression changes so we shouldn't skip.

Fixes: 66131ceb8b "radv: Pass through render loop detection to internal layout decisions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
2020-12-21 18:32:24 +00:00
Samuel Pitoiset 909e06075d radv: ignore the mutable bit for TC-compatible HTILE
All depth/stencil formats are incompatible each others, so the
mutable bit and the image format list can be ignored.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8126>
2020-12-21 17:46:03 +00:00
Samuel Pitoiset 19e96d4566 radv: remove useless push constants data when resolving ds attachments
Depth/stencil resolves are only allowed inside a subpass, which means
the offset is always 0 and the draw/dispatch covers the whole image.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8127>
2020-12-18 17:58:54 +00:00
Samuel Pitoiset 30852b5b49 radv: fix maxFragmentShadingRateRasterizationSamples
It's not a bitfield. This limit is purely informational.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8100>
2020-12-18 14:15:28 +01:00
Samuel Pitoiset c9e1264ec7 radv: adjust the maximum number of coverage samples for VRS
It should actually be 4 because the maximum fragment size supported
by the hardware is 2x2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8100>
2020-12-18 14:15:25 +01:00
Rhys Perry 271dd1837a ac/llvm: insert phis before demote kill
LLVM (like NIR) requires phi instructions to be before any other
instructions in the block. ac_branch_exited() can insert non-phi
instructions before visit_block() adds phis, so visit_block() should add
phi instructions before the non-phi instructions ac_branch_exited()
inserts.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: aa757f4f8c ("ac/llvm: fix demote inside conditional branches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8054>
2020-12-18 09:56:43 +00:00
Samuel Pitoiset 81a6ee7a9b radv: enable TC-compat HTILE for D32_SFLOAT+MSAA on GFX10+
This was disabled due to some depth/stencil resolve CTS failures
which are now fixed.

I figured that disabling TC-compat HTILE for D32_SFLOAT+MSAA reduced
performance in Control by -11% on Vega10. In fact, the game only uses
D32_SFLOAT for depth rendering.

This gives a huge boost in Control on Navi10 (eg. +17% in MSAA4x).
Note that the game is still slower than PRO without MSAA on Navi10,
but as fast (or even a bit faster) on Vega10.

I think TC-compat HILE could also be enabled for D32_SFLOAT_S8_UINT
but it needs more testing first.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8143>
2020-12-18 07:57:03 +01:00
Rhys Perry 661922f6ac aco: add block to worklist in mark_block_wqm()
Since we're requiring the branch condition to be in WQM, we have to ensure
that the block is in the worklist.

Fixes Trials Fusion hang at 4K and High settings.

fossil-db (Sienna):
Totals from 216 (0.15% of 139391) affected shaders:
SGPRs: 13392 -> 13360 (-0.24%)
CodeSize: 1321184 -> 1318592 (-0.20%)
Instrs: 255310 -> 254662 (-0.25%)
Cycles: 2178360 -> 2174652 (-0.17%)

Affected fossils in fossil-db are dirt4, nier and youngblood.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8145>
2020-12-17 19:22:55 +00:00
Samuel Pitoiset 7880faccc5 radv: add missing DB flush after depth/stencil resolve operations
I thought this was a bug in CTS but the Vulkan spec says:

    "VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT specifies write access
     to a color, resolve, or depth/stencil resolve attachment during
     a render pass or via certain subpass load and store operations."

So, VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT is used to synchronize
depth/stencil resolve attachments. Yes, it's counterintuitive.

This can't actually be fixed properly for now because RADV performs
the end subpass barrier *before* resolve attachments instead of after.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8138>
2020-12-17 15:19:57 +00:00
Daniel Schürmann b50d3e5760 aco/ra: fix phi operand renaming
In case one operand was renamed and another operand came
from an incomplete phi, it could happen, that the original
name was not restored.

This has no impact on the code, but ensures correct SSA
is maintained during RA.

Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8109>
2020-12-17 15:00:59 +00:00
Eric Anholt 6f52386544 amd: Fix leak in ac_surface_modifier_test.
Needed for meson test with asan enabled.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7936>
2020-12-15 19:39:29 +00:00
Tony Wasserka ada9be1ec9 radv,aco: Compile with -Wimplicit-fallthrough when available
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7847>
2020-12-15 18:22:46 +00:00
Tony Wasserka 6ba83d820c aco: Annotate switch fallthroughs
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7847>
2020-12-15 18:22:46 +00:00
Samuel Pitoiset 22790ef3d4 radv: add support for resolving layered depth/stencil images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8025>
2020-12-15 18:04:39 +00:00
Rhys Perry 23488c3515 aco: allow divergent mbcnt_amd masks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8085>
2020-12-14 20:35:21 +00:00
Rhys Perry feee375db9 aco: fix mbcnt_amd with wave32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8085>
2020-12-14 20:35:21 +00:00
Daniel Schürmann ef4101d6d7 aco/spill: only prevent rematerializable vars from being DCE'd if they haven't been renamed
The small DCE of the spiller only removes the original instructions
of rematerialized variables in case they are unused. If a variable
has been renamed, it cannot match any original instruction anymore.
Thus, the lookup is then unnecessary and can be omitted.

No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8055>
2020-12-14 16:42:49 +00:00
Daniel Schürmann 0bccfd86f6 aco: fix DCE of rematerializable phi operands
Otherwise, if a phi gets spilled, the operand might be considered unused.

Fixes: d48d72e98a ('aco: Initial commit of independent AMD compiler')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8055>
2020-12-14 16:42:49 +00:00
Samuel Pitoiset a791c1f3a7 radv: advertise VK_KHR_fragment_shading_rate on GFX10.3+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:39 +00:00
Samuel Pitoiset 77343576eb aco: implement a workaround for gl_FragCoord.z with VRS on GFX10.3
Without it, FragCoord.z will have the value of one of the fine pixels
instead of the center of the coarse pixel.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:39 +00:00
Samuel Pitoiset 45524afe95 radv/llvm: implement a workaround for gl_FragCoord.z with VRS on GFX10.3
Without it, FragCoord.z will have the value of one of the fine pixels
instead of the center of the coarse pixel.

It's only enabled for RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Samuel Pitoiset 7a464f4296 radv: track if VRS is enabled to apply a workaround on GFX10.3
On some chips, gl_FragCoord.z has to be adjusted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Samuel Pitoiset c587eaadf6 aco: implement fragment shading rate
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Samuel Pitoiset 0bac0b7f19 radv/llvm: implement fragment shading rate
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Samuel Pitoiset bf69d89b5a radv: implement VK_KHR_fragment_shading_rate
Only supported on GFX10.3+. Attachment Fragment Shading Rate is
for later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Samuel Pitoiset d8c1931ca9 radv: add VK_KHR_fragment_shading_rate but leave it disabled
To declare new prototypes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Samuel Pitoiset 9770ffb07c amd/registers: add missing VRS registers
These register definitions are copied from AMDVLK because they
aren't even in the kernel.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
2020-12-14 16:22:38 +00:00
Daniel Schürmann c4217ef2fc aco: don't create dead exec mask phis on merge blocks
Avoids some unnecessary exec copies and allows for a bit more
jump threading.

Totals from 112 (0.08% of 139391) affected shaders (NAVI10):
SpillSGPRs: 3084 -> 3050 (-1.10%)
CodeSize: 2657516 -> 2652376 (-0.19%)
Instrs: 492074 -> 490824 (-0.25%)
Cycles: 40369704 -> 40317052 (-0.13%)
VMEM: 24212 -> 24128 (-0.35%)
SClause: 12018 -> 12010 (-0.07%)
Copies: 72950 -> 72275 (-0.93%)
Branches: 13249 -> 12701 (-4.14%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8059>
2020-12-14 15:58:24 +00:00
Simon Ser ad19b0714a radv: fix access to uninitialized radeon_bo_metadata
If the image tiling is set to VK_IMAGE_TILING_LINEAR,
buffer_set_metadata will read an uninitialized radeon_bo_metadata.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: d5fd8cd46e ("radv: Allow non-dedicated linear images and buffer.")
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7898>
2020-12-14 14:59:49 +00:00
Daniel Schürmann fd49ba59a3 aco/ra: use get_reg_specified() for p_extract_vector
On GFX6/7, it might violate validation rules, otherwise.

Fixes: 51f4b22fee ('aco: don't allow unaligned subdword accesses on GFX6/7')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8047>
2020-12-11 13:44:47 +00:00
Timur Kristóf 731f8fc9dd aco: Use program->num_waves as maximum in scheduler.
The scheduler doesn't take SGPR use into account, which can be
a limiting factor on older GPUs. This patch fixes a CTS test crash
on GFX6.

CC: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8040>
2020-12-11 13:32:26 +00:00
Timur Kristóf a9a8e05b69 aco: Skip TCS s_barrier when VS outputs are not stored in the LDS.
When VS outputs are known to be never stored in LDS, there is no
reason for HS waves to wait for all LS waves to complete. So, the
s_barrier between the LS and HS can be safely skipped.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7727>
2020-12-10 17:23:16 +00:00
Rob Clark 790144e65a util+treewide: container_of() cleanup
Replace mesa's slightly different container_of() with one more aligned
to the linux kernel's version which takes a type as the 2nd param.  This
avoids warnings like:

  freedreno_context.c:396:44: warning: variable 'batch' is uninitialized when used within its own initialization [-Wuninitialized]

At the same time, we can add additional build-time type-checking asserts

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7941>
2020-12-10 16:48:36 +00:00
Marek Olšák 2b09bde1f5 radeonsi: use a C++ template to decrease draw_vbo overhead by 13 %
With GALLIUM_THREAD=0 to disable draw merging.

Before:
   1, DrawElements ( 1 VBO| 0 UBO|  0    ) w/ no state change,                 8736

After:
   1, DrawElements ( 1 VBO| 0 UBO|  0    ) w/ no state change,                10059

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7807>
2020-12-09 16:01:32 -05:00
Marek Olšák fc212dcaa5 amd/llvm: fix C++ compile failures
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7807>
2020-12-09 16:01:21 -05:00
Marek Olšák 3d41712193 ac/llvm: handle no_(un)signed_wrap NIR flags
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7939>
2020-12-09 20:13:25 +00:00
Marek Olšák 3b67c6451f ac: unify shader arguments that are duplicated
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7939>
2020-12-09 20:13:25 +00:00
Marek Olšák 4a50096ab4 ac: add shader return values into ac_shader_args
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7939>
2020-12-09 20:13:24 +00:00
Marek Olšák 2cf44ad30a ac: correct ac_shader_args types, remove sgpr_count
sgpr_count is unused. The size of the others is too small.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7939>
2020-12-09 20:13:24 +00:00
Rhys Perry b08343c404 aco: rename s_subb_u32 operands to borrow
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8007>
2020-12-09 19:06:34 +00:00
Rhys Perry f4e649a205 aco: fix various s_subb_u32 operands to SCC
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8007>
2020-12-09 19:06:34 +00:00
Marek Olšák c5ae01dcf1 ac,radeonsi: implement GL_NV_compute_shader_derivatives
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6799>
2020-12-09 15:52:58 +00:00
Marek Olšák d3d6d38145 ac: add radeon_info::all_vram_visible for Smart Access Memory
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7951>
2020-12-09 10:33:46 -05:00
Jonathan Gray ebfb9e1817 aco: use UINT64_C on 64 bit constant arguments
avoids errors seen when building on OpenBSD/amd64

../src/amd/compiler/aco_instruction_selection.cpp:1677:62: error: ambiguous conversion for functional-style cast from 'unsigned long' to 'aco::Operand'
            bld.vop3(aco_opcode::v_mul_f64, Definition(dst), Operand(0x3FF0000000000000lu), tmp);
                                                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
glibc uses unsigned long for uint64_t on LP64 archs and unsigned long long for
uint64_t on ILP32 archs.  On OpenBSD unsigned long long is used for uint64_t
on all archs.

The Operand constructors are uint8_t uint16_t uint32_t uint64_t
use UINT64_C so lu or llu suffix will be used as needed.

Fixes: df645fa369 ("aco: implement VK_KHR_shader_float_controls")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7944>
2020-12-08 11:11:28 +00:00
Samuel Pitoiset 59b1578176 radv: disable alphaToOne feature
The feature was exposed but completely ignored by the driver. Other
AMD drivers don't expose it as well, probably because it's complicated
to implement alpha-to-coverage properly. Let's disable it.

Cc: mesa-stable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7966>
2020-12-08 10:42:27 +01:00
Mauro Rossi 2b1930a50a android: radv: add libcutils shared dependency
Fixes the following building error:

    FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/vulkan.android-x86_intermediates/LINKED/vulkan.android-x86.so
    ...
    ld.lld: error: undefined symbol: property_get
    >>> referenced by os_misc.c:193 (external/mesa/src/util/os_misc.c:193)
    >>>               os_misc.o:(os_get_option) in archive out/target/product/x86_64/obj/STATIC_LIBRARIES/libmesa_util_intermediates/libmesa_util.a

Fixes: eeecc21d ("util: Add property_get() fallback for android")
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7861>
2020-12-07 23:54:25 +01:00
Samuel Pitoiset ec3828add3 radv: fix clearing FMASK for layered MSAA images on GFX9+
If we always clear the whole FMASK buffer, layers can be corrupted.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3710
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7924>
2020-12-07 16:19:22 +00:00
Samuel Pitoiset 35964e9387 ac/surface: initialize the FMASK slice size for GFX9+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7924>
2020-12-07 16:19:22 +00:00
Samuel Pitoiset c0319e4505 radv: advertise VK_EXT_sample_locations on GFX10+
Only MSAA2x and MSAA4X sample locations can be used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7850>
2020-12-07 15:45:49 +00:00
Samuel Pitoiset 3adf8121a0 radv: enable using MSAA2x and MSAA4x sample locations on GFX10+
These failures are really weird but MSAA2x and MSAA4x work fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7850>
2020-12-07 15:45:49 +00:00
Hans-Kristian Arntzen 86644b84b9 radv: Implement VK_VALVE_mutable_descriptor_type.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7967>
2020-12-07 15:25:17 +00:00
Samuel Pitoiset 562dd79bfa radv: fix using FS sample shading if the linker optimized inputs away
During NIR linking, constant varyings might be moved to the next
stage and the sample qualifier removed.

shader_info::uses_sample_shading remembers if the sample qualifier
was used before optimizations.

No fossils-db changes on Sienna Cichlid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7892>
2020-12-07 11:42:17 +00:00
Samuel Pitoiset b24b3026cc radv: use 32-bit predication for skipping FCE on GFX10.3+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7897>
2020-12-07 09:30:05 +00:00
Samuel Pitoiset 3494551d08 radv: set the predication boolean as 32-bit if necessary
CTS is missing tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7897>
2020-12-07 09:30:05 +00:00
Samuel Pitoiset fadcf13c8b radv: fix exporting multiviews with NGG
If a subpass uses multiview but the fragment shader doesn't load it
we still have to export it.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7815>
2020-12-07 08:06:43 +00:00
Samuel Pitoiset 5cacb56041 radv: mark GFX10.3 as a non-conformant Vulkan implementation
In theory, GFX10.3 is not considered to be a conformant Vulkan
implementation because we didn't submit a conformance submission
package.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7913>
2020-12-07 08:26:47 +01:00
Rhys Perry c553084bf9 aco: remove rollback code when making an instruction vop3
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:32 +00:00
Rhys Perry 349908587f aco: move update_renames() out of get_reg()
This is necessary for the next commit, which will pass a temporary copy of
the register file to get_reg().

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry 8794f0348a aco: remove rollback code for blocked fixed definitions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry 6f7cb47ad8 aco: remove rollback code around parallelcopy creation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry 9177fe8356 aco: simplify get_reg_impl()
Instead of copying the reg file as a backup, copy it so that we can remove
the rollback/undo code.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry 5c9d2ed78d aco: use clear() helper instead of writing reg file directly
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry d671cf7f53 aco: repeat get_reg_create_vector() with increased register demand if fail
We don't need rollback/undo code here because get_reg_create_vector() now
creates a temporary copy of the register file.

Works around RA failure with a bunch of dEQP-VK.robustness.robustness2.*

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3566
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry ebd8ab1757 aco: remove rollback code in get_reg_create_vector()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry ad26eae544 aco: don't fill killed operands in update_renames()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry 67860b99ce aco: clear operands in update_renames()
In the future, they might not have already been cleared.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7656>
2020-12-04 20:27:31 +00:00
Rhys Perry 7610630124 aco: coalesce constant copies
fossil-db (Navi):
Totals from 20108 (14.49% of 138791) affected shaders:
CodeSize: 117835376 -> 117830512 (-0.00%)
Instrs: 22813722 -> 22733245 (-0.35%)
Cycles: 1009135584 -> 1008543628 (-0.06%)
VMEM: 5401668 -> 5391247 (-0.19%)
SMEM: 1286824 -> 1283663 (-0.25%)
Copies: 1742154 -> 1661686 (-4.62%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:49 +00:00
Rhys Perry f53d4e5f60 aco: use v_lshrrev_b64 for 64-bit VGPR copies on GFX10+
This isn't worth it on GFX9-, but the proprietary compiler uses it on
GFX10.

fossil-db (Navi):
Totals from 23825 (17.17% of 138791) affected shaders:
CodeSize: 130623632 -> 130623800 (+0.00%); split: -0.00%, +0.00%
Instrs: 25185559 -> 25108597 (-0.31%)
Cycles: 709864740 -> 708910860 (-0.13%)
VMEM: 7205343 -> 7168839 (-0.51%); split: +0.00%, -0.51%
SMEM: 1584946 -> 1575183 (-0.62%)
Copies: 2043134 -> 1966230 (-3.76%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 8c02a8e2d2 aco: add get_const/is_constant_representable helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry b10de4c1d8 aco: allow 64-bit literals if they can be sign/zero-extended from 32-bit
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 24ee0f55f2 aco: remove sign-extension in constantValue64()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 8451911156 aco: test self-intersecting copies when src=higher
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 2c40846ab6 aco: don't assume src=lower when splitting self-intersecting copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 09c584caeb ("aco: split self-intersecting copies instead of swapping")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Samuel Pitoiset 055aff2613 radv: reduce maxTransformFeedbackBufferDataSize to 512
DRAW_OPAQUE_VERTEX_STRIDE only has 9 bits, so the register can
represent 511 bytes at most.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7900>
2020-12-04 07:52:23 +01:00
Vinson Lee cf7bf7fade amd/addrlib: Initialize Lib members in constructors.
Fix defects reported by Coverity Scan.

uninit_member: Non-static class member m_maxBaseAlign is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member m_maxMetaBaseAlign is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7768>
2020-12-03 23:02:17 +00:00
Bas Nieuwenhuizen 9a3aaffeb8 radv: Don't invalidate the SCACHE for image barriers.
Even ACO never uses the constant cache for images.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7875>
2020-12-03 22:21:06 +00:00