KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	27104ff647	radeonsi/gfx11: use the new TCS WaveID SGPR to compute vs_rel_patch_id Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	273674dde1	ac/surface: add gfx11 support to modifiers tests Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	3e85a0c90b	ac/surface: define gfx11 modifiers Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	85c76518c9	ac/surface: gfx11 changes Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	a419b53d12	ac/gpu_info: set cu_mask correctly for gfx11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	f24f8665db	ac: implement register shadowing for gfx11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	3a669558f2	ac: scratch buffer register changes for gfx11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	783b16b3c8	ac: implement ac_get_tbuffer_format for gfx11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Pierre-Eric Pelloux-Prayer	9480ad2b1c	amd: update gfx10_format_table.py for gfx11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	931098d44d	ac: don't align VGPRs to 8 or 16 for gfx11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Marek Olšák	980b7f75e8	amd: enable gfx11 in header generator, fix drivers with renamed gfx6-10 defs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>	2022-05-10 04:29:54 +00:00
Rhys Perry	28da4359a3	ac/nir: skip s_barrier if TCS patches are within subgroup fossil-db (Sienna Cichlid): Totals from 538 (0.33% of 162293) affected shaders: Instrs: 125288 -> 123682 (-1.28%) CodeSize: 712384 -> 705960 (-0.90%) Latency: 632139 -> 623596 (-1.35%) InvThroughput: 218491 -> 215600 (-1.32%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16356>	2022-05-09 16:30:27 +00:00
Samuel Pitoiset	4f9ae10296	ac,radeonsi: add has_sqtt_auto_flush_mode_bug Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16303>	2022-05-04 16:13:49 +00:00
Marek Olšák	88f22f188e	ac,radeonsi: report SCALED formats as unsupported by samplers and color buffers This was never exercised and it doesn't work. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16215>	2022-05-03 11:11:08 -04:00
Marek Olšák	65c7b5ec20	ac: support GR channel order in ac_choose_spi_color_formats Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16215>	2022-05-03 11:11:08 -04:00
Marek Olšák	c7ec284024	ac: remove really_needs_scratch, parameter from ac_parse_shader_binary_config it's always true Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16215>	2022-05-03 11:11:08 -04:00
Marek Olšák	4b93dd215f	ac/gpu_info: rework how num_se is derived Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16215>	2022-05-03 11:11:08 -04:00
Konstantin Seurer	d639608b8b	ac/nir: Do not set cursor in try_extract_additions Fixes: `61ac5ac` ("radv,ac/nir: lower global access to _amd global access intrinsics") Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16270>	2022-05-03 09:23:49 +00:00
Marek Olšák	3d5ba0e1b7	ac/gpu_info: remove old and unused fields from radeon_info Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15966>	2022-04-23 01:45:17 +00:00
Marek Olšák	1bf39b1f9d	ac,radeonsi: rework how scratch_waves is used and move it to ac_gpu_info.c The addition of the "compute" parameter is for a future change. Reviewed-by: Mihai Preda <mhpreda@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15966>	2022-04-23 01:45:17 +00:00
Marek Olšák	f719085007	ac: add more non-shadowed registers to the lists Reviewed-by: Mihai Preda <mhpreda@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15966>	2022-04-23 01:45:17 +00:00
Marek Olšák	c16239d464	ac/surface/tests: generalize and extend gfx10 tests Reviewed-by: Mihai Preda <mhpreda@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15966>	2022-04-23 01:45:17 +00:00
Marek Olšák	dda718d2bf	amd: document chips Reviewed-by: Mihai Preda <mhpreda@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15966>	2022-04-23 01:45:17 +00:00
Marek Olšák	11c28d9798	ac: add ac_nir_optimize_outputs, a NIR version of ac_optimize_vs_outputs ac_optimize_vs_outputs is an LLVM IR pass, and it will be replaced by this. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>	2022-04-22 22:21:11 +00:00
Marek Olšák	91bc463a51	radeonsi: add an SQTT workaround for chips with disabled RBs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15965>	2022-04-22 20:52:26 +00:00
Marek Olšák	c4ca059dee	ac/surface: fix an addrlib race condition on gfx9 Addrlib calls GetMetaEquation, which generates and saves address equations in a global table that is not thread safe. Fixes: `df2cbdd2e3` - amd/addrlib: expose DCC address equations to drivers Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6361 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16091>	2022-04-22 19:12:03 +00:00
Pierre-Eric Pelloux-Prayer	fcc499d5e1	ac/surface: adjust gfx9.pitch[*] based on surf->blk_w This is the same as `8275dc1ed5`, but since gfx9.pitch[...] is used for linear surfaces since `86262b6eac` we need to update it as well. Fixes: `86262b6eac` ("radeonsi,radv: fix usages of surf_pitch") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16053>	2022-04-22 17:21:47 +00:00
Pierre-Eric Pelloux-Prayer	ca40bad84a	ac/spm: setup write broadcasting correctly Based on PAL's PerfExperiment::BuildGrbmGfxIndex method. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15646>	2022-04-22 13:51:44 +02:00
Jason Ekstrand	1b8a43a0ba	util: Remove util_cpu_detect util_cpu_detect is an anti-pattern: it relies on callers high up in the call chain initializing a local implementation detail. As a real example, I added: ...a Mali compiler unit test ...that called bi_imm_f16() to construct an FP16 immediate ...that calls _mesa_float_to_half internally ...that calls util_get_cpu_caps internally, but only on x86_64! ...that relies on util_cpu_detect having been called before. As a consequence, this unit test: ...crashes on x86_64 with USE_X86_64_ASM set ...passes on every other architecture ...works on my local arm64 workstation and on my test board ...failed CI which runs on x86_64 ...needed to have a random util_cpu_detect() call sprinkled in. This is a bad design decision. It pollutes the tree with magic, it causes mysterious CI failures especially for non-x86_64 developers, and it is not justified by a micro-optimization. Instead, let's call util_cpu_detect directly from util_get_cpu_caps, avoiding the footgun where it fails to be called. This cleans up Mesa's design, simplifies the tree, and avoids a class of a (possibly platform-specific) failures. To mitigate the added overhead, wrap it all in a (fast) atomic load check and declare the whole thing as ATTRIBUTE_CONST so the compiler will CSE calls to util_cpu_detect. Co-authored-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15580>	2022-04-20 18:44:35 +00:00
Rhys Perry	ab1409010a	ac/nir: fix 64-bit NGG GS output stores I don't know why this was here. The DIV_ROUND_UP ensures that it's always at least 1 and the MIN2 ensures that it's never greater than 1. Fixes some KHR-Single-GL46.enhanced_layouts.varying_* tests with zink: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6301 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15863>	2022-04-16 11:19:11 +00:00
Rhys Perry	8fe8c5dfd0	ac/nir: properly handle large global access constant offsets Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `61ac5acca3` ("radv,ac/nir: lower global access to _amd global access intrinsics") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6321 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15951>	2022-04-15 10:39:40 +00:00
Rhys Perry	61ac5acca3	radv,ac/nir: lower global access to _amd global access intrinsics fossil-db (Sienna Cichlid): Totals from 400 (0.30% of 134621) affected shaders: VGPRs: 18696 -> 18688 (-0.04%) CodeSize: 2031348 -> 1946640 (-4.17%) Instrs: 374703 -> 360226 (-3.86%) Latency: 4200727 -> 4108628 (-2.19%); split: -2.20%, +0.01% InvThroughput: 1059935 -> 1029441 (-2.88%); split: -2.88%, +0.00% VClause: 5777 -> 5771 (-0.10%) SClause: 11890 -> 10891 (-8.40%); split: -8.57%, +0.17% Copies: 34035 -> 33259 (-2.28%); split: -2.98%, +0.70% Branches: 11108 -> 11100 (-0.07%); split: -0.08%, +0.01% PreSGPRs: 15999 -> 15942 (-0.36%); split: -0.44%, +0.08% PreVGPRs: 16994 -> 16970 (-0.14%) fossil-db (Polaris10): Totals from 400 (0.29% of 135668) affected shaders: SGPRs: 23799 -> 22919 (-3.70%); split: -4.30%, +0.61% VGPRs: 18480 -> 18472 (-0.04%) CodeSize: 2090316 -> 2041592 (-2.33%) Instrs: 395461 -> 385747 (-2.46%); split: -2.46%, +0.00% Latency: 5045768 -> 5020196 (-0.51%); split: -0.53%, +0.02% InvThroughput: 2694320 -> 2689886 (-0.16%); split: -0.23%, +0.07% VClause: 5982 -> 5968 (-0.23%) SClause: 12064 -> 10823 (-10.29%); split: -10.33%, +0.04% Copies: 48233 -> 48322 (+0.18%); split: -0.47%, +0.65% PreSGPRs: 16409 -> 16358 (-0.31%); split: -0.39%, +0.08% fossil-db (Pitcairn): Totals from 400 (0.29% of 135668) affected shaders: SGPRs: 22431 -> 22215 (-0.96%); split: -2.60%, +1.64% VGPRs: 18776 -> 18560 (-1.15%); split: -1.21%, +0.06% CodeSize: 2104440 -> 2017708 (-4.12%) MaxWaves: 2363 -> 2367 (+0.17%) Instrs: 413099 -> 397446 (-3.79%) Latency: 5507707 -> 5450251 (-1.04%); split: -1.12%, +0.07% InvThroughput: 2838867 -> 2786903 (-1.83%); split: -1.83%, +0.00% VClause: 10334 -> 10097 (-2.29%) SClause: 12346 -> 11005 (-10.86%); split: -10.89%, +0.02% Copies: 54034 -> 52065 (-3.64%); split: -3.99%, +0.35% PreSGPRs: 17916 -> 17857 (-0.33%); split: -0.40%, +0.07% PreVGPRs: 16917 -> 16893 (-0.14%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>	2022-04-13 16:23:35 +00:00
Indrajit Kumar Das	3abc66dc9f	ac/gpu_info: disallow displayable DCC for Navi12 and Navi14 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15813>	2022-04-12 23:52:24 +00:00
Rhys Perry	f6262804af	radv: increase inline push constant limit if we can inline all constants fossil-db (Sienna Cichlid): Totals from 665 (0.49% of 134627) affected shaders: CodeSize: 4519620 -> 4491724 (-0.62%); split: -0.62%, +0.01% Instrs: 842745 -> 837313 (-0.64%); split: -0.66%, +0.01% Latency: 7289925 -> 7279661 (-0.14%); split: -0.30%, +0.16% InvThroughput: 1240770 -> 1240639 (-0.01%); split: -0.01%, +0.00% VClause: 15799 -> 15772 (-0.17%) SClause: 33773 -> 32604 (-3.46%); split: -3.66%, +0.20% Copies: 67695 -> 64992 (-3.99%); split: -4.49%, +0.50% PreSGPRs: 38597 -> 38640 (+0.11%); split: -0.14%, +0.25% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145>	2022-04-12 11:44:30 +00:00
Rhys Perry	7f6262bb85	radv: allow holes in inline push constants Use a dword mask instead of a range to track which push constants to inline. fossil-db (Sienna Cichlid): Totals from 5724 (4.25% of 134621) affected shaders: CodeSize: 20894044 -> 20815748 (-0.37%); split: -0.39%, +0.02% Instrs: 4002568 -> 3988385 (-0.35%); split: -0.38%, +0.02% Latency: 29285060 -> 29224414 (-0.21%); split: -0.22%, +0.01% InvThroughput: 5529700 -> 5526893 (-0.05%); split: -0.05%, +0.00% VClause: 78093 -> 78240 (+0.19%); split: -0.23%, +0.41% SClause: 135495 -> 131027 (-3.30%); split: -3.30%, +0.00% Copies: 330856 -> 324552 (-1.91%); split: -2.37%, +0.46% PreSGPRs: 226031 -> 224778 (-0.55%); split: -0.61%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145>	2022-04-12 11:44:30 +00:00
Boris Brezillon	2daae1fab4	amd: Fix ac_gpu_info.c compilation on windows Fixes: `75a783ea73` ("ac: Query the amdgpu MEC firmware version.") Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15518>	2022-03-24 09:11:13 +00:00
Rhys Perry	15640e58d9	radv,aco: lower texture descriptor loads in NIR fossil-db (Sienna Cichlid): Totals from 39445 (24.30% of 162293) affected shaders: MaxWaves: 875988 -> 875972 (-0.00%) Instrs: 35372561 -> 35234909 (-0.39%); split: -0.41%, +0.03% CodeSize: 190237480 -> 189379240 (-0.45%); split: -0.47%, +0.02% VGPRs: 1889856 -> 1889928 (+0.00%); split: -0.00%, +0.01% SpillSGPRs: 10764 -> 10857 (+0.86%); split: -2.04%, +2.91% SpillVGPRs: 1891 -> 1907 (+0.85%); split: -0.32%, +1.16% Scratch: 260096 -> 261120 (+0.39%) Latency: 477701150 -> 477578466 (-0.03%); split: -0.06%, +0.03% InvThroughput: 87819847 -> 87830346 (+0.01%); split: -0.03%, +0.04% VClause: 673353 -> 673829 (+0.07%); split: -0.04%, +0.11% SClause: 1385396 -> 1366478 (-1.37%); split: -1.65%, +0.29% Copies: 2327965 -> 2229134 (-4.25%); split: -4.58%, +0.34% Branches: 906707 -> 906434 (-0.03%); split: -0.13%, +0.10% PreSGPRs: 1874153 -> 1862698 (-0.61%); split: -1.34%, +0.73% PreVGPRs: 1691382 -> 1691383 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773>	2022-03-22 16:33:27 +00:00
Mihai Preda	ff2b2bc568	amd/ac_gpu_info: fix warning on fread unused result fixes this warning: ignoring return value of 'fread' declared with attribute 'warn_unused_result' [-Wunused-result] Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15502>	2022-03-22 11:50:37 +00:00
Samuel Pitoiset	53ccfbb996	amd: add PKT3_LOAD_SH_REG_INDEX It seems only available on GFX8+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15053>	2022-03-14 08:54:23 +00:00
Timur Kristóf	75a783ea73	ac: Query the amdgpu MEC firmware version. MEC (Micro Engine Compute) is the firmware which is responsible for the compute-only queues on AMD GPUs. It is present on GFX7 and newer. This patch will query the version of this firmware and print it among the others. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15283>	2022-03-09 21:31:48 +00:00
Marek Olšák	66e20d2bf7	ac: add an environment variable that parses IBs in files Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>	2022-03-01 22:30:24 +00:00
Marek Olšák	3394f0ae14	ac: define PKT3_ATOMIC_MEM Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>	2022-03-01 22:30:24 +00:00
Marek Olšák	ff9e4409c1	ac: parse SET_SH_REG_INDEX packet Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>	2022-03-01 22:30:24 +00:00
Marek Olšák	f8cf5ea982	amd: add support for gfx1036 and gfx1037 chips Both are identified as GFX1036 for simplicity. Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Tested-by: Yifan Zhang <yifan1.zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15155>	2022-03-01 17:03:00 +00:00
Marek Olšák	48046d5bd8	ac: set correct cache size per TCC for Yellow Carp Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Tested-by: Yifan Zhang <yifan1.zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15155>	2022-03-01 17:03:00 +00:00
Timur Kristóf	93087f71e6	ac/nir: Extract final mesh shader output counts to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>	2022-03-01 15:37:12 +00:00
Timur Kristóf	2d5aae032b	ac/nir: Properly invalidate mesh shader metadata. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>	2022-03-01 15:37:12 +00:00
Timur Kristóf	3a3bd9cff1	ac/nir: Fix workgroup ID in mesh shader waves other than the first. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>	2022-03-01 15:37:12 +00:00
Timur Kristóf	57775dd76a	ac/nir: Store mesh shader API and HW workgroup size in lowering state. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>	2022-03-01 15:37:12 +00:00
Timur Kristóf	d0f45c7c49	ac/nir: Reuse existing nir_builder for emit_ms_finale. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>	2022-03-01 15:37:12 +00:00
Timur Kristóf	74f1e7965e	ac/nir: Use vertex count minus 1 to determine max index in mesh shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>	2022-03-01 15:37:12 +00:00
Rhys Perry	f800af2231	ac/nir: remove TCS nir_var_shader_out memory barrier nir_var_shader_out writes are only used for later TES invocations, so I don't think there's any need for the TCS workgroup to wait for them. fossil-db (Sienna Cichlid): Totals from 1691 (1.04% of 162293) affected shaders: Instrs: 710699 -> 709008 (-0.24%) CodeSize: 3830168 -> 3823404 (-0.18%) Latency: 3396997 -> 3007934 (-11.45%) InvThroughput: 1212094 -> 1082823 (-10.67%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15195>	2022-03-01 11:02:43 +00:00
Timur Kristóf	bf519a7d47	ac/nir: Refactor mesh shader output code to smaller functions. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>	2022-02-25 06:31:33 +00:00
Timur Kristóf	a84789f795	ac/nir: Make sure to exclude special outputs from arrayed output masks. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>	2022-02-25 06:31:33 +00:00
Timur Kristóf	3956c03b05	ac/nir: Sanitize mesh shader primitive indices using umin. This makes our implementation friendlier to potentially buggy shaders, meaning that it will less likely to hang the GPU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>	2022-02-25 06:31:33 +00:00
Timur Kristóf	0746b98f4a	ac/nir: Properly handle when mesh API workgroup size is smaller than HW. The problem is that the real workgroup launched on NGG HW can be larger than the size specified by the API, and the extra waves need to keep up with barriers in the API waves. There are 2 different cases: 1. The whole API workgroup fits in a single wave. We can shrink the barriers to subgroup scope and don't need to insert any extra ones. 2. The API workgroup occupies multiple waves, but not all. In this case, we emit code that consumes every barrier on the extra waves. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>	2022-02-25 06:31:33 +00:00
Timur Kristóf	d88516a23f	ac/nir: Move LDS area for primitive count to the beginning. This makes it impossible for out of bounds vertex and primitive attribute stores and indices stores to overwrite this. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>	2022-02-25 06:31:33 +00:00
Timur Kristóf	3759a16d8a	ac/nir/ngg: Fix mixed up primitive ID after culling. When NGG culling is enabled, make sure that the correct primitive ID is exported by each lane. Fixes: `e97f0463a8` "ac/nir: Implement NGG deferred attribute culling in NIR." Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6050 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15055>	2022-02-22 18:15:24 +00:00
Marek Olšák	62074cb4ac	ac: update shadowed registers based on PAL Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	79a7ab642a	ac/surface: add more elements to meta equations because HTILE can use them according to gfx10SwizzlePattern.h Fixes: `9fabbf2150` - ac/surface: copy the HTILE equations to the surface Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	9a28f79f7b	ac/surface/tests: fix missing NUM_PKRS extraction in test_modifier Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	21f169b2fb	ac,radeonsi: rework and optimize how TMPRING_SIZE is set Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	cfaaa0892f	ac/surface: don't set the display flag for 1D textures Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	2f2fca24d2	ac/gpu_info: print units for some radeon_info fields Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	53f683ff67	ac: add a gfx9 workaround for high priority compute Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	197467c238	amd: add a workaround for an SQ perf counter bug Cc: mesa-stable@lists.freedesktop.org Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Marek Olšák	95af3cc2f8	amd: remove the _UMD suffix from register definitions It was mistakenly added to indicate it's for a User-Mode Driver, but all defined registers in Mesa are. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>	2022-02-22 11:41:04 +00:00
Samuel Pitoiset	cdf9a1a911	ac: add ac_gpu_info::has_stable_pstate Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14038>	2022-02-21 11:16:11 +00:00
Samuel Pitoiset	85436896c4	radv: declare a new shader argument for loading the VRS rates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14713>	2022-02-16 08:11:15 +01:00
Timur Kristóf	0445802ab2	compiler: Extract num_mesh_vertices_per_primitive function. Prevent code duplication. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15005>	2022-02-14 11:13:42 +01:00
Rhys Perry	9e171b6d49	ac/nir: use shorter builder names This makes a lot of lines shorter. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14455>	2022-01-21 13:45:33 +00:00
Rhys Perry	533118413b	ac/nir: avoid providing an align_mul to intrinsic builders Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14455>	2022-01-21 13:45:33 +00:00
Rhys Perry	c0a586bad7	ac/nir: avoid providing a write_mask to intrinsic builders Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14455>	2022-01-21 13:45:33 +00:00
Dave Airlie	1352e0ba0c	mesa/*: add a shader primitive type to get away from GL types. This creates an internal shader_prim enum, I've fixed up most users to use it instead of GL types. don't store the enum in shader_info as it changes size, and confuses other things. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Dave Airlie	d54c07b4c4	mesa/*: use an internal enum for tessellation primitive types. To avoid dragging gl.h into places it has no business being, defined tessellation primitive mode to an enum. This has a lot of fallout all over the place. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Dave Airlie	acd0afdba4	amd: move uvd decode definitions to common place This just makes sharing these easier later. Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14607>	2022-01-20 07:07:32 +10:00
Dave Airlie	14551f2bde	amd: move vcn decoding regs + structs to a common file. This just moves the main regs + fw interface structs to a new shared file. Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14607>	2022-01-20 07:07:30 +10:00
Marek Olšák	3cafa3e852	ac/surface: allow displayable DCC with any resolution (e.g. 8K) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14529>	2022-01-18 01:44:17 -05:00
Pierre-Eric Pelloux-Prayer	86262b6eac	radeonsi,radv: fix usages of surf_pitch For linear textures, pitch[level] should be used instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14454>	2022-01-12 11:39:53 +00:00
Pierre-Eric Pelloux-Prayer	148b2d0040	amd: add SDMA_NOP_PAD And use it in amdgpu_cs.c. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>	2022-01-11 12:18:35 +00:00
Thomas H.P. Andersen	7daba1fe65	replace 0 with NULL for NULL pointers This updates many places where 0 is used as NULL pointer. There are a few warnings left when I build the default configuration but they either relate to code outside of mesa or where "None" is used instead. Found with static analysis (smatch) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12174>	2022-01-10 22:53:32 +00:00
Rhys Perry	0f5d90c2a7	ac/nir: fix store_buffer_amd write_masks Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>	2022-01-10 19:01:04 +00:00
Danylo Piliaiev	b8d486f298	nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8 Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but not signed dot product. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Marek Olšák	116a05c721	ac: move ac_exp_param.h to ac_nir.h Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>	2022-01-05 12:46:31 +00:00
Marek Olšák	12b942bd16	radeonsi: pass sample_coverage VGPR index to the PS prolog instead of guessing The code was correct, but little confusing. This is cleaner. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>	2022-01-05 12:46:30 +00:00
Marek Olšák	384014bebe	radeonsi: apply spi_cu_en to CU_EN Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14122>	2022-01-05 01:36:10 -05:00
Marek Olšák	470b61f3a9	ac/gpu_info: add AMD_CU_MASK environment variable to set CU_EN requested internally Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14122>	2022-01-05 01:36:10 -05:00
Marek Olšák	a68cb9db8d	ac/gpu_info: set cu_mask correctly for Arcturus Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14122>	2022-01-05 01:36:10 -05:00
Timur Kristóf	7aa42e023a	ac/nir/ngg: Lower NV mesh shaders to NGG semantics. Lower mesh shader outputs to shared memory. At the end of the shader, read the outputs from shared memory and export their values as NGG expects. We allocate separate shared memory (LDS) areas for per-vertex, per-primitive outputs, primitive indices, primitive count. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>	2021-12-31 13:05:09 +00:00
Qiang Yu	1876285c27	ac/surface: add prt_tile_depth For supporting 3D sparse texture. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14223>	2021-12-30 16:11:19 +08:00
Qiang Yu	92d810fa74	ac/surface: fix prt_first_mip_tail calculation for gfx9+ Use firstMipIdInTail directly from addrlib which calculated this in a different way: Original way: either dimension size of mipmap should be less than the tile size. Addrlib way: all dimesion size of the mipmap should be less than the tile size and at lest one dimension size should be less than half of the tile size, so that all following mip levels can fit in one tile and any commit for level in the mip tail also commit for all levels in mip tail. Theoretically either way is OK but addrlib way needs less care about the mip tail commit and better align with the true memory layout given by itself. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14223>	2021-12-30 16:11:19 +08:00
Bas Nieuwenhuizen	860532c5a1	radv: Add safety check for RGP traces on VanGogh. To avoid accidental hangs. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5260 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13730>	2021-12-17 21:25:01 +00:00
Dave Airlie	d051854cca	treewide: drop mtypes/macros includes from main These aren't required in lots of places, so remove them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14127>	2021-12-08 22:14:45 +00:00
Samuel Pitoiset	8e3fbe7cc8	ac/nir: fix left shift of 1 by 31 places detected by UBSAN src/amd/common/ac_nir_lower_ngg.c:1135:62: runtime error: left shift of 1 by 31 places cannot be represented in type 'int src/amd/common/ac_nir_lower_ngg.c:622:20: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13951>	2021-11-25 16:15:30 +00:00
Marek Olšák	694731ac13	ac/surface: allow gfx6-8 to enter the gfx9 DCC codepath for SI_FORCE_FAMILY Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13871>	2021-11-24 13:55:23 +00:00
Marek Olšák	d830d213b6	ac/gpu_info: don't fail on amdgpu_query_video_caps_info failures When VCN is unsupported, we don't want to break GL or Vulkan. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13871>	2021-11-24 13:55:23 +00:00
Samuel Pitoiset	cfc5c2abfd	ac: change family names to uppercase in ac_get_family_name() To print the same device name as real hw. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13812>	2021-11-23 08:07:41 +00:00
Timur Kristóf	5aa39253cb	nir: Rename nir_get_io_vertex_index_src and include per-primitive I/O. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Vinson Lee	008f5a127c	ac/rgp: Initialize clock_calibration with memset. Fix defect reported by Coverity Scan. Uninitialized scalar variable (UNINIT) uninit_use_in_call: Using uninitialized value clock_calibration. Field clock_calibration.reserved is uninitialized when calling fwrite. Fixes: `1ee85e8bab` ("ac/rgp: add support for clock calibration") Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13783>	2021-11-15 22:59:05 -08:00
Samuel Pitoiset	a9c4e0c371	ac/spm: fix determining the counter slot Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e928f475cc` ("ac: add initial SPM support") Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13758>	2021-11-15 11:24:36 +01:00
Samuel Pitoiset	11c6a32759	ac/spm: fix determing the SPM wire One SPM wire holds two 16-bit counters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e928f475cc` ("ac: add initial SPM support") Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13758>	2021-11-15 11:24:33 +01:00
James Park	195a379a7e	ac: Align ADDR_FASTCALL with addrlib Fixes linker errors for 32-bit Windows. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13749>	2021-11-12 09:46:10 +00:00
Samuel Pitoiset	3e7bac80ce	ac/rgp: add support for dumping SPM data Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13704>	2021-11-11 10:05:49 +00:00
Samuel Pitoiset	e928f475cc	ac: add initial SPM support SPM is hardware feature that allows us to dump performance counters at a sampling interval to a buffer. It is used by RGP to report cache counters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13704>	2021-11-11 10:05:49 +00:00
Samuel Pitoiset	1ee85e8bab	ac/rgp: add support for clock calibration Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13709>	2021-11-09 11:20:12 +01:00
Samuel Pitoiset	aebf04ab3f	ac/rgp: add support for queue event timings Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13709>	2021-11-09 11:20:10 +01:00
Samuel Pitoiset	824ce4ef40	ac/rgp: fix alignment of code object records to follow the RGP spec Should be aligned to 4 bytes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13711>	2021-11-09 09:42:20 +00:00
Timur Kristóf	b59614619b	radv: Use MESA_VULKAN_SHADER_STAGES to make room for mesh/task. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13440>	2021-11-04 13:32:07 +00:00
Pierre-Eric Pelloux-Prayer	dbf602a6b3	ac/surface: don't validate DCC settings if DCC isn't possible Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13550>	2021-11-04 09:38:46 +01:00
Pierre-Eric Pelloux-Prayer	84d4bda8e5	ac/surface: use a less strict condition in is_dcc_supported_by_L2 While Mesa chooses to always use independent_128B_blocks, other drivers can make different choices. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13550>	2021-11-04 09:38:27 +01:00
Samuel Pitoiset	d8e4546707	ac/nir: remove bogus assertion about the position for culling It's undefined to not export a position but some applications rely on that. The position is always initialized to 0,0,0,1 everywhere else if not exported. Fixes KHR-GL46.shader_image_load_store.multiple-uniforms with Zink. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13470>	2021-10-28 10:44:20 +00:00
Bas Nieuwenhuizen	2c43fd4c41	amd/rgp: Use VGH clocks for RGP workaround. Hear that it matters for RGP. This is the most likely scenario where we would hit this workaround, given the tooling for profiling on the deck will set profile_peak as workaround for hangs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13534>	2021-10-27 21:51:59 +00:00
Samuel Pitoiset	b797ecac7a	ac/rgp: remove useless code related to GFX6-7 RGP only supports GFX8+. RADV doesn't allow SQTT on < GFX8 and RadeonSI only allows it on GFX9+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13451>	2021-10-21 06:44:50 +00:00
Marek Olšák	f9d7db0262	ac,radeonsi: print a lowercase codename in the renderer string to make it stand out less Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13392>	2021-10-18 18:37:09 +00:00
Marek Olšák	84d0f54e75	ac/surface: enable better display DCC for chips newer than Yellow Carp Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13214>	2021-10-13 06:20:13 +00:00
Marek Olšák	a18a7626a2	ac/surface: disallow display DCC for big resolutions Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13214>	2021-10-13 06:20:13 +00:00
Marek Olšák	1a8df6f1be	ac/surface: always use suboptimal display DCC with DRM <= 3.43.0 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13214>	2021-10-13 06:20:13 +00:00
Timur Kristóf	5de91cfc04	ac/nir/nggc: Write undef to variables in non-repacked ES threads. This helps the compiler generate a little bit more efficient code. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 4659 (3.62% of 128647) affected shaders: CodeSize: 7468320 -> 7404484 (-0.85%); split: -0.88%, +0.03% Instrs: 1423425 -> 1407454 (-1.12%); split: -1.16%, +0.03% Latency: 5250593 -> 5226163 (-0.47%); split: -0.47%, +0.00% InvThroughput: 739848 -> 733373 (-0.88%); split: -0.90%, +0.02% Copies: 200139 -> 190307 (-4.91%); split: -5.13%, +0.22% Branches: 87925 -> 85998 (-2.19%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13121>	2021-10-12 16:27:50 +00:00
Timur Kristóf	783f8f728c	ac/nir/cull: Accept NaN and +/- Inf in face culling. When the determinant that we use for calculating triangle area is NaN, it's not possible to decide the facing of the triangle. This can happen when a coordinate of one of the triangle's vertices is INFINITY. It's better to just accept these triangles in the shader and let the PA deal with them. Let's do the same for +/- Infinity too. Though we haven't seen this yet, it may be troublesome as well. Fixes: `651a3da1b5` Closes: #5470 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13299>	2021-10-12 15:23:52 +00:00
Bas Nieuwenhuizen	33065149c1	amd/common: Add fallback for misreported clocks for RGP. Traces with clock = 0 are totally useless due to RGP getting very confused. Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13301>	2021-10-12 12:28:04 +00:00
Joshua Ashton	77e5f149eb	ac/surface: Expose modifiers capable of DCC image stores first These also have a higher compressed block size. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>	2021-10-11 11:43:39 +00:00
Joshua Ashton	9cffe1b9ea	ac/surface: Add ac_modifier_max_extent Currently, we aren't checking if the modifier supports the extent of the image. DCN only works with !64B && 128B on extents < 4K. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>	2021-10-11 11:43:39 +00:00
Samuel Pitoiset	a6298b1bc9	radv: remove unnecessary radv_shader_info:num_inline_push_consts This can be determined directly from the user SGPR loc. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13149>	2021-10-08 11:37:19 +00:00
Chia-I Wu	8cce6281e6	util/vector: make util_vector_init harder to misuse Make u_vector_init a wrapper to u_vector_init_pot. Let both take (element_count, element_size) as parameters. Motivated by `eed0fc4caf` ("vulkan/wsi/wayland: fix an invalid u_vector_init call") v2: rename u_vector_init_pot to u_vector_init_pow2 Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Simon Ser <contact@emersion.fr> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>	2021-10-08 00:15:11 +00:00
Joshua Ashton	72c0e57e7e	ac/surface: Use 64 && 128 for GFX10_3 on non-modifier path DCC_IND_BLK is not hooked up for this to work in the kernel in any released version, and it's unsafe to do so even if it was because it doesn't check the modifiers. There's no reason to change the legacy non-modifier path to be more performant at the expense of breaking backwards compatibility with older versions of Mesa. Fixes: `0f6251b3` ("ac/surface: use DCC compatible with image stores for < 4K resolutions") Closes: #5422 Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13122>	2021-10-06 00:13:46 +00:00
Samuel Pitoiset	b52aaea630	radv: remove unnecessary ac_nir_ngg_config output struct Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Samuel Pitoiset	52e91f7640	radv: move ngg passthrough determination earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Samuel Pitoiset	2ce78a30ff	move: move ngg lds bytes determination earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Samuel Pitoiset	90858dd718	radv: move ngg early prim export determination earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Rhys Perry	24501b5452	radv: move ngg culling determination earlier Co-Authored-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13134>	2021-10-04 08:55:19 +00:00
Marek Olšák	edc8a4a037	ac/surface: enable DCC image stores for all displayable DCC on gfx10.3 Co-authored-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13153>	2021-10-02 22:56:48 +00:00
Joshua Ashton	e6fcf65578	ac/surface: Add helper for checking if a surface supports DCC Image stores We need to keep RADV and RadeonSI on the same page about this due to modifiers. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13153>	2021-10-02 22:56:48 +00:00
Marek Olšák	923c535ee8	ac/surface: don't overwrite DCC settings for imported buffers Fixes: `0f6251b31f` - ac/surface: use DCC compatible with image stores for < 4K resolutions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13120>	2021-10-01 16:15:40 -04:00
Marek Olšák	279cd5821c	ac/gpu_info: fix the comment for the NGG->legacy transition bug Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13048>	2021-09-28 17:30:06 +00:00
Marek Olšák	a198c6b7dd	ac/surface: correct a comment about DCC image stores Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13013>	2021-09-25 08:49:05 +00:00
Marek Olšák	0f6251b31f	ac/surface: use DCC compatible with image stores for < 4K resolutions We don't have to use the special DCC settings for lower resolutions. This will cause corruption if X and an windowed app use different Mesa versions. The fix is to restart the X server. I expect to get false bug reports due to this. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13013>	2021-09-25 08:49:05 +00:00
Timur Kristóf	09f89d15e4	ac/nir/nggc: Don't reuse uniform values from divergent control flow. With NGG culling, the shaders are split into two parts: the top part that computes just the position output, and the bottom part which produces the other outputs. To reduce redundancy between the two, I added some code to reuse uniform variables between them. However, there is an edge case I didn't think about: because of vertex repacking, it is possible for the bottom part to process a different vertex. Therefore it can take a different divergent code path (though it must still take the same uniform code path). Due to this, when a uniform value comes from divergent control flow, this may be undefined in the bottom part. This commit stops reusing uniform variables from divergent control flow, to fix issues that arise from this. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 1723 (1.34% of 128647) affected shaders: VGPRs: 89312 -> 89184 (-0.14%); split: -0.15%, +0.01% SpillSGPRs: 4575 -> 120 (-97.38%) CodeSize: 10846424 -> 10873836 (+0.25%); split: -0.68%, +0.93% MaxWaves: 34582 -> 34602 (+0.06%); split: +0.06%, -0.01% Instrs: 2124471 -> 2128835 (+0.21%); split: -0.51%, +0.72% Latency: 7274569 -> 7293899 (+0.27%); split: -0.22%, +0.48% InvThroughput: 1637130 -> 1635490 (-0.10%); split: -0.17%, +0.07% VClause: 25141 -> 25414 (+1.09%); split: -0.02%, +1.10% SClause: 56367 -> 59503 (+5.56%); split: -1.36%, +6.93% Copies: 230704 -> 219313 (-4.94%); split: -5.49%, +0.55% Branches: 72781 -> 72681 (-0.14%); split: -0.21%, +0.07% PreSGPRs: 118766 -> 100176 (-15.65%); split: -15.70%, +0.05% PreVGPRs: 76876 -> 76833 (-0.06%) Fixes: `0bb543bb60` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13001>	2021-09-24 17:32:53 +00:00
Timur Kristóf	cb19ebe7ba	ac/nir/nggc: Refactor save_reusable_variables. This makes the code more elegant and also fixes the mistake of skipping the blocks that come before loops. Fossil DB changes on Sienna Cichlid with NGGC on: Totals from 1026 (0.80% of 128647) affected shaders: SpillSGPRs: 3817 -> 4035 (+5.71%) CodeSize: 5582856 -> 5538732 (-0.79%); split: -0.89%, +0.10% Instrs: 1106907 -> 1100180 (-0.61%); split: -0.68%, +0.07% Latency: 10084948 -> 10052197 (-0.32%); split: -0.37%, +0.05% InvThroughput: 1567012 -> 1564949 (-0.13%); split: -0.16%, +0.03% SClause: 39789 -> 39075 (-1.79%); split: -2.33%, +0.54% Copies: 95184 -> 96456 (+1.34%); split: -0.19%, +1.53% Branches: 44087 -> 44093 (+0.01%); split: -0.01%, +0.02% PreSGPRs: 47584 -> 51009 (+7.20%); split: -0.61%, +7.80% Fixes: `0bb543bb60` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13001>	2021-09-24 17:32:53 +00:00
Timur Kristóf	a7f2faea46	ac/nir: Emit edge flag instructions conditionally. They are not needed by RADV but will be needed by RadeonSI. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: VGPRs: 1982664 -> 1975936 (-0.34%); split: -0.43%, +0.09% CodeSize: 152790880 -> 149510316 (-2.15%); split: -2.15%, +0.00% MaxWaves: 1617984 -> 1621900 (+0.24%) Instrs: 29272825 -> 28907038 (-1.25%); split: -1.26%, +0.01% Latency: 128744182 -> 127565678 (-0.92%); split: -1.14%, +0.22% InvThroughput: 20125915 -> 19805168 (-1.59%); split: -1.63%, +0.03% VClause: 521312 -> 519804 (-0.29%); split: -0.77%, +0.48% SClause: 688861 -> 688897 (+0.01%); split: -0.04%, +0.05% Copies: 3205421 -> 3177799 (-0.86%); split: -1.68%, +0.82% Branches: 1181457 -> 1183147 (+0.14%); split: -0.03%, +0.17% PreVGPRs: 1626681 -> 1595406 (-1.92%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12998>	2021-09-23 16:57:56 +02:00
Bas Nieuwenhuizen	1ca4fd31e6	radv: Add support for ray launch size. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Timur Kristóf	13e467a147	ac/nir: Fix match_mask to work correctly for VS outputs. match_mask checks the intrinsic type and decides whether it's per-patch or not. VS don't have per-patch outputs, so this causes wrong behaviour there. Found using the GCC undefined behavior sanitizer. Fixes the following error: runtime error: shift exponent 18446744073709551584 is too large for 64-bit type 'long unsigned int' Closes: #5319 Fixes: `bf966d1c1d` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>	2021-09-20 18:08:16 +00:00
Timur Kristóf	75dbb40439	ac/nir: Remove byte permute from prefix sum of the repack sequence. The byte-permute instruction v_perm_b32 is not exposed by older LLVM releases (only available on LLVM 13 and later), therefore a new sequence is needed which we can use with these LLVM versions too. The prefix sum is replaced by two alternatives: 1. For GPUs that support v_dot, we shift 0x01 to the wanted byte positions and then use v_dot to sum the results. 2. For older GPUs (Navi 10), we simply shift out the unwanted bytes and use v_sad_u8 to produce the sum. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>	2021-09-20 12:39:03 +02:00
Joshua Ashton	92ade3df05	ac/surface: Add ac_modifier_supports_dcc_image_stores helper Helper function to check if a modifier supports DCC image stores. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12862>	2021-09-18 00:01:01 +00:00
Joshua Ashton	fd08758bb1	ac/surface: Add modifiers capable of DCC image stores Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12862>	2021-09-18 00:01:01 +00:00
Samuel Pitoiset	c952655693	ac/rgp, radv: report wave size for shaders Fills the "Wave mode" in "Pipelines" for GPUs that supports Wave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>	2021-09-17 08:05:36 +00:00
Samuel Pitoiset	d29c381c64	ac/rgp, radv: report scratch memory size for shaders Fills the "Scatch Mem" with "Yes/No" in "Pipelines", this requires instruction timing to be enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>	2021-09-17 08:05:36 +00:00
Rhys Perry	40a0935899	ac/gpu_info: add has_accelerated_dot_product Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Timur Kristóf	fe6e4484ab	ac/nir/nggc: Move gs_alloc_req up in NGG culling shaders. This is the first part of a refactor to make vertex compaction optional. Additionally, it may yield a very small benefit to allocate the PC space sligtly sooner. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 58239 (45.27% of 128647) affected shaders: CodeSize: 160502348 -> 160502340 (-0.00%) Instrs: 30722664 -> 30722662 (-0.00%) Latency: 137627419 -> 137782218 (+0.11%); split: -0.00%, +0.11% InvThroughput: 21698587 -> 21699068 (+0.00%); split: -0.00%, +0.00% Copies: 3288263 -> 3288261 (-0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Timur Kristóf	f4a65e5628	ac/nir/nggc: Only repack arguments that are needed. Don't repack everything, only what is actually used. The goal of this commit is primarily to remove unnecessary LDS stores and loads. In addition to that, it also gets rid of a few VALU instructions and reduces VGPR use. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 6951 (5.40% of 128647) affected shaders: VGPRs: 206056 -> 205360 (-0.34%); split: -0.79%, +0.45% CodeSize: 12344568 -> 12269312 (-0.61%); split: -0.62%, +0.01% MaxWaves: 211206 -> 212196 (+0.47%) Instrs: 2319459 -> 2308483 (-0.47%); split: -0.50%, +0.03% Latency: 7220829 -> 7164721 (-0.78%); split: -1.21%, +0.43% InvThroughput: 1051450 -> 1049191 (-0.21%); split: -0.36%, +0.15% VClause: 25794 -> 25445 (-1.35%); split: -1.97%, +0.61% SClause: 39192 -> 39277 (+0.22%); split: -0.21%, +0.43% Copies: 315756 -> 313404 (-0.74%); split: -1.17%, +0.42% Branches: 127878 -> 127879 (+0.00%); split: -0.00%, +0.00% PreVGPRs: 168029 -> 160162 (-4.68%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Timur Kristóf	02bba6aab5	ac/nir/nggc: Don't stop applying reusable variables at prim export. This was a mistake that prevented reusing variables in shaders with late primitive export. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 6547 (5.09% of 128647) affected shaders: VGPRs: 323368 -> 323824 (+0.14%); split: -0.03%, +0.18% SpillSGPRs: 45 -> 4865 (+10711.11%) CodeSize: 34208732 -> 33855952 (-1.03%); split: -1.21%, +0.18% MaxWaves: 142538 -> 142456 (-0.06%); split: +0.04%, -0.09% Instrs: 6654252 -> 6606432 (-0.72%); split: -0.89%, +0.17% Latency: 30527770 -> 30452769 (-0.25%); split: -0.42%, +0.18% InvThroughput: 5604540 -> 5609450 (+0.09%); split: -0.04%, +0.13% VClause: 121531 -> 120448 (-0.89%); split: -1.17%, +0.27% SClause: 195388 -> 177902 (-8.95%); split: -9.14%, +0.19% Copies: 617949 -> 636397 (+2.99%); split: -0.44%, +3.42% Branches: 228184 -> 228281 (+0.04%); split: -0.09%, +0.13% PreSGPRs: 271395 -> 343555 (+26.59%); split: -0.01%, +26.60% PreVGPRs: 277650 -> 277710 (+0.02%); split: -0.01%, +0.03% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00

1 2 3 4 5 ...

1962 Commits