KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Danylo Piliaiev	6ad7be1b36	meson/pps: Check if libdrm exists to compile pps For Turnip with KGSL we may have perffeto enabled but we don't have libdrm. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>	2022-06-22 11:52:36 +03:00
Danylo Piliaiev	ee6a0c675b	meson: Define _GNU_SOURCE for android host system Otherwise sched_getaffinity isn't be defined and util_cpu_detect_once fails to compile. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>	2022-06-22 11:52:36 +03:00
Samuel Pitoiset	ad3d6d9c6e	radv/llvm: always emit a null export even if the FS doesn't discard Even with a noop FS, the color blend state can still be non-zero, and then SPI color related registers won't be 0 and this would hang. Fixes: `bdf3797aeb` ("ac,radeonsi: don't export null from PS if it has no effect on gfx10+") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17169>	2022-06-22 08:31:30 +02:00
Pavel Asyutchenko	17645cb29c	llvmpipe: enable PIPE_CAP_FBFETCH_ZS Support for it was added in previous commits. Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Pavel Asyutchenko	ccaa7920ef	llvmpipe: implement FB fetch for depth/stencil Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Pavel Asyutchenko	0ba3e797ee	llvmpipe: simplify early/late zs tests selection This does not change selection logic. Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Pavel Asyutchenko	443ef18f0c	llvmpipe: enable per-sample shading when FB fetch is used This matches specifications of both color and ZS fetch extensions. Cc: mesa-stable Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Pavel Asyutchenko	8788b17596	nir_to_tgsi: Don't count ZS fbfetch vars as outputs Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Pavel Asyutchenko	959b748038	glsl: add language support for GL_ARM_shader_framebuffer_fetch_depth_stencil This extension adds built-in variables gl_LastFragDepthARM and gl_LastFragStencilARM which can be implemented almost the same as gl_LastFragData from color fetch extension. Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Pavel Asyutchenko	41f22a1823	gallium: add PIPE_CAP_FBFETCH_ZS and expose extension st/mesa will expose GL_ARM_shader_framebuffer_fetch_depth_stencil if this new capability is supported by the driver. Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>	2022-06-22 04:32:44 +00:00
Dave Airlie	68e8940114	glx/drisw: use xcb instead of X to query connection Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>	2022-06-22 03:28:21 +00:00
Dave Airlie	d3e723fb77	wsi/x11: add xcb_put_image support for larger transfers. This was noticed as a problem in the EGL code, just fixup wsi. Cc: mesa-stable Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>	2022-06-22 03:28:21 +00:00
Dave Airlie	c5dbb1139c	egl/x11: add missing put_image cookie cleanups These might not be required but be consistent with the wsi code. Cc: mesa-stable Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>	2022-06-22 03:28:21 +00:00
Dave Airlie	e6082ac62e	egl/x11: split large put image requests to avoid server destroy wezterm in fullscreen 4k was exceeding the xcb max request size on the put image with llvmpipe. This fixes it to send sub-images, the Xlib put image used in glx does this internally, but not the xcb one, so just do it in sections here. Cc: mesa-stable Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>	2022-06-22 03:28:21 +00:00
Mike Blumenkrantz	e8fc5cca90	zink: fix dual_src_blend driconf workaround not sure when this broke but it broke cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17156>	2022-06-22 03:14:18 +00:00
Mike Blumenkrantz	ea005c9e04	glx/drisw: invalidate drawables upon binding context if flush extension exists this forces surface resize as expected cc: mesa-stable fixes #6706 Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>	2022-06-22 02:18:37 +00:00
Mike Blumenkrantz	23b63e536e	glx/drisw: store the flush extension to the screen cc: mesa-stable Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>	2022-06-22 02:18:37 +00:00
Guilherme Gallo	cee1c4fc7f	ci/lava: Filter out undesired messages Some LAVA jobs emit lots of messages "Listened to connection for namespace 'common' for up to 1s" in a row at the end of the logs, making difficult to see the result of the test script. This commit removes those lines until a proper solution is deployed on the LAVA side. Closes: #6116 Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17151>	2022-06-22 01:48:16 +00:00
Jason Ekstrand	64d074879b	vulkan/wsi: Use HAVE_LIBDRM to detect DRM instead of !_WIN32 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17170>	2022-06-22 01:15:20 +00:00
Jordan Justen	a7127fbc4c	intel/tools: Print memory info in intel_dev_info Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>	2022-06-22 00:30:49 +00:00
Jordan Justen	eaf2a35a76	iris/bufmgr: Use memory info from devinfo Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>	2022-06-22 00:30:49 +00:00
Jordan Justen	1505f94397	anv: Use memory info from devinfo Rework: * Jordan: Drop regions.valid (Lionel implemented a fallback) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>	2022-06-22 00:30:49 +00:00
Lionel Landwerlin	4289c9ec13	intel/dev: add a fallback when memory regions are not available We have this in Anv and it could be reused in Iris for integrated memory system. Rework: * Jordan: Drop regions.valid (Lionel implemented a fallback) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>	2022-06-22 00:30:49 +00:00
Lionel Landwerlin	4e727297e8	intel/dev: add a helper to update memory info Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>	2022-06-22 00:30:49 +00:00
Jordan Justen	4aecfbf0f4	intel/dev: Add devinfo::mem to store i915 regions information Reworks: * Lionel: Change check on memory region valid to vram size * Jordan: Drop regions.valid (Lionel implemented a fallback) * Jordan: Rename devinfo::regions to devinfo::mem. * Jordan: Add devinfo::mem::use_class_instance * Add mesa_logw for lmem requiring regions. (s-b Lionel) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>	2022-06-22 00:30:49 +00:00
Alyssa Rosenzweig	1222c86e34	panfrost: Bump ESSL_FEATURE_LEVEL on Valhall This advertises ARB_gpu_shader5 on Valhall, which should be working now. On the GLES3.1 side, this notably adds support for sample variables and dynamic offsets for texture gathers, both of which should now be working. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	74460a5d75	panfrost: Enable CAP_INDIRECT_TEMP_ADDR on Valhall For parity with Bifrost. Apparently this pattern is sufficiently obscure that the shader-db results on Mali-G57 are mostly noise. total instructions in shared programs: 2675116 -> 2674820 (-0.01%) instructions in affected programs: 4336 -> 4040 (-6.83%) helped: 8 HURT: 1 helped stats (abs) min: 1.0 max: 52.0 x̄: 37.88 x̃: 49 helped stats (rel) min: 0.46% max: 8.20% x̄: 5.97% x̃: 7.56% HURT stats (abs) min: 7.0 max: 7.0 x̄: 7.00 x̃: 7 HURT stats (rel) min: 5.98% max: 5.98% x̄: 5.98% x̃: 5.98% 95% mean confidence interval for instructions value: -52.90 -12.88 95% mean confidence interval for instructions %-change: -8.48% -0.81% Instructions are helped. total cvt in shared programs: 14127.08 -> 14126.53 (<.01%) cvt in affected programs: 33.84 -> 33.30 (-1.62%) helped: 10 HURT: 1 helped stats (abs) min: 0.015625 max: 0.125 x̄: 0.06 x̃: 0 helped stats (rel) min: 0.71% max: 2.93% x̄: 1.76% x̃: 1.78% HURT stats (abs) min: 0.09375 max: 0.09375 x̄: 0.09 x̃: 0 HURT stats (rel) min: 7.89% max: 7.89% x̄: 7.89% x̃: 7.89% 95% mean confidence interval for cvt value: -0.09 -0.01 95% mean confidence interval for cvt %-change: -2.89% 1.13% Inconclusive result (%-change mean confidence interval includes 0). total sfu in shared programs: 7572 -> 7555.69 (-0.22%) sfu in affected programs: 37.19 -> 20.88 (-43.87%) helped: 6 HURT: 3 helped stats (abs) min: 2.75 max: 2.75 x̄: 2.75 x̃: 2 helped stats (rel) min: 47.31% max: 48.89% x̄: 48.63% x̃: 48.89% HURT stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0 HURT stats (rel) min: 5.56% max: 6.25% x̄: 5.79% x̃: 5.56% 95% mean confidence interval for sfu value: -2.89 -0.73 95% mean confidence interval for sfu %-change: -51.41% -9.57% Sfu are helped. total quadwords in shared programs: 1450040 -> 1449896 (<.01%) quadwords in affected programs: 1992 -> 1848 (-7.23%) helped: 6 HURT: 0 helped stats (abs) min: 24.0 max: 24.0 x̄: 24.00 x̃: 24 helped stats (rel) min: 6.82% max: 7.50% x̄: 7.24% x̃: 7.32% 95% mean confidence interval for quadwords value: -24.00 -24.00 95% mean confidence interval for quadwords %-change: -7.48% -6.99% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	7d84bb00dc	panfrost: Enable more FP16 caps on Valhall This brings the FP16 capabilities of Valhall to parity with Bifrost. Supporting FP16 constant buffers in particular reduces ALU in a ton of GLES shaders, so that's a nice win. FP16 derivatives get vectorized which is a big win where that applies, but they are considerably less common. The lost shaders are from enabling PIPE_SHADER_CAP_FP16_CONST_BUFFERS (these shaders compile on Midgard but not on Bifrost). The shaders in question declare the same uniform in linked vertex and fragment shaders with different precisions. This is contrary to the GLSL ES specification, which states precisions must match for default uniforms of linked shaders. All the lost shaders are in 8 Ball Pool and Hill Climb Racing. As those are proprietary games, if that becomes a problem in the future, drirc is the solution. total instructions in shared programs: 2697897 -> 2674595 (-0.86%) instructions in affected programs: 1019922 -> 996620 (-2.28%) helped: 4838 HURT: 2599 helped stats (abs) min: 1.0 max: 52.0 x̄: 7.13 x̃: 5 helped stats (rel) min: 0.16% max: 46.51% x̄: 8.04% x̃: 5.33% HURT stats (abs) min: 1.0 max: 36.0 x̄: 4.30 x̃: 3 HURT stats (rel) min: 0.17% max: 133.33% x̄: 10.53% x̃: 3.85% 95% mean confidence interval for instructions value: -3.32 -2.95 95% mean confidence interval for instructions %-change: -1.89% -1.22% Instructions are helped. total cycles in shared programs: 141764.61 -> 140602.88 (-0.82%) cycles in affected programs: 5728.22 -> 4566.48 (-20.28%) helped: 665 HURT: 89 helped stats (abs) min: 0.015625 max: 15.0 x̄: 1.75 x̃: 0 helped stats (rel) min: 0.30% max: 61.54% x̄: 11.17% x̃: 4.62% HURT stats (abs) min: 0.015625 max: 0.265625 x̄: 0.04 x̃: 0 HURT stats (rel) min: 0.30% max: 66.67% x̄: 6.77% x̃: 1.94% 95% mean confidence interval for cycles value: -1.77 -1.31 95% mean confidence interval for cycles %-change: -10.11% -7.99% Cycles are helped. total fma in shared programs: 22577.56 -> 22575.91 (<.01%) fma in affected programs: 2422.78 -> 2421.12 (-0.07%) helped: 533 HURT: 653 helped stats (abs) min: 0.015625 max: 0.0625 x̄: 0.03 x̃: 0 helped stats (rel) min: 0.30% max: 50.00% x̄: 8.25% x̃: 1.35% HURT stats (abs) min: 0.015625 max: 0.125 x̄: 0.03 x̃: 0 HURT stats (rel) min: 0.19% max: 100.00% x̄: 4.53% x̃: 2.08% 95% mean confidence interval for fma value: -0.00 0.00 95% mean confidence interval for fma %-change: -1.98% -0.44% Inconclusive result (value mean confidence interval includes 0). total cvt in shared programs: 14460.95 -> 14122.50 (-2.34%) cvt in affected programs: 6159.02 -> 5820.56 (-5.50%) helped: 4827 HURT: 2577 helped stats (abs) min: 0.015625 max: 0.796875 x̄: 0.11 x̃: 0 helped stats (rel) min: 0.20% max: 81.82% x̄: 17.78% x̃: 12.90% HURT stats (abs) min: 0.015625 max: 0.546875 x̄: 0.07 x̃: 0 HURT stats (rel) min: 0.00% max: 600.00% x̄: 43.66% x̃: 13.04% 95% mean confidence interval for cvt value: -0.05 -0.04 95% mean confidence interval for cvt %-change: 2.28% 4.93% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total sfu in shared programs: 7593.56 -> 7571.06 (-0.30%) sfu in affected programs: 357.19 -> 334.69 (-6.30%) helped: 149 HURT: 1 helped stats (abs) min: 0.0625 max: 0.25 x̄: 0.15 x̃: 0 helped stats (rel) min: 5.26% max: 36.36% x̄: 6.79% x̃: 5.56% HURT stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0 HURT stats (rel) min: 3.57% max: 3.57% x̄: 3.57% x̃: 3.57% 95% mean confidence interval for sfu value: -0.16 -0.14 95% mean confidence interval for sfu %-change: -7.51% -5.93% Sfu are helped. total v in shared programs: 8722.62 -> 8722.31 (<.01%) v in affected programs: 1.62 -> 1.31 (-19.23%) helped: 2 HURT: 0 total ls in shared programs: 129666 -> 128494 (-0.90%) ls in affected programs: 4163 -> 2991 (-28.15%) helped: 192 HURT: 0 helped stats (abs) min: 1.0 max: 15.0 x̄: 6.10 x̃: 5 helped stats (rel) min: 4.35% max: 75.00% x̄: 30.23% x̃: 26.32% 95% mean confidence interval for ls value: -6.67 -5.54 95% mean confidence interval for ls %-change: -32.67% -27.79% Ls are helped. total quadwords in shared programs: 1461496 -> 1449768 (-0.80%) quadwords in affected programs: 273592 -> 261864 (-4.29%) helped: 1992 HURT: 687 helped stats (abs) min: 8.0 max: 24.0 x̄: 8.76 x̃: 8 helped stats (rel) min: 1.43% max: 50.00% x̄: 16.30% x̃: 11.11% HURT stats (abs) min: 8.0 max: 16.0 x̄: 8.31 x̃: 8 HURT stats (rel) min: 1.92% max: 100.00% x̄: 36.39% x̃: 25.00% 95% mean confidence interval for quadwords value: -4.67 -4.08 95% mean confidence interval for quadwords %-change: -3.95% -1.62% Quadwords are helped. total threads in shared programs: 53496 -> 53551 (0.10%) threads in affected programs: 112 -> 167 (49.11%) helped: 74 HURT: 19 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.42 0.76 95% mean confidence interval for threads %-change: 56.83% 81.88% Threads are helped. total loops in shared programs: 128 -> 127 (-0.78%) loops in affected programs: 1 -> 0 helped: 1 HURT: 0 total fills in shared programs: 684 -> 672 (-1.75%) fills in affected programs: 160 -> 148 (-7.50%) helped: 2 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	3fedf22b60	pan/bi: Tune lower_vars_to_scratch Increase the threshold to lower indirect indexing of arrays to scratch memory all the way up to 256 bytes, which was the lowest power-of-two threshold for which enabling the pass on Mali-G57 was a win in shaderdb. It's difficult to tell what threshold is optimal here. The shader-db stats are based on a rough cycle model that assumes a 16:1 ratio between CVT and load/store on Valhall, and a 24:1 ratio between arithmetic and load/store on Bifrost. Those ratios are at most rules of thumb, as the number of cycles required by a load/store instruction will vary tremendously based on caching and the memory controller. However, they may well be lower bounds (if those are the upper bounds on instruction issuing in the Mali shader cores). As such, a large threshold seems well motivated. shader-db results on Mali-G52 follow, results on Mali-G57 were similar. Note the shader that's hurt for spills/fills is helped for load/store overall. cycles helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%)) ldst helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%)) total instructions in shared programs: 2415410 -> 2415372 (<.01%) instructions in affected programs: 1041 -> 1003 (-3.65%) helped: 3 HURT: 0 helped stats (abs) min: 2.0 max: 31.0 x̄: 12.67 x̃: 5 helped stats (rel) min: 2.08% max: 6.02% x̄: 3.90% x̃: 3.60% total tuples in shared programs: 1928558 -> 1928527 (<.01%) tuples in affected programs: 826 -> 795 (-3.75%) helped: 2 HURT: 1 helped stats (abs) min: 6.0 max: 26.0 x̄: 16.00 x̃: 16 helped stats (rel) min: 3.72% max: 9.68% x̄: 6.70% x̃: 6.70% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.54% max: 1.54% x̄: 1.54% x̃: 1.54% total clauses in shared programs: 355013 -> 354981 (<.01%) clauses in affected programs: 220 -> 188 (-14.55%) helped: 3 HURT: 0 helped stats (abs) min: 2.0 max: 27.0 x̄: 10.67 x̃: 3 helped stats (rel) min: 13.99% max: 21.43% x̄: 16.93% x̃: 15.38% total cycles in shared programs: 166610.27 -> 166574.90 (-0.02%) cycles in affected programs: 138 -> 102.62 (-25.63%) helped: 3 HURT: 0 helped stats (abs) min: 0.4583330000000001 max: 31.0 x̄: 11.79 x̃: 3 helped stats (rel) min: 15.28% max: 65.28% x̄: 34.86% x̃: 24.03% total arith in shared programs: 73690.13 -> 73690.58 (<.01%) arith in affected programs: 29.71 -> 30.17 (1.54%) helped: 1 HURT: 2 helped stats (abs) min: 0.0833339999999998 max: 0.0833339999999998 x̄: 0.08 x̃: 0 helped stats (rel) min: 3.85% max: 3.85% x̄: 3.85% x̃: 3.85% HURT stats (abs) min: 0.125 max: 0.4166659999999993 x̄: 0.27 x̃: 0 HURT stats (rel) min: 1.66% max: 5.17% x̄: 3.42% x̃: 3.42% total ldst in shared programs: 135611 -> 135571 (-0.03%) ldst in affected programs: 138 -> 98 (-28.99%) helped: 3 HURT: 0 helped stats (abs) min: 3.0 max: 31.0 x̄: 13.33 x̃: 6 helped stats (rel) min: 24.03% max: 100.00% x̄: 74.68% x̃: 100.00% total quadwords in shared programs: 1674599 -> 1674523 (<.01%) quadwords in affected programs: 838 -> 762 (-9.07%) helped: 3 HURT: 0 helped stats (abs) min: 2.0 max: 65.0 x̄: 25.33 x̃: 9 helped stats (rel) min: 3.39% max: 15.00% x̄: 9.14% x̃: 9.04% total spills in shared programs: 37 -> 40 (8.11%) spills in affected programs: 17 -> 20 (17.65%) helped: 0 HURT: 1 total fills in shared programs: 190 -> 196 (3.16%) fills in affected programs: 34 -> 40 (17.65%) helped: 0 HURT: 1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	fd021a618f	pan/va: Replace MKVEC.v4i8 with MKVEC.v2i8 This is the instruction that the hardware actually supports. Do the rename, use the more specific accurate model in the IR, and rework the Valhall texturing code to emit MKVEC.v2i8 instead of MKVEC.v4i8. Will fix: dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.* Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	c570693c19	pan/va: Pack MKVEC.v2i8 byte lanes They are in a different place, but the encoding is otherwise as usual. This will be required for texture gathers with dynamic offsets. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	10301885ab	pan/bi: Constant fold MKVEC.v2i8 Constant MKVEC.v2i8 will be generated during texturing on Valhall, just like constant MKVEC.v4i8 is currently generated. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	2833d0472a	pan/bi: Model MKVEC.v2i8 Valhall does not have Bifrost's 4-source MKVEC.v4i8. Instead, it has a (somewhat limtied) 3-source MKVEC.v2i8. The full MKVEC.v4i8 may be lowered to a pair of MKVEC.v2i8 instructions. For good code quality on both Bifrost and Valhall, we need to model both instructions in their full generality. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	6792b15971	pan/bi: Remove FRSCALE from IR It's just LDEXP in different clothing. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	21bedd2c97	pan/va: Rename RSCALE to LDEXP This avoids needless variation from Bifrost. While at it, fix the opcode definition: there are no abs/neg/swizzle modifiers on the signed integer source, and there's no clamp. However, there are round and infinity modes, like on Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	0da28ee2c7	pan/va: Implement sample positions FAU packing This will fix: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	9dd0bc92b5	pan/va: Lower FADD_RSCALE.f32 to FMA_RSCALE.f32 We generate FADD_RSCALE.f32 in our sample variables implementations. Valhall doesn't have a dedicated FADD_RSCALE.f32 implementation, it should be aliased to FMA_RSCALE.f32. Handle that alias in isel lowering. This will fix: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.* Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	1a882ecdab	pan/bi: Align accesses with packed TLS When lowering vars to scratch, we need to be careful with alignment on Valhall, where packed TLS access must not straddle a 16-byte boundary. Fixes regressions when enabling indirect access to temps on Valhall. Fixes: `6761dbf891` ("panfrost: Use packed TLS on Valhall") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	5ee1179c94	pan/bi: Fix LD_BUFFER.i16 definition This was missing the message, breaking UBO-to-push and who-knows-what-else, when enabling fp16 const buffers. Fixes: `3dc2095b07` ("pan/bi: Model LD_BUFFER instructions") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>	2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig	40accfd3b7	pan/va: Unit test va_mark_last This pass is super easy to unit test, so we have no excuse not to test thoroughly. va_mark_last only inserts annotations in a shader without any annotations, so our test cases are simply annotated shaders. The CASE macro just has to compare the case against the case with the annotations stripped and added back with va_mark_last. In retrospect, I should have used that technique for the flow control insertion tests too. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	4b7e337b45	pan/va: Mark last register reads On Valhall, register reads may be marked as "last" [1]. Setting the last flag promises the hardware that the value of the register is no longer required. This may enable hardware optimizations. In particular, it may permit the hardware to avoid register file writes if a write to the marked register is still in the forwarding buffer. This may improve power efficiency. In principle, this is trivial: run liveness analysis and mark killed sources, like we would in an SSA-based register allocator. In practice, there are a few wrinkles to avoid hazards around staging registers and 64-bit register pairs, requiring some additional data flow analysis and fix ups. However, nothing here is particularly "hard", and all the ideas are already in use for the Bifrost scheduler and the Bifrost/Valhall scoreboard analyses. [1] In Mesa's compiler, this is called discard for historical reasons. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	d4377e1255	pan/va: Use validate_register_pair for BLEND pack Instead of open-coding. Noticed by inspection. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	b48933d641	pan/va: Include BLEND for va_swap_12 This helps "contain the crazy" and avoids special casing BLEND in compiler passes. The Valhall instruction is roughly the same as its Bifrost counterpart, as long as we fix up the source order (as we already do for bitwise operations) everything works out. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	738a1572d2	pan/va: Move va_flow_is_wait_or_none to common We want to use this helper in the "mark last" pass too. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	1b29a99b7b	pan/va: Add header guards to valhall_enums.h Otherwise we can't #include in multiple places. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	c5a8736552	pan/bi: Constify bi_is_staging_src argument Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	2075bff4e8	pan/bi: Mark bi_postra_liveness_ins as MUST_CHECK Post-RA liveness relies on the caller updating the live variable with the results of bi_postra_liveness_ins. It is not automatic, as with regular liveness. This means ignoring the result of bi_postra_liveness_ins is surely an error. Mark it as MUST_CHECK to catch that error at compile time. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	43d00c2971	pan/va: Unit test barrier handling Add a unit test for the quirk discovered in the previos commit, because this will cause flakes (instead of fails) if we get it wrong. Better have a deterministic fail mode. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	8c6b9b9c92	pan/va: Workaround quirk of barrier handling For some unknown reason, waiting for general slots (at least for memory stores) doesn't work properly on a BARRIER instruction. We need to wait for all general slots right before issuing the BARRIER in addition to the general wait on the BARRIER itself. I don't know if this is a hardware bug or some hideous gate-saving quirk, but I observe the Mali-G78 DDK using the same workaround, which implies this really is necessary. Fixes rare flakes in: dEQP-GLES31.functional.compute.shared_var.work_group_size.float_128_1_1 Note that the flakes from that test are extremely timing dependent. Without this change, that test is racy but we almost always win the race. Reproducing the issue reliably requires high system load (e.g. running the CTS in the background) and simultaneously running that test a large number of times. Minimal shader-db impact. In particular, no cycle count regressions. total instructions in shared programs: 2699419 -> 2699458 (<.01%) instructions in affected programs: 22014 -> 22053 (0.18%) helped: 2 HURT: 25 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.64 x̃: 1 HURT stats (rel) min: 0.07% max: 2.82% x̄: 0.69% x̃: 0.49% 95% mean confidence interval for instructions value: 1.01 1.87 95% mean confidence interval for instructions %-change: 0.38% 0.88% Instructions are HURT. total cvt in shared programs: 14468.81 -> 14469.42 (<.01%) cvt in affected programs: 221.33 -> 221.94 (0.28%) helped: 2 HURT: 25 helped stats (abs) min: 0.015625 max: 0.015625 x̄: 0.02 x̃: 0 helped stats (rel) min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18% HURT stats (abs) min: 0.015625 max: 0.046875 x̄: 0.03 x̃: 0 HURT stats (rel) min: 0.10% max: 4.44% x̄: 1.06% x̃: 0.79% 95% mean confidence interval for cvt value: 0.02 0.03 95% mean confidence interval for cvt %-change: 0.57% 1.36% Cvt are HURT. total quadwords in shared programs: 1462496 -> 1462528 (<.01%) quadwords in affected programs: 4632 -> 4664 (0.69%) helped: 0 HURT: 4 HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.35% max: 7.69% x̄: 4.03% x̃: 4.03% 95% mean confidence interval for quadwords value: 8.00 8.00 95% mean confidence interval for quadwords %-change: -2.71% 10.76% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig	7fa545528d	pan/va: Simplify insert flow tests Test cases for insert flow are necessarily the reference test cases with the NOPs stripped out. That means we don't need to duplicate the test bodies. Deduplicate. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>	2022-06-21 22:19:59 +00:00

1 2 3 4 5 ...

155768 Commits All Branches Search

155768 Commits

All Branches