KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Emma Anholt	a5fa7e04d7	ci/lvp: Update the asan fails list. Many tests had been fixed but weren't being run due to test reshuffles from uprevs. Add some explanations for what remains. Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15133>	2022-02-24 02:09:02 +00:00
Alyssa Rosenzweig	6b2eda6b72	pan/bi: Reorder pushed uniforms to avoid moves On Bifrost and Valhall, push uniforms are loaded into Fast Access Uniform Random Access Memory (FAU-RAM). FAU-RAM is organized as an array of 64-bit slots. A given tuple (Bifrost) or instruction (Valhall) may access at most a single 64-bit slot. If an instruction requires uniforms from multiple 64-bit slots, a uniform-to-register move must be inserted to avoid the hazard. However, if an instruction requires a pair of 32-bit uniforms from the same 64-bit slot, no move is required. To reduce the number of moves we emit, this commit adds an optimization pass that reorders pushed uniforms, trying to group uniforms used by the same instruction. The pass works by creating a graph of pushed uniforms, where edges denote the "both 32-bit uniforms required by the same instruction" relationship. We perform depth-first search on this graph to find the connected components, where each connected component is a cluster of uniforms that are used together. We then select pairs of uniforms from each connected component. The remaining unpaired uniforms (from components of odd sizes) are paired together arbitrarily. In principle, we should weight the graph by number of occurences and choose pairs that maximize the total selected edge weight. This is left for future work, as it is nontrivial -- selecting these edges optimally appears to be NP-hard at first blush. Implementation note: As position and varying shaders share FAU on Bifrost, extra care is taken with a `push_offset` shader stage info parameter that ensures varying shaders do not reorder uniforms selected by the previous position shader. total instructions in shared programs: 2503343 -> 2451758 (-2.06%) instructions in affected programs: 1553309 -> 1501724 (-3.32%) helped: 14256 HURT: 8 helped stats (abs) min: 1.0 max: 80.0 x̄: 3.62 x̃: 3 helped stats (rel) min: 0.06% max: 36.36% x̄: 7.31% x̃: 6.67% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.38 x̃: 1 HURT stats (rel) min: 1.30% max: 12.50% x̄: 4.99% x̃: 3.85% 95% mean confidence interval for instructions value: -3.66 -3.58 95% mean confidence interval for instructions %-change: -7.41% -7.20% Instructions are helped. total tuples in shared programs: 2008399 -> 1969627 (-1.93%) tuples in affected programs: 1146344 -> 1107572 (-3.38%) helped: 12867 HURT: 147 helped stats (abs) min: 1.0 max: 61.0 x̄: 3.03 x̃: 2 helped stats (rel) min: 0.17% max: 42.86% x̄: 6.79% x̃: 4.65% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.20 x̃: 1 HURT stats (rel) min: 0.29% max: 20.00% x̄: 2.12% x̃: 1.19% 95% mean confidence interval for tuples value: -3.03 -2.93 95% mean confidence interval for tuples %-change: -6.82% -6.57% Tuples are helped. total clauses in shared programs: 408005 -> 401708 (-1.54%) clauses in affected programs: 90760 -> 84463 (-6.94%) helped: 6006 HURT: 164 helped stats (abs) min: 1.0 max: 9.0 x̄: 1.08 x̃: 1 helped stats (rel) min: 0.45% max: 33.33% x̄: 12.44% x̃: 14.29% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.64% max: 25.00% x̄: 9.81% x̃: 5.26% 95% mean confidence interval for clauses value: -1.03 -1.01 95% mean confidence interval for clauses %-change: -12.03% -11.66% Clauses are helped. total cycles in shared programs: 203308.37 -> 202737.83 (-0.28%) cycles in affected programs: 19264.71 -> 18694.17 (-2.96%) helped: 3024 HURT: 41 helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.19 x̃: 0 helped stats (rel) min: 0.17% max: 33.33% x̄: 3.83% x̃: 2.83% HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.06 x̃: 0 HURT stats (rel) min: 0.30% max: 5.88% x̄: 1.41% x̃: 0.93% 95% mean confidence interval for cycles value: -0.19 -0.18 95% mean confidence interval for cycles %-change: -3.89% -3.64% Cycles are helped. total arith in shared programs: 76265.67 -> 74669.25 (-2.09%) arith in affected programs: 45001.50 -> 43405.08 (-3.55%) helped: 12945 HURT: 97 helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.12 x̃: 0 helped stats (rel) min: 0.17% max: 50.00% x̄: 8.06% x̃: 4.88% HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.21% max: 33.33% x̄: 2.16% x̃: 0.96% 95% mean confidence interval for arith value: -0.12 -0.12 95% mean confidence interval for arith %-change: -8.16% -7.81% Arith are helped. total quadwords in shared programs: 1796563 -> 1766803 (-1.66%) quadwords in affected programs: 948830 -> 919070 (-3.14%) helped: 12078 HURT: 219 helped stats (abs) min: 1.0 max: 42.0 x̄: 2.49 x̃: 2 helped stats (rel) min: 0.10% max: 33.33% x̄: 5.57% x̃: 5.26% HURT stats (abs) min: 1.0 max: 4.0 x̄: 1.21 x̃: 1 HURT stats (rel) min: 0.33% max: 6.67% x̄: 2.00% x̃: 1.14% 95% mean confidence interval for quadwords value: -2.46 -2.38 95% mean confidence interval for quadwords %-change: -5.52% -5.36% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14163>	2022-02-24 01:35:33 +00:00
Timothy Arceri	6eec8fcbfa	glsl/nir: free GLSL IR right after we convert to NIR Gives us memory back faster which is useful for pathalogical CTS tests. The GLSL IR was previously used after converting to NIR for things like building the GL resource list but we have had a NIR version for this for some time and I don't believe there are any other use cases left for keeping the old IR hanging around this long. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15127>	2022-02-24 01:10:49 +00:00
Emma Anholt	0fda2ac4f0	ci/virgl: Drop the bvec4_from_mat4x2_vs xfail. The fix has landed in VK-GL-CTS 1.3.1.0, we were just not noticing it because this is also in the flakes list. Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>	2022-02-23 23:09:20 +00:00
Emma Anholt	9e710af830	ci/softpipe: Move most of testing to shared 64-core runners at Google. The single job takes about 3:30 of runner time. I don't have a good explanation for the crash->fail test changes. Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>	2022-02-23 23:09:20 +00:00
Emma Anholt	73b37f9ff0	ci/lavapipe: Test 1/3 of lavapipe on the shared 64-core google runners. Now we can get through 1/3 of the testsuite in about 3:30, while previously we did 1/10th. Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>	2022-02-23 23:09:20 +00:00
Emma Anholt	0f64f4bdb5	ci/llvmpipe: Move most of testing to shared 64-core runners at Google. These runners are configured to have a single job take up the whole runner, which means we get to use threads to our hearts content. The pile of cores means we don't need to spawn separate jobs to try to load-balance across fdo's shared runner capacity. Having dedicated runners means we won't get our MRs blocked as much waiting on non-Mesa testing happening on fd.o. We manage to complete all of this llvmpipe testing in about 6:15. Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>	2022-02-23 23:09:20 +00:00
Emma Anholt	6859b614a2	ci: Stash the ldd and ccache stats output under collapsed sections. You rarely need to look at these, they're just nice to have sometimes. Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>	2022-02-23 23:09:20 +00:00
Samuel Pitoiset	a2c1fa9137	radv: initialize extra state for internal pipelines at one place Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>	2022-02-23 22:29:55 +00:00
Samuel Pitoiset	959e8586aa	radv: remove useless radv_blend_state::single_cb_enable field This was only used for meta operations. DCC/FMASK/FCE pipelines only declare one color attachment and the color writemask of the second color attachment is 0 for the HW CB resolve. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>	2022-02-23 22:29:55 +00:00
Samuel Pitoiset	8347d3dfd7	radv: initialize VGT_GS_OUT_PRIM_TYPE earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>	2022-02-23 22:29:55 +00:00
Samuel Pitoiset	9fb0831ca1	radv: initialize more depth/stencil states earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>	2022-02-23 22:29:55 +00:00
Dmitry Baryshkov	b4bef890ee	freedreno/regs: remove 5nm DSI PHY regs 5nm PHY is a variation of 7nm PHY, they use the same register definitions. To remove duplication, drop 5nm defs. Cc: Robert Foss <robert.foss@linaro.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15051>	2022-02-23 21:25:22 +00:00
Eric Engestrom	c9e6d3ba73	docs: update calendar and link releases notes for 21.3.7 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15149>	2022-02-23 21:20:34 +00:00
Eric Engestrom	9bb16991b8	docs: add release notes for 21.3.7 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15149>	2022-02-23 21:20:34 +00:00
Dave Airlie	b77ef4dd60	draw/so: don't use pre clip pos if we have a tes either. This check for geom shader needed to be expanded for tess support. dEQP-VK.transform_feedback.simple.depth_clip_control_tese with lvp Fixes: `dacf8f5f5c` ("draw: hook up final bits of tessellation") Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15128>	2022-02-23 20:56:42 +00:00
Alyssa Rosenzweig	31b7ebcbc7	pan/mdg: Fix overflow in intra-bundle interference There are up to 4 instructions in the latter stage (if a branch is included), not 3. Bump the limit to fix memory corruption. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reported-by: Icecream95 <ixn@disroot.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15147>	2022-02-23 20:42:33 +00:00
Jordan Justen	0fffaa9fca	anv: Align state pools to 2MiB on XeHP Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com> Fixes: `c17e2216dd` ("anv: Align buffer VMA to 2MiB for XeHP") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15054>	2022-02-23 20:15:24 +00:00
Jordan Justen	5a28d2482f	anv: Align GENERAL_STATE_POOL_MIN_ADDRESS to 2MiB Fixes: `c17e2216dd` ("anv: Align buffer VMA to 2MiB for XeHP") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15054>	2022-02-23 20:15:24 +00:00
Alyssa Rosenzweig	d986731da9	iris,crocus,i915g: Don't stub flush_frontbuffer This callback is only intended for software rasterizers, layered drivers, and other special drivers that go through the software winsys path. Remove the unimplemented stubs from the Intel drivers. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Dave Airlie <airlied@redhat.com> [crocus] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15118>	2022-02-23 19:49:54 +00:00
Alyssa Rosenzweig	51689a2b80	panfrost: Simplify panfrost_resource_get_handle Unify the exit paths to clean up the logic. There are logically three modes we support (KMS without renderonly, KMS with renderonly, and FD); these each correspond to a leg of a small if statement. Outside of the small if's, everything else should be identical. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Simon Ser <contact@emersion.fr> Reviewed-by: James Jones <jajones@nvidia.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15120>	2022-02-23 18:31:55 +00:00
Alyssa Rosenzweig	b5734cc1c4	panfrost: Fix FD resource_get_handle When handle->type is WINSYS_HANDLE_TYPE_FD, the caller wants a file descriptor for the BO backing the resource. We previously had two paths for this: 1. If rsrc->scanout is available, we prime the GEM handle from the KMS device (rsrc->scanout->handle) to a file descriptor via the KMS device. 2. If rsrc->scanout is not available, we prime the GEM handle from the GPU (bo->gem_handle) to a file descriptor via the GPU device. In both cases, the caller passes in a resource (with BO) and expects out a file descriptor. There are no direct GEM handles in the function signature; the caller doesn't care which GEM handle we prime to get the file descriptor. In principle, both paths produce the same file descriptor for the same BO, since both GEM handles represent the same underlying resource (viewed from different devices). On grounds of redundancy alone, it makes sense to remove the rsrc->scanout path. Why have a path that only works sometimes, when we have another path that works always? In fact, the issues with the rsrc->scanout path are deeper. rsrc->scanout is populated by renderonly_create_gpu_import_for_resource, which does the following: 1. Get a file descriptor for the resource by resource_get_handle with WINSYS_HANDLE_TYPE_FD 2. Prime the file descriptor to a GEM handle via the KMS device. Here comes strike number 2: in order to get a file descriptor via the KMS device, we had to /already/ get a file descriptor via the GPU device. If we go down the KMS device path, we effectively round trip: GPU handle -> fd -> KMS handle -> fd There is no good reason to do this; if everything works, the fd is the same in each case. If everything works. If. The lifetimes of the GPU handle and the KMS handle are not necessarily bound. In principle, a resource can be created with scanout (constructing a KMS handle). Then the KMS view can be destroyed (invalidating the GEM handle for the KMS device), even though the underlying resource is still valid. Notice the GPU handle is still valid; its lifetime is tied to the resource itself. Then a caller can ask for the FD for the resource; as the resource is still valid, this is sensible. Under the scanout path, we try to get the FD by priming the GEM handle on the KMS device... but that GEM handle is no longer valid, causing the PRIME ioctl to fail with ENOENT. On the other hand, if we primed the GPU GEM handle, everything works as expected. These edge cases are not theoretical; recent versions of Xwayland trigger this ENOENT, causing issue #5758 on all Panfrost devices. As far as I can tell, no other kmsro driver has this 'special' kmsro path; the only part of resource_get_handle that needs special handling for kmsro is getting a KMS handle. Let's remove the broken, useless path, fix Xwayland, bring us in line with other drivers, and delete some code. Thank you for coming to my ted talk. Closes: #5758 Fixes: `7da251fc72` ("panfrost: Check in sources for command stream") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reported-and-tested-by: Jan Palus <jpalus@fastmail.com> Reviewed-by: Simon Ser <contact@emersion.fr> Reviewed-by: James Jones <jajones@nvidia.com> Acked-by: Daniel Stone <daniels@collabora.com> Tested-by: Dan Johansen <strit@manjaro.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15120>	2022-02-23 18:31:55 +00:00
Dmitry Baryshkov	22efeec399	freedreno/registers: add new register for 7nm DSI PHY v4.3 (sm8450) Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15052>	2022-02-23 17:28:17 +00:00
Alyssa Rosenzweig	04b80489d5	ci: Disable windows-vs2019 Currently down. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15148>	2022-02-23 15:12:41 +00:00
Rhys Perry	ded9cb904f	anv: Enable nir_opt_access This commit will enable pass for searching readonly / writeonly access when it's missing. We don't support shaderStorageImageReadWithoutFormat and the optimization pass causes those shaders to take the write-only path which does support formatless. Following games are affected with positive result: - Wolfenstein: Youngblood - Wolfenstein II: The New Colossus https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138 - Rage 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791 - The Surge 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5805 - Metro Exodus https://gitlab.freedesktop.org/mesa/mesa/-/issues/4703 - DOOM Eternal https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273 Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138,https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791,https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273 Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15082>	2022-02-23 13:11:12 +00:00
Alyssa Rosenzweig	abb7f04674	panfrost: Inline pan_emit_sfbd_tiler Easier to read, the common code was already common. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	910d4f8245	panfrost: Remove pan_emit_fbd thunking Use a common interface. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	8dc7757754	panfrost: Remove unrelated comment Not sure what this was supposed to describe, but it's not the code here. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	099d61c95d	panfrost: Use txl instead of tex in the blitter We always blit from a particular level, so it's a waste to compute the LOD. This corresponds to a simple texture instruction with implement 0 LOD, which is the optimal texturing path on Bifrost -- it maps to TEXS_2D but does not require helper invocations. Functional change on Bifrost: Blit shaders no longer set .computed_lod or shader_contains_barrier. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	5b1a00c565	panfrost: Inline pan_blit_emit_dcd Easier to follow the logic without having a million arguments passed around. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	c9784c9512	panfrost: Decouple tiler job and DCD emit We can share the "emit quad" logic, even though the DCDs differ. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	a13d87c484	panfrost: Annotate slow clears as such We should realistically be using the clear shaders from PanVK once they're moved to common. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	1eb3dbafdb	panfrost: Set defaults for deprecated DCD fields There are always set to true. Don't pollute the driver code with them, make their existence a local detail to pre-Valhall XML and that's it. Functional change: "four components per vertex" is now set on vertex job DCDs. This should be a no-op. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	bd3d7e33b6	panfrost: Use pan_shader_prepare_rsd in blitter This reduces code duplication and will ease Valhall porting. Functional changes on v7: * Shader contains barrier is now set (perf loss, fixed later in series) * Shader register allocation is now set (perf win) * Point sprite inverted, no-op for blit shaders Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig	6fc81f163e	pan/mdg: Fix partial execution mode names cont -> skip, last -> kill, and fix the special case handling. It's just an enum. Makes the disassembly easier to read and closer to Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>	2022-02-23 12:56:30 +00:00
Danylo Piliaiev	7e703e4428	turnip: Always use GMEM for feedback loops in autotuner For ordinary feedback loops GMEM is a lot faster than sysmem since we don't set SINGLE_PRIM mode. For feedback loops with ordered rasterization GMEM should also be faster. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>	2022-02-23 11:31:59 +00:00
Danylo Piliaiev	ebc23ac963	turnip: Implement VK_ARM_rasterization_order_attachment_access Trivially implemented by using A6XX_GRAS_SC_CNTL_SINGLE_PRIM_MODE. This extension is useful for emulators e.g. AetherSX2 PS2 emulator and could drastically improve performance when blending is emulated. Relevant tests: dEQP-VK.rasterization.rasterization_order_attachment_access.* Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>	2022-02-23 11:31:59 +00:00
Danylo Piliaiev	d6c89e1e4a	turnip: Merge LRZ and DEPTH_PLANE draw states They were emitted at the same time. Frees 1 draw state for us to use. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>	2022-02-23 11:31:59 +00:00
Danylo Piliaiev	dab34bd5c8	turnip: Use LATE_Z when there might be depth/stencil feedback loop Otherwise a shader invocation would read the value which should have been set AFTER this shader invocation. Fixes tests: dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers Fixes: `71595a189a` ("tu: Fix feedback loops in sysmem mode") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>	2022-02-23 11:31:59 +00:00
Paulo Zanoni	d10fd5b7c9	iris: fix register spilling on compute shaders on XeHP XeHP scratch space is handled differently. Commit `ae18e1e707` implemented support for it, but handled it differently between render and compute shaders: it calculates scratch_addr differently and doesn't pin the buffer on compute. Make it work on compute shaders by calling pin_scratch_space() from iris_compute_walker(), which fixes both the address and the pinning. This commit can be verified by the two-year-old-but-still-unreviewed Piglit MR 234. You can also verify this by running a very simple compute shader with INTEL_DEBUG=spill_fs. References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234 Fixes: `ae18e1e707` ("iris: Add support for scratch on XeHP") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070>	2022-02-22 22:16:57 +00:00
Kenneth Graunke	c46d3acf0e	anv: Raise vertex input bindings and attributes limits slightly This raises our vertex input bindings limit from 28 to 31, and our vertex input attribute limit from 28 to 29. We could theoretically go higher, but it will take additional work. The 3DSTATE_VERTEX_BUFFERS and 3DSTATE_VERTEX_ELEMENTS limits are 33 vertex buffers, and 34 vertex elements. But we need up to two vertex elements for system values (FirstVertex, BaseVertex, BaseInstance, DrawID), and we currently use two vertex bindings for those. There is another hidden limit: our compiler backend only supports the push model for VS inputs currently. 3DSTATE_VS only allows URB Read Lengths between [0, 15], which is measured in pairs of inputs, which means we can theoretically push no more than 32 vertex elements. This is no artifical limit either, as a vec4 element takes up 4 registers in the payload, and 32 * 4 = 128, the entire size of our register file. Plus, the VS Thread payload needs at least g0 and g1 for other things, so we can really only push 31. We can theoretically support one additional binding, by combining our two SGV bindings into a single upload. In order to support additional vertex elements, we would need to add support to the backend compiler for the pull model for VS inputs. References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5917 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14991>	2022-02-22 21:31:06 +00:00
Mike Blumenkrantz	dabba7d726	zink: ci updates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>	2022-02-22 21:16:55 +00:00
Mike Blumenkrantz	3029000389	zink: remove zink_descriptor_util_init_null_set() no longer used Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>	2022-02-22 21:16:55 +00:00
Mike Blumenkrantz	7266182be0	zink: allow null descriptor set layouts I got confused while writing this somehow because of the null descriptor feature, which enables drivers to consume a null descriptor, which has no relation to a descriptor layout containing no descriptors failing to accurately use zero descriptors can put layouts over the maximum per-stage limits, which causes tests to crash fixes (lavapipe): KHR-GL46.shading_language_420pack.binding_uniform_block_array KHR-GL46.multi_bind.dispatch_bind_buffers_base Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>	2022-02-22 21:16:55 +00:00
Timur Kristóf	3759a16d8a	ac/nir/ngg: Fix mixed up primitive ID after culling. When NGG culling is enabled, make sure that the correct primitive ID is exported by each lane. Fixes: `e97f0463a8` "ac/nir: Implement NGG deferred attribute culling in NIR." Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6050 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15055>	2022-02-22 18:15:24 +00:00
Mike Blumenkrantz	c063d8ff64	zink: prune ci lists I don't know why I thought running GL3.2 and GL4.6 was a good idea, but it wasn't Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15065>	2022-02-22 18:02:00 +00:00
Emma Anholt	59bc17d57a	turnip: Request no implicit sync when we have no implicit-sync WSI BOs. I chose to implement this as a global flag in the device, because otherwise we would end up with extra draw overhead trying to avoid it in the implicit-sync WSI case, and you're probably going to end up needing implicit sync anyway because you used one of the BOs in any of the submitted cmdbufs. To do better than this, we would probably want a skip-implicit-sync flag on the BOs in the BO list, rather than global on the submit. Reports about venus on turnip say that this flag reduces worst-case QueueSubmit time in a game workload from ~10ms to ~4ms. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14838>	2022-02-22 17:36:05 +00:00
Samuel Pitoiset	83ee08f6d1	radv: fix build on BSD Just disable inotify for BDS systems. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6060 Fixes: `c50557d961` ("radv: allow applications to dynamically change RADV_FORCE_VRS") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15105>	2022-02-22 17:16:21 +00:00
Alyssa Rosenzweig	2e86767370	pan/bi: Add BIFROST_MESA_DEBUG=nosb option To disable the new scoreboarding optimizations when debugging. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>	2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig	c81c022e66	pan/bi: Implement basic scoreboarding pass Extend our existing bi_scoreboard infrastructure with a simple data flow analysis pass that calculates which dependency slots need waiting. We still lack a heuristic for selecting dependency slots. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>	2022-02-22 16:57:30 +00:00

1 2 3 4 5 ...

150492 Commits All Branches Search

150492 Commits

All Branches