mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Erik Faye-Lund	a16f3963d3	lavapipe: fix reported subpixel precision for lines We have no reason to report a subpixel precision of 4 for lines; in fact LLVMpipe uses 8 subpixel bits for lines, similar to other primitives. But let's use the pipe-cap for this instead of hard-coding it. Fixes: `9fbf6b2abf` ("lavapipe: implement VK_EXT_line_rasterization") Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12277>	2021-08-24 08:45:20 +00:00
Vinson Lee	0a4c4f4459	broadcom/compiler: Fix qpu.flags.muf typo. Fix defect reported by Coverity Scan. Same on both sides (CONSTANT_EXPRESSION_RESULT) pointless_expression: The expression inst->qpu.flags.auf != V3D_QPU_UF_NONE \|\| inst->qpu.flags.auf != V3D_QPU_UF_NONE does not accomplish anything because it evaluates to either of its identical operands, inst->qpu.flags.auf != V3D_QPU_UF_NONE. Fixes: `3f2c54a27f` ("broadcom/compiler: rewrite partial update liveness tracking") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12385>	2021-08-24 08:30:59 +00:00
Erik Faye-Lund	4d6e18b6cb	llvmpipe: improve polygon-offset precision This performs the polygon offset addition after interpolation, which prevents floating-point cancellation issues completely. This does mean that we have to perform a single floating-point addition more per fragment than before, unless we also want to spend a bit in the fragment-shader variant key to avoid this. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12442>	2021-08-24 07:36:31 +00:00
Erik Faye-Lund	1fa61483de	llvmpipe: split coefficient calculation and store This will be used for some underhanded smuggling of values in the next commit. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12442>	2021-08-24 07:36:31 +00:00
Erik Faye-Lund	8565333669	llvmpipe: clamp z to 0..1 range when using polygon offset The OpenGL 4.6 compatibility spec, section 14.6.5 (Depth Offset) says the following: > For fixed-point depth buffers, fragment depth values are always > limited to the range [0,1] by clamping after offset addition is > performed. Fragment depth values are clamped even when the depth > buffer uses a floating-point representation. So we need to properly clamp the result here. This fixes the following dEQP failures, that the CI has missed: - dEQP-GLES3.functional.polygon_offset.default_result_depth_clamp - dEQP-GLES3.functional.polygon_offset.default_factor_1_slope - dEQP-GLES3.functional.polygon_offset.fixed24_result_depth_clamp - dEQP-GLES3.functional.polygon_offset.fixed24_factor_1_slope Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12442>	2021-08-24 07:36:31 +00:00
Dave Airlie	63138c42c5	crocus: copy views before adjusting The current code overwrote the original view which meant if we had to reemit a surface the second emit would be wrong. This fixes cubemaps on gm45 and maybe some issues with 3D textures elsewhere. Fixes: `f3630548f1` ("crocus: initial gallium driver for Intel gfx 4-7") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12514>	2021-08-24 02:21:01 -04:00
Vinson Lee	4fc2a6cbdb	freedreno: Require C++17. Commit `3a772be026` ("freedreno: Add perfetto renderpass support") uses C++17 init-statement feature. GCC ../src/gallium/drivers/freedreno/freedreno_perfetto.cc: In lambda function: ../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: init-statement in selection statements only available with ‘-std=c++17’ or ‘-std=gnu++17’ 148 \| if (auto state = tctx.GetIncrementalState(); state->was_cleared) { \| ^~~~ Clang ../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: 'if' initialization statements are a C++17 extension [-Wc++17-extensions] if (auto state = tctx.GetIncrementalState(); state->was_cleared) { ^ Intel C++ Compiler ../src/gallium/drivers/freedreno/freedreno_perfetto.cc(148): error: expected a ")" if (auto state = tctx.GetIncrementalState(); state->was_cleared) { ^ Fixes: `3a772be026` ("freedreno: Add perfetto renderpass support") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5193 Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12293>	2021-08-23 21:15:48 -07:00
Jason Ekstrand	31fdd26d01	intel/compiler: Add unified barrier support for CS Program CS barrier message fields for producers/consumers. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Jordan Justen	6a950bab0c	intel/compiler: Add unified barrier support for TCS Program the producers/consumer fields for TCS Barrier messages. Producer and consumer fields are set to number of TCS threads. Ref: Bspec 54006 for Barrier Data Payload Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Jordan Justen	b4055a020f	intel/compiler: Regroup TCS barrier code paths Rearrange if/else fragments to unify case for Gen11 or later platforms. This will help the code look cleaner for adding unified barrier support to TCS. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Alyssa Rosenzweig	0606af1d4a	panfrost: Rip out primconvert code This is handled in common Gallium code if we set the appropriate CAP. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Suggested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12509>	2021-08-24 00:53:38 +00:00
Alyssa Rosenzweig	2d31d469f7	panfrost: Fix NULL dereference in allowlist code If a user attempts to run Panfrost on an unsupported GPU (e.g. Mali T604), Panfrost will refuse to load and will destroy the screen immediately, allowing for a graceful fallback to a software rasterizer. However, the screen destroy code calls a screen_destroy function in the GenXML vtbl -- and this function is still NULL when the allowlist is checked. This manifests as crashes on unsuported GPUs. Issue tracked down with Icecream95's mad Ghidra skills. Closes: #5269 Fixes: `88dc4db6be` ("panfrost: Init/destroy blitter from per-gen file") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reported-by: Icecream95 <ixn@disroot.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12512>	2021-08-24 00:38:31 +00:00
Nanley Chery	2944f49610	intel: Parse INTEL_NO_HW for devinfo construction This commit does several things: * Unify code common to several drivers by evaluating INTEL_NO_HW within intel_get_device_info_from_fd (suggested by Jordan). * For drivers that keep a copy of the intel_device_info struct, a separate copy of the no_hw field is now unnecessary. Remove them. * Minimize kernel queries when INTEL_NO_HW is true. This is done for code simplification, but we may find reason to undo this later on. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>	2021-08-24 00:12:47 +00:00
Nanley Chery	7d59a66e3a	intel: Use env_var_as_boolean for INTEL_NO_HW The prior method of checking the result of getenv() for NULL would cause the feature to be enabled for INTEL_NO_HW=0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>	2021-08-24 00:12:47 +00:00
Alyssa Rosenzweig	e245468eb4	panfrost: Port v5 blend shader issue to blitter This is a presumed erratum workaround. Fixes INSTR_INVALID_PC faults on some draw_buffers_indexed.* cases on Midgard, where a blend shader is required to pack RT n > 0. Backport the workaround from the GL driver. The helper is now in common code for panvk to use as well; it has the same bug. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	3113dbd837	panfrost: Zero initialize blend_shaders Fixes an invalid read caught by valgrind when there is a hole in the valid render target mask: ==6749== Conditional jump or move depends on uninitialised value(s) ==6749== at 0x5E88EC0: panfrost_prepare_fs_state (pan_cmdstream.c:417) ==6749== by 0x5E88EC0: panfrost_emit_frag_shader (pan_cmdstream.c:501) ==6749== by 0x5E88EC0: panfrost_emit_frag_shader_meta (pan_cmdstream.c:573) ==6749== by 0x5E88EC0: panfrost_update_state_fs (pan_cmdstream.c:2593) ==6749== by 0x5E8B0BF: panfrost_direct_draw (pan_cmdstream.c:2839) Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Fixes: `a124c47b9f` ("panfrost: Fix NULL derefs in pan_cmdstream.c") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	5c4b54ce96	pan/mdg: Handle swapped 565 and 1010102 unorm Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	82a6b38d8c	pan/lower_framebuffer: Don't open-code pan_unpacked_type_for_format Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	5fe35012c9	pan/lower_framebuffer: Don't open-code pad_vec4 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	58e96e4aa2	pan/lower_framebuffer: Don't treat UNORM 4 special Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	0169f7aac8	pan/lower_framebuffer: Unify UNORM handling Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:34 +00:00
Alyssa Rosenzweig	851620562a	pan/lower_framebuffer: Use fmul_imm Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Alyssa Rosenzweig	eda3e7f32c	pan/lower_framebuffer: Don't replicate so much We need to replicate to deal with multisampling, but not otherwise. Simplify the logic substantially. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Alyssa Rosenzweig	f45ceb8182	pan/mdg: Insert moves before writeout when needed Otherwise we end up accessing overwritten registers. Fixes dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_enable_buffer_enable Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Alyssa Rosenzweig	7cc3a7ff45	panfrost: Delete unpacks for blendable formats Unnecessary. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Alyssa Rosenzweig	2cf581b195	panfrost: Use blendable check for tib read check These are the same! Either you're blendable and can use f32/f16 conversion, or you're raw and you can only get raw. It's that simple! Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Alyssa Rosenzweig	85ab479d24	panfrost: Fix UNORM 10 sizes Fixes: `56047fb64d` ("panfrost: Fix UNORM 16 rendering") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Alyssa Rosenzweig	6dfdeea213	panfrost: Remove unneeded quirks from T760 Will cause trouble later in the series when we start garbage collecting unneeded code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Boris Brezillon	6b7b8eb046	panfrost: Add explicit padding to pan_blend_shader_key So the hash function doesn't end up hashing uninitialized values. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reported-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Fixes: `bbff09b952` ("panfrost: Move the blend shader cache at the device level") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Tomeu Vizoso	27367cf018	panfrost: Add padding to pan_blit_blend_shader_key So the hashtable helpers know the correct size of the struct. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>	2021-08-23 20:54:33 +00:00
Kenneth Graunke	9cc303ffbb	iris: Mark the aux table buffers with EXEC_OBJECT_CAPTURE. Having these could be useful when tracking down GPU hangs. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12420>	2021-08-23 13:28:23 -07:00
Kenneth Graunke	7bb4ada8e0	iris: Bypass the BO cache when allocating buffers for aux map tables When freeing a buffer, we may return a non-idle buffer to the cache, which means we cannot unmap aux entries at that time. Instead, we defer unmapping the stale aux entry until we reuse a BO from the cache. Unfortunately, this can lead to a recursive locking issue: 1. intel_aux_map_add_mapping wants to set up a new aux entry It takes the intel_aux_map_context::mutex lock, then calls: add_mapping -> get_aux_entry -> add_sub_table -> add_buffer -> intel_aux_map_buffer_alloc -> iris_bo_alloc 2. iris_bo_alloc tries to allocate a BO from the cache, doing: alloc_bo_from_cache -> intel_aux_map_unmap_range -> intel_aux_unmap_range ...which then tries to take the intel_aux_map_context::mutex lock. But it is already locked. One solution would be to rework the aux map handling code to allocate BOs without holding its lock, but that looks to be painful. Another is to make the lock recursive, but we try and avoid that. A third option wuold be to add a BO_ALLOC flag that makes alloc_bo_from_cache skip any buffers with aux_map_address != 0 so we don't have to unmap, making the less cache effective but fixing the recursive lock. A fourth option is to simply bypass the BO cache altogether for the buffers that hold the aux map itself. Allocating new BOs for the aux tables should be relatively rare, so there's probably not a lot of benefit in using the BO cache. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5191 Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12420>	2021-08-23 13:28:22 -07:00
Yiwei Zhang	e9be86adda	venus: scrub ignored fields of pipeline info when rasterization is disable v2: use vk_alloc instead of vk_zalloc because of full memcpy Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1) Reviewed-by: Ryan Neph <ryanneph@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12499>	2021-08-23 20:00:58 +00:00
Yiwei Zhang	b816167312	venus: fix all missing vn_object_base_fini Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Ryan Neph <ryanneph@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12498>	2021-08-23 18:51:38 +00:00
Matt Turner	c600494a8e	tu: Enable VK_KHR_uniform_buffer_standard_layout This extension relaxes the alignment requirements to allow the GL std430 layout to be used. freedreno/ir3 already supports this (via PIPE_CAP_LOAD_CONSTBUF). Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12495>	2021-08-23 18:30:22 +00:00
Samuel Pitoiset	07cd30ca29	nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a) Found with Cyberpunk 2077. fossils-db (GFX10.3): Totals from 128 (2.34% of 5465) affected shaders: CodeSize: 769720 -> 767656 (-0.27%); split: -0.27%, +0.00% Instrs: 145748 -> 145229 (-0.36%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11604>	2021-08-23 17:53:38 +00:00
Dave Airlie	0cddfba328	vulkan/wsi/sw: wait for image fence before submitting to queue With hw devices, when you submit a present, implicit sync will make sure the work submitted to the gpu on the client will end up happening before the present work submitted on the server. However with sw paths there is no real GPU, the lavapipe fake GPU thread is client side only and presenting is done directly from the pixmap (or later shared pixmap). In order for this to make sense the wsi common code should wait for the fence on the image before queueing the submit to the server so that all client works has been flushed to the pixmap before the copy or present operation is submitted. Fixes: `8004fa9c95` ("vulkan/wsi: add sw support. (v2)") Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12502>	2021-08-24 03:30:17 +10:00
Rhys Perry	b23a9dd1f6	aco/scheduler: allow moving down VMEM stores to below VMEM loads fossil-db (Vega10): Totals from 93 (0.06% of 150305) affected shaders: SGPRs: 4832 -> 4768 (-1.32%) VGPRs: 4084 -> 4144 (+1.47%) CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47% MaxWaves: 589 -> 580 (-1.53%) Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61% Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10% InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80% VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57% SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16% Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97% fossil-db (Sienna Cichlid): Totals from 88 (0.06% of 150170) affected shaders: VGPRs: 3840 -> 3872 (+0.83%) CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23% Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35% Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61% InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35% VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81% SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12% Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211>	2021-08-23 16:48:31 +00:00
Erik Faye-Lund	eb60d8c7b9	llvmpipe: use preferred attribute interpolation for wide lines When rasterizing legacy-lines, OpenGL defines the width as being an extrusion along the minor axis, repeating varyings. While the spec does allow for an alternative method that matches our current results, the OpenGL ES CTS doesn't allow these results even if OpenGL ES has the same wording of an alternative method. This is technically speaking a bug in the OpenGL ES CTS, but it seems like nobody else is using the alternative formulation, at least not while passing the OpenGL ES CTS. On top of this, the OpenGL specification explicitly lists the extrusion results as the preferred method. So it seems like a good idea for us to do this the way the OpenGL specification prefers regardless; it's going to give less surprising results to applications, and it's helping us pass some tests. This math to set these up would "trivially" be: dx = (dx * dx + dy * dy) / dx dy = 0 and: dy = (dx * dx + dy * dy) / dy dx = 0 ...but since we've already calculated dxdy, we can reformulate this to save a division. This fixes the following dEQP test-cases: - dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide - dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide - dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide - dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide - dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide - dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide - dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.interpolation.lines_wide - dEQP-GLES3.functional.rasterization.fbo.texture_2d.interpolation.lines_wide - dEQP-GLES3.functional.rasterization.interpolation.basic.line_loop_wide - dEQP-GLES3.functional.rasterization.interpolation.basic.line_strip_wide - dEQP-GLES3.functional.rasterization.interpolation.basic.lines_wide - dEQP-GLES3.functional.rasterization.interpolation.projected.line_loop_wide - dEQP-GLES3.functional.rasterization.interpolation.projected.line_strip_wide - dEQP-GLES3.functional.rasterization.interpolation.projected.lines_wide Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11315>	2021-08-23 15:43:48 +00:00
Rhys Perry	2201f5a58c	aco: remove label_extract if the extract is used by a non-VALU If an extract is used by a non-VALU instruction, it can't be applied to all instructions, so it's not beneficial to try to apply it. This check isn't needed because can_apply_extract()/can_use_SDWA() should already handle non-VALU instructions. fossil-db (Sienna Cichlid): Totals from 1020 (0.68% of 150170) affected shaders: SpillSGPRs: 1577 -> 1571 (-0.38%) CodeSize: 7863668 -> 7858336 (-0.07%); split: -0.07%, +0.00% Instrs: 1431583 -> 1431083 (-0.03%); split: -0.04%, +0.01% Latency: 25891250 -> 25890916 (-0.00%); split: -0.01%, +0.01% InvThroughput: 7248683 -> 7248655 (-0.00%); split: -0.01%, +0.01% SClause: 49072 -> 49071 (-0.00%) Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06% Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01% PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02% PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01% fossil-db (Polaris10): Totals from 654 (0.43% of 151696) affected shaders: CodeSize: 5814552 -> 5811568 (-0.05%); split: -0.05%, +0.00% Instrs: 1105783 -> 1105049 (-0.07%); split: -0.07%, +0.00% Latency: 20261458 -> 20259744 (-0.01%); split: -0.01%, +0.00% InvThroughput: 9011785 -> 9011749 (-0.00%); split: -0.00%, +0.00% Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00% PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01% PreVGPRs: 43813 -> 43809 (-0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212>	2021-08-23 14:56:37 +01:00
Samuel Pitoiset	e0353296da	radv: allocate shaders to 32-bit address to skip PGM_HI This reduces the number of emitted registers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466>	2021-08-23 11:28:21 +00:00
Samuel Pitoiset	2dc90ca8a4	radv: don't use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x Seems it make the perf worse. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466>	2021-08-23 11:28:21 +00:00
Daniel Schürmann	77ffdf41b1	aco: add more validation rules for SDWA operands Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	077776a866	aco/opcodes: remove definition_size[] Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	f6b281a1c2	aco/validate: simplify get_subdword_bytes_written() Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	ec1bbfa608	aco/ra: refactor subdword operand stride Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	c75138ed64	aco/ra: refactor subdword definition info Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	e11b23f7cd	aco: add instr_is_16bit() helper function to indicate whether some instruction writes partial registers, only. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	3d6ca41e44	aco: use VOPC_SDWA on GFX9+ Totals from 5138 (3.42% of 150170) affected shaders: (GFX10.3) VGPRs: 409520 -> 409416 (-0.03%); split: -0.03%, +0.00% CodeSize: 43056360 -> 43035696 (-0.05%); split: -0.06%, +0.02% MaxWaves: 69296 -> 69310 (+0.02%) Instrs: 8161016 -> 8153365 (-0.09%); split: -0.10%, +0.01% Latency: 109397002 -> 109756208 (+0.33%); split: -0.05%, +0.38% InvThroughput: 23238920 -> 23310761 (+0.31%); split: -0.11%, +0.42% VClause: 135141 -> 135100 (-0.03%); split: -0.05%, +0.02% SClause: 349511 -> 349489 (-0.01%); split: -0.01%, +0.00% Copies: 388107 -> 387754 (-0.09%); split: -0.48%, +0.38% Branches: 184629 -> 184503 (-0.07%); split: -0.08%, +0.01% PreSGPRs: 258807 -> 258839 (+0.01%) PreVGPRs: 372561 -> 372184 (-0.10%); split: -0.10%, +0.00% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00
Daniel Schürmann	60e171af06	aco/print_ir: fix printing of VOPC_SDWA definitions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>	2021-08-23 10:31:40 +00:00

... 5 6 7 8 9 ...

144246 Commits All Branches Search

144246 Commits

All Branches