KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Alyssa Rosenzweig	d429187bf3	pan/mdg: Analyze helper execution requirements Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>	2020-05-12 22:30:41 +00:00
Alyssa Rosenzweig	3228b3106a	pan/mdg: Analyze helper invocation termination Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>	2020-05-12 22:30:41 +00:00
Alyssa Rosenzweig	0da03c68ae	pan/mdg: Explain helper invocations dataflow theory Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>	2020-05-12 22:30:41 +00:00
Arcady Goldmints-Orlov	95fd950d35	intel/compiler: fix alignment assert in nir_emit_intrinsic Fixes: `c643979228` (intel/fs: Choose memory message type based on bit size) Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5000>	2020-05-12 22:14:31 +00:00
Eric Anholt	a663c595bc	freedreno: Skip taking the lock for resource usage if it's already flagged. Improves nohw drawoverhead 8-ubos update throughput by 13.493% +/- 0.391444% (n=15). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5011>	2020-05-12 21:42:00 +00:00
Eric Anholt	356f99161d	freedreno: Move the resource_read early out to an inline. Looking at perf, the drawoverhead test case was now spending 13% CPU (89% in that function) on stack management. nohw drawoverhead throughput 1.03902% +/- 0.380257% (n=13). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>	2020-05-12 21:19:50 +00:00
Eric Anholt	d393837332	freedreno: Add an early out for preparing to read a resource. nohw drawoverhead 8 UBOs test throughput 1.06093% +/- 0.363376% (n=10). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>	2020-05-12 21:19:50 +00:00
Eric Anholt	3e424bcdfc	freedreno: Split the fd_batch_resource_used by read vs write. This is for an optimization I plan in a following commit. I found I had to add likely()s to avoid a perf regression from branch prediction. On the drawoverhead 8 UBOs test, the HW can't quite keep up with the CPU, but if I set nohw then this change is 1.32023% +/- 0.373053% (n=10). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>	2020-05-12 21:19:50 +00:00
Eric Anholt	fdcadf611e	freedreno: Add a nohw flag to skip submitting to the kernel. For some CPU-side-only optimizations, it can be nice to disable rendering so that we can see what the impact is even on cases where the GPU can't quite keep up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>	2020-05-12 21:19:50 +00:00
Brian Ho	a43e974064	turnip: Execute ir3_nir_lower_gs pass again This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>	2020-05-12 13:42:55 -07:00
Rob Clark	1bd38746d5	freedreno/gmem: rework gmem layout algo And try a bit harder to find an optimal layout. Improves on a sub- optimal layout we arrive at in the 4 MRT pass in manhattan, picking up a bit more than 3%. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Rob Clark	c46f46befe	freedreno/gmem: relax alignment on a6xx The blob only uses single page alignment, and empirically that appears to work just fine. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Rob Clark	ad6e06621b	freedreno: add gmemtool A simple standalone thing to run through a bunch of GMEM layouts for a given gpu. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Rob Clark	ef5f238fd0	freedreno/gmem: add helper to dump GMEM layout Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Rob Clark	6a49d9c396	freedreno/gmem: add div_align() helper Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Rob Clark	96b5a70f45	freedreno: initialize max_scissor Somehow the initialization of this got lost somewhere along the way, resulting in assuming minx/miny are always zero. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Rob Clark	1387e77801	freedreno/gmem: don't assume scissor opt when estimating # of bins We potentially don't know yet what the resulting scissor bounds are, so we can't assume this when estimating number of bins per pipe for VSC size calculations. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>	2020-05-12 18:16:48 +00:00
Jason Ekstrand	3c87618d35	vulkan: Handle vkGet/SetPrivateDataEXT on Android swapchains There is an annoying spec corner on Android. Because VkSwapchain is implemented in the Vulkan loader on Android which may not know about this extension, we have to handle it as a special case inside the driver. We only have to do this on Android and only for VkSwapchainKHR. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4882>	2020-05-12 18:01:48 +00:00
Jason Ekstrand	51c6bc13ce	anv,vulkan: Implement VK_EXT_private_data Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4882>	2020-05-12 18:01:48 +00:00
Jonathan Marek	d76e722ed6	turnip: enable tiling for compressed formats Now that layout code supports this, we can enable it. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>	2020-05-12 17:25:38 +00:00
Jonathan Marek	f543d87f23	turnip: update "fetchsize" value to match fdl6_layout changes It seems this is actually a "minimum pitch" value. For example TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels. This fixes breakage with compressed formats. For example this test: dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3 Fixes: `a34b3fa198` ("freedreno/fdl: Align after dividing by block size") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>	2020-05-12 17:25:38 +00:00
Eric Anholt	f789c5975c	freedreno: Fix non-constbuf-upload UBO block indices and count. The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers to the HW for UBO block 1-N, so let's just fix up the shader state. Fixes an off by one in const state layout setup, and some really dodgy register addressing trying to deal with dynamic UBO indices when the UBO pointers happen to be at the start of the constbuf. There's no fixes tag, though this fixes a bug from September, because it would require the num_ubos fix in nir_lower_uniforms_to_ubo. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>	2020-05-12 17:01:55 +00:00
Eric Anholt	4553fc66a5	nir: Fix count when we didn't lower load_uniforms but did shift load_ubos. The fixed commit was really nice in mostly fixing num_ubos to reflect the shader after lowering, but for dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there are no default uniforms and so we skipped the increment, even though we shifted the block index up. Fixes: `4777ee1a62` ("nir: Always create UBO variable when lowering uniforms to ubo") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>	2020-05-12 17:01:55 +00:00
Eric Anholt	0f2e44d55b	freedreno: Drop the "write" arg to emit_const_bo now relocs don't care. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	51d7a71bd4	freedreno: Replace OUT_RELOCW with OUT_RELOC. Final cleanup commit now that they're the same. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	064f395a89	freedreno: Tell the kernel that all BOs are for writing. Using non-write flags is pretty dubious -- it means the kernel tracking an array of read-only consumers of the BO and having exclusive consumers wait on each reader's fence. It allows multiple readers through dma-bufs to do work in parallel, but at the cost of kernel CPU time and memory management of the shared array. Other drivers have dropped this distinction since dma-buf sharing is usually producer-consumer, not producer-two-consumers, and the userspace and kernel space tracking is expensive. For us, this lets us drop the flags passed in for relocs and tracked in the ringbuffer reloc lists. The end result of the flags reduction work is drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	b2c23b1e48	freedreno: Mark all ringbuffer BOs as to be dumped on crash. We can avoid passing these flags around in the DRM backends by just marking ring BOs up front. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	554b959df0	freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	9d8d936dfc	freedreno: Start moving relocs flags into the BOs. It's silly to have all the reloc emitters passing around FD_RELOC_READ when you have to have it set on all relocs (that don't include WRITE, which implies read) for the kernel to actually track the fences on the BO. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Samuel Pitoiset	4235624b6a	aco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b) v2: outline into a separate function and also optimize additions (by Daniel Schürmann) Totals from affected shaders: (VEGA) SGPRS: 938888 -> 941496 (0.28 %) VGPRS: 832068 -> 831532 (-0.06 %) Spilled SGPRs: 618 -> 618 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 3696 -> 3696 (0.00 %) dwords per thread Code Size: 72893900 -> 72558928 (-0.46 %) bytes LDS: 18201 -> 18201 (0.00 %) blocks Max Waves: 64256 -> 64268 (0.02 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4419>	2020-05-12 16:15:17 +00:00
Daniel Schürmann	a5fc96b533	aco: coalesce parallelcopies during register allocation These are the result of lowering to CSSA, and should be removed if possible Totals from affected shaders: (VEGA) SGPRS: 544544 -> 544544 (0.00 %) VGPRS: 418224 -> 418224 (0.00 %) Spilled SGPRs: 141826 -> 141826 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 65853740 -> 64703380 (-1.75 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 13669 -> 13669 (0.00 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4952>	2020-05-12 15:59:31 +00:00
Jon Turney	38cc649fcb	glthread: Fix use of alloca() without #include "c99_alloca.h" ../src/mesa/main/glthread_draw.c: In function ‘_mesa_marshal_MultiDrawElementsBaseVertex’: ../src/mesa/main/glthread_draw.c:812:36: error: implicit declaration of function ‘alloca’; did you mean ‘malloc’? [-Werror=implicit-function-declaration] 812 \| const GLvoid *out_indices = alloca(sizeof(indices[0]) draw_count); \| ^~~~~~ \| malloc ../src/mesa/main/glthread_draw.c:812:36: error: initialization of ‘const GLvoid ’ {aka ‘const void ’} from ‘int’ makes pointer from integer without a cast [-Werror=int-conversion] cc1: some warnings being treated as errors Include c99_alloca.h to portably make the alloca() prototype available. Fixes: `2840bc30` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4920>	2020-05-12 14:46:12 +00:00
Lucas Stach	dc6c42dc77	etnaviv: generalize FE stall before loading shader and sampler states It seems that some of the new shader and sampler states added with Halti0 are not self-synchronizing anymore. Make sure to stall the FE before loading those new states to avoid corruption of the in-flight draw state. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3963>	2020-05-12 16:13:31 +02:00
Daniel Stone	8e5fc97be6	CI: Re-enable Panfrost T7x0 jobs The hardware issue in the lab preventing jobs from being run on those machines (and limiting T820 availability), leading to them being disabled in !4965, has been fixed. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `696bafac40` ("CI: Disable Panfrost T7x0 jobs") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5006>	2020-05-12 11:33:06 +01:00
Samuel Pitoiset	8c6350d2bb	radv: update the list of allowed Android extensions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>	2020-05-12 10:29:48 +02:00
Samuel Pitoiset	021270cb31	radv: handle different Vulkan API versions correctly Loosely based on ANV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>	2020-05-12 10:29:46 +02:00
Samuel Pitoiset	69430921fc	radv: limit the Vulkan version to 1.1 for Android Vulkan 1.2 seems rejected. This hardcodes the Android version to 1.1.107. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2936 Fixes: `7f5462e349` ("radv: enable Vulkan 1.2") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>	2020-05-12 10:29:44 +02:00
Gert Wollny	50eabb7035	r600: Fix nir compiler options, i.e. don't lower IO to temps for TESS Also fix alignments and add umad24 and umul24 options. Fixes: `6747a984f5` r600: Enable tesselation for NIR Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4982>	2020-05-12 06:34:07 +00:00
Alejandro Piñeiro	f7fcbe9830	v3d/tex: use TMUSLOD register if possible TMUSLOD register is the same that TMUS but having the same effect that setting disable_autolod on the TMU configuration parameter 2. So using that register is potentially more efficient, as in several cases we would be able to skip writing P2. One case where we can't use it is for texture cube maps, as we need to use TMUSCM. v2: don't put a comment in the middle of the conditions (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>	2020-05-11 23:52:46 +00:00
Alejandro Piñeiro	c3af695bb0	v3d/tex: set up default values for Configuration Parameter 1 if possible Texture access has three configuration parameters, P0 (texture), P1 (sampler) and P2(lookup). P1 and P2 are optional, but if P2 is needed (like for example to set the offset for texelFetchOffset), then you need to set P1. But until now when setting up P1 we were asking the driver to fill up the address with the shader state. But in that case we can just fill that address with the default value NULL. So let's avoid asking the driver to fill that default values, and do it directly on the compiler. This is a good-to-have on OpenGL, and likely would be needed on Vulkan. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>	2020-05-11 23:52:46 +00:00
Alejandro Piñeiro	50c2c76ea3	v3d/tex: only look up the 2nd texture gather offset for 1d non-arrays Commit `1bc71e8b65` already did that for the 3rd offset, but it also needs to do it for the 2nd (to handle 1d array). Fixes assertion failures with Vulkan CTS tests using 1darray targets. Seems that there isn't too many 1darray tests on OpenGL CTS, and OpenGL-ES don't support 1d arrays, but the same problem could arise eventually on OpenGL. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>	2020-05-11 23:52:46 +00:00
Ani	ad8c5bba0a	drirc: Enable glthread for rpcs3 Closes: #2939 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4988>	2020-05-11 23:25:19 +00:00
Icecream95	d1290e7948	pan/midgard: Fix old style shadows This fixes the sky being red in OpenMW, as well as some of the Mesa demos using shadows (shadowtex, shadow_sampler). Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4997>	2020-05-12 10:36:30 +12:00
Axel Davy	47bfc799da	gallium/util: Fix leak in the live shader cache When the nir backend is used, the create_shader call is supposed to release state->ir.nir. When the cache hits, create_shader is not called, thus state->ir.nir should be freed. There is nothing to be done for the TGSI case as the tokens release is done by the caller. This fixes a leak noticed in: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2931 Fixes: `4bb919b0b8` Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4980>	2020-05-11 19:42:37 +00:00
Ian Romanick	412e29c277	nir/algebraic: Eliminate useless extract before unpack The shader helped for spills and fills is the big compute shader in Dirt Showdown. One of the shaders hurt for spills and fills on Broadwell is the big compute shader in Bioshock Infinite, but combined with the previous commit, it's still an impovement. Tiger Lake total instructions in shared programs: 21833218 -> 21832449 (<.01%) instructions in affected programs: 66104 -> 65335 (-1.16%) helped: 106 HURT: 14 helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5 helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95% HURT stats (abs) min: 1 max: 14 x̄: 4.64 x̃: 1 HURT stats (rel) min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19% 95% mean confidence interval for instructions value: -8.51 -4.30 95% mean confidence interval for instructions %-change: -1.23% -0.69% Instructions are helped. total cycles in shared programs: 506180109 -> 506196314 (<.01%) cycles in affected programs: 1671429 -> 1687634 (0.97%) helped: 37 HURT: 84 helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24 helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41% HURT stats (abs) min: 1 max: 5000 x̄: 225.19 x̃: 8 HURT stats (rel) min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42% 95% mean confidence interval for cycles value: 2.85 265.00 95% mean confidence interval for cycles %-change: 0.04% 0.88% Cycles are HURT. Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19961317 -> 19960543 (<.01%) instructions in affected programs: 30268 -> 29494 (-2.56%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31% 95% mean confidence interval for instructions value: -29.46 -10.23 95% mean confidence interval for instructions %-change: -2.95% -1.71% Instructions are helped. total cycles in shared programs: 498863755 -> 498865843 (<.01%) cycles in affected programs: 1831136 -> 1833224 (0.11%) helped: 57 HURT: 65 helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25 helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71% HURT stats (abs) min: 1 max: 1887 x̄: 145.18 x̃: 15 HURT stats (rel) min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73% 95% mean confidence interval for cycles value: -58.30 92.53 95% mean confidence interval for cycles %-change: 0.16% 0.97% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8774 -> 8773 (-0.01%) spills in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total fills in shared programs: 9496 -> 9494 (-0.02%) fills in affected programs: 40 -> 38 (-5.00%) helped: 1 HURT: 0 Broadwell total instructions in shared programs: 17859373 -> 17858548 (<.01%) instructions in affected programs: 38452 -> 37627 (-2.15%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69% 95% mean confidence interval for instructions value: -39.79 -13.44 95% mean confidence interval for instructions %-change: -3.25% -1.89% Instructions are helped. total cycles in shared programs: 525858109 -> 525869236 (<.01%) cycles in affected programs: 2058597 -> 2069724 (0.54%) helped: 44 HURT: 75 helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23 helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85% HURT stats (abs) min: 1 max: 3915 x̄: 258.56 x̃: 47 HURT stats (rel) min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21% 95% mean confidence interval for cycles value: -26.06 213.07 95% mean confidence interval for cycles %-change: 0.19% 1.78% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 25744 -> 25730 (-0.05%) spills in affected programs: 1578 -> 1564 (-0.89%) helped: 4 HURT: 2 total fills in shared programs: 31710 -> 31689 (-0.07%) fills in affected programs: 4346 -> 4325 (-0.48%) helped: 3 HURT: 3 Haswell total instructions in shared programs: 16228399 -> 16227783 (<.01%) instructions in affected programs: 22201 -> 21585 (-2.77%) helped: 27 HURT: 0 helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86% 95% mean confidence interval for instructions value: -31.96 -13.66 95% mean confidence interval for instructions %-change: -3.68% -2.15% Instructions are helped. total cycles in shared programs: 538613967 -> 538701354 (0.02%) cycles in affected programs: 1653044 -> 1740431 (5.29%) helped: 36 HURT: 81 helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17 helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65% HURT stats (abs) min: 1 max: 30100 x̄: 1125.30 x̃: 304 HURT stats (rel) min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60% 95% mean confidence interval for cycles value: 23.78 1470.01 95% mean confidence interval for cycles %-change: 4.29% 7.12% Cycles are HURT. total spills in shared programs: 23418 -> 23409 (-0.04%) spills in affected programs: 177 -> 168 (-5.08%) helped: 2 HURT: 0 total fills in shared programs: 25919 -> 25896 (-0.09%) fills in affected programs: 568 -> 545 (-4.05%) helped: 3 HURT: 0 Ivy Bridge total instructions in shared programs: 15265983 -> 15265759 (<.01%) instructions in affected programs: 8418 -> 8194 (-2.66%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26 helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00% 95% mean confidence interval for instructions value: -86.29 -3.31 95% mean confidence interval for instructions %-change: -4.43% -1.81% Instructions are helped. total cycles in shared programs: 422930336 -> 422929589 (<.01%) cycles in affected programs: 59347 -> 58600 (-1.26%) helped: 3 HURT: 2 helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168 helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06% HURT stats (abs) min: 265 max: 288 x̄: 276.50 x̃: 276 HURT stats (rel) min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22% 95% mean confidence interval for cycles value: -829.08 530.28 95% mean confidence interval for cycles %-change: -4.43% 5.93% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 4953 -> 4946 (-0.14%) spills in affected programs: 344 -> 337 (-2.03%) helped: 2 HURT: 0 total fills in shared programs: 5548 -> 5521 (-0.49%) fills in affected programs: 838 -> 811 (-3.22%) helped: 2 HURT: 0 No shader-db changes on any earlier Intel platform. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>	2020-05-11 12:07:01 -07:00
Ian Romanick	bc0bbb8f0b	nir/algebraic: Add some half packing optimizations for pack_half_2x16_split Like `1f72857739` ("nir/algebraic: add some half packing optimizations"), but for the pack_half_2x16_split variant. The shader helped for spills and fills is the big compute shader in Bioshock Infinite. Tiger Lake total instructions in shared programs: 21834539 -> 21833218 (<.01%) instructions in affected programs: 60119 -> 58798 (-2.20%) helped: 105 HURT: 0 helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9 helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70% 95% mean confidence interval for instructions value: -14.35 -10.81 95% mean confidence interval for instructions %-change: -3.20% -1.97% Instructions are helped. total cycles in shared programs: 506215169 -> 506180109 (<.01%) cycles in affected programs: 1445088 -> 1410028 (-2.43%) helped: 97 HURT: 8 helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26 helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34% HURT stats (abs) min: 21 max: 635 x̄: 319.12 x̃: 212 HURT stats (rel) min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46% 95% mean confidence interval for cycles value: -782.96 115.15 95% mean confidence interval for cycles %-change: -1.74% -0.16% Inconclusive result (value mean confidence interval includes 0). Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown) total instructions in shared programs: 19962974 -> 19961317 (<.01%) instructions in affected programs: 63471 -> 61814 (-2.61%) helped: 105 HURT: 0 helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11 helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16% 95% mean confidence interval for instructions value: -18.38 -13.18 95% mean confidence interval for instructions %-change: -3.86% -2.48% Instructions are helped. total cycles in shared programs: 498908953 -> 498863755 (<.01%) cycles in affected programs: 1566998 -> 1521800 (-2.88%) helped: 89 HURT: 15 helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69 helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12% HURT stats (abs) min: 3 max: 661 x̄: 144.47 x̃: 16 HURT stats (rel) min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30% 95% mean confidence interval for cycles value: -903.93 34.74 95% mean confidence interval for cycles %-change: -4.50% -2.32% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8776 -> 8774 (-0.02%) spills in affected programs: 25 -> 23 (-8.00%) helped: 1 HURT: 0 total fills in shared programs: 9500 -> 9496 (-0.04%) fills in affected programs: 46 -> 42 (-8.70%) helped: 1 HURT: 0 Haswell total instructions in shared programs: 16229912 -> 16228399 (<.01%) instructions in affected programs: 61257 -> 59744 (-2.47%) helped: 105 HURT: 0 helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11 helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15% 95% mean confidence interval for instructions value: -16.14 -12.68 95% mean confidence interval for instructions %-change: -3.77% -2.40% Instructions are helped. total cycles in shared programs: 538654481 -> 538613967 (<.01%) cycles in affected programs: 1448966 -> 1408452 (-2.80%) helped: 58 HURT: 47 helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74 helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03% HURT stats (abs) min: 5 max: 3720 x̄: 318.98 x̃: 49 HURT stats (rel) min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12% 95% mean confidence interval for cycles value: -999.84 228.14 95% mean confidence interval for cycles %-change: -2.86% 0.51% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 15266086 -> 15265983 (<.01%) instructions in affected programs: 7272 -> 7169 (-1.42%) helped: 3 HURT: 0 helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41 helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23% total cycles in shared programs: 422930883 -> 422930336 (<.01%) cycles in affected programs: 49259 -> 48712 (-1.11%) helped: 3 HURT: 0 helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220 helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72% No changes on any earilier Intel platforms. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>	2020-05-11 12:07:01 -07:00
Ian Romanick	a2bf41ec65	nir/algebraic: Optimize ushr of pack_half, not ishr When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce 0xBC000000. The ishr will produce 0xFFFFBC00. The replacement pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00. Fixes: `1f72857739` ("nir/algebraic: add some half packing optimizations") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>	2020-05-11 12:07:01 -07:00
Kenneth Graunke	ab16bff97d	intel: Delete hardcoded devinfo->urb.size values for Gen7+ (sans DG1). On all Gen7+ platforms except DG1, the URB is a subsection of the configurable L3 cache, and so the size can vary. The size listed in the documentation on those platforms is an "example size", picked by calculating it based on an arbitrarily chosen L3 config. Hardcoding a value for those platforms provides no value and only confuses people trying to fill out these tables when doing hardware enabling. anv and iris never use this field. i965 uses it to initialize brw->urb.size, but then updates that in update_urb_size() to be the correct value, so the initial value doesn't matter. Delete the values for Gen7+ and update the comment accordingly. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4969>	2020-05-11 09:40:56 -07:00
Abhishek Kumar	0bea2a1321	egl: Limit the EGL ver for android Android support EGL 1.5 from Q onwards, so limit EGL ver to 1.4 for P and below. Closes: #2892 Signed-off-by: Abhishek Kumar <abhishek4.kumar@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4951>	2020-05-11 13:06:22 +00:00
Serge Martin	9c839e6394	amd/common: Fix incorrect use of asprintf instead of vasprintf Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-05-11 12:54:41 +02:00

... 2 3 4 5 6 ...

123775 Commits All Branches Search

123775 Commits

All Branches