KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Kenneth Graunke	10560f8506	iris: Minor tidying	2019-07-03 22:24:44 -07:00
Kenneth Graunke	8551dc17a7	iris: Disable loop unrolling in GLSL IR. Leave it to NIR instead, like i965 does. Thanks to Tim Arceri for noticing that I'd left this enabled by accident. shader-db results on Skylake: total instructions in shared programs: 15522628 -> 15521642 (<.01%) instructions in affected programs: 94008 -> 93022 (-1.05%) helped: 34 HURT: 33 helped stats (abs) min: 12 max: 48 x̄: 33.82 x̃: 42 helped stats (rel) min: 0.06% max: 22.14% x̄: 9.86% x̃: 10.89% HURT stats (abs) min: 1 max: 16 x̄: 4.97 x̃: 3t HURT stats (rel) min: 0.82% max: 3.77% x̄: 1.73% x̃: 1.53% 95% mean confidence interval for instructions value: -20.08 -9.35 95% mean confidence interval for instructions %-change: -5.95% -2.36% Instructions are helped. total cycles in shared programs: 367105221 -> 367074230 (<.01%) cycles in affected programs: 10017660 -> 9986669 (-0.31%) helped: 266 HURT: 184 helped stats (abs) min: 1 max: 9556 x̄: 151.35 x̃: 12 helped stats (rel) min: 0.08% max: 59.91% x̄: 4.66% x̃: 1.67% HURT stats (abs) min: 1 max: 1716 x̄: 50.37 x̃: 6 HURT stats (rel) min: <.01% max: 24.40% x̄: 2.42% x̃: 0.85% 95% mean confidence interval for cycles value: -133.90 -3.84 95% mean confidence interval for cycles %-change: -2.44% -1.10% Cycles are helped. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-26 22:55:03 -07:00
Caio Marcelo de Oliveira Filho	5bd48ff252	iris: Enable INTEL_shader_atomic_float_minmax Supported only for gen >= 9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-06-13 09:03:58 -07:00
Caio Marcelo de Oliveira Filho	9c81db8adb	iris: Enable PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED This avoids lowering of CS system values by GLSL (configured by state tracker). In i965 we don't use that lowering, and we also shouldn't need that in Iris. Using it cause some unnecessary round trip between values, e.g.: shader uses gl_LocalInvocationIndex, GLSL rewrites it in terms of gl_LocalInvocationID, then driver rewrites those in terms of gl_LocalInvocationIndex again. Copy propagation can make some of those go away, but not all as seen below. Intel SKL shader-db results: total instructions in shared programs: 15595189 -> 15594556 (<.01%) instructions in affected programs: 74880 -> 74247 (-0.85%) helped: 81 HURT: 4 helped stats (abs) min: 2 max: 172 x̄: 7.88 x̃: 4 helped stats (rel) min: 0.19% max: 5.66% x̄: 1.71% x̃: 1.23% HURT stats (abs) min: 1 max: 2 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.45% max: 1.65% x̄: 0.76% x̃: 0.46% 95% mean confidence interval for instructions value: -11.56 -3.34 95% mean confidence interval for instructions %-change: -1.91% -1.28% Instructions are helped. total loops in shared programs: 4831 -> 4831 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 372136618 -> 372145628 (<.01%) cycles in affected programs: 9218230 -> 9227240 (0.10%) helped: 131 HURT: 86 helped stats (abs) min: 1 max: 798 x̄: 39.79 x̃: 12 helped stats (rel) min: <.01% max: 6.75% x̄: 0.42% x̃: 0.13% HURT stats (abs) min: 2 max: 2442 x̄: 165.38 x̃: 6 HURT stats (rel) min: <.01% max: 20.83% x̄: 0.74% x̃: 0.12% 95% mean confidence interval for cycles value: -2.07 85.11 95% mean confidence interval for cycles %-change: -0.22% 0.30% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 11956 -> 11950 (-0.05%) spills in affected programs: 77 -> 71 (-7.79%) helped: 3 HURT: 0 total fills in shared programs: 25619 -> 25549 (-0.27%) fills in affected programs: 593 -> 523 (-11.80%) helped: 4 HURT: 0 LOST: 0 GAINED: 0 Total CPU time (seconds): 1695.69 -> 1706.03 (0.61%) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-11 15:12:17 -07:00
Kenneth Graunke	a8588f512b	iris: Bypass half-float pack/unpack lowering. This skips GLSL IR lowering of pack/unpackHalf operations, allowing the NIR optimizer to see them Improves performance in Synmark2's OglCSDof by about 2x, by cutting about 90% of the cycles from one of the compute shaders. shader-db statistics on Skylake: 4 compute shaders went from SIMD8 to SIMD16. total instructions in shared programs: 15598871 -> 15542568 (-0.36%) instructions in affected programs: 143016 -> 86713 (-39.37%) helped: 144 HURT: 0 helped stats (abs) min: 17 max: 4669 x̄: 390.99 x̃: 164 helped stats (rel) min: 7.48% max: 85.28% x̄: 30.17% x̃: 24.22% 95% mean confidence interval for instructions value: -510.50 -271.49 95% mean confidence interval for instructions %-change: -32.70% -27.65% Instructions are helped. total cycles in shared programs: 371973958 -> 368902103 (-0.83%) cycles in affected programs: 5557722 -> 2485867 (-55.27%) helped: 144 HURT: 0 helped stats (abs) min: 106 max: 1026600 x̄: 21332.33 x̃: 1697 helped stats (rel) min: 0.53% max: 88.98% x̄: 36.12% x̃: 34.67% 95% mean confidence interval for cycles value: -41570.02 -1094.64 95% mean confidence interval for cycles %-change: -38.44% -33.80% Cycles are helped. total spills in shared programs: 11936 -> 11903 (-0.28%) spills in affected programs: 110 -> 77 (-30.00%) helped: 3 HURT: 2 total fills in shared programs: 25644 -> 25178 (-1.82%) fills in affected programs: 677 -> 211 (-68.83%) helped: 5 HURT: 0 total loops in shared programs: 4830 -> 4829 (-0.02%) loops in affected programs: 1 -> 0 helped: 1 HURT: 0	2019-06-10 16:01:36 -07:00
Jason Ekstrand	e459d6d6df	iris: Enable nir_opt_large_constants Shader-db results on Kaby Lake: total instructions in shared programs: 15306230 -> 15304726 (<.01%) instructions in affected programs: 4570 -> 3066 (-32.91%) helped: 16 HURT: 0 total cycles in shared programs: 361703436 -> 361680041 (<.01%) cycles in affected programs: 129388 -> 105993 (-18.08%) helped: 16 HURT: 0 LOST: 0 GAINED: 2 The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal Space Program Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-29 21:09:16 +00:00
Kenneth Graunke	25afbb04c2	iris: Advertise coherent framebuffer fetches This lets us advertise GL_EXT_shader_framebuffer_fetch and GL_KHR_blend_equation_advanced_coherent support.	2019-05-23 08:13:10 -07:00
Kenneth Graunke	a2d7834457	gallium: Change PIPE_CAP_TGSI_FS_FBFETCH bool to PIPE_CAP_FBFETCH count TGSI's FBFETCH instruction currently only supports reading from a single render target, but NIR intrinsics can support multiple render targets. radeonsi can only support fetching from RT 0, but other drivers may be able to support fetching from any render target. To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?" to an integer number of render targets which can be fetched. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-23 08:13:07 -07:00
Kenneth Graunke	fb1d08dcfd	iris: Expose the disk cache to the state tracker as well. This lets st/nir cache the NIR for shaders, based on the shader source string hash, allowing us to skip initial compiles altogether, and also letting us start from there should we need to recompile for NOS. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Dylan Baker	4756864cdc	iris: Start wiring up on-disk shader cache This creates the on-disk shader cache data structure, and handles the build-id keying aspects. The next commits will fill it out so it's actually used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-21 15:05:38 -07:00
Kenneth Graunke	752367b766	iris: Dodge more GLSL IR lowering This avoids some lower_instructions bits in st.	2019-05-15 19:44:21 -07:00
Kenneth Graunke	bb5db02bab	iris: Enable fragment shader interlock on Gen9+. There's some debate about whether we should support this on older hardware as well. Currently i965 turns it off on Gen8- though, so we follow suit. If this changes, we can update this as well. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-14 19:34:33 -07:00
Eric Anholt	0c31fe9ee7	gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE. The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 12:03:08 -07:00
Illia Iorin	a35269cf44	iris: Implement ARB_indirect_parameters iris_draw_vbo is divided into two functions to remove unnecessary operations from the loop. This implementation of ARB_indirect_parameters takes into account NV_conditional_render by saving MI_PREDICATE_RESULT at the start of a draw call and restoring it at the end also the result of NV_conditional_render is taken into account when computing predicates that limit draw calls for ARB_indirect_parameters in a similar way to `1952fd8d` in ANV. v2: Optimize indirect draws (suggested by Kenneth Graunke) v3: (by Kenneth Graunke) - Fix an issue where indirect draws wouldn't set patch information before updating the compiled TCS. - Move some code back to iris_draw_vbo to avoid duplicating it. - Fix minor indentation issues. Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-11 23:56:52 -07:00
Kenneth Graunke	c61862ddfc	iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY This provides a way for the application to query whether any resets have happened, which lets us expose "robust" contexts. This also enables the KHR_robust_buffer_access_behavior tests.	2019-05-09 16:49:07 -07:00
Kenneth Graunke	d9b9bb91ff	iris: Report the same video memory settings as i965. This just copy and pastes Ian's code from i965.	2019-05-08 12:43:08 -07:00
Kenneth Graunke	a032a9665f	iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This makes CompressedTexSubImage from a PBO source do proper GPU rendering to upload instead of stalling to map the PBO source on the CPU (then copying it on the CPU). Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this functionality, and to Jason Ekstrand for writing the code I adapted. Vulkan only supports a single layer, however, and this code tries to support multiple layers as long as it's miplevel 0. Improves performance in Sid Meier's Civilization VI: Average frame time (ms): -3.67423% +/- 1.46201% (n=5) 99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5)	2019-05-06 09:50:32 -07:00
Kenneth Graunke	f3bdffc33d	iris: Only enable GL_AMD_depth_clamp_separate on Gen9+ The hardware feature is new as of Gen9+. I accidentally enabled it on Gen8.	2019-04-29 13:25:12 -07:00
Kenneth Graunke	59aa7c924d	iris: Enable GL_AMD_depth_clamp_separate We support this, we just forgot to turn it on.	2019-04-24 16:49:13 -07:00
Kenneth Graunke	19b246257d	iris: Actually put Mesa in GL_RENDERER string I constructed the right thing and then returned the other one.	2019-04-24 12:54:27 -07:00
Mike Blumenkrantz	b53d256db8	iris: add support for INTEL_conservative_rasterization this hooks up the iris gallium driver to existing mesa bits which handle the implementation resolves kwg/mesa#8 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 16:36:30 -07:00
Kenneth Graunke	5ad0c88dbe	iris: Replace buffer backing storage and rebind to update addresses. This implements PIPE_CAP_INVALIDATE_BUFFER and invalidate_resource(), as well as the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag. When either of these happen, we swap out the backing storage of the buffer for a new idle BO, allowing us to write to it immediately without stalling or queueing a blit. On my Skylake GT4e at 1920x1080, this improves performance in games: ----------------------------------------------- \| DiRT Rally \| +25% (avg) \| +17% (max) \| \| Bioshock Infinite \| +22% (avg) \| +11% (max) \| \| Shadow of Mordor \| +27% (avg) \| +83% (max) \| -----------------------------------------------	2019-04-23 00:24:08 -07:00
Kenneth Graunke	36478b9f77	iris: Enable the dual_color_blend_by_location driconf option. This fixes rendering in Unigine Valley 1.0 and Heaven 4.0.	2019-04-22 09:36:36 -07:00
Kenneth Graunke	faa52e328e	iris: Add mechanism for iris-specific driconf options Based on Nicolai's `0f8c5de869`. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-22 09:35:36 -07:00
Kenneth Graunke	33314cf410	iris: Change vendor and renderer strings This patch changes the GL_VENDOR string from "Mesa Project" to "Intel". This makes GLX_MESA_query_renderer report "Vendor: Intel (0x8086)" instead of "Vendor: Mesa Project (0x8086)" which is arguably wrong. We now also use a consistent vendor string across Windows and Linux. It also prepends "Mesa" to the GL_RENDERER string, both to credit the community and have a distinguishing mark between the two drivers. We drop "DRI" compared to i965, as it's not really that important. Improves performance in Portal by 1.8x. Iris is now 3.86% faster than i965 at the portal-d1.dem timedemo on my Kabylake laptop. One change is that Portal selects the MapBufferRange path based on the vendor string, and iris's BufferSubData path is still missing the storage invalidation optimization.	2019-04-16 10:27:20 -07:00
Kenneth Graunke	024a57d23c	iris: Make shader_perf_log print to stderr if INTEL_DEBUG=perf is set This matches i965's behavior, and makes sure that shader compiler messages are visible when setting INTEL_DEBUG=perf.	2019-04-15 23:33:03 -07:00
Mike Blumenkrantz	03d6d01fe2	iris: support INTEL_NO_HW environment variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-10 12:59:17 -07:00
Caio Marcelo de Oliveira Filho	956226c8ba	iris: Enable NV_compute_shader_derivatives Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	3b20ca34ae	iris: Clean up compiler warnings about unused Removed a few unused variables and iris_getparam_boolean(). Kept 'name' around since there's a commented debug that make use of it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-29 12:07:26 -07:00
Timur Kristóf	fd5075e059	iris: Face should be a system value. This patch adds PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL which despite its name is not a TGSI-specific capability, just lets the state tracker know that it should generate a system value for FACE. This is needed if we want to run tgsi_to_nir on iris. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-11 14:02:40 -07:00
Kenneth Graunke	9d1334d2a0	iris: Use copy_region and staging resources to avoid transfer stalls This is similar to intel_miptree_map_blit and intel_buffer_object.c's temporary blits in i965. Improves performance of DiRT Rally by 20-25% by eliminating stalls. Breaks piglit's spec/arb_shader_image_load_store/host-mem-barrier, by using the GPU to do uploads, exposing a st/mesa issue where it doesn't give us memory_barrier() calls. This is a pre-existing issue and will be fixed by a later patch (currently out for review).	2019-03-08 13:29:39 -08:00
Chris Wilson	04ddff1aa4	iris: Wire up EGL_IMG_context_priority Add the missing PIPE_CAP_CONTEXT_PRIORITY_MASK and parsing of the context construction flags. Testcase: piglit/egl-context-priority Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-07 20:27:10 -08:00
Kenneth Graunke	d53b1b6215	iris: Drop PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY This cap is mainly for working around a r600 texture swizzle issue, but it also controls whether ARB_texture_buffer_object (with legacy formats) is enabled. I suspect the missing I/L/A/LA faking is why I had it set in the first place. Thanks to Ilia for pointing out that I shouldn't be setting this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Jose Maria Casanova Crespo	4122665dd9	iris: Enable ARB_shader_draw_parameters support Additional VERTEX_ELEMENT_STATE are used to store basevertex and baseinstance and drawid updating the DWordLength of the 3DSTATE_VERTEX_ELEMENTS command. This passes all piglit tests for spec.draw_parameters. tests and VK-GL-CTS KHR-GL45.shader_draw_parameters_tests.* tests. Now we only mark a dirty_update when parameters are changed or when we have an indirect draw. We enable PIPE_CAP_DRAW_PARAMETERS on Iris. There is no edge flag support in the Vertex Elements setup. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-26 13:28:38 -08:00
Kenneth Graunke	07ec1f0b25	iris: Make an IRIS_MAX_MIPLEVELS define	2019-02-21 10:26:12 -08:00
Kenneth Graunke	8ab82bd1fd	iris: Drop XXX about checking for swizzling Caio noted that this is not necessary on Gen8+: "Before Gen8, there was a historical configuration control field to swizzle address bit[6] for in X/Y tiling modes. This was set in three different places: TILECTL[1:0], ARB_MODE[5:4], and DISP_ARB_CTL[14:13]. For Gen8 and subsequent generations, the swizzle fields are all reserved, and the CPU's memory controller performs all address swizzling modifications." Since we don't support earlier hardware, we can skip it entirely.	2019-02-21 10:26:12 -08:00
Andre Heider	bffb65d28e	iris: improve PIPE_CAP_VIDEO_MEMORY bogus value -1 is a little too bogus for most games ;) Signed-off-by: Andre Heider <a.heider@gmail.com>	2019-02-21 10:26:12 -08:00
Kenneth Graunke	be49fb051d	iris: Stop chopping off the first nine characters of the renderer string	2019-02-21 10:26:12 -08:00
Kenneth Graunke	974229df46	iris: Add PIPE_CAP_MAX_VARYINGS	2019-02-21 10:26:11 -08:00
Kenneth Graunke	4bfd12bbf7	iris: minor tidying	2019-02-21 10:26:11 -08:00
Kenneth Graunke	edd3ce5a63	iris: Enable PIPE_CAP_COMPACT_ARRAYS	2019-02-21 10:26:11 -08:00
Kenneth Graunke	e17333ea1e	iris: fail to create screen for older unsupported HW loader shouldn't try, but let's be paranoid	2019-02-21 10:26:11 -08:00
Kenneth Graunke	1f91f688e8	iris: Switch to the new PIPELINE_STATISTICS_QUERY_SINGLE capability I had a hack in place earlier to pass the query type as q->index for the regular statistics query, but we ended up adjusting the interface and adding a new query type. Use that instead, fixing pipeline statistics queries since the rebase.	2019-02-21 10:26:11 -08:00
Dave Airlie	8806b29e16	iris: setup gen8 caps	2019-02-21 10:26:11 -08:00
Kenneth Graunke	68d531d7d7	iris: Destroy the bufmgr Plugs a 12360 byte leak	2019-02-21 10:26:10 -08:00
Kenneth Graunke	3d55e9a2aa	iris: Destroy transfer helper on screen teardown Plugs a 16 byte leak	2019-02-21 10:26:10 -08:00
Kenneth Graunke	855ff47d36	iris: Enable precompiles	2019-02-21 10:26:10 -08:00
Kenneth Graunke	beb2d5e065	iris: Lie about indirects fixes interpolateAt tests	2019-02-21 10:26:10 -08:00
Kenneth Graunke	b9ccb00e2c	iris: Enable ctx->Const.UseSTD430AsDefaultPacking hooray for obscurely named pipe caps with bizarre descriptions!	2019-02-21 10:26:10 -08:00
Chris Wilson	f459c56be6	iris: Add fence support using drm_syncobj	2019-02-21 10:26:10 -08:00

1 2 3

107 Commits