KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Gurchetan Singh	f0e71b1088	virgl: use transfer queue This improves Unigine Valley benchmark by 3 to 10 fps (depending on the scene). It also improves the Team Fortress 2 benchmark from 6 fps to 13 fps (host: 20 fps). Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	4a7857b377	virgl: introduce transfer queue Transfers will be placed here at unmap time instead of incurring a VM exit. There's an attempt to deduplicate intersecting 1D transfers, which are surprisingly common. This can also help with mipmapped texture upload and smaller textures, where the majority of the time is spent in the guest kernel / QEMU -- not virglrenderer. This is shown by the GLbench texture upload benchmark: Before: texture_upload_rgba_teximage2d_32 = 64.23 mtexel_sec After: texture_upload_rgba_teximage2d_32 = 367.44 mtexel_sec v2: Split up list iteration functions (@gerddie) v3: Support for optimizing glBufferSubData Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	9c4930946a	virgl: add encoder functions for new protocol Let's encode the new protocol with new helper functions. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	5510cc67e0	virgl: make winsys modifications for encoded transfers The idea is to have two command buffers: 1) One for transfers 2) One for commands, which can include transfers At flush time, (2) will be filled. Otherwise, (1) will be used to submit transfers if there are enough of them. v2: Pass size directly to cmd_buf_create (@gerddie) Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	90e9650585	virgl: add extra checks in virgl_res_needs_flush_wait This is motivated by the following scenario: glSubBufferData(GL_ARRAY_BUFFER, ...) glFlush(..) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) This increases @davidriley's Team Fortress 2 apitrace from 1 fps to 6 fps and helps with the Chromium glbench microbenchmarks: Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec buffer_upload_dynamic_array_12 = 0.02 mbytes_sec buffer_upload_dynamic_array_576 = 1.07 mbytes_sec After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec buffer_upload_dynamic_array_12 = 2.22 mbytes_sec buffer_upload_dynamic_array_576 = 164.89 mbytes_sec Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	ab6ea6e9ce	virgl: pass virgl transfer to virgl_res_needs_flush_wait Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	d98fbd9c92	virgl: keep track of number of computations It's good to keep track of these things. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	35515985a9	virgl: limit command length to 16 bits Much of our logic is based around the idea the upper 16 bits of a command dword can encode the length of the command. Now that the command buffer >= 2^16 - 1, we should check for this. v2: alignment, and only check VIRGL_ENCODE_MAX_DWORDS Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	503ffe46bb	virgl: use virgl_transfer in inline write Let's define a helper function and use it. This commit also allows resources to be emitted into different command buffers. Like the ioctls, send 0 for layer_stride and stride. If we actually send the real values, there are various assumptions in virglrenderer for non-1D buffers that may need to be modified. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	0fcd48bac5	virgl: add protocol for resource transfers Mostly similar to VIRGL_CCMD_RESOURCE_INLINE_WRITE. However, this uses the resource's already attached iovecs rather than the command buffer to transfer the data. v2: Used (1 << 16) not (1 << 15) [@gerddie] Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	168c3ffce3	virgl: when creating / freeing transfers, pass slab pool directly This will allow us to destroy transfers w/o having a pointer to the context. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	d5c2dacc15	virgl: unmap uploader at flush time This should save some memory when allocating and freeing transfers. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	14f265b533	virgl: make alignment smaller when uploading index user buffers Since we're just uploading to guest memory, let's just align to dword size. Fixes: e0f932 ("u_upload_mgr: pass alignment to u_upload_data manually") Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	7626e6e189	virgl: track level cleanliness rather than resource cleanliness This allows a minor optimization for texture upload. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	c19aedcf1a	virgl: don't mark unclean after a flush The guest memory is still clean until host GL touches it, which we should track elsewhere. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	5b6a2ae987	virgl: use virgl_resource_dirty helper Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	1d294ad264	virgl: add ability to do finer grain dirty tracking There are levels to cleanliness. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Alyssa Rosenzweig	acc52fff20	panfrost: Improve logging and patch memory leaks Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:47:54 +00:00
Alyssa Rosenzweig	c70ed4ca18	panfrost: Don't align framebuffer dims Fixes regressions with EGL clients Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:46:30 +00:00
Alyssa Rosenzweig	5155bcf099	panfrost: Implement PIPE_QUERY_OCCLUSION_COUNTER Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:46:02 +00:00
Alyssa Rosenzweig	2d22b5380c	panfrost: Identify MALI_OCCLUSION_PRECISE bit Setting this is required for desktop-style occlusion queries. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:45:56 +00:00
Tapani Pälli	595af46f0f	drirc/i965: add option to disable 565 configs and visuals We have cases where we would not like to expose these. v2: call the option allow_rgb565_configs for consistency with existing allow_rgb10_configs (Eric, Jason) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-15 09:38:36 +02:00
Alyssa Rosenzweig	97aa05470a	panfrost: Backport driver to Mali T600/T700 There are a few differenes between Mali T860 (Panfrost's primary reference target) and the older Midgard generations (T600/T700): - Miscellaneous different magic numbers. It's not clear what these numbers mean on either the old or new configurations yet. - Errata fixes. T800 is the final Midgard generation and presumably the least buggy. Older Midgard has some extra hardware errata we have to workaround. - SFBD vs MFBD split. Essentially, older Midgard use a Single FrameBuffer Descriptor (SFBD), which corresponds to single render-target rendering. Newer Midgard (T760+) use a Multiple FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these descriptors serve the same function, but we implement both, depending on the version of the hardware. - CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and vice versa for 64-bit. Our target T760 systems are 32-bit whereas our target T860 systems are 64-bit. More work is needed in this area. This patch fixes support in these areas for supporting older Midgard hardware. It is tested on Mali T760 and Mali T860. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:22:42 +00:00
Alyssa Rosenzweig	f96e871c26	panfrost: Fix build; depend on libdrm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:19:43 +00:00
Jason Ekstrand	08bfd710a2	nir/dead_cf: Stop relying on liveness analysis The liveness analysis pass is fairly expensive because it has to build large bit-sets and run a fix-point algorithm on them. Instead of requiring liveness for detecting if values escape a CF node, just take advantage of the structured nature of NIR and use block indices instead. This only requires the block index metadata which is the fastest we have metadata to generate. No shader-db changes on Kaby Lake Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-14 23:06:29 -06:00
Jason Ekstrand	b50465d197	nir/dead_cf: Inline cf_node_has_side_effects We want to handle live SSA values differently and it's going to involve walking the instructions. We can make it a single instruction walk if we combine it with cf_node_has_side_effects. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-14 23:05:28 -06:00
Jason Ekstrand	367b0ede4d	intel/fs: Bail in optimize_extract_to_float if we have modifiers This fixes a bug in runscape where we were optimizing x >> 16 to an extract and then negating and converting to float. The NIR to fs pass was dropping the negate on the floor breaking a geometry shader and causing it to render nothing. Fixes: `1f862e923c` "i965/fs: Optimize float conversions of byte/word..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-02-14 23:02:44 -06:00
Ilia Mirkin	8c859367df	swr: set PIPE_CAP_MAX_VARYINGS correctly Unfortunately swr was missed in the original commit. The number of varyings should generally match up to what's reported as the shader caps for fragment inputs. Fixes: `6010d7b8e8` (gallium: add PIPE_CAP_MAX_VARYINGS) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alok Hota <alok.hota@intel.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-14 20:29:36 -05:00
Jason Ekstrand	5064464931	intel/fs: Silence a compiler warning Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:47 -06:00
Jason Ekstrand	9b202239ba	anv: Silence some compiler warnings in release builds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:45 -06:00
Jason Ekstrand	cd60c995a6	anv/blorp: Delete a pointless assert Just a little higher up in the function we assert that the aspect masks are actually equal so there's no reason for the weaker check. Also, the temporary variables were causing compiler warnings in release builds. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:42 -06:00
Jason Ekstrand	b14d7a6b60	nir: Silence a couple of warnings in release builds [28/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_gather_xfb_info.c.o'. ../src/compiler/nir/nir_gather_xfb_info.c: In function ‘nir_gather_xfb_info’: ../src/compiler/nir/nir_gather_xfb_info.c:171:13: warning: variable ‘max_offset’ set but not used [-Wunused-but-set-variable] unsigned max_offset[NIR_MAX_XFB_BUFFERS] = {0}; ^~~~~~~~~~ [36/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_instr_set.c.o'. ../src/compiler/nir/nir_instr_set.c:502:1: warning: ‘instr_each_src_and_dest_is_ssa’ defined but not used [-Wunused-function] instr_each_src_and_dest_is_ssa(nir_instr *instr) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:35 -06:00
Kenneth Graunke	6775665e5e	spirv: Eliminate dead input/output variables after translation. spirv_to_nir can generate input/output variables which are illegal for the current shader stage, which would cause nir_validate_shader to balk. After my recent commit to start decorating arrays as compact, dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started hitting validation errors due to outputs in a TCS (not intended for the TCS at all) not being per-vertex arrays. Thanks to Jason Ekstrand for suggesting this approach. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573 Fixes: `ef99f4c8d1` compiler: Mark clip/cull distance arrays as compact before lowering. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-02-14 11:03:56 -08:00
Kenneth Graunke	39aee57523	anv: Put MOCS in the correct location My patch to switch from struct-based MOCS to numeric MOCS accidentally divided all MOCS entries by 2 in the Vulkan driver. MOCS on Gen9+ is just an array index into a table. But in the hardware packets, the index starts at bit 1. So we need to shift it. Fixes: `0b44644ca6` (genxml: Consistently use a numeric "MOCS" field) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-14 11:03:28 -08:00
Ian Romanick	9a918050e0	spirv: Add missing break Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `c6465fec0c` ("spirv: add SpvCapabilityInt64Atomics") CID: 1442555	2019-02-14 08:35:59 -08:00
Eric Engestrom	c2b4b46fa9	util/tests: compile to something sensible in release builds assert()-based tests make no sense without asserts, so make sure asserts are compiled in, even if the rest of the code has asserts turned off. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-14 12:52:34 +00:00
Eric Engestrom	f7c56475d2	anv/tests: compile to something sensible in release builds assert()-based tests make no sense without asserts, so make sure asserts are compiled in, even if the rest of the code has asserts turned off. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-14 12:52:34 +00:00
Eric Engestrom	4c1ca5b074	etnaviv: drop duplicate #define Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	7f68b38439	st/dri: drop duplicate #define Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	2fa165e757	gbm: drop duplicate #defines Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	f1374805a8	drm-uapi: use local files, not system libdrm There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	69e4c273c4	drm-uapi/README: remove explicit list of driver names These headers are used by a lot more than just the intel drivers nowadays. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Samuel Pitoiset	227df98fa6	radv: fix radv_fixup_vertex_input_fetches() We should check that num_channels is 4, otherwise that breaks the world. Sorry for the short breakage. Fixes: `4b3549c084` ("radv: reduce the number of loaded channels for vertex input fetches") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-14 09:44:35 +01:00
Samuel Pitoiset	4b3549c084	radv: reduce the number of loaded channels for vertex input fetches It's unnecessary to load more channels than the vertex attribute format. The remaining channels are filled with 0 for y and z, and 1 for w. 29077 shaders in 15096 tests Totals: SGPRS: 1321605 -> 1318869 (-0.21 %) VGPRS: 935236 -> 932252 (-0.32 %) Spilled SGPRs: 24860 -> 24776 (-0.34 %) Code Size: 49832348 -> 49819464 (-0.03 %) bytes Max Waves: 242101 -> 242611 (0.21 %) Totals from affected shaders: SGPRS: 93675 -> 90939 (-2.92 %) VGPRS: 58016 -> 55032 (-5.14 %) Spilled SGPRs: 172 -> 88 (-48.84 %) Code Size: 2862740 -> 2849856 (-0.45 %) bytes Max Waves: 15474 -> 15984 (3.30 %) This mostly helps Croteam games (Talos/Sam2017). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:10:56 +01:00
Samuel Pitoiset	210aec3612	radv: store vertex attribute formats as pipeline keys The formats will be used for reducing the number of loaded channels. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:10:09 +01:00
Samuel Pitoiset	45382baef6	radv: use MAX_{VBS,VERTEX_ATTRIBS} when defining max vertex input limits Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:09:51 +01:00
Samuel Pitoiset	2154fac6f3	ac: make use of ac_build_expand_to_vec4() in visit_image_store() And make ac_build_expand() a static function. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:09:48 +01:00
Eric Anholt	338d399fd0	freedreno: Use the NIR lowering for isign. I think this will save an instruction and hopefully not increase any other costs (possibly the immediate -1 and 1?), but I haven't actually tested. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 00:32:30 +00:00
Eric Anholt	8f3694e1ab	intel: Use the NIR lowering for isign. Drops one instruction from fs-sign-int.shader_test. No change in shader-db due to it having 0 instances of sign(genIType). This may hurt isign64 if algebraic runs before int64 lowering, but I wasn't sure how to mark the algebraic opt as "every bit size but 64". v2: Update commit message about shader-db. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2019-02-14 00:32:30 +00:00
Eric Anholt	3f22b35a43	v3d: Use the NIR lowering for isign instead of rolling our own. min/max instead of comparisons saves 2 instructions on fs-sign-int.shader_test.	2019-02-14 00:32:30 +00:00

1 2 3 4 5 ...

107524 Commits All Branches Search

107524 Commits

All Branches