KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Christian Gmeiner	db7967ef9f	etnaviv: add deqp debug option This new debug option will fake some driver CAPs to be able to run dEQP for GLES3. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3351> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3351>	2020-01-11 22:05:35 +00:00
Timur Kristóf	44a6b17df7	aco/wave32: Set the definitions of v_cmp instructions to the lane mask. The output of v_cmp instructions is s1 (a single SGPR) in wave32 mode, as opposed to s2 (an SGPR-pair) in wave64 mode. A couple of cases where this should have been fixed were omitted from the previous patch by mistake. Fixes: `e0bcefc3a0` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-11 20:15:53 +01:00
Alyssa Rosenzweig	59d30fd4bc	pan/midgard: Support indirect UBO offsets ...in case we have arrays in a UBO block that we'd like to access indirectly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352>	2020-01-10 17:48:42 -05:00
Francisco Jerez	c20dc9b836	intel/fs: Make implied_mrf_writes() an fs_inst method. This will be convenient in a later commit enabling SIMD32 fragment shaders, and happens to fix the calculation for MATH instructions which is currently inaccurate for SIMD-lowered instructions on Gen4-5 platforms (all of them on Gen4 in SIMD16 mode), since it was based on the shader's dispatch width rather than on the actual execution size of the instruction. This causes some shader-db noise on Gen4 due to the more compact register allocation interacting with the SEND dependency workarounds, but otherwise no major changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:02:30 -08:00
Francisco Jerez	591f146fd2	intel/fs/cse: Fix non-deterministic behavior due to inaccurate liveness calculation. The liveness calculation done by the local CSE pass in order to prune AEB entries whose sources are no longer live is currently inaccurate, because the live intervals are calculated once at the beginning of the pass, so they don't take into account any of the copy instructions inserted by the CSE pass as it makes progress. However the IP counter used in that calculation is based on the start_ip of the basic block, which is updated automatically whenever any instructions are inserted into the CFG. This causes the IP counter and liveness intervals to get out of sync in programs with multiple basic blocks, causing the CSE pass to toss AEB entries prematurely, which can lead to missed optimization opportunities rather non-deterministically. On BDW this leads to the following shader-db changes: total instructions in shared programs: 14952488 -> 14951763 (-0.00%) instructions in affected programs: 45416 -> 44691 (-1.60%) helped: 40 HURT: 4 total spills in shared programs: 20989 -> 20970 (-0.09%) spills in affected programs: 103 -> 84 (-18.45%) helped: 3 HURT: 0 total fills in shared programs: 24981 -> 24926 (-0.22%) fills in affected programs: 127 -> 72 (-43.31%) helped: 3 HURT: 0 In addition it avoids a number of regressions in combination with some of the optimization changes I'm working on for SIMD32, which would have made CSE more effective... Causing it to be less effective elsewhere in the program astonishingly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:02:06 -08:00
Francisco Jerez	cc0ea482ad	intel/fs: Fix nir_intrinsic_load_barycentric_at_sample for SIMD32. For uniform sample ID, only the first channel of msg_data will be initialized. We need to pass that component only to the SEND message for SIMD lowering to unzip the descriptor source correctly. Fixes several dozens of conformance test failures with SIMD32 fragment shaders enabled, including: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.dynamic_sample_number.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:01:52 -08:00
Francisco Jerez	0703eab012	intel/fs/gen8+: Fix r127 dst/src overlap RA workaround for EOT message payload. The problem occured when the return payload of a SIMD8 SEND instruction was re-used as source payload of an EOT SEND message. In such cases the interference edge added by that workaround between the payload and grf127_send_hack_node would have no effect, because the payload would be allocated to a fixed range of registers containing r127 by the special handling of EOT message payloads in the same function. This would cause things to blow up if the source payload of the first SIMD8 message ended up being allocated to a range which happened to overlap the destination. Fix it by avoiding r127 altogether in the allocation of EOT message payloads. The problem can be reproduced on ICL with the fp-indirections2 Piglit test-case in combination with the other optimizer changes of this series. Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127 for sends dest" Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:00:42 -08:00
Francisco Jerez	0a6e46d44d	intel/fs/gen11+: Handle ROR/ROL in lower_simd_width(). Prevents invalid code from being emitted for ROR/ROL instructions in SIMD32 shaders. The problem can be reproduced with the following tests while forcing SIMD32 to be used for fragment shaders: piglit.shaders.glsl-rotate-left piglit.shaders.glsl-rotate-right However the issue could occur in production already with compute shaders and a workgroup size large enough to trigger SIMD32 dispatch. Fixes: `83fdec0f0d` "intel/compiler: Enable the emission of ROR/ROL instructions" Cc: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:00:24 -08:00
Francisco Jerez	a30bb25a7a	glsl: Fix software 64-bit integer to 32-bit float conversions. The current implementation was broken for any integers between 2^24 and 2^30 (it would return zero for me on ICL). The reason is that for such integers we wouldn't take the 'if (0 <= shiftCount)' early return path, however 'shiftCount + 7' would be positive, leading to a negative 'count' argument passed to __shift64RightJamming(), which would give undefined results. This reworks the affected conversion functions to use either __shortShift64Left() or __shift64RightJamming() based on the sign of the final shift count, which should avoid the problem. In addition this should qualify as a clean-up/optimization -- This implementation of the conversion functions translates to 7 instructions less than the original on Intel hardware. This fixes the 'KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot' conformance tests on soft fp64 hardware with large enough subgroup size (>16). Fixes: `d5cf6e92b4` "glsl: Add built-in functions to do uint64_to_fp32(uint64_t)" Fixes: `c9d333a6b7` "glsl: Add built-in functions to do int64_to_fp32(int64_t)" Cc: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2020-01-10 10:51:58 -08:00
Daniel Schürmann	8b7a42d6d0	aco: compact aco::span<T> to use uint16_t offset and size instead of pointer and size_t. This reduces the size of the Instruction base class from 40 bytes to 16 bytes. No pipelinedb changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>	2020-01-10 17:49:18 +00:00
Daniel Schürmann	ffb4790279	aco: compact various Instruction classes No pipelinedb changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>	2020-01-10 17:49:18 +00:00
Andrii Simiklit	ebaab89761	mesa/st: fix a memory leak in get_version This patch prevents memory leak in get_version function in st_manager.c This issue was found by valgrind: 16 bytes in 1 blocks are definitely lost in loss record 6 of 1,418 at 0x483CD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x63D9476: st_init_extensions (st_extensions.c:1679) by 0x63B803B: get_version (st_manager.c:1271) by 0x63B8124: st_api_query_versions (st_manager.c:1289) by 0x63266EF: dri_init_screen_helper (dri_screen.c:583) by 0x6321B12: dri2_init_screen (dri2.c:2110) by 0x631AACC: driCreateNewScreen2 (dri_util.c:155) by 0x5D58192: dri3_create_screen (dri3_glx.c:897) by 0x5D39829: AllocAndFetchScreenConfigs (glxext.c:815) by 0x5D39C57: __glXInitialize (glxext.c:941) by 0x5D3290A: GetGLXPrivScreenConfig (glxcmds.c:174) by 0x5D34F38: glXQueryExtensionsString (glxcmds.c:1307) Fixes: `eca8032f20` ("gallium: Add ARB_gl_spirv support") Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3345> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3345>	2020-01-10 17:27:39 +00:00
Lasse Lopperi	3de2774dcb	freedreno/drm: Fix memory leak in softpin implementation Free the memory allocated for cmds/reloc_bos array when destoying the associated ringbuffer. For similar fix for the non-softpin implementation see: `d014af98b7` Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2324 Fixes: `f3cc0d2` ("freedreno: import libdrm_freedreno + redesign submit") Signed-off-by: Lasse Lopperi <lasse.lopperi@ge.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342>	2020-01-10 16:21:35 +00:00
Rhys Perry	b5c9688516	aco: limit register usage for large work groups Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-10 12:10:37 +00:00
Timur Kristóf	eccac46cdc	ac/llvm: Fix ac_build_reduce in wave32 mode. Previously, when cluster_size was set to 0, it always worked as if the cluster size was 64. This commit fixes it in wave32 mode by changing to work as if the cluster size was set to 32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-01-10 12:30:44 +01:00
Pierre-Eric Pelloux-Prayer	a5fe84aefb	radeonsi: release saved resources in si_compute_do_clear_or_copy Fixes: `9b331e462e` ("radeonsi: use compute shaders for clear_buffer & copy_buffer") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:40 +01:00
Pierre-Eric Pelloux-Prayer	6912149ee5	radeonsi: release saved resources in si_compute_clear_12bytes_buffer Fixes: `6c901f0675` ("radeonsi: use compute shader for clear 12-byte buffer") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:38 +01:00
Pierre-Eric Pelloux-Prayer	1acf714d57	radeonsi: release saved resources in si_compute_copy_image Fixes: `1b25d340b7` ("radeonsi: use compute for resource_copy_region when possible") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:35 +01:00
Pierre-Eric Pelloux-Prayer	e1e87466ae	radeonsi: release saved resources in si_compute_clear_render_target Fixes: `984fd73515` ("radeonsi: use compute for clear_render_target when possible") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:33 +01:00
Pierre-Eric Pelloux-Prayer	6c019e28ca	radeonsi: release saved resources in si_compute_expand_fmask Fixes: `095a58204d` ("radeonsi: expand FMASK before MSAA image stores are used") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:31 +01:00
Pierre-Eric Pelloux-Prayer	9211cbe07a	radeonsi: release saved resources in si_retile_dcc Fixes: `1f21396431` ("radeonsi: add support for displayable DCC for multi-RB chips") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2330 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:19 +01:00
Samuel Iglesias Gonsálvez	39c1892dd8	main: fix coverity error in _mesa_program_resource_find_name() We did not take into account if name is NULL, so we could dereference a NULL pointer in strncmp() call. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 08:40:00 +01:00
Icecream95	f2f1277624	panfrost: Add negative lod bias support Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-10 06:51:42 +00:00
Gurchetan Singh	daf1d5ad4c	virgl/drm: update UAPI This seems to compile. Header copied over from drm-misc-next 7da5492739db. Acked-by: Eric Engestrom <eric@engestrom.ch>	2020-01-10 04:12:40 +00:00
Vasily Khoruzhick	438c677859	lima: drop support for R8G8B8 format We can only sample from 24-bit packed format and can't render into it and it causes chromium-based browsers to fail when they create FBO with GL_RGB format. Drop R8G8B8 alltogether so mesa can promote it to RGBX format. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-09 18:46:08 -08:00
Jason Ekstrand	9b71171442	anv: Re-use flush_descriptor_sets in flush_compute_state There's no reason to hand-roll all of the memory re-allocation fall-back code for compute shaders. It's just duplicated complexity. This also makes it more clear in flush_compute_state where the MEDIA_INTERFACE_DESCRIPTOR_LOAD command gets emitted relative to other packets in the command stream. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:45:00 -06:00
Jason Ekstrand	ae72d1238c	anv: Flag descriptors dirty when gl_NumWorkgroups is used Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:45:00 -06:00
Jason Ekstrand	ca6b3b11af	anv: Don't add dynamic state base address to push constants on Gen7 Because Gen7 push constants are already relative to dynamic state base address, they aren't really an address. It's deceptive to return an address from the helper function. Instead, let's leave it as a special-case in the gen7-11 helper; we don't need the helper for code de-duplication for Gen7 anyway. Fixes: `67d2cb3e93` "anv: Add get_push_range_address() helper" Closes: #2323 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:44:06 -06:00
Vasily Khoruzhick	044da65f52	lima: add debug flag to disable tiling Add debug flag to disable tiling. Note that it prevents lima from creating tiled buffers, but it's still able to import them if modifier is specified Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-10 01:13:47 +00:00
Vasily Khoruzhick	a533d1d4c6	lima: use linear layout for shared buffers if modifier is not specified Use linear layout for shared buffers if modifier is not specified and use linear layout when importing buffers with invalid modifier. Fixes: `01a451b04d` ("lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()") Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-10 01:13:47 +00:00
Timothy Arceri	87e0dd68f5	glsl: call calculate_subroutine_compat() from the nir linker Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	726e8f24c6	glsl: move calculate_subroutine_compat() to shared linker code We will make use of this in the nir linker in the following patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	c60d0bd92f	glsl: call uniform resource checks from the nir linker Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	05c1f7a154	glsl: move uniform resource checks into the common linker code We will call this from the nir linker in the following patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	b85985dd51	glsl: call check_subroutine_resources() from the nir linker Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	a6fd1c7752	glsl: move check_subroutine_resources() into the shared util code We will make use of this in the nir linker in the following patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Jason Ekstrand	3dec68e682	genxml: Remove a non-existant HW bit	2020-01-09 18:40:20 -06:00
Kristian H. Kristensen	f9d35ea55b	ir3: Set up full/half register conflicts correctly Setting up transitive conflicts between a full register and its two half registers (eg r0.x and hr0.x and hr0.y) will make the half registers conflict. They don't actually conflict and this prevents us from using both at the same time. Add and use a new ra helper that sets up transitive conflicts between a register and its subregisters, except it carefully avoids the subregister conflict. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2020-01-09 16:03:25 -08:00
Dave Airlie	85eed5def3	llvmpipe: add ARB_derivative_control support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2020-01-10 08:43:40 +10:00
Marek Olšák	269953e779	radeonsi/gfx9: force the micro tile mode for MSAA resolve correctly on gfx9 Fixes: `69ea473` "amd/addrlib: update to the latest version" Closes: #2325 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-09 16:28:28 -05:00
Lionel Landwerlin	60e0db3bfb	anv: fix intel perf queries availability writes The availability is not written at the location changed in ee6fbb95a74d... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ee6fbb95a7` ("anv: Properly handle host query reset of performance queries") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-09 20:42:36 +02:00
Dylan Baker	da2fe9c15e	docs: Add release notes for 19.3.2, update calendar and home page	2020-01-09 10:33:49 -08:00
Dylan Baker	2d46a7f26d	docs: add SHA256 sums for 19.3.2	2020-01-09 10:32:18 -08:00
Dylan Baker	d4f237dcce	docs: Add release notes for 19.3.2	2020-01-09 10:32:14 -08:00
Satyajit Sahu	4e3a09db25	radeon/vcn: Handle crop parameters for encoder Set proper cropping parameter if frame cropping is enabled Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Boyuan Zhang boyuan.zhang@amd.com Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3328> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3328>	2020-01-09 15:43:18 +00:00
Daniel Schürmann	cd31da4587	nir: fix printing of var_decl with more than 4 components. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Fixes: `a8ec4082a4` ('nir+vtn: vec8+vec16 support') Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3320> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3320>	2020-01-09 10:31:26 +01:00
Samuel Pitoiset	e298e78a01	radv: advertise VK_AMD_shader_image_load_store_lod This extension allows to use LOD with image read/write operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:34 +01:00
Samuel Pitoiset	4d49a7ac73	aco: handle nir_intrinsic_image_deref_{load,store} with lod Use image_load_mip and image_store_mip respectively if the lod parameter isn't zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Samuel Pitoiset	e77ff89914	amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod Use image_load_mip and image_store_mip respectively if the lod parameter isn't zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Samuel Pitoiset	1b808d208f	spirv,nir: add new lod parameter to image_{load,store} intrinsics SPV_AMD_shader_image_load_store_lod allows to use a lod parameter with OpImageRead, OpImageWrite and OpImageSparseRead. According to the specification, this parameter should be a 32-bit integer. It is initialized to 0 when no lod parameter is found during SPIR-V->NIR translation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00

1 2 3 4 5 ...

119128 Commits All Branches Search

119128 Commits

All Branches