KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rafael Antognolli	bf1577fe09	i965/gen10: Remove warning message. Gen10 seems pretty stable so far, so there's no reason to keep this message. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:09:41 -08:00
Louis-Francis Ratté-Boulianne	aad14cf15a	egl/x11: Fix leak in dri3_create_image_khr_pixmap bp_reply wasn't properly free'd Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-14 11:52:06 +00:00
Iago Toral Quiroga	cb9dbd6dec	i965/compiler: clean up nir_intrinsic_load_input for vertex shaders This code to re-set the type of the source and destination is not necessary since we never manipulate the types. Looks like a left over from a time where we had to retype to float temporarily to handle 64-bit inputs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-14 12:00:14 +01:00
Iago Toral Quiroga	4917d38321	intel/compiler: fix first_component for 64-bit types on vertex inputs Divide it by two as we do for other stages. This is because the component layout qualifier is always in 32-bit units. Fixes issues in a new CTS test (still WIP): KHR-GL45.enhanced_layouts.varying_double_components Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-14 12:00:14 +01:00
Samuel Pitoiset	ad4b58ea70	ac/nir: rename nir_to_llvm_context to radv_shader_context There is still more to do in that area, but it's a good start. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:16 +01:00
Samuel Pitoiset	141db61509	ac: remove nir_to_llvm_context from ac_nir_translate() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:14 +01:00
Samuel Pitoiset	a541117ff4	ac/nir: remove nir_to_llvm_context::nir link Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:12 +01:00
Samuel Pitoiset	e9f0205ca2	ac: move the outputs array to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:10 +01:00
Samuel Pitoiset	07e4268f36	ac/shader: scan force_persample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:08 +01:00
Dave Airlie	b9d2ff05a6	r600: fix regression in gl_FragColor drawing This fixes a regression in the broadcast color to all color bufs case. Fixes: `6c691081a` (r600: fixup sparse color exports.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 14:02:41 +10:00
Dave Airlie	9c9a9bee44	r600: fix array spill if temp[0] is before all arrays I found a shader with DCL TEMP[0], LOCAL DCL TEMP[1..256], ARRAY(1), LOCAL DCL TEMP[257..512], ARRAY(2), LOCAL DCL TEMP[513..768], ARRAY(3), LOCAL DCL TEMP[769], LOCAL This would remap badly, as it would add up all the spilled sizes and subtract it from the temp for 0. If the current temp is less than the array start break out. Fixes: `1d871aa6` (r600g: Implement spilling of temp arrays (v2)) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:37:59 +10:00
Dave Airlie	8f2656c75b	virgl: add ARB_sample_shading support. This enable ARB_sample_shading if the renderer supports it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:06:07 +10:00
Dave Airlie	9b95b70719	virgl: add ARB_draw_indirect support. This relies on the renderer code landing first. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:06:07 +10:00
Roland Scheidegger	f6718baabc	tgsi: Recognize RET in main for tgsi_transform Shaders coming from dx10 state trackers have a RET before the END. And the epilog needs to be placed before the RET (otherwise it will get ignored). Hence figure out if a RET is in main, in this case we'll place the epilog there rather than before the END. (At a closer look, there actually seem to be problems with control flow in general with output redirection, that would need another look. It's enough however to fix draw's aa line emulation in some internal bug - lines tend to be drawn with trivial shaders, moving either a constant color or a vertex color directly to the output). v2: add assert so buggy handling of RET in main is detected Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-14 02:06:54 +01:00
Bas Nieuwenhuizen	7461bd5b8f	ac: Use the renumbered const address space for LLVM 7. The LLVM AMDGPU backend decided to renumber the constant address space .... Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-14 01:05:03 +01:00
Dave Airlie	9ddacd9af4	gallium: drop all the guard band float caps. Nobody queries these and nobody sets them to anything useful, the docs say TODO. Drop them until a use appears. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 08:50:08 +10:00
Vadym Shovkoplias	a553c54abf	mesa: add glsl version query (v4) Add support for GL_NUM_SHADING_LANGUAGE_VERSIONS and glGetStringi for GL_SHADING_LANGUAGE_VERSION v2: - Combine similar functionality into _mesa_get_shading_language_version() function. - Change GLSL version return mechanism. v3: - Add return of empty string for GLSL ver 1.10. - Move _mesa_get_shading_language_version() function to src/mesa/main/version.c. v4: - Add OpenGL version check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104915 Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 13:24:31 -07:00
Brian Paul	b08d718703	mesa: add missing switch case for EXTRA_VERSION_40 in check_extra() The EXTRA_VERSION_40 predicate is tested as part of extra_gl40_ARB_sample_shading but there was no switch case for it. Fixes: `77b440e42d` ("mesa: Add new functions and enums required by GL_ARB_sample_shading") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-13 10:35:55 -07:00
Mark Janes	e5809788d6	mesa: fix compile failure Missing header triggered a failure in i965 CI buildtest project. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067 Fixes: `e149a0253c`	2018-02-13 00:22:05 -08:00
Mark Janes	d9de7aaca3	Partially revert "mesa: use GLenum16 in a few more places" This reverts part of commit `ca721b3d89`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067	2018-02-13 00:22:05 -08:00
Mark Janes	3e5758a70a	Revert "mesa: reduce the size of gl_texture_image" This reverts commit `f4ea2b2a9e`. Several members reduced in size by the offending commit are not large enough to store the data needed by the i965 driver. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067	2018-02-13 00:22:05 -08:00
Dave Airlie	db5f422169	i965: fix tessellation regressions with gl_state_index16 Looks like one conversion was missed. Fixes: `e149a0253` (mesa,glsl,nir: reduce gl_state_index size to 2 bytes) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067 Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-02-12 23:05:16 -08:00
Stéphane Marchesin	5e4a2b394e	virgl: Support v2 caps struct (v2) This struct allows us to report: - accurate max point size/line width. - accurate texel and texture gather offsets - vertex/geometry limits. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-13 14:23:54 +10:00
Timothy Arceri	10457712ed	ac/nir: add nir_intrinsic_{load,store}_shared support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	c787cbfa33	ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	b6cf898ec2	radeonsi: make si_declare_compute_memory() more generic and call for nir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	94fa090fad	st/glsl: set req_local_mem earlier for compute shaders Without this change it will never be set for backends using nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-13 14:43:05 +11:00
Marek Olšák	6b1e26e181	mesa: move STATE_LENGTH to shader_enums.h and use it everywhere Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	f4ea2b2a9e	mesa: reduce the size of gl_texture_image 80 -> 40 bytes. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	4794fbc86e	mesa: reduce the size of gl_program_parameter 40 -> 24 bytes, which includes the gl_state_index16 change. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	e149a0253c	mesa,glsl,nir: reduce gl_state_index size to 2 bytes Let's use the new gl_state_index16 type everywhere and remove the typecasts. This helps reduce the size of gl_program_parameter. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	a7882013d3	mesa: reduce the size of gl_viewport_attrib All drivers convert these to float, so there is no reason to use double. The piglit test that expects double precision from glGet will be adjusted not to require it (there is a piglit patch). gl_context::ViewportArray: 512 -> 384 bytes Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	d7550d783a	mesa: reduce the size of gl_texture_object Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	65ed98839b	mesa: reduce the size of gl_program gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	78f1decc95	mesa: reduce the size of gl_image_unit (v2) gl_context::ImageUnits: 6144 -> 4608 bytes v2: use ASSERT_BITFIELD_SIZE Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	ca5c5d96d8	mesa: further reduce the size of ctx->Texture Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	78043a75f6	mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8 GL allows doing glTexEnv on 192 texture units, while in reality, only MaxTextureCoordUnits units are used by fixed-func shaders. There is a piglit patch that adjusts piglits/texunits to check only MaxTextureCoordUnits units. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	07c10cc59c	mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	79aca14f5f	mesa: inline init_texture_unit because this is going to be changed Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	ca721b3d89	mesa: use GLenum16 in a few more places Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Jason Ekstrand	4c77e21c81	anv: Move setting current_pipeline to cmd_state_init We were setting current_pipeline to UINT32_MAX and then calling cmd_cmd_state_reset which memsets the entire state struct to 0 which implicitly resets current_pipeline to 3D. I have no idea how this hasn't caused everything to explode. Fixes: `cd3feea745` "anv/cmd_buffer: Rework anv_cmd_state_reset" cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-12 15:18:23 -08:00
Jason Ekstrand	f37bd726c7	anv: Don't resolve or ambiguate non-existent layers The previous code was trying to avoid non-existent layers by taking a MAX with anv_image_aux_layers. Unfortunately, it wasn't taking into account that layer_count starts at base_layer which may not be zero. Instead, we need to subtract base_layer from anv_image_aux_layers with a guard against roll-over. Fixes: `de3be61801` "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-12 15:14:57 -08:00
Daniel Stone	c2c4e5bae3	i965: Fix bugs in intel_from_planar This commit fixes two bugs in intel_from_planar. First, if the planar format was non-NULL but only had a single plane, we were falling through to the planar case. If we had a CCS modifier and plane == 1, we would return NULL instead of the CCS plane. Second, if we did end up in the planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID, we would end up segfaulting in isl_drm_modifier_has_aux. Cc: mesa-stable@lists.freedesktop.org Fixes: `8f6e54c929` Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-12 15:14:45 -08:00
Eric Anholt	1aed66dc1e	radv: Fix compiler warning about uninitialized 'set' The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:47 +00:00
Eric Anholt	21670f8208	glsl/tests: Fix strict aliasing warning about int64/double. Fixes: `4bf9862747` ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2018-02-12 20:48:43 +00:00
Eric Anholt	091bff8317	ac/nir: Fix compiler warning about uninitialized dw_addr. Even switching the def's condition to be the same chip revision check as the use, the compiler doesn't figure it out. Just NULL-init it. Fixes: `ec53e52742` ("ac/nir: Add ES output to LDS for GFX9.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:29 +00:00
Eric Anholt	7a83be4b28	gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax. My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't notice that ddmax is used from the same no_rho_opt as its initialization. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-12 20:48:18 +00:00
Kenneth Graunke	bd87bd178c	anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf. The kernel used to have execbuf parameters to program the INSTPM bit for whether 3DSTATE_CONSTANT_* should be relative to dynamic state base address or an absolute address. However, they never worked in the presence of hardware contexts, so I deleted them a while back. It doesn't make sense to set this flag, as it doesn't exist anymore. It also never did anything anyway - the flag is zero, so \|'ing it in did nothing. The default is relative anyway. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-12 07:00:41 -08:00
Eric Engestrom	111d4bf1d0	r200: remove left over dead code `0aaa27f291` removed the references to this array without removing the array itself Cc: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0aaa27f291` "mesa: Pass the translated color logic op dd_function_table::LogicOpcode" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-02-12 11:19:44 +00:00
Samuel Pitoiset	f4e85ba93f	ac/nir: remove backlink to nir_to_llvm_context Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:39 +01:00
Samuel Pitoiset	be5f6eb13e	ac/nir: remove nir_to_llvm_context::module Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:36 +01:00
Samuel Pitoiset	90a815ddeb	ac/nir: remove nir_to_llvm_context::builder Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:34 +01:00
Samuel Pitoiset	759acfa180	ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:31 +01:00
Samuel Pitoiset	e7373a6498	ac/nir: drop nir_to_llvm_context from visit_var_atomic() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:29 +01:00
Samuel Pitoiset	485346b05a	ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:27 +01:00
Samuel Pitoiset	cd6dfacda9	ac/nir: drop nir_to_llvm_context from visit_load_push_constant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:25 +01:00
Samuel Pitoiset	5c9e398c83	ac/nir: drop nir_to_llvm_context from cast_ptr() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:23 +01:00
Samuel Pitoiset	5ef5944848	ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:21 +01:00
Samuel Pitoiset	da8b0b8264	ac/nir: drop nir_to_llvm_context from emit_f2f16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:19 +01:00
Samuel Pitoiset	e32f374944	ac: remove unused parameters in abi::load_tess_coord() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:17 +01:00
Samuel Pitoiset	1e69db003d	ac/nir: remove useless bitcast in load_tess_coord() nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:15 +01:00
Samuel Pitoiset	ed179fbdf3	ac: add load_resource() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:13 +01:00
Samuel Pitoiset	ecf229706f	ac: add load_sample_mask_in() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:11 +01:00
Samuel Pitoiset	0f48eeea05	ac: move view_index to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:09 +01:00
Samuel Pitoiset	0efbede949	ac: move push_constants to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:07 +01:00
Samuel Pitoiset	460d3ce726	ac: move tg_size to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:04 +01:00
Samuel Pitoiset	054c92190c	ac/nir: remove unused nir_to_llvm_context:{defs,phis} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:02 +01:00
Eric Anholt	0b97eb02b0	egl/gbm: Fix compiler warning about visual matching. The compiler doesn't know that num_visuals > 0. Fixes: `37a8d907cc` ("egl/gbm: Ensure EGLConfigs match GBM surface format") Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-12 09:16:44 +00:00
Rob Clark	831fb29252	freedreno: small fix for flushing dependent batches Flush a resource's previous write_batch synchronously. Because a resource's associated batches are not updated until after the flush thread submits rendering to the kernel, this was causing a bit of confusion in the following loop. This fixes a bug that appeared with recent stk. Perhaps we need to re-work things a bit to clear out dependent patches in the ctx's thread and use a fence to deal with the period between when a flush is queued and when it is submitted to the kernel. But this will do until time permits a larger refactor. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	c57ed8e01c	freedreno/ir3: intra-block scheduling Because of loops, we can't schedule all of a block's predecessors first. Instead just assume that the result consumed in a block was written far enough away in all paths into a block. And do an intra-block scheduling pass to figure out if there are any cases where we need to insert extra nop's. This works out better than always assuming the worst case (ie. that a value live into a block was written in the last instruction in the predecessor block). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	2a2099a875	freedreno/ir3: "boost" the depth of if/else condition Account for the move to predicate register, to try to avoid needing to insert extra NOPs later. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ffb00f6841	freedreno/ir3: account for arrays in delayslot calc Normally false-deps are not something to consider, since they mostly exist for delay-slot related reasons: * barriers * ordering writes after read * SSBO/image access ordering The exception is a false-dependency on an array store. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	f54d2b4f10	freedreno/ir3: more clever legalize algorithm Previously we didn't handle flow control in legalize, and instead just set (ss)(sy) on the first instruction in every block. Which isn't very clever. Instead, consider output state of all predecessor blocks, so we only set a sync bit if needed for any possible path leading into a block. Because of loops, we can't require that all successor blocks are legalized before a given block, so instead run in a loop until results converge. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	015afb6a38	freedreno/ir3: track block predecessors Useful in the following patches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	76440fcca9	freedreno/ir3: clean up dangling false-dep's Maybe there is a better way for this.. where it comes useful is "array" loads, which end up as a false-dep for a later array store. If all the uses of an array load are CP'd into their consumer, it still leaves the dangling array load, leading to funny things like: mov.u32u32 r5.y, r0.y mov.u32u32 r5.y, r0.z Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	aea223741f	freedreno/ir3: handle IMMED for mad 2nd src special case Consider also immediates for swapping the first two srcs, because they can be lowered to constant. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	242a8a1957	freedreno/ir3: remove ir3 phi instruction Now that we convert phi webs to ssa, we can drop all this. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a7b569d60c	freedreno/ir3: remove lower_if_else pass Now that it is unused. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	268ab05484	freedreno/ir3: add experimental GCM pass Generally seems to do worse on instruction count and register usage, according to shader-db. But shader-db also doesn't do a very good job of weighting loop bodies, so that might not be totally valid. So add an env variable to enable GCM pass for easier experimentation. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	4c15c53d91	freedreno/ir3: change opt passes There are more useful nir passes added since initial conversion to nir. But ir3 was never updated to use them. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ec8bc54ad2	freedreno/ir3: use peephole select pass Agressively lowering all if/else to selects in some extreme cases results in much higher register pressure. Using peephole select instead with a modest threshold speeds up alu2 4x! 16 seems like a good limit, low enough to help alu2 but not too low that it penalizes everything else. With a bit better scheduling of the instruction that moves a value into a predicate register, we might be able to lower this limit a bit more in the future, but since we need 6 cycles from the move to predicate register to predicated branch, that puts some sort of lower bound on how far we can lower this threshold. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a7ea2b4eba	freedreno/ir3: lower phi webs to regs nir's from_ssa pass is much better at avoiding inserting extra moves than our logic is. And lowering phi webs to regs just treats anything involved in a phi web as an array of length=1. Which with previous array related fixes in RA/etc ends up working out quite well. This cuts down on extra instructions and also helps with register pressure. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	0a6ddf964f	freedreno/ir3: separate arrays from groups Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	55f14a1ac4	freedreno/ir3: make block/instruction serialno per-shader Makes it easier to compare values seen in-game (where there are many shaders) to cmdline standalone compiler. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	5a7de94392	freedreno/ir3: add spirv support to cmdline compiler Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	942341bcd0	freedreno/ir3: don't lower fsat Instead, if possible fold (sat) flag into src, otherwise use: (sat)max.f rD, rS, rS Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	b2fc94f074	freedreno/ir3: add encoding/decoding for (sat) bit Seems to be there since a3xx, but we always lowered fsat. But we can shave some instructions, especially in shaders that use lots of clamp(foo, 0.0, 1.0) by not lowering fsat. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	1b658533e1	freedreno/ir3: extend liverange of arrays Use livein state of other blocks to extend liverange of arrays when they are still needed by successor blocks. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ac459a6f7f	freedreno/ir3: avoid extra mov's for "arrays" Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	2bc3fb6992	freedreno/ir3: a couple more array fixes (Plus a couple TODOs) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	8ea1ef4191	freedreno/ir3: keep array stores Since these are not in SSA form, add to block's keeps so it doesn't appear unused. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	c60f150d56	freedreno/ir3: propagate barrier information When eliminating movs, the instruction that is now directly using the src of the mov has the same scheduling order constraints as the original mov instruction. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	98702c1010	freedreno/ir3: remove pointless statement Function ends after this if/else ladder, so it was pointless. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	930ca0e038	freedreno/ir3: some more debug prints Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a84e324847	freedreno/ir3: fix printing of relative branch offsets The number of bits depends on generation. But printing negative values with a5xx encoding (largest size) but compiling for a3xx or a4xx, would result in negative values printed as large positive values. I guess in practice huge negative branch offsets aren't likely (and if that is the case, the shader is probably too big to grok by reading the assembly). So just print using smallest bitfield size. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a5c28fe07b	freedreno/ir3: be more clever with if/else jumps Try to clean up things like: br !p0.x #2 br p0.x #something to eliminate the first branch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	44dd7dcd2f	freedreno/ir3: avoid some spurious sync bits Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	069c0ac625	freedreno/ir3: print # of sync bits for shaderdb When trying to optimize to reduce stalls, it is nice to see this info. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	7d45e2e39f	freedreno: add debug trace for flush Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Grazvydas Ignotas	9b9a89cd79	intel/compiler: fix 64bit value prints on 32bit Fix the following: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-10 17:59:02 +02:00
Timothy Arceri	ff0e3fa1fe	st/glsl_to_nir: remove unused options variable	2018-02-10 11:06:55 +11:00
Timothy Arceri	8f378c116e	st/radeonsi: enable disk cache for nir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	bc9d9f9b86	st: add nir shader disk cache support v2: include compute shader support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	97efdc0d57	st/glsl_to_tgsi: move nir detection earlier We move the nir check before the shader cache call so that we can call a nir based caching function in a following patch. Also with this change we simply check if vertex shaders support NIR rather than looping over the stages as mixing of shader types is not supported anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	b5e23887fe	radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead. This change indirectly enables NIR support for compute shaders on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	73f1d6f0c1	r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support in clover. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	51f484bb44	clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR for compute shaders, so we let clover pick the one it wants to use. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	3af4f34e61	r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs Acked-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	ce836487b8	radeonsi/nir: add depth layout to scan pass Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	6a8efbe652	radeonsi/nir: add FRAG_RESULT_COLOR to scan pass Fixes a number of draw buffers piglit tests. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	ef8082baf8	ac: convert nir_op_f2f32 src to a float Fixes the following piglit test: ./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo Where we would end up with the nir such as: vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10 vec1 32 ssa_12 = f2f32 ssa_2 And our pack_64_2x32_split nir to llvm code always produces a 64bit integer as output. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	1b1e5f8edf	ac: fix some 64bit unpack asserts Previously the asserts did not take swizzles into account. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Mark Janes	9a05c66feb	Revert "i965: prevent potentially null pointer access" This reverts commit `712332ed54`, which caused over 90k failures in Mesa i965 CI. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-09 09:46:07 -08:00
Daniel Stone	37a8d907cc	egl/gbm: Ensure EGLConfigs match GBM surface format When we create an EGL window surface on a GBM surface, ensure that the EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB interchange. For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888 gbm_surface (and vice-versa) are acceptable, but rendering with an XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be rejected. This was previously allowed through; when 10bpc formats were enabled, clients which picked a completely random EGL config and hoped/assumed they were XRGB8888 would break. If you have bisected a failure to start a GBM/KMS client to this commit, please look at its EGLConfig selection (e.g. through eglChooseConfigs), and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the attribs for config selection. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	8174e5b49e	egl/gbm: Remove duplicate format table Now that we have mask/channel information in gbm_dri's format conversion table, we can remove the copy in EGL. As this table contains more formats (notably including R8 and RG8, which can be used for BO but not surface allocation), we now compare the masks of all channels when trying to find a suitable config. Without doing this, an XRGB8888 EGLConfig would match on an R8 format. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	314714ac53	gbm/dri: Expose visuals table through gbm_dri_device Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	2ed344645d	gbm/dri: Add RGBA masks to GBM format table Eventually, we can replace the visuals list inside GBM EGL driver with this one. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	4732094cff	egl/wayland: Use an array for modifiers Each Wayland EGLDisplay currently contains a struct with one vector of modifiers per format, hardcoded in the header. To allow easier support for more formats, turn this into an array of u_vectors which is opaque outside of platform_wayland.c. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	5bc49d4cbf	egl/wayland: Remove has_format enum Instead of the has_format enum, use an index into the visual array. This makes adding new formats less typing. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	d32b23f383	egl/wayland: Add bpp to visual map Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation, had their own format -> bpp lookup tables. Replace these with a lookup into the visual map. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	4de98a9c07	egl/wayland: Use visual map for DRIImage<->FourCC map When trying to translate between DRIImage format enums and FourCC codes, use our visual map rather than an open-coded subset. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	68a80c11bd	egl/wayland: Use visual map for format advertisement Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	3323ce72ff	egl/wayland: Use visual map for buffer_from_image When creating a wl_buffer on an upstream Wayland display from an existing EGLImage, use the dri2_wl_visual map rather than another hardcoded list of formats. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	a9cc4edb60	egl/wayland: Use visual map for config->format lookup Having hoisted the format -> config map into common code, we now use it for config -> format lookups. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	1dc013f1ee	egl/wayland: Add format enums to visual map Extend the visual map from only containing names and bitmasks, to also carrying the three format enums we need. These are the DRIImage format tokens for internal allocation, FourCC codes for wl_drm and dmabuf protocol, and wl_shm codes for swrast drivers. We will later use these formats to eliminate a bunch of open-coded conversions. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	66912641df	egl/wayland: Use proper enum type in visual definition No semantic change. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	845c2f6156	egl/wayland: Widen channel masks to bpp Widen the channel masks given in the visual table to the full width of the pixel format, i.e. as many leading zeros as required. No functional change. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	19cbca38e4	egl/wayland: Hoist format <-> EGLConfig definition up Pull the mapping between Wayland formats and EGLConfigs up to the top level, so we can reuse it elsewhere. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	4fbd2d50b1	egl/wayland: Fix ARGB/XRGB transposition in config map When `0b2b719121` moved from an if tree to a struct to map between wl_drm formats and EGLConfigs, it transposed the mapping between XRGB and ARGB. Luckily, everyone exposes both formats, so this is harmless. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `0b2b719121` ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:06 +00:00
Marek Olšák	76085f2048	st/mesa: generate blend state according to the number of enabled color buffers Non-MRT cases always translate blend state for 1 color buffer only. MRT cases only check and translate blend state for enabled color buffers. This also avoids an assertion failure in translate_blend for: dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Marek Olšák	c446dd7927	st/mesa: don't translate blend state when color writes are disabled Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Marek Olšák	3d06c8afb5	st/mesa: don't translate blend state when it's disabled for a colorbuffer Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Lionel Landwerlin	712332ed54	i965: prevent potentially null pointer access Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> CID: 1418110	2018-02-09 14:02:59 +00:00
Mark Thompson	5db29d62ce	st/va: Make the vendor string more descriptive Include the Mesa version and detail about the platform. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-02-09 13:37:43 +01:00
Mark Thompson	768f1487b0	st/va: Enable vaExportSurfaceHandle() It is present from libva 2.1 (VAAPI 1.1.0 or higher). Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-02-09 13:37:36 +01:00
Tapani Pälli	41c5bf3836	disk cache: move path creation back to constructor This patch moves disk cache path and index creation back to the constructor which matches previous behavior. We still allow create to succeed without path so that cache can be used with callback functionality. Fixes: c95d3ed091 "disk cache: create cache even if path creation fails" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 11:33:25 +02:00
Samuel Pitoiset	3a2bb4db23	ac/nir: compute correct number of user SGPRs on GFX9 For merged shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:16:04 +01:00
Michel Dänzer	171076f082	st/mesa: Initialize tex_target in compile_tgsi_instruction Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the variable type was changed to enum tgsi_texture_type. Fixes a bunch of piglit failures with radeonsi, e.g.: gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed. Corresponding compiler warning: CXX state_tracker/st_glsl_to_tgsi.lo ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context, uint, ureg_program, glsl_to_tgsi_visitor, const gl_program, GLuint, const ubyte, const ubyte, const ubyte, const ubyte, const ubyte, GLuint, const ubyte, const ubyte, const ubyte)’: ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized] ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src, ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ inst->buffer_access, ~~~~~~~~~~~~~~~~~~~~ tex_target, inst->image_format); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here enum tgsi_texture_type tex_target; ^~~~~~~~~~ Fixes: `9f9ce1625f` ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:26:40 +01:00
Alejandro Piñeiro	f32b01ca43	glsl/linker: remove ubo explicit binding handling This is already handled at link_uniform_blocks, specifically at process_block_array_leaf. Additionally, this code was not handling correctly arrays of arrays. When creating the name of the block to set the binding, it only took into account the first level, so any attempt to set a explicit binding on a array of array ubo would trigger an assertion. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 08:32:42 +01:00
Mathias Fröhlich	77cb2fc0bd	mesa: Only update enabled VAO gl_vertex_array entries. Instead of updating all modified gl_vertex_array_object::_VertexArray entries just update those that are modified and enabled. Also release buffer object from the _VertexArray that belong to disabled attributes. v2: Also set Ptr and Size to zero. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-09 04:26:23 +01:00
Mathias Fröhlich	437cae411e	gallium: Mute arrays for several meta like callbacks. Set the _DrawArray pointer to NULL when calling into the Drivers Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks. This fixes an assert that gets uncovered when the following patch gets applied. v2: Mute from within the state tracker instead of generic mesa. v3: Avoid evaluating _DrawArrays from within st_validate_state. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 04:26:13 +01:00
Mathias Fröhlich	2f9eb0aad5	mesa: Fix VAO buffer object tracking. When changing the attribute binding in the VAO we also need to account for getting rid of non vbo bits from VertexAttribBufferMask. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-09 04:21:36 +01:00
Timothy Arceri	d8bca3809d	radeonsi/nir: gather some missing fs info Fixes some early-z arb_shader_image_load_store piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Timothy Arceri	c77078c942	ac: pass struct ac_llvm_context to emit_membar() Fixes segfault in piglit test: ./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Marek Olšák	12fd567c78	radeonsi: copy the NIR enablement debug bit to the shader cache flags When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 02:01:45 +01:00
Jason Ekstrand	8f20cf166e	intel/blorp: Use isl_aux_op instead of blorp_hiz_op Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1e941a0528	intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1810f965c8	anv: Allow fast-clearing the first slice of a multi-slice image Now that we're tracking aux properly per-slice, we can enable this for applications which actually care. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	de3be61801	anv/cmd_buffer: Rework aux tracking This commit completely reworks aux tracking. This includes a number of somewhat distinct changes: 1) Since we are no longer fast-clearing multiple slices, we only need to track one fast clear color and one fast clear type. 2) We store two bits for fast clear instead of one to let us distinguish between zero and non-zero fast clear colors. This is needed so that we can do full resolves when transitioning to PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear values in all sorts of places we wouldn't normally. 3) We now track compression state as a boolean separate from fast clear type and this is tracked on a per-slice granularity. The previous scheme had some issues when it came to individual slices of a multi-LOD images. In particular, we only tracked "needs resolve" per-LOD but you could do a vkCmdPipelineBarrier that would only resolve a portion of the image and would set "needs resolve" to false anyway. Also, any transition from an undefined layout would reset the clear color for the entire LOD regardless of whether or not there was some clear color on some other slice. As far as full/partial resolves go, he assumptions of the previous scheme held because the one case where we do need a full resolve when CCS_E is enabled is for window-system images. Since we only ever allowed X-tiled window-system images, CCS was entirely disabled on gen9+ and we never got CCS_E. With the advent of Y-tiled window-system buffers, we now need to properly support doing a full resolve of images marked CCS_E. v2 (Jason Ekstrand): - Fix an bug in the compressed flag offset calculation - Treat 3D images as multi-slice for the purposes of resolve tracking v3 (Jason Ekstrand): - Set the compressed flag whenever we fast-clear - Simplify the resolve predicate computation logic Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	2cbfcb205e	anv/cmd_buffer: Move the mi_alu helper higher up Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	2e69045c4d	anv/image: Simplify some verbose commennts Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	f0523f70ef	anv: Use blorp_ccs_ambiguate instead of fast-clears Even though the blorp pass looks a bit on the sketchy side, the end result in the Vulkan driver is very nice. Instead of having this weird case where you do a fast clear and then maybe have to resolve, we just do the ambiguate and are done with it. The ambiguate does exactly what we want of setting all the CCS values to 0 which puts it into the pass-through state. This should also improve performance a bit in certain cases. For instance, if we did a transition from UNDEFINED to GENERAL for a surface that doesn't have CCS enabled all the time, we would end up doing a fast-clear and then a full resolve which ends up touching every byte in the main surface as well as the CCS. With the ambiguate pass, that transition only touches the CCS. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	84fd2ebfbc	anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	3ef8c4b2f5	anv/cmd_buffer: Pull the undefined layout condition into the if Now that this isn't a multi-case if and it's just the one case, it's a bit clearer if the condition is just part of the if instead of being pulled out into a boolean variable. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	857b5b5a7f	intel/blorp: Add a CCS ambiguation pass This pass performs an "ambiguate" operation on a CCS-compressed surface by manually writing zeros into the CCS. On gen8+, ISL gives us a fairly detailed notion of how the CCS is laid out so this is fairly simple to do. On gen7, the CCS tiling is quite crazy but that isn't an issue because we can only do CCS on single-slice images so we can just blast over the entire CCS buffer if we want to. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	13b621d6fd	anv: Only fast clear single-slice images The current strategy we use for managing resolves has an issues where we track clear colors and the need for resolves per-LOD but we still allow resolves of only a subset of the slices in any given LOD and doing so sets the "needs resolve" flag for that LOD to false while leaving the remaining layers unresolved. This patch is only the first step and does not, by itself fix anything. However, it's fairly self-contained and splitting it out means any performance regressions should bisect to this nice obvious commit rather than to the giant "rework aux tracking" commit. Nanley and I did some testing and none of the applications we tested even tried to fast-clear anything other than the first slice of an image. The test was done by adding a printf right before we call blorp_fast_clear if we were every going to touch any slice other than the first with a fast-clear. Due to the way the original code was structured, this would not have included applications which only cleared a subset of layers. The applications tested were: * All Sascha Willems demos * Aztec Ruins * Dota 2 * The Talos Principle * Mad Max * Warhammer 40,000: Dawn of War III * Serious Sam Fusion 2017: BFE While not the full list of shipping applications, it's a pretty good spread and covers most of the engines we've seen running on our driver. If this is ever shown to be a performance problem in the future, we can reconsider our strategy. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	571ed588ac	anv/cmd_buffer: Add a mark_image_written helper Currently, this helper does nothing but we call it every place where an image is written through the render pipeline. This will allow us to properly mark the aux state so that we can handle resolves correctly. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	9876d6f0ef	anv/blorp: Add src/dst_level helper variables in CmdCopyImage Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	c180c2c868	anv/cmd_buffer: Add an anv_genX_call macro This is copied and pasted from the similar macro we added to ISL. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	ab7543b13d	anv/cmd_buffer: Generalize transition_color_buffer This moves it to being based on layout_to_aux_usage instead of being hard-coded based on bits of a priori knowledge of how transitions interact with layouts. This conceptually simplifies things because we're now using layout_to_aux_usage and layout_supports_fast_clear to make resolve decisions so changes to those functions will do what one expects. There is a potential bug with window system integration on gen9+ where we wouldn't do a resolve when transitioning to the PRESENT_SRC layout because we just assume that everything that handles CCS_E can handle it all the time. When handing a CCS_E image off to the window system, we may need to do a full resolve if the window system does not support the CCS_E modifier. The only reason why this hasn't been a problem yet is because we don't support modifiers in Vulkan WSI and so we always get X tiling which implies no CCS on gen9+. This patch doesn't actually fix that bug yet but it takes us the first step in that direction by making us actually pick the correct resolve op. In order to handle all of the cases, we need more detailed aux tracking. v2 (Jason Ekstrand): - Make a few more things const - Use the anv_fast_clear_support enum v3 (Jason Ekstrand): - Move an assert and add a better comment Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	151771b390	anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	bea7373c92	anv/image: Support color aspects in layout_to_aux_usage Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	b09464db42	anv/image: Add a helper for determining when fast clears are supported v2 (Jason Ekstrand): - Return an enum instead of a boolean v3 (Jason Ekstrand): - Return ANV_FAST_CLEAR_NONE instead of false (Topi) - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE - Add documentation for the enum values v4 (Jason Ekstrand): - Remove a dead comment Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1f7eee6bc1	anv/image: Update a comment This got lost in all of the aspect vs. plane rebasing of YCBCR. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	5c38ab8f07	anv/blorp: Rework HiZ ops to look like MCS and CCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1d473e26f2	anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what we want for blits/copies. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	42f1668a54	anv/blorp: Rework image clear/resolve helpers This replaces image_fast_clear and ccs_resolve with two new helpers that simply perform an isl_aux_op whatever that may be on CCS or MCS. This is a bit cleaner as it separates performing the aux operation from which blorp helper we have to call to do it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	482c24783e	intel/isl: Codify AUX operations in an enum Right now, we have different entrypoints and enums in blorp for these different operations. This provides us a central enum which we can begin to transition to. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Gert Wollny	c36172e387	r600/sb: Check whether optimizations would result in reladdr conflict v2: * Check whether the node src and dst registers are NULL before using them. * fix a type in the commit message. Two cases are handled with this patch: 1. If copy propagation tries to eliminated a move from a relative array access then it could optimize MOV R1, ARRAY[RELADDR_1] MOV R2, ARRAY[RELADDR_2] OP2 R3, R1 R2 into OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2] which is forbidden, because there is only one address register available. 2. When MULADD(x,a,MUL(x,c)) is handled MUL TMP, R1, ARRAY[RELADDR_1] MULLADD R3, R1, ARRAY[RELADDR_2], TMP by folding this into ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1] MUL R3, R1, TMP which is also forbidden. Test for these cases and reject the optimization if a forbidden combination of relative access would be created. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:00:38 +10:00
Glenn Kennard	1d871aa626	r600g: Implement spilling of temp arrays (v2) Pessimistically spills arrays if GPR limit is exceeded. v2: fix r600 support [airlied] Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:26 +10:00
Dave Airlie	22fc5eff80	r600/sb: handle scratch mem reads on r600 On r600 we use the scratch mem with read/read_ind, in that case sb should track the rw_gpr as a dst instead of a src. This stops the whole shader being optimised out. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:21 +10:00
Glenn Kennard	cd34deb585	r600g/sb: Add dependency tracking for scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:19 +10:00
Glenn Kennard	a100d906b2	r600g/sb: Support scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:16 +10:00
Glenn Kennard	6b4303f358	r600g: Implement scratch buffer state management (v2) v2: add Glenn's fixes Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:12 +10:00
Glenn Kennard	9d31596d7a	r600g: Add pending output function Spills have to happen after the VLIW bundle currently processed, so defer emitting the spill op. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:08 +10:00
Glenn Kennard	9c48a139b0	r600g: Support emitting scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:52:48 +10:00
Dave Airlie	2a891ed190	r600: fix texture gather swizzling. This fixes: KHR-GL45.texture_gather.swizzle on cayman and redwood. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:32:20 +10:00
Timothy Arceri	12a2350e6d	ac: add 64bit support to ac_find_lsb() v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	a9f6b392c7	ac: move get_elem_bits() to ac_llvm_build.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	19f9839f0b	ac: add 64bit bitCount support v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Samuel Pitoiset	bb750d265c	ac/nir: clean up handle_fs_outputs_post() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:33 +01:00
Samuel Pitoiset	528bc14fa5	ac/nir: add radv_load_output() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:30 +01:00
Samuel Pitoiset	834d9845ca	ac/shader: scan info about output PS declarations NIR->LLVM should only be a translation pass, and all scan stuff should be done before. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:27 +01:00
Samuel Pitoiset	a8e04e91de	ac/nir: add radv_export_param() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:26 +01:00
Samuel Pitoiset	e3cfd6b805	ac/nir: remove set but unused export_mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:24 +01:00
Samuel Pitoiset	724136d590	ac/nir: remove dead code in handle_vs_outputs_post() The memcpy can't be reached because the condition is always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:22 +01:00
Samuel Pitoiset	c63d8d0284	ac/nir: remove useless check in si_llvm_init_export_args() values can't be NULL because we use ac_build_export_null() now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:20 +01:00
Samuel Pitoiset	26ab5a4269	ac/nir: use ac_build_export_null() The number of enabled channels should be 0 when exporting null. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:11:44 +01:00
Samuel Pitoiset	bd9f7b7635	ac: add ac_build_export_null() helper Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 22:11:42 +01:00
Scott D Phillips	1f4d2433e7	meson: Add build option for tools Add a build option to control building some of the misc tools we have. Also set the executables to install, presumably you want that if you're asking for the build. v2: set 'install:' to the with_tools value, not true (Jordan) handle 'all' in a the comma list (Dylan) Add freedreno's tools (Dylan) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-08 11:24:42 -08:00
Brian Paul	11e92889aa	gallium/util: silence clang warning in blitter code Silence "warning: comparison of constant 4294967295 with expression of type 'ubyte'". Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-08 10:27:31 -07:00
Brian Paul	4b0a45da25	tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output() So the function matches the prototype. Found with clang. v2: fix copy&paste error Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-08 10:27:19 -07:00
Brian Paul	d95c2d86cc	tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the value zero so there's no change in behavior. It seems funny to declare these fs input registers with constant interpolation. But it looks like ureg_DECL_input_layout() is not called anywhere and ureg_DECL_input() is only called from util_make_geometry_passthrough_shader(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	26948ba761	gallium/util: s/uint/enum tgsi_semantic/ in simple shader code Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	0f40f4ffda	tgsi: s/unsigned/enum pipe_shader_type/ in ureg code And add a default switch case to silence a compiler warning. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	c0dc337ecd	gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c And put static qualifier on const arrays. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	e55de6e20c	st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	b9ff185e41	vbo: add a comment on vbo_draw_transform_feedback() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	93b3d38176	gallium/util: trivial whitespace/formatting fixes in u_blit.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	5396f8546a	vbo: improve comments on vbo_draw_func() And rename a parameter name. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	b03ade55b9	cso: add a couple sanity check assertions in cso_draw_vbo() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	5cf342704d	st/mesa: rename some vars related to indirect draw count 'indirect_params' was a bit vague. Use the names that we use in gallium's pipe_draw_indirect_info. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Marek Olšák	d9e6e0bbe3	st/mesa: remove out_num_textures from update_textures Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-08 16:14:11 +01:00
Marek Olšák	08496c5d52	st/mesa: don't store non-fragment sampler states and views in st_context those are unused. st_context: 10120 -> 3704 bytes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-08 16:14:11 +01:00
Lionel Landwerlin	e843667733	i965: perf: cleanup detection of kernel support for loadable configs The initial revision of the patch adding loadable configs was testing the feature's availability by adding a new config successfully and then removing it. A second version tested the availability just by exercising the removal. But some unused code remained. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-08 10:52:14 +00:00
Lionel Landwerlin	bd6c0cab60	i965: perf: use drmIoctl() instead of ioctl() ioctl() might be interrupted, use drmIoctl() instead as it'll retry automatically. Fixes: `27ee83eaf7` "i965: perf: add support for userspace configurations" Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-02-08 10:51:40 +00:00
Lionel Landwerlin	0f952b778f	i965: perf: add debug messages for loaded configs This helps figuring out potential problems when metrics don't show up on frameretrace for example. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-08 10:51:01 +00:00
Dave Airlie	3f7a7bd897	r600: implement tg4 integer workaround. (v2) This ports the texture gather integer workaround from radeonsi. This fixes: KHR-GL45.texture_gather.plain-gather-uint/int* v2: add rect support, fix 2d array shadow Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 16:21:40 +10:00
Glenn Kennard	77b1b33724	r600: clean up initial shader register setup This is taken from Glenn Kennards scratch series, but separated out as a cleanup by me. Reviewed-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 16:21:35 +10:00
Roland Scheidegger	b936f4d1ca	r600: partly fix sampleMaskIn value The hw gives us coverage for pixel, not for individual fragment shader invocations, in case execution isn't per pixel (eg, unlike cm, actually cannot do "real" minSampleShading, it's either per-pixel or per-fragment, but it doesn't really make a difference here). Also, with msaa disabled, the hw still gives us a mask corresponding to the number of samples, where GL requires this to be 1. Fix this up by masking the sampleMaskIn bits with the bit corresponding to the sampleID, if we know this shader is always executed at per-sample granularity. (In case of a per-sample frequency shader and msaa disabled, the sampleID will always be 0, so this works just fine there.) Fixing this for the minSampleShading case will need a shader key (radeonsi uses the prolog part for) (for eg, could get away with a single bit, cm would need more bits depending on sample/invocation ratio, or read the bits from a uniform), unless we'd want to always use a sample mask uniform (which is probably not a good idea, as it would make the ordinary common msaa case slower for no good reason). This fixes some parts of piglit arb_sample_shading-samplemask (with fixed test), in particular those which use a sampleID, still failing others as expected. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	07d724326a	r600: clean up fragment shader input scan code For some reason, we were iterating through the code twice (first just for instructions needing barycentrics, then for instructions and input dcls). Move things around slightly so this is no longer necessary. There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only needed if the per-sample interpolation comes from an input, not from an instruction (just move the assert where it belongs) (since the sample id to sample from comes from a tgsi src in this case, and isn't sampleID). Otherwise there should be no functional change. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	6fd3c39590	mesa: (trivial) remove unused ignore_sample_qualifier_parameter This parameter for _mesa_get_min_incations_per_fragment() was once used by the intel driver, but it's long gone. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@vmware.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	becc7faae2	r600/cm: (trivial) code cleanup for emitting msaa state No functional change (compile tested only). Reviewed-by: Dave Airlie <airlied@redhate.com>	2018-02-08 04:07:52 +01:00
Brian Paul	b99cb13002	tgsi: use tgsi_semantic enum type in ureg code Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:43:01 -07:00
Brian Paul	174f3a4ab7	st/mesa: use tgsi_semantic enum type Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:43:01 -07:00
Brian Paul	0f7be4fc16	tgsi: use TGSI enum types in ureg code v2: fix enum tgsi_interpolate_mode/loc typo. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:42:39 -07:00
Brian Paul	9f9ce1625f	st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Brian Paul	6321b1bd40	gallium/util: replace uint with tgsi enum types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Brian Paul	15874338ff	gallium/util: replace unsigned with tgsi enum types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Fredrik Höglund	5a38d8f103	radv: implement VK_EXT_external_memory_host Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 00:46:07 +01:00
Dave Airlie	5dd385f378	r600: fix rendering regression on r6/7 gpus Fixes: `2d5b5d267e` (r600: work out target mask at framebuffer bind.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 09:37:09 +10:00
Grazvydas Ignotas	f91aa68ac6	radeonsi: avoid int-to-pointer-cast warnings on 32bit I hope the actual dropping of MSB is ok, but that's what's already happened before this change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 01:13:58 +02:00
Grazvydas Ignotas	13ada91740	gallium/hud: update some query functions It seems these were missed when struct pipe_context * argument was added to hud_graph::query_new_value. Fixes: `3132afdf4c` "gallium/hud: pass pipe_context explicitly to most functions" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 01:12:07 +02:00
Roland Scheidegger	09f49b9e50	Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary" This reverts commit `6f82b8d8d0`. This broke scons build, and reportedly clover with autotools/meson too.	2018-02-07 23:47:39 +01:00
Marek Olšák	6f82b8d8d0	gallium: build ddebug, noop, rbug, trace as part of auxiliary Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU. (gallium build time is reduced by 15% when building only radeonsi) Non-recursive makefiles are great!	2018-02-07 22:08:34 +01:00
Roland Scheidegger	def09f8db0	u_blit: (trivial) fix bogus argument order for set_fragment_shader Amazingly this still worked sometimes, albeit I'm not even sure why... This fixes `d7bec6f7a6`.	2018-02-07 22:03:18 +01:00
Andres Rodriguez	83990dd529	mesa: fix incorrect type when allocating arrays The array members are have type 'struct gl_buffer_object *' Found by coverity. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-07 14:50:21 -05:00
Roland Scheidegger	d7bec6f7a6	u_blit,u_simple_shaders: add shader to convert from xrbias format We need this to handle some oddball dx10 format (DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this format is very limited, hence we don't want to add it as a gallium format (we could not express the properties of this format as ordinary format properties neither, so like all special formats it would need specific code for handling it in any case). While here, also nuke the array for different shaders for different writemasks, as it was not actually used (always full masks are passed in for generating shaders). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-07 17:09:37 +01:00
Roland Scheidegger	afd1e9be17	u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask The writemask handling was busted, since writing defaults to output meant they got overwritten by the tex sampling anyway. Albeit the affected components were undefined, so maybe with some luck it still would have worked with some drivers - if not could as well kill it... (This would have affected u_blitter but not u_blit since the latter always used xyzw mask.) Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen	5d754872b5	autotools: Only build libmesa-st-tests-common.a for tests. We don't need the library if we don't build tests, and building it adds a dependency on gtest which adds a dependency on cxxabi.h. Fixes: `6569b33b6e` "mesa/st/tests: unify MockCodeLine* classes" Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-02-07 14:04:04 +01:00
Tapani Pälli	9d322fde97	i965: add __DRI2_BLOB support and set cache functions v2: adjust to change that moved cache from ctx to screen Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	ae00ef2702	disk cache: add callback functionality v2: add disk_cache_has_key, disk_cache_put_key support using blob cache (Nicolai, Jordan) v3: rename set_cb as put_cb to match existing naming (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	6a651b6b77	disk cache: initialize cache path and index only when used This patch makes disk_cache initialize path and index lazily so that we can utilize disk_cache without a path using callback functionality introduced by next patch. v2: unmap mmap and destroy queue only if index_mmap exists Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	e8495646af	glsl/tests: changes to test_disk_cache_create test Next patch will allow disk_cache instance to be created without path set for it, modify some test cases that assume disk_cache creation to fail with invalid path. Creation should succeed but simple put/get test fail. v2: leave tests as is but check that both cache struct exists and try simple put/get that should fail with invalid path set (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	83c81b6cce	glsl/tests: move utility functions in cache_test Patch moves functions higher so that we can utilize them from test_disk_cache_create which is modified by next patch. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	6f5b57093b	egl: add support for EGL_ANDROID_blob_cache v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) adapt to earlier ctx->screen changes v3: remove useless checking, add _eglSetFuncName (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Samuel Pitoiset	757d36ee70	ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Ported from RadeonSI. Only one F1 2017 shader is affected, code size decreased from 532 to 488 on both Polaris10 and Vega10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:13 +01:00
Samuel Pitoiset	2f54d7382d	ac/nir: avoid loading unused VS input components Polaris10: Totals from affected shaders: SGPRS: 122840 -> 120984 (-1.51 %) VGPRS: 78812 -> 78440 (-0.47 %) Spilled SGPRs: 177 -> 129 (-27.12 %) Code Size: 2950028 -> 2941276 (-0.30 %) bytes Max Waves: 17899 -> 17976 (0.43 %) Vega10: Totals from affected shaders: SGPRS: 117144 -> 115776 (-1.17 %) VGPRS: 77580 -> 77532 (-0.06 %) Spilled SGPRs: 0 -> 152 (0.00 %) Code Size: 3352656 -> 3347860 (-0.14 %) bytes Max Waves: 19756 -> 19866 (0.56 %) This increases SGPRs spilling a bit with Talos, but I have some other ideas that might reduce it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:09 +01:00
Samuel Pitoiset	1c57a6da5e	ac/shader: scan vertex inputs usage mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:07 +01:00
Iago Toral Quiroga	f474b19875	i965: allocate a SGVS element when VertexID or InstanceID are read Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS to put these beyond the last vertex element it seems that we still need to allocate the SVGS element, otherwise we have observed cases where we end up reading garbage. Specifically, the CTS test mentioned below was flaky with a fail rate of ~1% on some gen9+ platforms caused by reading garbage for the gl_InstanceID value. The flakyness goes away as soon as we start allocating the SVGS element. v2: - Do this for gen8+, not just gen9+, and pull the boolean outside the #if block (Jason) Fixes flaky test: KHR-GL45.vertex_attrib_64bit.limits_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-07 11:11:16 +01:00
Dylan Baker	c74719cf4a	glapi: fix check_table test for non-shared glapi with meson v2: - Add glapitable_h generated source to requirements Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2018-02-06 15:00:17 -08:00
Dylan Baker	002fbde71e	glapi: Don't search through subdirs from glapitable.h Because meson won't put it in that folder. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	aac3d01178	state_tracker: Don't build st-renumerate-test without shared glapi Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	0316aa432d	glapi: remove APPLE extensions from test Fixes: `7009955281` ("mesa: Remove GL_APPLE_vertex_array_object stubs") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	a4f1fc5dd1	glapi/check_table: Remove 'extern "C"' block Using 'extern "C"' around includes is always incorrect, as the header may contain C++ symbols (as it does in this case), which means it cannot use C linkage. In this case the header has a template in it, which obviously cannot be linked with C linkage rules. Fixes: `a29ad2b421` ("mesa/tests: Add tests for the generated dispatch table") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	105178db8f	meson: fix test source name for static glapi fixes: `43a6e84927` ("meson: build mesa test.") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	9be7487f30	glapi: don't walk backwards for includes Instead just set the proper -I flags and include it from a more standard path. In this case we'll add -Isrc/mesa (which is common), and #include main/foo.h. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Brian Paul	e7a4536e64	mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray Since the type is gl_vertex_array. Update comment to explain that these arrays are only used by the VBO module. Also rename some local variables in _mesa_update_vao_derived_arrays(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:36:47 -07:00
Brian Paul	d9ab39ea65	mesa: minor whitespace fixes, line wrapping in texcompress.c Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:23:26 -07:00
Brian Paul	b38196b452	mesa: simplify _mesa_get_compressed_formats() Instead of testing for formats==NULL everywhere, just point formats at a dummy array which will be discarded. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:23:26 -07:00

... 3 4 5 6 7 ...

92373 Commits