KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rob Clark	4e47c205b9	freedreno/ir3: remove store_output lowered to store_shared_ir3 Fixes crashes that were unnoticed in CI because debug_assert() was not enabled (but become real crashes after the next patch): dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_highp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_lowp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_mediump_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_highp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_lowp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_mediump_geometry Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Rafael Antognolli	50f60d69e4	iris: Add restriction to 3DSTATE_CONSTANT_ packets. The following programming note shows up in all 3DSTATE_CONSTANT_* packets: "The sum of all four read length fields must be less than or equal to the size of 64." The backend compiler should guarantee this for us, so let's just add a check here. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	d3e339364f	anv: Use 3DSTATE_CONSTANT_ALL when possible. Use this new instruction introduced in Gen12. The instruction itself is smaller, and it also allows us to emit a single instruction to all stages that have the same push constant buffers (e.g. when they don't have constant buffers). There's one restriction to use this instruction, though: the length field is only 5 bits long, so we need to check whether we can use it, and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32. v2: - Rebased on top of the lasted changes from Jason. - Added review suggestions by Caio. - Removed struct push_bos and merged some code into anv_nir_compute_push_layout(). v3: - Remove code churn due to gen8+ workaround in anv_nir_compute_push_layout(). This code has been removed in an earlier commit, and implemented in cmd_buffer_emit_push_constant(). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	7d5da53d27	anv: Move code for emitting push constants into its own function. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	67d2cb3e93	anv: Add get_push_range_address() helper. Add a helper function to get the push range address. Once we have a separate function for emitting gen12 push constants, we can use this helper and avoid duplicating code. v3: Do not add range->start to the address in gen7 (Caio). v4: Do not drop range->start from gen7 (Caio, Jason). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	c0225a728e	anv: Move gen8+ push constant packet workaround. Store push_ranges in ascending order, and only "shift" them to the end of the array during state packet emission. We don't need this workaround with the new 3DSTATE_CONSTANT_ALL packet. So instead of applying the workaround here just for GEN < 12 (which requires and extra loop through all the ranges to figure out if we should shift them or not), we simply move the whole logic to the state emission code. At that point, in a later commit, we are already looping through all of the ranges anyway to check which packet we will be using, so we might as well implement the workaround there, where it is going to be used. v3: Move gen8+ workaround to the state emission code (Caio). v4: Add explanation of why we moved the workaroudn (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	06438ea7fa	iris: Use 3DSTATE_CONSTANT_ALL when possible. Use this new instruction introduced in Gen12. The instruction itself is smaller, and it also allows us to emit a single instruction to all stages that have the same push constant buffers (e.g. when they don't have constant buffers). There's one restriction to use this instruction, though: the length field is only 5 bits long, so we need to check whether we can use it, and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32. v2 (Suggestions from Caio): - use max_length instead of large_buffers. - remove UNUSED and use #if GEN_GEN >= 12 instead. - inline "buffers" and drop BITSET_RANGE() usage. - add assert(n <= max_pointers) - move emit to outside of the loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	1ba9a18911	iris: Rework push constants emitting code. Split into a function the logic to gather the push constant buffers, which now stores them in struct push_bos. Another function is added to emit the packet, using data from the push_bos struct. This will be useful when adding a new function for emitting push constants for newer platforms. v2 (Suggestions from Caio): - rename 'n' -> 'buffer_count' - remove large_buffers (for now) - initialize push_bos - remove assert - change for() condition (i <= 3 -> i < 4) v3: - Add comment about size limit. - Rework "shift" logic and 'for' loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	9db044792f	intel/blorp: Use 3DSTATE_CONSTANT_ALL to setup push constants. In blorp, all the push constants are disabled, so we only need to emit a single 3DSTATE_CONSTANT_ALL with the bitmask for stage update appropriately set. v2: Update comment (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	8983622995	intel/aubinator: Decode 3DSTATE_CONSTANT_ALL. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	2d127614a2	intel/genxml: Add 3DSTATE_CONSTANT_ALL packet. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Jonathan Marek	1576ff5fbb	turnip: MSAA resolve directly from GMEM Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	abaaf0b2e7	turnip: don't set unused BLIT_DST_INFO bits for GMEM clear These bits are ignored when clearing so don't bother setting them. Note: MSAA samples when clearing comes from other registers (tu6_emit_msaa) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	4babdc7381	turnip: implement CmdClearAttachments Passes these deqp tests: dEQP-VK.api.image_clearing.core.attachsingle* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	1dfa2e6c99	turnip: don't skip unused attachments when setting up tiling config This makes it easier to find the gmem_offset associated with an attachment. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Vasily Khoruzhick	8c12f4e5f2	lima: enable tiling Now that we have tiled format modifier merged into linux we can enable tiling. That should improve overall performance and also workaround broken mipmapping for linear textures since now we prefer tiled textures. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-12-04 08:20:56 -08:00
Tapani Pälli	272ef5d39a	glsl: additional interface redeclaration check for SSO programs Patch adds additional linker check for SSO programs to make sure they are redeclaring built-in blocks as required by the desktop spec. This fixes following Piglit tests: arb_separate_shader_objects/linker/pervertex-* Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-12-04 15:27:41 +00:00
Tapani Pälli	2d26cc077d	gitlab-ci: bump piglit checkout commit Commit also updates the Piglit quick_gl.txt, list modifications happened due to following Piglit commits: c248bf201,c acff58ca, 5603e2e60. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-12-04 15:27:41 +00:00
Rhys Perry	3e67aa2e4e	nir/load_store_vectorize: fix combining stores with aliasing loads between v2: add test Fixes: `ce9205c03b` ('nir: add a load/store vectorization pass') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v2)	2019-12-04 12:21:40 +00:00
Timur Kristóf	637c5a1dd9	aco/wave32: Fix reductions. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	21db083504	aco/wave32: Allow setting the subgroup ballot size to 64-bit. Previously, it would only work when the ballot size was set to the lane mask. This patch makes is possible to set the ballot size to either 32-bit or 64-bit for wave32 mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	ed815d503e	aco/wave32: Use wave_size for barrier intrinsic. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	b8f2edb452	aco/wave32: Fix load_local_invocation_index to support wave32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	e0bcefc3a0	aco/wave32: Use lane mask regclass for exec/vcc. Currently all usages of exec and vcc are hardcoded to use s2 regclass. This commit makes it possible to use s1 in wave32 mode and s2 in wave64 mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	b4efe179ed	aco/wave32: Add wave size specific opcodes to aco_builder. Several places in ACO we use SOP1 or SOP2 instructions to operate over the exec mask or VCC, and these need to be adapted to the new size in wave32 mode. This commit adds a way to deal with this problem in aco_builder: the caller can specify a wave size specific opcode and the builder will translate that to the correct opcode based on the current wave size. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	c44af6cbc7	aco/wave32: Introduce emit_mbcnt which takes wave size into account. This is relevant because in wave32 mode the v_mbcnt_hi_u32_b32 instruction is superfluous. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	07754a9c9e	aco/wave32: Replace hardcoded numbers in spiller with wave size. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	c0dbf42a03	aco/wave32: Change uniform bool optimization to work with wave32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	dd9dad731b	aco: Optimize load_subgroup_id to one bit field extract instruction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	753670e902	aco: Remove lower_linear_bool_phi, it is not needed anymore. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	0d2d672020	aco: Remove superfluous argument from emit_boolean_logic. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	9a43d26b74	aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Michel Dänzer	5585b8eadd	gitlab-ci: Run piglit glslparser & quick_shader tests separately And only use --process-isolation false for the quick_gl tests. This will hopefully avoid variance in the test results that we've been seeing lately. But even if it doesn't, it should at least help narrow down the cause of the variance. Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-04 10:36:33 +01:00
Lionel Landwerlin	ddacd3d43b	intel/perf: fix improper pointer access This expression was unused by the macro, probably why it didn't register in the compilation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	8c0b058263	intel/perf: simplify the processing of OA reports This is a more accurate description of what happens in processing the OA reports. Previously we only had a somewhat difficult to parse state machine tracking the context ID. What we really only need to do to decide if the delta between 2 reports (r0 & r1) should be accumulated in the query result is : * whether the r0 is tagged with the context ID relevant to us * if r0 is not tagged with our context ID and r1 is: does r0 have a invalid context id? If not then we're in a case where i915 has resubmitted the same context for execution through the execlist submission port v2: Update comment (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	b364e920bf	intel/perf: take into account that reports read can be fairly old If we read the OA reports late enough after the query happens, we can get a timestamp in the report that is significantly in the past compared to the start timestamp of the query. The current code must deal with the wraparound of the timestamp value (every ~6 minute). So consider that if the difference is greater than half that wraparound period, we're probably dealing with an old report and make the caller aware it should read more reports when they're available. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	9d0a5c817c	intel/perf: set read buffer len to 0 to identify empty buffer We always add an empty buffer in the list when creating the query. Let's set the len appropriately so that we can recognize it when we read OA reports up to the end of a query. We were using an 0 timestamp value associated with the empty buffer and incorrectly assuming this was a valid value. In turn that led to not reading enough reports and resulted in deltas added to our counter values which should have been discarded because those would be flagged for a different context. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	acea59dbf8	intel/perf: fix invalid hw_id in query results Accumulation happens between 2 reports, it can be between a start/end report from another context. So only consider updating the hw_id of the results when it's not already valid and that we have a valid value to put in there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `41b54b5faf` ("i965: move OA accumulation code to intel/perf") Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Pierre-Eric Pelloux-Prayer	a7bbebcfb9	radeonsi: display cs blit count for AMD_DEBUG=testdma Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-04 09:08:28 +01:00
Pierre-Eric Pelloux-Prayer	082d1c1686	radeonsi: implement sdma for GFX9 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-04 09:08:28 +01:00
Samuel Pitoiset	4cacba0c86	radv/gfx10: fix the vertex order for triangle strips emitted by a GS My fix wasn't totally correct as pointed out by Marek. Ported from RadeonSI. Fixes: `deafe4cc58` ("radv/gfx10: fix primitive indices orientation for NGG GS") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-04 08:28:57 +01:00
Samuel Pitoiset	dac6bd29ae	radv: simplify a check in radv_fixup_vertex_input_fetches() The number of loaded channels should always be > 0 now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-04 08:04:05 +01:00
Samuel Pitoiset	3b51259f06	radv: remove dead shader input/output variables No pipeline-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-04 08:04:05 +01:00
Jason Ekstrand	0604768ae4	iris: Stop setting up fake params In `d1c4e64a69`, we added a parameter to tell the back-end compiler to ignore the param array and just push however many constants you ask it to push. Iris doesn't want to push anything so it gives a bogus number of parameters and trusts the back-end compiler to dead-code all of them. Now that we can tell the back-end compiler to stop re-arranging things, delete the hack and enable the new simpler code path. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 04:52:20 +00:00
Dave Airlie	713636766d	gallium/scons: fix graw-xlib build on OSX. Fixes: `44a6b0107b` (gallivm: add nir->llvm translation (v2)) Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-12-04 13:24:44 +10:00
Dave Airlie	3263c9824e	llvmpipe: enable texcoord semantics To make NIR transitioning easier, move the driver to using texcoord semantics. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 12:08:14 +10:00
Jason Ekstrand	178a2946c0	anv: Respect the always_flush_cache driconf option Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-03 17:10:51 -06:00
Krzysztof Raszkowski	07adc47460	gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format. Reject the new formats in swr to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-12-03 16:51:24 +00:00
Rob Clark	b31637c453	gitlab-ci: disable junit results for deqp They don't seem to be hugely useful, and seem to be bogging down gitlab. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-03 08:46:39 -08:00
Jason Ekstrand	b1f37688ba	anv: Set up SBE_SWIZ properly for gl_Viewport gl_Viewport is also in the VUE header so we need to whack the read offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that case as well. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-03 16:20:50 +00:00

... 2 3 4 5 6 ...

118321 Commits All Branches Search

118321 Commits

All Branches