KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Connor Abbott	fa17295ebd	ir3: Add simple CSE pass RA currently can't handle a live value that's part of a vector and introduces extra copies. This was espeically a problem for bary.f, where the bary coords were being split and repeatedly re-collected. But this could be a problem in other situations as well. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	b1a1de76e8	ir3/sched: Consider unused destinations when computing live effect If an instruction's destination is unused, then we shouldn't penalize it. For example, this helps us schedule atomic operations whose results aren't read. This works around RA failures when CSE is enabled in some robustness2 tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	ba8efeb7fa	ir3/sched: Make collects count against tex/sfu limits In a scenario where there are a lot of texture fetches with constant coordinates, this prevents the scheduler from scheduling all the setup instructions after the first group of textures has been scheduled because they are the only non-syncing thing and scheduling them didn't decrease tex_delay. Collects with immed/const sources will turn into moves of those sources, so we should treat them the same. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	8b15c2f30c	ir3/sched: Don't schedule collect early I don't think there was ever a good reason to do this, but when we start folding constants/immediates into collect, this can become actively harmful. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	27593cb241	ir3: Remove right and left copy prop restrictions This is leftover from the old RA, and inhibits copy propagation unnecessarily with the new RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	2f51379d03	ir3/ra: Add a validation pass This helps catch tricky-to-debug bugs in RA, or helps rule them out. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	0ffcb19b9d	ir3: Rewrite register allocation Switch to the new SSA-based register allocator. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	df9f41cc02	ir3: Expose occupancy calculation functions Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	3ac743c333	ir3: Add pass to lower arrays to SSA This will be run right after nir->ir3. Even though we have SSA coming out of NIR, we still need it for NIR registers, even though we keep the original array around to insert false dependencies. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:04 -07:00
Connor Abbott	d4b5a550ed	ir3: Add dominance infrastructure Mostly lifted from nir. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	1f3546c9e2	ir3: Remove unused check_src_cond() Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	c0789395e0	ir3/postsched: Don't use SSA source information This was only used for calculating if a source is a tex or SFU instruction, which is easily replacable. It's going away with the new RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	c947475533	ir3/delay: Delete pre-RA repeat handling It looks likely that any implementation of (rptN) in ir3 will have to actually create (rptN) instructions after RA, which means that this can be dropped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	58d82add87	ir3: Rewrite delay calculation The old delay calculation relied on the SSA information staying around, and wouldn't work once we start introducing phi nodes and making "normal" values defined in multiple blocks not array regs anymore. What's worse is that properly inserting phi nodes when splitting live ranges would make that code even more complicated, and this was the last place post-RA that actually needed that information. The new version only compares the physical registers of sources and destinations. It works by going backwards up to a maximum number of cycles, so it might be slightly slower when the definition is closer but should be faster when it is farther away. To avoid complicating the new method, the old method is kept around, but only for pre-RA scheduling and it can therefore be drastically simplified as the array case can be dropped. ir3_delay_calc() is split into a few variants to avoid an explosion of boolean arguments in users, especially now that merged_regs now has to be passed to it. The new method is a little more complicated when it comes to handling (rptN), because both the assigner and consumer may be (rptN). This adds some unit tests for those cases, in addition to dropping the to-SSA code in the test harness since it's no longer needed. Finally, ir3_legalize has to be switched to using physical registers for the branch condition. This was the one place where IR3_REG_SSA remained after RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	c0823a2d31	ir3: Make branch conditions non-SSA In particular, make sure they have a physreg assigned. This was the last place after RA where SSA registers were created, which won't work with the new post-RA delay calculation that relies on the physreg. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	fc7402b4cf	ir3: Add reg_elems(), reg_elem_size(), and reg_size() For working with registers in units of half-regs in the new RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	890de1a436	ir3/delay: Fix full->half and half->full delay The current compiler never does this, but the new compiler will start to in mergeregs mode. There is an extra penalty for this. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	9ad83f51eb	ir3: Add ir3_register::array.base There were two different approaches I saw in the post-RA code for figuring out what regiser range a relative access touched: 1. Use reg->array.offset and reg->array.size. This is wrong in case reg->array.offset was non-zero before RA, because array.size is the size of the whole array and array.offset has the const offset within the array baked in. 2. Lookup the array from the array ID and use the base + range there. This is correct, but won't work with the new RA, where an array might not always be assigned to the same register. This replaces both methods with a new ir3_register::array.base field, and switches all the users I could find to it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	939ee6966f	ir3: Improve register printing for SSA Print the ssa name for array destinations, and handle printing undef SSA sources. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	edf23e15eb	ir3: Prepare for instructions with multiple destinations To simplify the pre-RA merge set code and express the result live-range splitting in RA, we need to add support for parallel copy instructions, and for the merge set code these parallel copies need to be in SSA form. Parallel copies have multiple destinations by necessity, but there was no way to express this in the existing IR. In particular there was no support for marking a register as being a destination, and no support for indicating which destination register out of several an SSA source refers to. This replaces ir3_register::instr with ir3_register::def and re-purposes ir3_register::instr. I haven't propagated this into common helpers, like ssa(), because that would vastly increase the amount of churn and the number of places that produce such instructions should be limited -- only RA will create parallel copies and they will be destroyed right after RA. In the future swz will have multiple destinations too, but it will only be created after RA via parallel copy lowering. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	e1d7240576	ir3: Readd support for translating NIR phi nodes This is roughly based on the support removed a while ago, but it handles sources better by associating each source with a predecessor block. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	0ef021be4a	ir3: Add ir3_start_block() Name based on nir_start_block(). A number of places were already open-coding this, convert them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	ef4e07a1a2	ir3: Introduce phi and parallelcopy instructions Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Rob Clark	3f758afe6a	freedreno: Fix fdperf flush We created and initialized the fence, but forgot to pass it to fd_submit_flush(). Fixes: `aafcd8aacb` ("freedreno: Re-work fd_submit fence interface") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11200>	2021-06-09 19:08:53 -07:00
Rob Clark	09f64f74db	freedreno/ir3: Fix use after free If the tex/sfu ssa src is from a different block than the one currently being scheduled, we do not have a valid sched-node. So fallback to previous behavior rather than dereference an invalid ptr. Fixes: `7821e5a3f8` ("ir3/sched: Don't penalize uses of already-waited tex/SFU") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10306>	2021-06-09 00:37:15 +00:00
Caio Marcelo de Oliveira Filho	8af6766062	nir: Move workgroup_size and workgroup_variable_size into common shader_info Move it out the "cs" sub-struct, since these will be used for other shader stages in the future. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>	2021-06-08 09:23:55 -07:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho	c8a7bd0dc8	nir: Rename WORK_GROUP (and similar) to WORKGROUP Be consistent with other usages in Vulkan and SPIR-V, and the recently added workgroup_size field. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	a71a780598	nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	430d2206da	compiler: Rename local_size to workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Dmitry Baryshkov	cac88b5f06	freedreno/regs: split old/not used phy registers to separate DB In order to simplify main DSI host database, split away phy register definitions used on DSI v2 hosts to the separate database file. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11075>	2021-06-05 19:20:50 +00:00
Eric Anholt	95d41a3525	ra: Use struct ra_class in the public API. All these unsigned ints are awful to keep track of. Use pointers so we get some type checking. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>	2021-06-04 19:08:57 +00:00
Danylo Piliaiev	20d8324a1b	turnip: implement VK_EXT_provoking_vertex Passes: dEQP-VK.rasterization.provoking_vertex.* Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11112>	2021-06-04 14:37:01 +00:00
Hyunjun Ko	41eaa07823	turnip/kgsl: Fix to build on android. Fixes: `3f229e34` ("turnip: Implement VK_KHR_timeline_semaphore.") Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11153>	2021-06-03 08:55:06 +00:00
Chia-I Wu	3ba3681b58	tu: use vk_default_allocator Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11117>	2021-06-03 08:13:26 +00:00
Emma Anholt	d3e419f9d8	ci/freedreno: Add some more known flakes from recent marge runs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11144>	2021-06-03 03:07:35 +00:00
Danylo Piliaiev	b71e27ea84	turnip: fix register_index calculations of xfb outputs nir_assign_io_var_locations() does not use outputs_written when assigning driver locations. Use driver_location to avoid incorrectly guessing what locations it assigned. Copied from lavapipe `8731a1beb7` Will fix provoking vertex tf tests when VK_EXT_provoking_vertex would be enabled: dEQP-VK.rasterization.provoking_vertex.transform_feedback.* Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11111>	2021-06-02 23:55:00 +00:00
Danylo Piliaiev	551d7fddfb	turnip: emit vb stride dynamic state when it is dirty Due to incorrect condition we never emitted vb stride if state was dynamically set. Fixes vertex explosion with Zink. See https://gitlab.freedesktop.org/mesa/mesa/-/issues/4738 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11133>	2021-06-02 21:38:19 +00:00
Danylo Piliaiev	74aa09b22c	turnip: reset push descriptor set on command buffer reset Otherwise it will store a pointer to already unmapped memory which could lead to a crash in tu_CmdPushDescriptorSetWithTemplateKHR since it tries to copy data from the old memory. Fixes a crash with Zink's new lazy descriptor manager instroduced in `bfdd1d8d` Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11137>	2021-06-02 16:01:40 +00:00
Matt Turner	09935c0dde	freedreno/afuc: Print uintptr_t with PRIxPTR Fixes a compilation error on 32-bit. Fixes: `bba61cef38` ("freedreno/afuc: Add emulator mode to afuc-disasm") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11118>	2021-06-02 03:57:20 +00:00
Tomeu Vizoso	bc50a16103	Revert "ci/freedreno: Skip Portal 2 trace on a630, due to flakiness" This reverts commit `e381bc0e67`. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Corentin Noël <corentin.noel@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11058>	2021-06-01 08:50:45 +02:00
Rob Clark	3dff0c30cf	freedreno/headergen2: Fix compile warnings with CP_DRAW_INDIRECT_MULTI Using stripes to deal with the different packet layout variants resulted in redefining "register" offsets with different values, so use "prefix" to add a suffix to disambiguate. drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h:1066: warning: "REG_A6XX_CP_DRAW_INDIRECT_MULTI_INDIRECT" redefined 1066 \| #define REG_A6XX_CP_DRAW_INDIRECT_MULTI_INDIRECT 0x00000006 \| drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h:1057: note: this is the location of the previous definition 1057 \| #define REG_A6XX_CP_DRAW_INDIRECT_MULTI_INDIRECT 0x00000003 \| (Admittedly it isn't really a "prefix" but that was the field in the schema available to use, and REG_INDEXED_CP_DRAW_INDIRECT_MULTI_STRIDE sounds somewhat more funny.) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	ff5e17f1f8	freedreno/afuc: Use emulator to extract jmptbl This runs through the SQE bootstrap code to extract the packet-table, rather than relying on heuristics. As a bonus, it can detect the start of the LPAC fw in a660+ fw so that we can properly decode the LPAC fw and packet-table. Note that this decodes the jmptable as normal instructions, which is a change in behavior from the previous heuristic based jmptbl extraction. Not sure if that is a good or bad thing. For a5xx, for now the legacy heuristic based jmptable decoding is preserved, at least until enough control regs are figured out. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	2beb5b015a	freedreno/ci: Add real packet-table loading for afuc test When we start running the bootstrap code thru the emulator we will need the packet-table loading to actually happen. So add this. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	df14af6480	freedreno/afuc: Add emulator support to run bootstrap Run until the packet-table is populated, so the disassembler can use this to know the offsets of various pm4 packet handlers without having to rely on heuristics. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	ea2e244198	freedreno/afuc: Split out helpers to parse labels and packet-table Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	9a4ca194e8	freedreno/afuc: Extract full gpu-id Some of the a6xx gens will require some control reg initialization, and go into an infinite loop if they don't see the values they expect, so we'll need to extract the compute gpu-id. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	c2f8c98d56	freedreno/registers: Add a few a6xx regs and notes A few things I noticed while playing with the emulator. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	bba61cef38	freedreno/afuc: Add emulator mode to afuc-disasm This is an (at least somewhat complete) logical emulator of the a6xx SQE that lets us step through firmware execution (bootstrap, cmdstream pkt handling, etc). It lets us poke at various fw visible state and run through pm4 packet(s) to better understand what the fw is doing when it handles various packets. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	745dad0446	freedreno/afuc: Add pipe reg name decoding Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	184f474574	freedreno/afuc: Clean up special regs Allow for different mnemonics depending on whether they are used as source or destination register, to better reflect what they do. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	2876253f28	freedreno/afuc: Split out utils With disasm emulator mode, we'll start wanting some things that are duplicationg what the assembler does, so just split out all the rnndb bits into shared utils. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	d367d84d87	freedreno/afuc: Split out instruction decode helper Split the giant switch/decode out into a helper function so that we can re-use it for emulator mode. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	33b9445a68	freedreno: Move pkt parsing helpers to common I'll be needing these in afuc as well. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Rob Clark	62c53d4361	freedreno/tu+drm: Extract out pm4 pkt header helpers I'm going to need these in a 3rd place, so let's deduplicate first. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>	2021-05-31 23:34:43 +00:00
Danylo Piliaiev	8d0c76b143	freedreno: reduce the upper bound of IB size by one Going beyond 0x100000 results in hangs, however I found that the last 0x100000 packet just doesn't get executed. Thus the real limit is 0x0FFFFF. At least this is true for a6xx. This could be tested by appending nops to the cmdstream and placing e.g. CP_INTERRUPT at the end, at any position other than being 0x100000 packet it results in a hang. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10786>	2021-05-31 17:38:26 +00:00
Danylo Piliaiev	f38fd3c577	turnip: place a limit on the growth of BOs There is a limit on IB size, which on freedreno is set to 0x100000. Going beyond it results in hangs, however I found that the last 0x100000 packet just doesn't get executed. Thus the real limit is 0x0FFFFF. This could be tested by appending nops to the cmdstream and placing e.g. CP_INTERRUPT at the end, at any position other than being 0x100000 packet it results in a hang. Fixes: dEQP-VK.api.command_buffers.record_many_draws_secondary_2 dEQP-VK.api.command_buffers.record_many_draws_primary_2 However these tests could trigger hangcheck timeouts. Also this fixes hangs when opening captures of games in RenderDoc. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10786>	2021-05-31 17:38:26 +00:00
Emma Anholt	9d28bac9d0	turnip: Make sure that SNORM blits don't clamp ambiguous -1.0 values. The CTS expects that some paths transfer SNORM data exactly, but the HW will clamp 0x80 values to 0x81 in the process. We can treat snorm as unorm, though, and get working compression without the clamping happening. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10735>	2021-05-27 19:27:40 +00:00
Emma Anholt	69df1e8650	turnip: Reorganize copy_format()'s switch statement. Now that we need FALLTHROUGH macros we weren't saving much by falling through, and things were weirdly ordered anyway with depth intermixed with color formats and the default case tucked in the middle of the switch statement. Replace with pretty obvious ordering of normal color, planar color, depth, then default. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10735>	2021-05-27 19:27:40 +00:00
Tomeu Vizoso	e381bc0e67	ci/freedreno: Skip Portal 2 trace on a630, due to flakiness Sometimes the reticle is missing. Skip it for now to keep the number of pipeline failures due to flakes low. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11033>	2021-05-27 14:07:44 +02:00
Tomeu Vizoso	6cd37e4bf0	Partial revert of "ci: Add a manual job for tracking the performance of Freedreno" This reverts commit `8e470457de`. Drop that jobs, as it's sometimes causing a gitlab-ci.yml parse error that isn't readily reproducible: Found errors in your .gitlab-ci.yml: 'a630-profile-traces' job needs 'arm_test' job but it was not added to the pipeline 'a630-profile-traces' job needs 'meson-arm64' job but it was not added to the pipeline Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11030>	2021-05-27 10:41:53 +02:00
Emma Anholt	26677008b9	ci/freedreno: Turn off default a530 quick_gl testing, do full quick_shader. The quick_gl set is too unstable -- even when I switched to a consistent set of tests, and added lots of flakes, I keep getting new ones as some test (unclear which, but it's like 7-8 minutes into the run) kills other innocent ones. Until we get per process pagetables, disable this testing by default. However, now that we've freed up a board that was doing quick_gl, we have time to do all of quick_shader so that piglit uprevs don't reshuffle the test list and expose new failures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11018>	2021-05-26 20:49:47 +00:00
Tomeu Vizoso	26079868a7	ci/freedreno: Add depth32f_stencil8 flakes Started happening after disabling cpufreq, devfreq and runtime PM. At least one of these fail in each run, so it's blocking MRs. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7987>	2021-05-26 18:20:19 +00:00
Antonio Caggiano	8e470457de	ci: Add a manual job for tracking the performance of Freedreno Use Piglit's replay profile to measure and store the time that frames take to render in the GPU. This job won't run automatically in regular pipelines, but will be triggered automatically by a script for every successful pre-merge pipeline. This is because we want to generate performance data for every relevant commit merged in main, but we don't want to keep a device busy during the pre-merge run. Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-By: Rohan Garg <rohan.garg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7987>	2021-05-26 18:20:19 +00:00
Emma Anholt	ee408df29c	ci/freedreno: Consolidate ssbo.fragment_binding_array flake annotation. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>	2021-05-24 16:42:33 +00:00
Emma Anholt	fec60d5bee	ci/freedreno: Drop VK flake annotations not seen in the last ~year. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>	2021-05-24 16:42:33 +00:00
Emma Anholt	4caf9b430d	ci/freedreno: Add a link explaining get_display_plane_capabilities Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>	2021-05-24 16:42:33 +00:00
Emma Anholt	a8c3783982	ci/freedreno: Drop a630 flake annotation from the go-fast changes. The async fix seems to have fixed it, haven't seen this one since May 3rd. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>	2021-05-24 16:42:33 +00:00
Emma Anholt	1dbaaa22f9	ci/freedreno: Clear stale validation failure flake annotation. Haven't seen it in my current set of IRC logs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>	2021-05-24 16:42:33 +00:00
Emma Anholt	a40479b6bc	ci/freedreno: Clear compswap flake annotation. These flakes disappeared around 2020-08 and the only sign since then has been some flakes on versions of the ir3 RA rewrite. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>	2021-05-24 16:42:33 +00:00
Connor Abbott	9350900fcd	ir3: Only use per-wave pvtmem layout for compute The blob seems to do this since a630, and it fixes spec@glsl-1.30@execution@fs-large-local-array on a650. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10922>	2021-05-21 20:45:07 +00:00
Connor Abbott	0ab01f4215	ir3: Call nir_lower_wrmask() again after lowering scratch I forgot that after rebasing on large_consts support that this is now called after the first time nir_lower_wrmask is called and can generate partial writemasks that need to be lowered. While we're here, also call the main optimization loop if things are lowered to scratch because it generates address arithmetic that may need to be cleaned up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10922>	2021-05-21 20:45:07 +00:00
Emma Anholt	307139c7f9	vulkan: Avoid stomping array padding in the MemoryProperties wrapper. The deqp test for it expects that the unused array elements are untouched, so make sure they don't get replaced with random stack data. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10737>	2021-05-20 21:41:06 +00:00
Connor Abbott	de9f2170cc	ir3: Use round-to-nearest-even for fquantize2f16 We're supposed to map a floating-point value too large to be represented as fp16 to infinity, however round-to-zero naturally rounds it down to the largest representable fp16 number instead. The blob emits a bunch of fixup code to work around this, but instead we can just do what all the other drivers seem to do and use round-to-nearest-even instead. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10897>	2021-05-20 18:45:59 +00:00
Ian Romanick	7d85dc4f35	nir/algebraic: Equality comparison inversions require sources be numbers v2: Update A630 expected image checksum for minetest.trace. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21036690 -> 21049485 (0.06%) instructions in affected programs: 852085 -> 864880 (1.50%) helped: 240 HURT: 2514 helped stats (abs) min: 1 max: 46 x̄: 2.45 x̃: 2 helped stats (rel) min: 0.15% max: 4.30% x̄: 0.79% x̃: 0.55% HURT stats (abs) min: 1 max: 198 x̄: 5.32 x̃: 2 HURT stats (rel) min: 0.06% max: 10.71% x̄: 1.48% x̃: 1.04% 95% mean confidence interval for instructions value: 4.14 5.15 95% mean confidence interval for instructions %-change: 1.23% 1.34% Instructions are HURT. total cycles in shared programs: 856045255 -> 855816220 (-0.03%) cycles in affected programs: 16743786 -> 16514751 (-1.37%) helped: 790 HURT: 1973 helped stats (abs) min: 1 max: 10766 x̄: 627.97 x̃: 18 helped stats (rel) min: <.01% max: 32.59% x̄: 3.01% x̃: 0.64% HURT stats (abs) min: 1 max: 4078 x̄: 135.36 x̃: 18 HURT stats (rel) min: <.01% max: 54.56% x̄: 2.80% x̃: 0.82% 95% mean confidence interval for cycles value: -131.36 -34.42 95% mean confidence interval for cycles %-change: 0.88% 1.40% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 9771 -> 9766 (-0.05%) spills in affected programs: 47 -> 42 (-10.64%) helped: 1 HURT: 0 total fills in shared programs: 9451 -> 9430 (-0.22%) fills in affected programs: 91 -> 70 (-23.08%) helped: 1 HURT: 0 LOST: 16 GAINED: 51 All Intel GPUs from Sandybridge through Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 20024781 -> 20025568 (<.01%) instructions in affected programs: 103309 -> 104096 (0.76%) helped: 12 HURT: 389 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.20% max: 2.70% x̄: 1.36% x̃: 1.37% HURT stats (abs) min: 1 max: 8 x̄: 2.06 x̃: 1 HURT stats (rel) min: 0.05% max: 7.14% x̄: 1.25% x̃: 0.95% 95% mean confidence interval for instructions value: 1.78 2.15 95% mean confidence interval for instructions %-change: 1.06% 1.28% Instructions are HURT. total cycles in shared programs: 979419070 -> 979439180 (<.01%) cycles in affected programs: 4968711 -> 4988821 (0.40%) helped: 60 HURT: 381 helped stats (abs) min: 1 max: 1296 x̄: 96.92 x̃: 26 helped stats (rel) min: <.01% max: 27.10% x̄: 1.64% x̃: 0.65% HURT stats (abs) min: 1 max: 7320 x̄: 68.04 x̃: 30 HURT stats (rel) min: <.01% max: 19.77% x̄: 1.32% x̃: 0.87% 95% mean confidence interval for cycles value: 10.25 80.95 95% mean confidence interval for cycles %-change: 0.69% 1.15% Cycles are HURT. LOST: 1 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8128474 -> 8132527 (0.05%) instructions in affected programs: 642323 -> 646376 (0.63%) helped: 12 HURT: 1972 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 0.72% max: 1.72% x̄: 1.09% x̃: 0.83% HURT stats (abs) min: 1 max: 16 x̄: 2.07 x̃: 3 HURT stats (rel) min: 0.12% max: 7.14% x̄: 0.77% x̃: 0.70% 95% mean confidence interval for instructions value: 1.99 2.10 95% mean confidence interval for instructions %-change: 0.74% 0.79% Instructions are HURT. total cycles in shared programs: 238280994 -> 238294376 (<.01%) cycles in affected programs: 8841250 -> 8854632 (0.15%) helped: 84 HURT: 1192 helped stats (abs) min: 4 max: 64 x̄: 12.50 x̃: 8 helped stats (rel) min: 0.02% max: 1.61% x̄: 0.28% x̃: 0.17% HURT stats (abs) min: 2 max: 198 x̄: 12.11 x̃: 12 HURT stats (rel) min: 0.02% max: 8.03% x̄: 0.28% x̃: 0.14% 95% mean confidence interval for cycles value: 9.65 11.32 95% mean confidence interval for cycles %-change: 0.22% 0.27% Cycles are HURT. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	4246c2869c	nir/algebraic: Invert comparisons less often This fixes the piglit test range_analysis_fsat_of_nan.shader_test. That test contains some code like o = saturate(X) > 0 ? vec4(1.0, 0.0, 0.0, 1.0) : vec4(0.0, 1.0, 0.0, 1.0); A clever optimizer will convert this to o = vec4(float(saturate(X) > 0), float(!(saturate(X) > 0)), 0, 1); Due to the ordering of optimizations in the compiler, the `saturate` operations are removed. This is safe even in the presense of NaN. o = vec4(float(X > 0), float(!(X > 0)), 0, 1); Since the calculations are not marked precise, an overzealous optimizer may reduce this to o = vec4(float(X > 0), float(X <= 0), 0, 1); This will result in black being output. The GLSL spec gives quite a bit of leeway with respect to NaN, but that seems too far. The shader author asked for a result of red or green. A result of black is still "undefined behavior," but it's also a little mean. This also enables CSE to do its job better. v2: Update A530 expected image checksum for minetest.trace. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4531 Fixes: `0dbda153aa` ("nir/algebraic: Flag inexact optimizations") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21041563 -> 21041789 (<.01%) instructions in affected programs: 992066 -> 992292 (0.02%) helped: 526 HURT: 548 helped stats (abs) min: 1 max: 16 x̄: 2.48 x̃: 2 helped stats (rel) min: 0.04% max: 5.56% x̄: 0.74% x̃: 0.49% HURT stats (abs) min: 1 max: 27 x̄: 2.80 x̃: 2 HURT stats (rel) min: 0.04% max: 4.55% x̄: 0.59% x̃: 0.38% 95% mean confidence interval for instructions value: -0.00 0.42 95% mean confidence interval for instructions %-change: -0.12% <.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 855885569 -> 856118189 (0.03%) cycles in affected programs: 343637248 -> 343869868 (0.07%) helped: 907 HURT: 541 helped stats (abs) min: 1 max: 7724 x̄: 206.45 x̃: 36 helped stats (rel) min: <.01% max: 29.97% x̄: 1.01% x̃: 0.37% HURT stats (abs) min: 1 max: 14177 x̄: 776.09 x̃: 31 HURT stats (rel) min: <.01% max: 29.94% x̄: 1.24% x̃: 0.35% 95% mean confidence interval for cycles value: 84.30 237.00 95% mean confidence interval for cycles %-change: -0.32% -0.01% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). LOST: 3 GAINED: 5 Ice Lake total instructions in shared programs: 20027107 -> 20025352 (<.01%) instructions in affected programs: 1068856 -> 1067101 (-0.16%) helped: 1153 HURT: 273 helped stats (abs) min: 1 max: 14 x̄: 1.83 x̃: 1 helped stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.35% HURT stats (abs) min: 1 max: 15 x̄: 1.29 x̃: 1 HURT stats (rel) min: 0.16% max: 1.30% x̄: 0.58% x̃: 0.60% 95% mean confidence interval for instructions value: -1.33 -1.13 95% mean confidence interval for instructions %-change: -0.43% -0.34% Instructions are helped. total cycles in shared programs: 979499227 -> 979448725 (<.01%) cycles in affected programs: 344261539 -> 344211037 (-0.01%) helped: 1079 HURT: 441 helped stats (abs) min: 1 max: 9384 x̄: 147.78 x̃: 48 helped stats (rel) min: <.01% max: 31.83% x̄: 0.90% x̃: 0.33% HURT stats (abs) min: 1 max: 7220 x̄: 247.07 x̃: 32 HURT stats (rel) min: <.01% max: 31.30% x̄: 1.52% x̃: 0.53% 95% mean confidence interval for cycles value: -70.01 3.56 95% mean confidence interval for cycles %-change: -0.35% -0.05% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 10564 -> 10568 (0.04%) spills in affected programs: 143 -> 147 (2.80%) helped: 0 HURT: 1 total fills in shared programs: 11343 -> 11347 (0.04%) fills in affected programs: 287 -> 291 (1.39%) helped: 0 HURT: 1 LOST: 3 GAINED: 2 Skylake total instructions in shared programs: 18192274 -> 18190128 (-0.01%) instructions in affected programs: 1000188 -> 998042 (-0.21%) helped: 1149 HURT: 55 helped stats (abs) min: 1 max: 14 x̄: 1.92 x̃: 1 helped stats (rel) min: 0.04% max: 6.67% x̄: 0.67% x̃: 0.42% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.16% max: 0.55% x̄: 0.27% x̃: 0.26% 95% mean confidence interval for instructions value: -1.87 -1.69 95% mean confidence interval for instructions %-change: -0.67% -0.58% Instructions are helped. total cycles in shared programs: 960856054 -> 960728040 (-0.01%) cycles in affected programs: 340840968 -> 340712954 (-0.04%) helped: 1079 HURT: 233 helped stats (abs) min: 1 max: 7640 x̄: 170.95 x̃: 46 helped stats (rel) min: <.01% max: 30.20% x̄: 0.96% x̃: 0.28% HURT stats (abs) min: 1 max: 6864 x̄: 242.23 x̃: 26 HURT stats (rel) min: <.01% max: 34.64% x̄: 2.10% x̃: 0.22% 95% mean confidence interval for cycles value: -135.62 -59.53 95% mean confidence interval for cycles %-change: -0.59% -0.25% Cycles are helped. LOST: 15 GAINED: 1 Broadwell total instructions in shared programs: 17855624 -> 17853580 (-0.01%) instructions in affected programs: 1012209 -> 1010165 (-0.20%) helped: 1105 HURT: 52 helped stats (abs) min: 1 max: 13 x̄: 1.90 x̃: 1 helped stats (rel) min: 0.03% max: 6.67% x̄: 0.67% x̃: 0.36% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.13% max: 0.52% x̄: 0.26% x̃: 0.25% 95% mean confidence interval for instructions value: -1.86 -1.67 95% mean confidence interval for instructions %-change: -0.68% -0.58% Instructions are helped. total cycles in shared programs: 1029905447 -> 1029840699 (<.01%) cycles in affected programs: 347102680 -> 347037932 (-0.02%) helped: 1007 HURT: 211 helped stats (abs) min: 1 max: 1360 x̄: 89.76 x̃: 48 helped stats (rel) min: <.01% max: 16.26% x̄: 0.69% x̃: 0.25% HURT stats (abs) min: 1 max: 1297 x̄: 121.51 x̃: 20 HURT stats (rel) min: <.01% max: 31.31% x̄: 1.21% x̃: 0.20% 95% mean confidence interval for cycles value: -62.39 -43.92 95% mean confidence interval for cycles %-change: -0.47% -0.25% Cycles are helped. total spills in shared programs: 20335 -> 20333 (<.01%) spills in affected programs: 19 -> 17 (-10.53%) helped: 2 HURT: 0 total fills in shared programs: 25905 -> 25899 (-0.02%) fills in affected programs: 23 -> 17 (-26.09%) helped: 2 HURT: 0 LOST: 9 GAINED: 0 Haswell total instructions in shared programs: 16418516 -> 16417293 (<.01%) instructions in affected programs: 223785 -> 222562 (-0.55%) helped: 590 HURT: 67 helped stats (abs) min: 1 max: 15 x̄: 2.19 x̃: 1 helped stats (rel) min: 0.03% max: 6.52% x̄: 0.87% x̃: 0.60% HURT stats (abs) min: 1 max: 2 x̄: 1.04 x̃: 1 HURT stats (rel) min: 0.04% max: 1.85% x̄: 0.44% x̃: 0.25% 95% mean confidence interval for instructions value: -2.01 -1.71 95% mean confidence interval for instructions %-change: -0.80% -0.67% Instructions are helped. total cycles in shared programs: 1037179754 -> 1037084874 (<.01%) cycles in affected programs: 352541071 -> 352446191 (-0.03%) helped: 1093 HURT: 182 helped stats (abs) min: 1 max: 888 x̄: 111.03 x̃: 64 helped stats (rel) min: <.01% max: 27.30% x̄: 0.84% x̃: 0.20% HURT stats (abs) min: 1 max: 6777 x̄: 145.49 x̃: 21 HURT stats (rel) min: <.01% max: 24.10% x̄: 1.99% x̃: 0.29% 95% mean confidence interval for cycles value: -88.10 -60.73 95% mean confidence interval for cycles %-change: -0.58% -0.29% Cycles are helped. total spills in shared programs: 17457 -> 17456 (<.01%) spills in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total fills in shared programs: 20387 -> 20385 (<.01%) fills in affected programs: 15 -> 13 (-13.33%) helped: 1 HURT: 0 LOST: 6 GAINED: 1 Ivy Bridge and earlier platforms had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515482 -> 15513998 (<.01%) instructions in affected programs: 239739 -> 238255 (-0.62%) helped: 573 HURT: 57 helped stats (abs) min: 1 max: 20 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.03% max: 9.84% x̄: 0.94% x̃: 0.55% HURT stats (abs) min: 1 max: 2 x̄: 1.39 x̃: 1 HURT stats (rel) min: 0.09% max: 1.85% x̄: 0.52% x̃: 0.35% 95% mean confidence interval for instructions value: -2.57 -2.14 95% mean confidence interval for instructions %-change: -0.89% -0.73% Instructions are helped. total cycles in shared programs: 584509880 -> 584463152 (<.01%) cycles in affected programs: 11765280 -> 11718552 (-0.40%) helped: 661 HURT: 152 helped stats (abs) min: 1 max: 3073 x̄: 101.99 x̃: 32 helped stats (rel) min: <.01% max: 34.38% x̄: 1.46% x̃: 0.50% HURT stats (abs) min: 1 max: 6637 x̄: 136.10 x̃: 15 HURT stats (rel) min: <.01% max: 24.19% x̄: 1.75% x̃: 0.25% 95% mean confidence interval for cycles value: -82.79 -32.16 95% mean confidence interval for cycles %-change: -1.11% -0.61% Cycles are helped. LOST: 9 GAINED: 0 Tiger Lake Instructions in all programs: 160905127 -> 160900949 (-0.0%) SENDs in all programs: 6812418 -> 6812085 (-0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7431911114 -> 7433914697 (+0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304537 (-0.0%) Ice Lake Instructions in all programs: 145296733 -> 145292370 (-0.0%) SENDs in all programs: 6863818 -> 6863485 (-0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798257570 -> 8800204360 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334248 (-0.0%) Skylake Instructions in all programs: 135891485 -> 135887357 (-0.0%) SENDs in all programs: 6803031 -> 6802698 (-0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442221881 -> 8444201959 (+0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301114 (-0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Connor Abbott	e894e83e47	ir3/cf: Rewrite pass The old pass had a few bugs: - It tried to avoid folding f2f32 into f2f16, but didn't consider conversions that were already folded in. - It didn't prevent folding an f2f16 or f2f32 into a non-floating-point op. In addition it wasn't written in a manner which made handling integer conversions practical. This rewrites the pass to instead calculate the "type" of the conversion source and then check whether folding the conversion is allowed. This allows us to cleanly separate the declarative part where we describe how the HW works from the policy part where we decide whether the transform is allowed, and makes it simple to add support for folding integer conversions. Closes: #3208 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10859>	2021-05-19 20:03:19 +00:00
Danylo Piliaiev	931ad19a18	turnip: make cmdstream bo's read-only to GPU Would allow earlier faults instead of having corrupted cmdstream. This was already done to Freedreno long ago in: `04aff7e4` "freedreno: make cmdstream bo's read-only to GPU" Since private memory should be GPU writable it is now allocated separately, instead of suballocation from now read-only cmdstream. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>	2021-05-17 18:29:09 +00:00
Danylo Piliaiev	413e7c6dc8	turnip: make possible to create read-only bo with tu_bo_init_new GPU won't be able to write to such BOs, which would to useful for cmdstream BOs. Move "bool dump" to the new flags along the way. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>	2021-05-17 18:29:09 +00:00
Connor Abbott	a40714abf7	nir/lower_phis_to_scalar: Add "lower_all" option We don't want to have to deal with vector phis in freedreno, because vectors are always split/unsplit around vectorized instructions anyways, and the stated reason for not scalarising them (it hurting coalescing) won't apply to us because we won't be using nir_from_ssa. Add this option so that we don't have to do the equivalent thing while translating from NIR. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>	2021-05-17 09:59:45 +00:00
Danylo Piliaiev	3a29e45a90	turnip: do not ignore early_fragment_tests Specifying "early_fragment_tests" in fragment shader takes precedence over our internal conditions. Fixes test: dEQP-VK.fragment_operations.early_fragment.early_fragment_tests_stencil Fixes: `b2a60c157e` "turnip: add LRZ early-z support" Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10803>	2021-05-17 11:56:32 +03:00
Rob Clark	a5a86adc23	freedreno/a6xx: Add a few registers Based-on: https://patchwork.freedesktop.org/patch/429745/?series=89269&rev=2 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10829>	2021-05-16 18:47:55 +00:00
Dmitry Baryshkov	0c30ad402d	freedreno/regs: split DSI PHY registers to separate xml files. In-kernel DSI PHY driver is being more and more split from the main DSI driver. Split PHY registers from main dsi.xml file to ease further split of the drivers. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10817>	2021-05-16 15:53:14 +00:00
Juan A. Suarez Romero	629e8347ad	ci: Update VK-GL-CTS to 1.2.6.1 Reviewed-by: Emma Anholt <emma@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10800>	2021-05-14 20:35:24 +00:00
Emma Anholt	341ecb2dfc	ci/freedreno: Skip refract on a306 now that it hangchecks sometimes. Not every MR, but several per day since I landed the apitrace switch which increased trace replay resolution to 1080p. Hopefully some day we can tune the hangcheck to be less aggressive. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10793>	2021-05-14 16:59:13 +00:00
Danylo Piliaiev	5a133ef1f2	ci/turnip: drop fail annotation for image.extend_operands_spirv1p4.* They were fixed in `ed20e69b` "vtn: Handle ZeroExtend/SignExtend image operands" Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10783>	2021-05-13 17:18:48 +00:00
Danylo Piliaiev	9a477ccbea	ci/turnip: drop fail annotation for float_control tests These tests are NotSupported and therefore cannot fail. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10783>	2021-05-13 17:18:48 +00:00
Rob Clark	6c530ebf40	freedreno/ir3: Don't force RTNE if rounding mode is undefined Forcing round-to-nearest-even results in loss of opportunities for conversion folding, causing a regression in gfxbench gl_alu2. Fixes: `de195671bd` ("ir3: nir_op_f2f16 should round to even") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10773>	2021-05-12 19:05:27 +00:00
Hyunjun Ko	3f229e34c9	turnip: Implement VK_KHR_timeline_semaphore. Implements non-shareable timelines using legacy syncobjs, inspired by anv/radv implementation. v1. Avoid memcpy in/out_syncobjs and fix some mistakes. v2. - Handle vkQueueWaitIdle. - Add enum tu_semaphore_type. - Fix to handle VK_SEMAPHORE_WAIT_ANY_BIT_KHR correctly. - Fix a crash of dEQP-VK.synchronization.timeline_semaphore.device_host.misc.max_difference_value. v3. Avoid indefinite waiting in vkQueueWaitIdle by calling tu_device_submit_deferred_locked itself. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10126>	2021-05-12 05:07:44 +00:00
Hyunjun Ko	daefc6e2a4	turnip: prep work for timeline semaphore support Small refactor to classify semphore types, currently only binary syncobj is being used though. v1. Fix a crash of dEQP-VK.api.null_handle.destroy_semaphore Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10126>	2021-05-12 05:07:44 +00:00
Emma Anholt	7520ac54dd	ci: Switch to apitraces for glmark2 This brings in upstream mediump fixes, and should also replay faster than .rdc files. Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Rohan Garg <rohan.garg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10295>	2021-05-11 20:07:29 +00:00
Emma Anholt	9a5c9ff342	turnip: Drop fail annotation for driver_properties. These subtests weren't run in CI, and the whole set is skipped since dropping to 1.1. Fixes: `7bcda21441` ("turnip: Demote API version to 1.1.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10734>	2021-05-11 14:16:25 +00:00
Emma Anholt	63a3d18ae1	ci/turnip: Add some links to issues and MRs for some test failures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10734>	2021-05-11 14:16:25 +00:00
Emma Anholt	90b08175b7	ci/turnip: Clean up some stale fail annotations. This test group was fixed in the deqp 1.2.6.0 uprev, but we do a fractional run that didn't include these tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10734>	2021-05-11 14:16:25 +00:00
Danylo Piliaiev	811f289c56	turnip: copy all layers specified in vkCmdCopyImage When copying layered images we ignored .layerCount parameter. Fixes mis-rendering of walls in D3D11 game "Company Of Heroes 2". Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10736>	2021-05-11 11:42:12 +00:00
Emma Anholt	d93acf1001	freedreno: Update editorconfig and emacs settings for freedreno reformat. Fixes: `2d439343ea` ("freedreno: Re-indent") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10742>	2021-05-10 23:16:00 +00:00
Eric Anholt	3eee475e39	turnip: Claim 2 discrete queue priorities. The spec requires at least 2, but says "No specific guarantees are made about higher priority queues receiving more processing time or better quality of service than lower priority queues." So, we can just leave the priorities as a stub. Fixes dEQP-VK.info.device_properties Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10470>	2021-05-10 17:21:02 +00:00
Eric Anholt	d8099df65a	turnip: Drop wideLines properties since we don't support wide lines. The blob doesn't expose wideLines either, and dEQP-VK.info.device_properties fails if you claim wide line properties without it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10470>	2021-05-10 17:21:02 +00:00
Rob Clark	3a772be026	freedreno: Add perfetto renderpass support Add a custom DataSource to provide trace events for render stages. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>	2021-05-10 15:34:07 +00:00
Rob Clark	133a3e4dd3	freedreno/pps: Detect GPU suspend on newer kernels We can avoid re-sending the configuration cmdstream constantly if we know the device has not suspended since the last sampling period. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>	2021-05-10 15:34:07 +00:00
Rob Clark	e63ef520fe	freedreno/drm: Add support to query device suspend count Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>	2021-05-10 15:34:07 +00:00
Rob Clark	3e13e45467	freedreno: Add freedreno pps driver Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>	2021-05-10 15:34:07 +00:00
Danylo Piliaiev	daad8f2245	freedreno/a5xx: SP_BLEND_CNTL has per-mrt blend enable bit Blending in SP_BLEND_CNTL is not a binary flag but the same mask as in RB_BLEND_CNTL. It is a per-mrt enable bit for blending. Copied form a6xx, on a5xx it should be have the same since it seems to have the same structure layout. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10682>	2021-05-07 19:12:17 +00:00
Danylo Piliaiev	14da2444a9	turnip,freedreno/a6xx: SP_BLEND_CNTL has per-mrt blend enable bit Blending in SP_BLEND_CNTL is not a binary flag but the same mask as in RB_BLEND_CNTL. It is a per-mrt enable bit for blending. Example SP_BLEND_CNTL produced by blob on a630 and different MRT blendings: SP_BLEND_CNTL: { UNK8 \| 0x6 } SP_BLEND_CNTL: { ENABLED \| UNK8 \| 0xe } (Decoded before this commit) Fixes mis-rendering with D3D11 game "Spelunky 2". Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10682>	2021-05-07 19:12:17 +00:00
Emma Anholt	0cd63e891d	turnip: Move the extension tables to tu_device.c Following intel's lead in `27d49670`. In the dEQP-VK.info.* tests, this bumps apiVersion from 1.1.128 to 1.1.177. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10635>	2021-05-06 00:14:12 +00:00
Emma Anholt	c5438450ad	turnip: Switch to the shared vulkan ICD generator. One less python script to maintain. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10635>	2021-05-06 00:14:12 +00:00
Eric Anholt	cc5df4398a	freedreno/a5xx: Fix up border color pointers. We were forgetting to increment in the loop, but also it looks from blob dumps on Pixel 2 like all the pointers it emitted were shifted up by 3 compared to our xml, and that's the same shift that a6xx uses for its pointers. None of the tests seem to use more than one border-color-requiring texture, so it's hard to tell. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9904>	2021-05-05 21:28:03 +00:00
Rob Clark	b447db41fc	freedreno/tools: Fix async flush vs fdperf/computerator They need to wait on the ready fence to ensure the submit has been flushed to the kernel. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10626>	2021-05-05 20:32:31 +00:00
Eric Anholt	7bcda21441	turnip: Demote API version to 1.1. We don't support major 1.2 required extensions like timeline semaphores. Fixes many complaints in the dEQP-VK.info.vulkan1p2.* group. We were originally bumped to 1.2 in `75755e0eba` ("turnip: Pretend to support Vulkan 1.2") but hopefully that build issue has been fixed in the entrypoint reworks since then. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10471>	2021-05-05 17:09:09 +00:00
Danylo Piliaiev	d8ab0ec8e4	turnip: implement VK_KHR_vulkan_memory_model No handling of Acquire/Release because at the moment scheduler works as if any barrier is Acq+Rel. Instead of removing scoped_barrier with scope/mode that for TCS corresponds to a control_barrier or a memory_barrier_tcs_patch in ir3_nir_lower_tess_ctrl - remove them in emit_intrinsic_barrier. And do the same for memory_barrier_tcs_patch and control_barrier. While in any case hw fence/barrier shouldn't be emitted for them, they still affect ordering of stores, and in feature ir3 backend may want to have that information. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>	2021-05-05 10:05:38 +00:00
Danylo Piliaiev	a898828a63	ir3: update bar/fence bits in accordance to blob On a6xx blob uses .l rather differently from a5xx. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>	2021-05-05 10:05:38 +00:00
Danylo Piliaiev	cb8a00791c	ir3: memory_barrier also controls shared memory access order nir_intrinsic_memory_barrier has the same semantic as memoryBarrier() in GLSL, which is: GLSL 4.60, 4.10. "Memory Qualifiers": "The built-in function memoryBarrier() can be used if needed to guarantee the completion and relative ordering of memory accesses performed by a single shader invocation." GLSL 4.60, 8.17. "Shader Memory Control Functions": "The built-in functions memoryBarrier() and groupMemoryBarrier() wait for the completion of accesses to all of the above variable types." Fixes tests: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.image.comp Fixes: `819a613a` ("freedreno/ir3: moar better scheduler") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>	2021-05-05 10:05:38 +00:00
Eric Anholt	89114225b5	tunrip: Add support for VK_EXT_separate_stencil_usage. We were implictly including it in exposing VK 1.2, but we weren't making use of the supplied struct. Actually enabling it gives us a chance to do slightly better at Z/S UBWC, and means we won't lose the separate usage test coverage when switching back to exposing VK 1.1. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10594>	2021-05-04 20:30:50 +00:00
Eric Anholt	7c52a79057	ci/freedreno: Add another db820c flake that's appeared in the last few months. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10597>	2021-05-04 01:03:50 +00:00
Connor Abbott	9fa587ae96	ir3: Don't assume regs[1] exists in ir3_fixup_src_type() It won't exist for phi nodes because they are only partially constructed beforehand. Move it into the switch arguments where we know it's needed. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	3c8a5d7e17	ir3: Rework outputs Instead of using a separate outputs array, make the "end" instruction (or chmask) take the outputs as sources. This works better for the new RA, because it better models the fact that outputs are consumed all at the same time. With the old model, each output collect would be assumed dead after it was processed and subsequent collects could use it when inserting shuffle code, which wouldn't work, and the new RA also deletes collect instructions after lowering them to moves so the information would be gone after RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	dd55bd8f68	ir3: Make predecessors an array We need a stable order in order to create phi instructions. In the future we can make this more sophisticated in order to make manipulating the CFG easier, but for now that only happens after RA, so we won't have to worry about it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	0bd68b8386	ir3: Refactor nir->ir3 block handling Originally I wrote this to support multiple ir3 blocks per NIR block, but this turned out to be more useful for creating a stable ordering to the predecessors. We compute the predecessors ourselves, rather than relying on NIR, so that the array of predecessors we create in the next commit has a stable order we can rely on when creating phi nodes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	d28b22374c	ir3/cp_postsched: Fixup SSA use pointer for direct reads There's an optimization here to sink direct (i.e. not relative) reads of an array past unrelated direct writes. However, since each write actually reads, modifies, and then writes again to the array, this means that we need to read the latest updated array. The old RA used the array id instead of the SSA information, so it didn't care, but the new RA uses ->instr instead and ignores the array id because arrays are now SSA so it needs to be correct. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	40a1c4ba2d	ir3/postsched: Fix ir3_postsched_node::delay calculation This wasn't using the same calculation that add_reg_dep() was using to get the index into state->regs, so it was using the wrong register. Fix this by folding it into add_reg_dep(). This shouldn't fix anything, because it's just used for scheduler priorities, but it should reduce nop's and syncs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	4b41ffc231	ir3/delay: Remove special case for array deps The case it was trying to handle (array read-after-write depedendencies) is already handled by the normal SSA source handling, so this is just useless. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	873e21f4e9	ir3/postsched: Use correct src index Match what ir3_delay_calc() does. Caught by an assert later. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	af7f29a78e	ir3/sched: Use correct src index Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	7df7bab03b	ir3/cp: Clone registers for compare-folding optimization Sharing the same register between instructions happened to work with the old RA, but not with the new RA because they may get different register assignments. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	e597f8b122	ir3/postsched: Fix dependencies for a0.x/p0.x a0.x is written as a half-reg, but just interpreting it as "hr61.x" will result in it overlapping with r30.z in merged mode, which is not what the hardware does at all. This introduced a spurious dependency on a write to r30.z which resulted in an assert tripping. Just pretend it's a full reg instead. This fixes spec@arb_tessellation_shader@execution@variable-indexing@vs-output-array-vec3-index-wr-before-tcs with the new RA. Fixes: `0f78c32` ("freedreno/ir3: post-RA sched pass") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Danylo Piliaiev	ea72be8b7c	ir3: do not fold cmps from different blocks with non-null address Scheduling don't like address being in the different block from the instruction. Fixes a crash in the trace of "War Thunder" (DX11) Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10355>	2021-05-03 17:25:05 +00:00
Connor Abbott	3d5c1c4989	tu: Fix SP_GS_PRIM_SIZE for large sizes Based on the previous commit. Fixes: `012773b` ("turnip: Configure VPC for geometry shaders") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10551>	2021-05-03 14:06:24 +00:00
Connor Abbott	0157076982	freedreno/a6xx: Better document SP_GS_PRIM_SIZE Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10551>	2021-05-03 14:06:24 +00:00
Rob Clark	c554757bc2	freedreno/drm: Initialize control->fence Don't rely on getting a zero'd out buffer, we could hit the bo-cache. Fixes: `7dabd62464` ("freedreno/drm: Userspace fences") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10567>	2021-05-02 15:17:25 +00:00
Rob Clark	928453ccb2	freedreno/ci: Mark client_wait_sync_finish as flake This one has shown up a couple times since fd/go-fast, I'm still trying to reproduce/debug. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10530>	2021-05-01 08:47:50 -07:00
Rob Clark	5181f40670	freedreno/drm: Allow FD_BO_PREP_FLUSH without _NOSYNC This provides the upper layer (gallium, etc) a way to ensure that rendering involving the bo has been flushed all the way to the kernel. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10530>	2021-05-01 08:46:27 -07:00
Rob Clark	9fa3312773	freedreno/ci: Isolate dEQP-EGL reset_context tests To reduce flakes, separate out the dEQP-EGL tests that are intentionally triggering GPU hangs. This avoids some kernel side issues with bad handling of ringbuffer-full scenarios, causing innocent tests to flake. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10560>	2021-05-01 02:37:05 +00:00
Eric Anholt	0987df6a3e	ci/freedreno: Mark dEQP-EGL flakes reported on IRC since its introduction. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10552>	2021-04-30 16:44:20 -07:00
Danylo Piliaiev	1201aa9332	ir3: do not move varying inputs that depend on unmovable instrs Not all varying fetches could be pulled into the start block. If there are fetches we couldn't pull, like load_interpolated_input with offset which depends on a non-reorderable ssbo load or on a phi node, this pass is skipped since it would be hard to find a place to set (ei) flag (beside at the very end). We also don't have to manually set (ei) in such cases since a5xx and a6xx do automatically release varying storage at the end. Earlier gens need further testing, however they do not support interpolateAt* functions at moment, so unless we would like to support sample shading on them - they are fine. Fixes crash in GTA V. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10483>	2021-04-30 14:49:18 +00:00
Juan A. Suarez Romero	e532a47f76	util/hash_table: do not leak u64 struct key For non 64bit devices the key stored in hash_table_u64 is wrapped in hash_key_u64 structure, which is never free. This commit fixes this issue by just removing the user-defined `delete_function` parameter in hash_table_u64_{destroy,clear} (which nobody is using) and using instead a delete function to free this structure. Fixes: `608257cf82` ("i965: Fix INTEL_DEBUG=bat") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10480>	2021-04-29 12:58:23 +02:00
Rob Clark	1d19325483	freedreno/ci: Disable counterstrike trace on a306 for now The combination of removing bottlenecks in userspace (userspace fences, etc) and slow GPU results in hitting full ringbuffer on a306. Haven't figured out a reasonable way to work around that in userspace until a kernel fix is in place, so disable this one for now. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	f92f31455a	freedreno/drm: Assume explicit fences if in_fence_fd If we ever see explicit fencing used, then we can disable implicit fencing, even for internal or unfenced batches. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	e9a9ac6f77	freedreno/drm: Async submit support Move the submit ioctl to it's own thread to unblock the driver thread and let it move on to the next frame. Note that I did experiment with doing the append_bo() parts synchronously on the theory that we should be more likely to hit the fast path if we did that part of submit merging before the bo was potentially re-used in the next batch/submit. It helped some things by a couple percent, but hurt more things. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	2c9e8db28d	freedreno/drm: pipe should hold reference to device A more direct solution would be for bo's to have a reference to the device. But bo's are ref/unrefd more frequently. This avoids async submits unrefing a bo after the device handle- table is freed. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	ad9654a4c1	freedreno/drm: fd_submit should hold ref to fd_pipe Also, move this into the base class, no reason for it to be in backend. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	cccdc513e3	freedreno/drm/sp: Implement deferred submit merging For submits flushed with (a) no required fence, and (b) no externally visible effects (ie. imported/exported bo), we can defer flushing the submit and merge it into a later submit. This is a bit more work in userspace, but it cuts down the number of submit ioctls. And a common case is that later submits overlap in the bo's used (for example, blit upload to a buffer, which is then used in the following draw pass), so it reduces the net amount of work needed to be done in the kernel to handle the submit ioctl. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/19 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	c7dc5cf3cb	freedreno/drm/sp: Split submit prep and finish For deferred submits, we still need to do the prep steps immediately, but the ioctl construction can be deferred.. prepare for that by splitting the prep out. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	62a6773d80	freedreno/drm: Add pipe tracking for deferred submits Now that we have some bo state tracking for userspace fences, we can build on this to add a way for the pipe implementation to defer a submit flush in order to merge submits into a single ioctl. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	aafcd8aacb	freedreno: Re-work fd_submit fence interface Move everything into a struct assocated with the pipe_fence_handle, so that the drm layer can fill in the seqn/fd fences directly. This will give us a comvenient place to insert a util_queue_fence in the next commit. While we're at it, extract the uint32_t fence (previously called 'timestamp' in place, a kgsl legacy) into a struct that encapsulates both the kernel fence and the userspace fence. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	d531c8d22a	freedreno/drm: Reference count submits To merge submits, we'll need drm to internally hold an extra reference to the submit. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	49852ace2a	freedreno/drm: Inline the fence-table In the common case, a bo will have no more than a single fence attached. We can inline the storage to avoid a separate allocation in this case. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	7dabd62464	freedreno/drm: Userspace fences Add a per-fd_pipe fence "timeline" so we can detect cases where we don't need to call into the kernel to determine if a fd_bo is still busy. This reuses table_lock, rather than introducing a per-bo lock to protect fence state updates because (a) the common / hotpath pattern is to update fences on a lot of objects, but checking the fence state of a single object is less common, and (b) because we already hold the table lock in common spots where we need to check the bo's fence state (ie. allocations from the bo-cache). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	df78934cdf	freedreno/drm: Add locked version fd_{bo,pipe}_del() This will be needed in the next patch, so we can reuse the bo table_lock for fence related locking. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	8f5c89350f	freedreno/drm: Move the growable array helper We'll need this to track userspace fences attached to a fd_bo. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	c98ada6ad1	freedreno/drm: Add FD_BO_PREP_FLUSH There are a couple cases where we want to use _NOSYNC, but at the same time we want to ensure that rendering related to a bo is actually flushed. This doesn't do anything yet, but when we start deferring/merging submits we'll need a way to trigger anything deferred to flush. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	8ab227c373	freedreno/drm: Cleanup bo cpu_prep flags Also add some STATIC_ASSERT() Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	7f0abd9048	freedreno/drm: Cleanup bo allocation flags Most of them were actually unused. The memory type (KMEM vs SMI) only applied to very old a2xx era devices that had a small/fast stacked memory (SMI) vs normal memory (KMEM). And the cache flags are ignored (ie. everything is writecombine), but we can add new cache flags later when they actually do something. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	ef0c5007f2	freedreno/drm: Move submit->primary to base class Gets rid of a bit of duplication between the two current implementations, and will be needed in next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Rob Clark	224dbd77d5	freedreno: Small indent fix Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>	2021-04-28 15:36:42 +00:00
Danylo Piliaiev	addab037f0	tu: do not corrupt unwritten render targets There is no point in having a write to an attachment enabled when there is no corresponding output in the shader. Per VK spec it is an UB, however a few apps depend on attachment not being changed if FS doesn't have corresponding output. Fixes water in Genshin Impact. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10489>	2021-04-28 08:51:49 +00:00
Eric Anholt	7ae0719117	turnip: Only write the tu_RegisterDeviceEXT() out fence on success. Fixes a double-free in dEQP-VK.wsi.display_control.register_device_event where the fence that we destroyed got destroyed again. Leaving it unset in the error path leaves the test's NULL in place. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10473>	2021-04-27 16:19:26 +00:00
Italo Nicola	8074a040e7	util: add util_sign_extend This code is taken from src/freedreno/isa/decode.c. Since we need a similar function in panfrost, it's probably good to move it to utils. Signed-off-by: Italo Nicola <italonicola@collabora.com> Acked-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9461>	2021-04-27 07:04:07 +00:00
Connor Abbott	643f2cb8a3	ir3, tu: Cleanup indirect i/o lowering Do all the necessary lowering in one place, during finalization, and stop uselessly calling nir_lower_indirect_derefs in turnip. Splitting i/o to elements should no longer be necessary since we use the i/o semantics instead of variables now. This has the side effect that we no longer generate enormous if-ladders for tess/GS shaders with turnip. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>	2021-04-26 17:07:02 +00:00
Connor Abbott	decfea2f4e	ir3: Prevent oob writes to inputs/outputs array Don't setup inputs and outputs if we aren't using load_input/store_output intrinsics. While it's mostly harmless, there may be more outputs than expected which would lead to an oob write of the outputs array when setting the register id to INVALID_REG. Also be more paranoid with asserts to catch this. Fixes: `a6291b1` ("freedreno/ir3: rework setup_{input,output} to make struct varyings work") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>	2021-04-26 17:07:02 +00:00
Robert Foss	9967dabe91	freedreno/regs: add 5nm DSI PHY/PLL regs This is for the kernel driver. Signed-off-by: Robert Foss <robert.foss@linaro.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10368>	2021-04-21 15:42:03 +00:00
Danylo Piliaiev	9402d5a6b5	ir3: make possible to specify branchstack up to 64 On a6xx/a5xx there is such dependency between branchstack bitfield and the amount of nested ifs, which could be seen with blob: IFs BRANCHSTACK 0 0 1 1 2 2 3 2 4 3 5 3 6 4 ... 59 30 60 31 61 31 62 32 63 32 64 32 Remove open-coded branchstack for a5xx compute along the way. Fixes tests: dEQP-VK.spirv_assembly.instruction.compute.float16.opvectorshuffle.344 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_vert dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.444_geom dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.244_tessc dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_frag Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9859>	2021-04-21 11:57:07 +00:00
Danylo Piliaiev	e7eed45869	ir3: do not double threadsize when exceeding branchstack limit We can't support more than compiler->branchstack_size diverging threads in a wave. Thus, doubling the threadsize is only possible if we don't exceed the branchstack size limit. As of blob version 512.490.0 - it doesn't have this heuristics. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9859>	2021-04-21 11:57:07 +00:00
Danylo Piliaiev	1e33b6a32b	turnip: enable shaderInt16 We should have everything to enable it. 16b integer division is lowered by nir_lower_idiv. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10054>	2021-04-20 20:32:20 +00:00
Danylo Piliaiev	d918bbfa1c	ir3: treat 16b imul as mul.s24 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10054>	2021-04-20 20:32:20 +00:00
Rob Clark	5bf7475460	ir3: handle 16b op_i2b1 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10054>	2021-04-20 20:32:20 +00:00
Samuel Iglesias Gonsálvez	b2a60c157e	turnip: add LRZ early-z support Imported the logic from Freedreno driver. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>	2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez	af049b6668	turnip: fix setting dynamic state mask for VK_DYNAMIC_STATE_STENCIL_OP_EXT case Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>	2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez	88c7aa0b3e	turnip: group all geometry constant draw states in one Thus, we can free some draw state slots for future use. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>	2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez	2c0c696f16	turnip: update LRZ state based on stencil test state Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>	2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez	ff8e3547b3	turnip: implement LRZ direction There are some LRZ compare op switches that are not supported by the HW, like GREATER* <-> LESS* ones. This patch tracks the direction of the switch and disables LRZ if needed. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>	2021-04-20 10:01:58 +00:00
Eric Anholt	8a8e55d6a8	ci/freedreno: Test dEQP-EGL against Xorg. This should help us be able to refactor core EGL code with more confidence, and increase our confidence uprevving Mesa in ChromeOS. Part of #1884 Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10240>	2021-04-19 20:53:27 +00:00
Danylo Piliaiev	64367f2359	turnip: implement VK_KHR_shader_terminate_invocation OpTerminateInvocation provides the behavior required by the GLSL discard statement, which we already implement. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>	2021-04-19 17:11:36 +00:00
Danylo Piliaiev	9dd9424a85	turnip: implement VK_EXT_shader_demote_to_helper_invocation The "demote" intrinsic has the semantics of D3D discard, which means it doesn't change the control flow, allowing derivatives to work. On A6xx there is no known way to check whether invocation was demoted, thus we use nir_lower_is_helper_invocation. Add "logical" OPC_DEMOTE which is later translated to "kill". Such separation is necessary to run "kill" specific optimizations which are invalid for "demote". Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>	2021-04-19 17:11:36 +00:00
Connor Abbott	08499369d0	ir3: Assemble and disassemble swz/gat/sct Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>	2021-04-19 16:10:44 +00:00
Connor Abbott	d48d43039a	ir3: Improve cat1 modifier disassembly Remove bit that shouldn't be part of (rptN), and rewrite the handling of (even) and (pos_infinity) to uncover a missing (neg_infinity) modifier. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>	2021-04-19 16:10:44 +00:00
Connor Abbott	4c5b696cc3	ir3/parser: Fix oob write with immediates array immediates_count and immediates_size are supposed to have the same units, but it was only incrementing immediates_count by 1. While we're here, also fix the case where constants are specified out-of-order. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>	2021-04-19 16:10:44 +00:00
Rob Clark	c74d93cf01	freedreno/fdl: Re-indent Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	6050976232	freedreno/perfcntrs: Re-indent Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	d26a224ca9	freedreno/ir2: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/ir2/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	2dbf09c2b4	freedreno/drm-shim: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/drm-shim/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	45856c5fbc	freedreno/decode: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/decode/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	3894bc9664	freedreno/computerator: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/computerator/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	ccd68b672a	freedreno/common: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/common/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	f5918f750f	freedreno/afuc: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/afuc/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Rob Clark	b94db11708	freedreno/drm: Re-indent clang-format -fallback-style=none --style=file -i src/freedreno/drm/*.[ch] Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>	2021-04-17 15:38:56 +00:00
Eric Anholt	23159f1a7a	ci/freedreno: Skip some precision tests on a530. These have flaked as Timeouts in CI in the last month. .precision.* is generally very slow (some in the 15s-30s range), but it's unclear to me why they sometimes spike up to 60 seconds (thermal throttling?). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10274>	2021-04-16 04:34:14 +00:00
Eric Anholt	7d234da6ee	freedreno: Fix YUV sampler regression. We have to keep sampler uniforms around for later YUV lowering, and we only need to remove uniforms that take up storage space. Code comes from radeonsi. Closes: #4644. Fixes: `de17b4aab5` ("freedreno: Remove uniform variables after finalizing NIR.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10246>	2021-04-15 16:20:15 +00:00
Michel Dänzer	d200f45875	Use explicit break instead of fall-through to break-only case clang generates a warning if there's no explicit break or fall-through annotation. The latter would be kind of silly in this case, and not robust against any future changes turning the fall-through invalid. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Connor Abbott	cf727e6ba4	tu: Expose VK_EXT_robustness2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:05:13 +02:00
Connor Abbott	0fb14420da	tu: Handle null descriptors Writing all 0's, including for the format, seems to work. Actually setting the format seems to break textureSize() (getsize returns 1 for some reason). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:05:13 +02:00
Connor Abbott	f58ece08da	tu: Handle robust UBO behavior for pushed UBO ranges If we push a UBO range but then find out at draw-time that part of the pushed range is out of range of the UBO descriptor, then we have to fill in the rest of the range with 0's to mimic the bounds-checking that ldc would've done. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:05:13 +02:00
Connor Abbott	cb02a48f83	tu: Correctly preserve old push descriptor contents We were never setting set->size, so we were always copying 0 bytes. But as we only copy the contents when the layout and therefore the size is the same, we don't have to take the old size into account anyway. This fixes some VK_EXT_robustness2 tests that use push descriptors. Fixes: `6d4f33e` ("turnip: initial implementation of VK_KHR_push_descriptor") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:05:13 +02:00
Connor Abbott	c68ea960a7	ir3, tu: Add compiler flag for robust UBO behavior This needs to be part of the compiler because it's the only piece that we always have access to in all the places ir3_optimize_loop() is called, and it's only enabled for the whole Vulkan device. Right now it's just used for constraining vectorization, but the next commit adds another use. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:05:11 +02:00
Connor Abbott	8f54028479	ir3: Reduce max const file indirect offset base to 9 bits This fixes dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_buffer.no_fmt_qual.len_260.samples_1.1d.frag, which accesses the shader UBO with c<a0.x + 512> due to the constant data UBO coming before it in the const file. The len_256 variant has a smaller constant data UBO, so it uses c<a0.x + 256> instead, and that works, so 512 seems to be the real limit. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:03:54 +02:00
Connor Abbott	8e11f0560e	ir3: Fix list corruption in legalize_block() We forgot to remove the instruction under consideration from instr_list before inserting it into the block's list, which caused instr_list to become corrupted. This happened to work but caused further corruption in some rare scenarios. Fixes: `adf1659` ("freedreno/ir3: use standard list implementation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:03:54 +02:00
Eric Anholt	6d510fd473	ci/freedreno: Merge a630 piglit to a single job. piglit_gl clocked in at 6:12 end-to-end runtime, and piglit_shader spent 2:53 in deqp-runner, so merging them together should be about 9 minutes. Removing a boot should save us a minute or two of runner time per pipeline. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10243>	2021-04-15 10:06:14 +00:00
Samuel Iglesias Gonsálvez	029bc53be6	turnip: fix typo in tu_CmdBeginRenderPass2() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:51:25 +02:00
Samuel Iglesias Gonsálvez	d52917f858	turnip/lrz: added support for depth bounds test enable Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:51:25 +02:00
Samuel Iglesias Gonsálvez	2161aebf8d	turnip: document GRAS_LRZ_CNTL's UNK5 bitfield It is used by the blob to enable depth bounds test for LRZ. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:51:25 +02:00
Samuel Iglesias Gonsálvez	54cf12774a	turnip/lrz: add support for VK_EXT_extended_dynamic_state When the depth or stencil state changes dynamically, that might affect LRZ state and we need to recalculate it and emit it again. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:51:20 +02:00
Samuel Iglesias Gonsálvez	6d6cbb7361	turnip: refactor how LRZ state is calculated Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:50:51 +02:00
Samuel Iglesias Gonsálvez	43ebba4e88	turnip: initialize pipeline->rb_{stencil,depth}_cntl always This change will simplify further changes on LRZ state management. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:50:51 +02:00
Samuel Iglesias Gonsálvez	1f9fb7677b	turnip: move pipeline gras_su and rb{stencil,depth}_cntl_mask initialization Move them up, so they are initialized even when the dynamic state is not used. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>	2021-04-15 09:50:51 +02:00
Rob Clark	31782330da	freedreno: Add missing foreach macros and update indentation Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10247>	2021-04-14 16:53:26 -07:00
Rob Clark	2fb3984805	freedreno: Add .clang-format Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883>	2021-04-14 19:52:21 +00:00
Connor Abbott	2deead184c	ir3/sched: Don't schedule too many tex/SFU instructions Consider a simple loop that does a series of texture instructions and then reduces the results: vec4 sum = vec4(0); for (int i = 0; i < N; i++) { sum += texture(...); } Assume that the loop is unrolled and we schedule the resulting basic block. Right now, after we schedule the first texture instruction, the only instructions available to schedule that don't incur a sync are the instructions to setup the second texture instruction. So we keep picking the texture instructions, no matter how large N is, resulting in a pathological schedule for register pressure when N is very large: sum1 = texture(...); sum2 = texture(...); sum3 = texture(...); ... sum = sum1 + sum2 + sum3 + ...; In particular this happens with some CTS tests for VK_EXT_robustness2, where a loop like that with many iterations is marked as [[unroll]], forcing NIR to unroll it. This solution is a balance between the current approach and always scheduling for register pressure (and ignoring sync's). We only allow a certain number of texture fetches to be in flight before considering textures to "sync", even though they don't really, both because they likely will sync in reality (overflowing the internal queue of waiting texture instructions) and because at some point we need the normal algorithm to kick in and start lowering register pressure. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>	2021-04-14 17:33:58 +00:00
Connor Abbott	7821e5a3f8	ir3/sched: Don't penalize uses of already-waited tex/SFU Once we insert a use of a given tex or SFU instruction, then we must wait for that tex/SFU instruction (as well as all earlier ones) to complete, so we shouldn't penalize further uses, even if a subsequent tex/SFU instruction gets scheduled after the first use. This especially matters after the next commit when we start forcibly breaking up long sequences of texture instructions, since if we schedule a group of 8 texture instructions then we want to schedule the uses of those instructions in parallel with the next 8 texture instructions to reduce register pressure. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>	2021-04-14 17:33:58 +00:00
Michel Dänzer	af0fde955c	ci: Move docker images from Debian buster to bullseye Among other things, this gets us GCC 10 (was 6). Requires some changes to third party components we use: * Install apitrace (& waffle) from Debian; was hitting issues with the local build, and it's the same version 9.0 anyway. * Update Fossilize to a newer commit which builds with GCC 10. * apt.llvm.org repositories are no longer needed. * Use an SPIRV-LLVM-Translator commit which builds with LLVM 11.0.1. * Install XCB packages from Debian, 1.13 fails to build with Python 3.9. * Install wayland-protocols from Debian, 1.12 is too old for libgtk-3-dev in bullseye. LLVM 7/8 packages are no longer available. Also adapt expected test results to Xvfb now exposing multi-samle GLXFBConfigs. v2: * Install clang instead of clang-11. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3124 Reviewed-by: Eric Anholt <eric@anholt.net> # v1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>	2021-04-14 13:05:08 +00:00
Connor Abbott	271c18f48e	tu: Expose VK_KHR_relaxed_block_layout This was absorbed into Vulkan 1.1, but we forgot to expose it separately. It's a subset of what's allowed by VK_EXT_scalar_block_layout. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>	2021-04-14 11:48:38 +00:00
Connor Abbott	765c3b85a5	tu: Expose VK_KHR_spirv_1_4 and VK_EXT_scalar_block_layout VK_KHR_spirv_1_4 is trivial because vtn already supports all the added SPIR-V features that aren't gated behind Vulkan extensions. I've observed some robustness2 CTS tests requiring this. However there are a few tests currently failing due to lacking spilling. VK_EXT_scalar_block_layout should also be trivial, since support for "straddling" UBO loads was added recently for other reasons. This is used by every robustness2 CTS test. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>	2021-04-14 11:48:38 +00:00
Juan A. Suarez Romero	9e5762c387	ci: Update VK-GL-CTS to 1.2.6.0 v2: - Bump up MESA_ROOTFS_TAG instead of arm_build (Michel) Acked-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10136>	2021-04-14 08:06:55 +00:00
Marek Olšák	fb29cef8dd	nir: add many passes that lower and optimize 16-bit input/outputs and samplers Added: * a pass that renumbers bases of IO intrinsics * a pass that converts mediump IO to 16 bits, optionally using the new packed varying slots * a pass that sets (forces) mediump in IO intrinsics (for testing) * a pass that remaps VARYING_SLOT_VAR[0..15]_16BIT to VARYING_SLOT_VAR[0..31] (if some shader stages don't want packed varyings) * a pass that folds type conversions around texture opcodes into those opcodes (e.g. tex(f2f32(coord), ..) is changed into tex accepting f16) * a pass that changes (legalizes) sampler src and dst types based on specified hw constraints (e.g. derivatives must be the same type as coordinates) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>	2021-04-13 05:07:42 +00:00
Rhys Perry	a2619b97f5	nir/lower_idiv: add options to use fp32 for 8-bit division lowering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Danylo Piliaiev	16fd5bd996	turnip: support copying both aspects of D32_SFLOAT_S8_UINT We cannot copy both aspects at the same time, so copy them one by one. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10140>	2021-04-12 14:36:30 +00:00
Connor Abbott	ba796d5115	ir3/postsched: Make sure to schedule inputs before kill Before, we would prefer to schedule inputs before kills, which works assuming that the live range of the bary_ij system value don't get split and therefore all bary.f are ready at the start of the block. However live range splitting can mess up that assumption and cause a kill to get scheduled before a move that leads to a bary.f. This fixes even e.g. dEQP-GLES2.functional.shaders.discard.basic_always on a3xx before introducing CSE of collect instructions, but even after that it could be a problem theoretically as the register allocator doesn't guarantee that any live ranges aren't split. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10143>	2021-04-09 16:31:29 +00:00
Matt Turner	4251e9cddf	ir3: Don't count (nopX) towards the wrong category Prior to this commit (nop3) mad.f32 r0.y, c0.x, r1.w, c0.y was counted as 4 cat3 instructions (and still 3 cat0/nops) in shader-db results. With this change, it is counted as only 1 cat3 instruction. Probably never going to have better shader-db results than this in my career: total cat2 in shared programs: 1214667 -> 732058 (-39.73%) cat2 in affected programs: 1194729 -> 712120 (-40.39%) helped: 8551 HURT: 0 total cat3 in shared programs: 376448 -> 274745 (-27.02%) cat3 in affected programs: 344918 -> 243215 (-29.49%) helped: 7222 HURT: 0 Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10116>	2021-04-09 14:26:35 +00:00
Bas Nieuwenhuizen	4ca4de50f7	nir: Remove nir_shader->shared_size. The same info is in shader_info. Dedupe. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10094>	2021-04-08 14:39:28 +00:00
Chad Versace	5e6db19168	anv: Remove vkCreateDmaBufINTEL (v4) Superceded by VK_EXT_image_drm_format_modifier. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1466>	2021-04-08 14:15:55 +00:00
Chad Versace	0845cabc72	vulkan: Track dependencies of Python imports The meson.build was unaware of transitive dependencies introduced by Python imports. Android still needs fixing. But I did not update the Android files lest I break the build. Ideally, we would fix this by using a Python runner that generates a depfile, similar to how meson creates depfiles for C files by passing flags -MD -MQ -MF to gcc. But this patch gets the job done, without stalling on the ideal general solution, by manually tracking the Python imports in new 'foo_depend_files' variables. CC: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1466>	2021-04-08 14:15:54 +00:00
Connor Abbott	5a70c4d4a0	ir3: Don't copy propagate arrays in ir3_cp We don't check whether there's an intervening write in this pass, which makes it incorrect. ir3_cp_postsched does check correctly, but we were accidentally doing it here anyway for some sources. While we're here, delete some code that was only used in the array case. Fixes: `f370e954` ("freedreno/ir3: handle const/immed/abs/neg in cp") Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>	2021-04-07 14:35:13 +00:00
Connor Abbott	1ad5ee5a04	ir3/cp_postsched: Set address of uses for relative mov's Fixes: `680ca5b` ("freedreno/ir3: add post-scheduler cp pass") Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>	2021-04-07 14:35:13 +00:00
Connor Abbott	dcc26a3945	ir3: Fix valid flags for STIB Disallow immediates for the source. This was hidden by the fact that we didn't copy-propagate trivial collect instructions. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>	2021-04-07 14:35:13 +00:00
Connor Abbott	94beaa1d92	ir3/legalize: Fix last input (ss) insertion If there was a mix of ldlv and bary.f and we inserted an (ss) after the last input which was a bary.f, then last_input_needs_ss would get unset, even though it shouldn't. For figuring out whether we need the (ss), we need to know whether there are any pending ldlv's when last_input gets executed, not at the end of the block, which means that the existing code's strategy of inserting it after the whole block has been processed won't work. Rework it to do the last_input processing in the main loop instead. Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>	2021-04-07 14:35:13 +00:00
Connor Abbott	35ffe4fec1	freedreno/a3xx: Fix SP_FS_CTRL_REG1_INITIALOUTSTANDING Unfortunately this didn't fix anything, but I thought I might as well include it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>	2021-04-07 14:35:13 +00:00
Danylo Piliaiev	519eb735a3	turnip: implement variableMultisampleRate If subpass doesn't have depth/color attachments - samples count is devised from VkPipelineMultisampleStateCreateInfo::rasterizationSamples. Without variableMultisampleRate enabled all pipelines in such subpass should have the same samples count; variableMultisampleRate allows to have pipelines with different number of samples in one subpass, given that it doesn't have depth/color attachments. Blob doesn't have it enabled but there is no known reason for this. Passes: dEQP-VK.pipeline.multisample.variable_rate.* Fixes test: dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9556>	2021-04-07 12:04:45 +00:00
Alejandro Piñeiro	1e0a69afa7	vulkan: track number of bindings instead of max binding for CreateDescriptorSetLayout As that handles better, and more clear, the case of bindingCount being zero. For the case of Anvil and Turnip, this avoids allocating a non-needed binding when bindingCount is zero. Inspired on radv, that was what it was doing so far. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4526 Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9905>	2021-04-05 20:17:53 +00:00
Danylo Piliaiev	0709a6b363	turnip: fix alignment of non-32b types in workgroup memory Fixes tests: dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.float16 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10039>	2021-04-05 17:31:11 +00:00
Alyssa Rosenzweig	06ebbde630	vulkan: Deduplicate mesa stage conversion Across every driver... v2: Add casts to appease -fpermissive used on CI. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9477>	2021-04-03 17:34:39 +00:00
Danylo Piliaiev	0ec495e3c9	turnip: handle format list for compressed formats Compressed formats may have compatible formats, however they could only be sampled, so we should not call tu6_format_color with them. tu6_format_texture should have the same behaviour for checking swap so use it for all cases. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10009>	2021-04-02 21:52:05 +00:00
Eric Anholt	f67b6f9c47	ci/freedreno: Fix up the a5xx border color flake annotation. Looks like I put it in the wrong file back when I first caught it. It's a one-or-twice-a-week back flake that seems to happen. The upcoming deqp-runner uprev would have caught this mistake. Fixes: `957132294f` ("ci/a5xx: Increase the gles3/31 coverage.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9806>	2021-04-02 18:42:04 +00:00
Eric Anholt	adf04d1af4	ci/freedreno: Switch to the trimmed glxgears trace. The old one had a ton of frames and took ~5 minutes on a306. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>	2021-04-01 21:04:11 +00:00
Eric Anholt	fe5349f70c	freedreno/a6xx: Fix alpha tests. Apparently I inverted the sense of this flag back when we didn't have piglit testing. Fixes terrible rendering in minetest, HL2, CS:Source, and CS. Fixes: `0369dd9077` ("freedreno/a6xx: Add ARB_depth_clamp and separate clamp support.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>	2021-04-01 21:04:11 +00:00
Eric Anholt	3043940183	freedreno/a5xx: Fix alpha test vs early Z bugs. Just like with discards, we have to disable early Z writes when alpha test is enabled. Fixes rendering on HL2, CS: Source, counter-strike, and minetest. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>	2021-04-01 21:04:11 +00:00
Eric Anholt	c9fd8c2570	ci/freedreno: Add trace testing on a3xx, a5xx. Having compared rendering between a6xx and these, I found several bugs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>	2021-04-01 21:04:11 +00:00
Eric Anholt	8e3a1d0dd2	ci/freedreno: Rename a306-test and a530-test to drop "arm64" from the name. We don't have an armhf variant, and probably won't. Now matches a630. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>	2021-04-01 21:04:11 +00:00
Eric Anholt	ec54546b2a	ci/freedreno: Add more new traces for a630 (minetest, TDM, pioneer, glyphy). These are all recent traces that have been added. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>	2021-04-01 21:04:11 +00:00
Danylo Piliaiev	ce1a381e57	turnip: enable VK_KHR_16bit_storage on A650 A650 can use the same SSBO descriptor for both 32-bit and 16-bit access, which makes it easy to enable this extension. Passes tests that run under: dEQP-VK.spirv_assembly.instruction..16bit_storage. Rebased and modified commit from Jonathan Marek. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>	2021-04-01 17:51:07 +00:00
Jonathan Marek	14acc64c3b	turnip: enable VK_KHR_shader_float16_int8 ir3 supports 16-bit floats, so we can enable this. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>	2021-04-01 17:51:07 +00:00
Danylo Piliaiev	64aaa4afc3	turnip: enable infinities for f16 math and document the register When float16 is enabled this will allow to pass a number of float16 tests. When A6XX_SP_FLOAT_CNTL_F16_NO_INF is set - all operations which generate +-infinity generate +-MAX_HALF_FLOAT. Fixes some tests from: dEQP-VK.spirv_assembly.instruction..float16. dEQP-VK.spirv_assembly.instruction..float_controls.fp16. E.g.: dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.sinh_vert dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_4.length dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log_denorm_flush_to_zero_nostorage dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log2_denorm_flush_to_zero_nostorage dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.inv_sqrt_denorm_flush_to_zero_nostorage Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>	2021-04-01 17:51:07 +00:00
Danylo Piliaiev	14460faa64	ir3: convert shift amount to 16b for 16b shifts NIR has shifts defined as: opcode("*shr", 0, tuint, [0, 0], [tuint, tuint32], False, ... However, in ir3 we have to ensure that both operators of shift instruction have the same bitness. Let's hope that in future the additional COV for constants would be optimized away. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>	2021-04-01 17:51:07 +00:00
Jonathan Marek	3777ecdf11	turnip: implement VK_KHR_shader_float_controls This matches the blob and doesn't require actually implementing controls since the supported modes are just what the HW does. Passes tests under: dEQP-VK.spirv_assembly..float_controls. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>	2021-04-01 17:51:07 +00:00
Danylo Piliaiev	de195671bd	ir3: nir_op_f2f16 should round to even cat1 instructions round to zero by default. When fp16 is enabled this will fix: dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_frag dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_vert dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>	2021-04-01 17:51:07 +00:00
Michel Dänzer	6652c5018c	ci: Merge ARM testing docker images to a single arm_test one The merged image contains kernels & rootfs for both arm64 & armhf baremetal test jobs, and is smaller than either arm{64,hf}_test image before. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>	2021-04-01 16:35:26 +00:00
Michel Dänzer	4b20bd7425	ci: Build ARM baremetal rootfs in native container Doing so in an x86 container via qemu was slow, and started failing recently after updating to a newer qemu version. This also results in smaller arm_test docker images, since we need to install fewer Debian packages in them. As a bonus, this turns some piglit tests from fail to pass (Or maybe they'll turn out to be flakes? They've passed at least 3 times in a row). Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>	2021-04-01 16:35:26 +00:00
Eric Anholt	0be9a40225	ci/freedreno: Demote a630-asan to a manual test for now. It's flaky in producing Missing results. I've got an uprev that should avoid the issue (and possibly a followon actual fix), but it's blocked on being able to rebuild the arm containers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9932>	2021-03-31 17:15:27 +00:00
Danylo Piliaiev	00d6ccebf9	ir3/isa: account for randomly set by blob lowest bit of ibo atomics As far as I could see - blob randomly sets the lowest bit of atomic.b.* instructions. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9841>	2021-03-31 15:03:35 +00:00
Bas Nieuwenhuizen	83c92a48b7	vulkan: Fix descriptor set creation with zero bindings. MAX2(count * struct size, 1) results in 1 for count=0, not the size of a struct. Since this MAX only seems to exist so we can keep using NULL for error reporting, just refactor to return a VkResult. Fixes: `ad241b15a9` ("vk: consolidate dynamic descriptor binding sorting") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4522 Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9880>	2021-03-29 23:32:50 +00:00
Matt Turner	0b35987895	tu: Skip tu_tiling_config_update_tile_layout() if not using gmem Otherwise pass->tile_align_w will be 0, leading to a divide by zero and undefined behavior. In practice, I saw this lead to an infinite loop in tests like dEQP-VK.draw.instanced.draw_indexed_indirect_vk_primitive_topology_line_list_attrib_divisor_0_multiview Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9606>	2021-03-29 21:58:24 +00:00
Eric Anholt	99838513ae	freedreno/a5xx: Add support for clip distances and use them for userclip. A little low-stakes RE effort as I unwind from fighting CI all day. Comes from diffing dEQP-VK.clipping.user_defined.clip_distance.vert.* on the blob and comparing to a6xx behavior. (My blob doesn't do tess, so if there are equivalent tess fields for some of these, I didn't find them) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9870>	2021-03-29 21:24:16 +00:00

... 3 4 5 6 7 ...

2471 Commits