KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rob Clark	106c2a65db	freedreno/drm: don't pass thru 'DUMP' flag on older kernels "softpin" mode was introduced in the same kernel as the 'DUMP' flag. So if we are using the legacy non-softpin path, clear the dump flag. OTOH the 'DUMP' flag isn't quite so needed on older kernels, since we would get all cmdstream, even SDS stateobjs, dumped regardless, as they would have cmd table entries. Fixes: `b2c23b1e48` ("freedreno: Mark all ringbuffer BOs as to be dumped on crash.") Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>	2020-05-18 19:00:47 +00:00
Rob Clark	5e10506834	freedreno/fdperf: add dependency on generated headers To fix an issue reported here: https://bugs.chromium.org/p/chromium/issues/detail?id=1083815 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5088>	2020-05-18 17:53:45 +00:00
Ilia Mirkin	b5accb3ff9	freedreno/a3xx: parameterize ubo optimization A3xx apparently has higher alignment requirements than later gens for indirect const uploads. It also has fewer of them. Add compiler parameters for both settings, and set accordingly for a3xx and a4xx+. This fixes all the ubo test failures caused by this optimization. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5077>	2020-05-17 19:51:40 -04:00
Ilia Mirkin	9048adbd24	freedreno/ir3: avoid applying (sat) on bary.f This causes failures on a3xx resulting in the non-sensical dEQP failures on packUnorm2x16. The same test uses ldlv on a4xx+, so just disallow (sat) on bary.f on all generations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5074>	2020-05-17 21:17:57 +00:00
Ilia Mirkin	ff4df32fae	freedreno/a3xx: there's no r8i/ui rb format, only rg8i/rg8ui This fixes a number of dEQP tests: dEQP-GLES3.functional.fbo.blit.conversion.r8* dEQP-GLES3.texture.specification.basic_teximage2d.r8* and others. The reason why this enum showed up in traces for R8 is that it was an "upgraded" texture to R8G8. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5073>	2020-05-17 14:39:42 -04:00
Eduardo Lima Mitev	e7458f19e1	freedreno/uuid: Generate meaningful device and driver UUID Device UUID becomes SHA1('freedreno' + gpu_id). Driver UUID becomes SHA1(mesa-version + git-head-sha1). v2: Don't use build_id for driver UUID since it generates different values for vulkan and gl shared objects. (Kristian) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>	2020-05-14 19:05:02 +00:00
Eduardo Lima Mitev	9623debf48	freedreno: Centralize UUID generation into new files freedreno_uuid.c/h The new files are created under a 'common' folder under 'src/freedreno', where shared functionality between GL and Vulkan drivers (that is not registers, layout or compiler) will be placed. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>	2020-05-14 19:05:02 +00:00
Connor Abbott	f293d02dc4	tu: Advertise COLOR_ATTACHMENT_BLEND_BIT for blendable formats Whoops. After fixing dual-source blending, dEQP-VK.pipeline.blend.* all go from skipped to pass, and fixes a bunch of dEQP-VK.api.info.format_properties.* tests where blending is required. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>	2020-05-14 18:15:31 +00:00
Connor Abbott	adbdab3ee8	tu: Implement dual-src blending Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>	2020-05-14 18:15:31 +00:00
Connor Abbott	078aa9df8d	tu: Move RENDER_COMPONENTS setting to pipeline state This needs to be pipeline state because it can change when dual-source blending is active. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>	2020-05-14 18:15:31 +00:00
Connor Abbott	2a9d12d513	ir3: Fixup dual-source blending slot The hardware expects that where MRT0 and MRT1 would normally go are the dual sources for MRT0, whereas GLSL has an extra "index" parameter that indicates which source it is. Remap it when handling FS outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>	2020-05-14 18:15:31 +00:00
Connor Abbott	0e0580550e	freedreno/a6xx: Document dual-src blending enable bits Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>	2020-05-14 18:15:31 +00:00
Eric Anholt	b1151cd2ff	freedreno: Avoid duplicate BO relocs in FD_RINGBUFFER_OBJECTs. For the piglit drawoverhead case, 5/18 of the objects' relocs were duplicated. We can dedupe them at object create time (since objects are long-lived) and avoid repeated relocation work at emit time. nohw drawoverhead program statechange throughput 2.34082% +/- 0.645832% (n=10). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>	2020-05-14 14:12:15 +00:00
Eric Anholt	a6fe0799fa	freedreno: Fix resource layout dump loop. Apparently I've never dumped a fully populated slices array, so the 0-init always terminated the loop. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>	2020-05-14 14:12:15 +00:00
Rob Clark	cf21b76383	freedreno/ir3: use lower_wrmasks pass Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-05-13 20:24:53 -07:00
Rob Clark	a506d49fae	nir: add helper to copy const_index[] It seems less brittle to not assume they are in the same order for src and dst instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-05-13 20:24:45 -07:00
Rob Clark	ea6b404294	freedreno/ir3: use const_index accessors Cleans up a couple spots that were still open-coding this. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-05-13 20:24:38 -07:00
Kristian H. Kristensen	14969aab11	freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics These intrinsics are supposed to map to the underlying hardware instructions, which don't have wrmask. We use them when we lower store_output in the geometry pipeline and since store_output gets lowered to temps, we always see full wrmasks there.	2020-05-13 20:24:33 -07:00
Eric Anholt	112c65825f	freedreno/a6xx: Use LDC for UBO loads. It saves addressing math, but may cause multiple loads to be done and bcseled due to NIR not giving us good address alignment information currently. I don't have any workloads I know of using non-const-uploaded UBOs, so I don't have perf numbers for it This makes us match the GLES blob's behavior, and turnip (other than being bindful). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	ab93a631b4	freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf. With the upcoming LDC usage in the GL driver, we don't want to be uploading descriptors for every UBO when they aren't actually in use. Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling elsewhere right now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	d5176c453e	freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges. I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges would move things back in a broken way. We can just run this pass later and drop the _ir3 path. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	5387c27140	freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift. Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3() after the non-offset src's definition, leading to nir_validate() failures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	e0a4d1c4e5	freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa). Just copy the src through. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	6670475a44	freedreno/a6xx: Fix UBWC mipmapping height alignment. After fixing the power of two sizing, pitches worked, but 1-pixel high and unaligned height miplevels were off. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	81f21ff4ef	freedreno/a6xx: Fix UBWC mipmap sizing. The HW requires a log2 width/height of the level 0 meta_* size in the descriptors, making it pretty clear that UBWC mipmapping is all power-of-two sized. Fixes a bunch of failures in the upcoming unit UBWC layout unit tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	b5db2a2574	freedreno/a6xx: Fix UBWC blockheight for RG8. Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of R16I/UI's 6x8. The other blockw/h I verified other than cpp=1 (R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	9da4ce9953	freedreno: Pull the tile_alignment lookup for a layout to a helper. The r8g8 case UBWC alignment will be changing in the next commit, so fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	dc7ccdb3f5	freedreno/a6xx: Add a testcase for UBWC buffer sharing. These offsets are hand-computed referencing msm_media_info.h, and match our driver's current behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	e32783c644	freedreno/a6xx: Improve layout testcase logging for UBWC fails. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	2e4ddb6353	freedreno/a4xx+: Increase max texture size to 16384. Noticed when poking around with texture layouts and found that my big texture layout from the blob buffer overflowed. Values come from http://vulkan.gpuinfo.org for Adreno 418, 512, 630. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Connor Abbott	b408734e5e	tu: Implement fallback linear staging blit for CopyImage Also, rewrite the format decision code so that we correctly decide when the linear fallback is needed, even if UBWC is disabled. As part of that, I also moved around some of the code to handle compressed formats to make sure that copying compressed formats with a linear staging blit works (this is now possible since we started allowing tiled compressed textures). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>	2020-05-13 13:39:04 +00:00
Connor Abbott	40e842c009	tu: Add noubwc debug flag to disable UBWC Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>	2020-05-13 13:39:04 +00:00
Connor Abbott	ed79f805fa	tu: Add a "scratch bo" allocation mechanism This is simpler than a full-blown memory reuse mechanism, but is good enough to make sure that repeatedly doing a copy that requires the linear staging buffer workaround won't use excessive memory or be slowed down due to repeated allocations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>	2020-05-13 13:39:04 +00:00
Samuel Pitoiset	91c757b796	turnip: use the common code for generating extensions and dispatch tables Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>	2020-05-13 08:45:29 +02:00
Rob Clark	d6706fdc46	freedreno/ir3/sched: try to avoid syncs Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d95a6e3a0c	freedreno/ir3/sched: avoid scheduling outputs If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	488cf208d5	freedreno/ir3/postsched: try to avoid (sy) syncs Similar to avoidance of `(ss)` syncs, it turns out to be helpful to avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex, (sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in gfxbench gl_fill2. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	25f4fb346e	freedreno/ir3/postsched: reset sfu_delay on sync Once we schedule an instruction that will require an `(ss)` sync flag, there is no need to delay any further instructions that consume an SFU result (until the next SFU instruction is scheduled). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	f351e1d137	freedreno/ir3: limit # of tex prefetch by shader size It seems for short frag shaders, too much prefetch can be detrimental. I think what we really want to do is decide after pre-RA sched, when we also know about nop's and what the actual ir3 instruction count is. But that will require re-working how prefetch lowering works. For now this is a super crude heuristic to attempt to approximate a good solution. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d69f6fd852	freedreno/ir3: fix indirect cb0 load_ubo lowering We can no longer assume that `state->ranges[0]` is block 0. It often is, but when we encounter a "real" ubo that we lower to `load_uniform` before a block 0 `load_ubo`, it could end up another entry in the table. Resulting in the second pass after gathering ubo ranges, not finding a valid range. Which results in a `load_ubo` for a thing that is not actually a ubo making it's way into ir3 frontend. Resulting in grabbing what we think is a ubo address out of some unrelated const register, and trying to dereference that. Which as you can imagine, fails in amusing ways. Fixes: `fc850080ee` ("ir3: Rewrite UBO push analysis to support bindless") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>	2020-05-12 23:51:46 +00:00
Rob Clark	c4dc877cb5	freedreno/ir3: don't allow negative const_offset Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>	2020-05-12 23:51:46 +00:00
Brian Ho	a43e974064	turnip: Execute ir3_nir_lower_gs pass again This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>	2020-05-12 13:42:55 -07:00
Jonathan Marek	d76e722ed6	turnip: enable tiling for compressed formats Now that layout code supports this, we can enable it. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>	2020-05-12 17:25:38 +00:00
Jonathan Marek	f543d87f23	turnip: update "fetchsize" value to match fdl6_layout changes It seems this is actually a "minimum pitch" value. For example TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels. This fixes breakage with compressed formats. For example this test: dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3 Fixes: `a34b3fa198` ("freedreno/fdl: Align after dividing by block size") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>	2020-05-12 17:25:38 +00:00
Eric Anholt	f789c5975c	freedreno: Fix non-constbuf-upload UBO block indices and count. The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers to the HW for UBO block 1-N, so let's just fix up the shader state. Fixes an off by one in const state layout setup, and some really dodgy register addressing trying to deal with dynamic UBO indices when the UBO pointers happen to be at the start of the constbuf. There's no fixes tag, though this fixes a bug from September, because it would require the num_ubos fix in nir_lower_uniforms_to_ubo. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>	2020-05-12 17:01:55 +00:00
Eric Anholt	51d7a71bd4	freedreno: Replace OUT_RELOCW with OUT_RELOC. Final cleanup commit now that they're the same. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	064f395a89	freedreno: Tell the kernel that all BOs are for writing. Using non-write flags is pretty dubious -- it means the kernel tracking an array of read-only consumers of the BO and having exclusive consumers wait on each reader's fence. It allows multiple readers through dma-bufs to do work in parallel, but at the cost of kernel CPU time and memory management of the shared array. Other drivers have dropped this distinction since dma-buf sharing is usually producer-consumer, not producer-two-consumers, and the userspace and kernel space tracking is expensive. For us, this lets us drop the flags passed in for relocs and tracked in the ringbuffer reloc lists. The end result of the flags reduction work is drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	b2c23b1e48	freedreno: Mark all ringbuffer BOs as to be dumped on crash. We can avoid passing these flags around in the DRM backends by just marking ring BOs up front. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	554b959df0	freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	9d8d936dfc	freedreno: Start moving relocs flags into the BOs. It's silly to have all the reloc emitters passing around FD_RELOC_READ when you have to have it set on all relocs (that don't include WRITE, which implies read) for the kernel to actually track the fences on the BO. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Mauro Rossi	a92a483ff7	freedreno: android: add adreno-pm4-pack.xml.h generation to android build Fixes the following building errors: In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:40: external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blend.c:36: external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_const.c:26: external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `ee293160` "freedreno/a6xx: add OUT_PKT()" Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4973>	2020-05-09 16:19:14 +00:00
Mauro Rossi	5dc3b22dd0	freedreno/drm: android: add libfreedreno_registers static dependency The dependency is required to get the necessary generated headers Fixes the following building error: In file included from external/mesa/src/freedreno/drm/msm_bo.c:27: In file included from external/mesa/src/freedreno/drm/msm_priv.h:30: In file included from external/mesa/src/freedreno/drm/freedreno_priv.h:51: external/mesa/src/freedreno/drm/freedreno_ringbuffer.h:35:10: fatal error: 'adreno_common.xml.h' file not found #include "adreno_common.xml.h" ^~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `6c688ae8` ("freedreno: Deduplicate ringbuffer macros with computerator/fdperf") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4973>	2020-05-09 16:19:14 +00:00
Eric Anholt	c9e8df61dc	freedreno: Initialize the bo's iova at creation time. Avoids repeated conditionals at reloc time checking if we need to go ask the kernel. No statistically significant difference on the drawoverhead case I'm looking at (n=300). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:39 -07:00
Eric Anholt	b3c4e6a597	freedreno: Rename append_bo() in case it doesn't get inlined. In a debugoptimized build, it wasn't inlined and so I wasn't noticing where a bunch of CPU usage was going in the DRM functions. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:39 -07:00
Eric Anholt	e1c74f3fac	freedreno: Clean up tests around ORing in the reloc flags. gcc was surprisingly not seeing through this to just do an AND and an OR. Improves drawoverhead's few uniforms / 1 change throughput 1.64141% +/- 0.188152% (n=60). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:39 -07:00
Eric Anholt	6c688ae81f	freedreno: Deduplicate ringbuffer macros with computerator/fdperf They're sugar around freedreno_ringbuffer.h, so put them there and reuse them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:38 -07:00
Hyunjun Ko	094c7646a3	freedreno,tu: Don't request fragcoord components not being read. v1. Replace the existed bool type with new bitfield and edit register files to take a mask instead of duplicating codes to do masking. v2. Use fragcoord_compmask != 0 instead of fragcoord_compmask > 0 since it represents a bitfield. Tested with dEQP-VK.glsl.builtin_var.simple.fragcoord_xyz/w dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz/w Closes: #2680 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4723>	2020-05-08 17:45:03 +00:00
Connor Abbott	6d513eb0db	tu: Support pipelines without a fragment shader Apparently this is allowed, and the CTS started doing this more often recently which resulted in frequent hangs running the entire CTS. I copied the code to create an empty FS from radv. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4928>	2020-05-07 16:05:53 +00:00
Kristian H. Kristensen	a34b3fa198	freedreno/fdl: Align after dividing by block size For compressed formats, we need to align the number of blocks, not the logical number of pixels in the texture. Only compressed formats have block width/height > 1, so we can just unconditionally multiply the alignment by the block width/height. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4868>	2020-05-06 17:11:34 -07:00
Eric Anholt	9a6bbf4c80	freedreno/ir3: Disable sin/cos range reduction for mediump. robclark noted that the blob wasn't doing range reduction in the mediump case, and I confirmed it on dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.mediump_float_fragment vs dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.highp_float_fragment. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4893>	2020-05-05 17:23:34 +00:00
Joshua Ashton	785803a2e5	turnip: Remove RANGE_SIZE usage These were removed from the latest Vulkan headers https://github.com/KhronosGroup/Vulkan-Docs/issues/1230 Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4878>	2020-05-05 00:28:00 +00:00
Eric Anholt	5c81f51c3c	freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx. These come from the disasm tests, and fix our disasm of blob's uniform/nonuniform cat6 operands. We also now include human-readable names for all the modes we know about (though bindless gets distinguished by its .baseN, like Connor's original disasm). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:15:50 -07:00
Eric Anholt	97b21110b8	freedreno/ir3: Sync some new changes from envytools. With this I also brought in a few new control flow instruction disasm tests that I'd made back when I wrote the disasm test, but which were too far from correct to include until now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:14:46 -07:00
Eric Anholt	1e5b0c92c5	freedreno/ir3: Add some more tests of cat6 disasm. I put these together from traces I had while trying to do LDC for GL. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:14:46 -07:00
Eric Anholt	29f58cfbd0	freedreno/ir3: Set up outputs for multi-slot varyings. Necessary to avoid compiler assertion failures in: dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	88dcfaf0ee	freedreno/ir3: Stop initializing regid of so->outputs during setup. It's unused and overwritten by ir3_compile_shader_nir(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	8c1c218909	freedreno/ir3: Improve shader key normalization. We can remove a bunch of conditional code at key comparison time by computing a bitmask of used key bits at ir3_shader creation time. This also gives us a nice place to put additional key simplification to reduce how many variants we create (like skipping rastflat if we don't read colors in the FS, or skipping vclamp_color if we don't write colors). It does mean walking the whole key to AND it, but the key is just 28 bytes so far so that seems pretty fine. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	6f1e3235f2	freedreno: Emit debug messages when doing draw-time recompiles of shaders. Right now that's "always" unless you have shaderdb set. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	a361567c46	freedreno/ir3: Remove unused half precision shader key flag. The code using it was removed in `4af86bd0b9` ("freedreno/ir3: remove half-precision output") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	05be0659fe	freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled. We weren't filling in the tess mode of the key, or setting has_gs on GS shaders, resulting in assertion failures when NIR intrinsics didn't get lowered. We have to make a guess at prim mode for TCS, but it should be better to have some shader-db coverage than none, and it will avoid these failures happening when we start precompiling shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	f91e49ee29	freedreno/ir3: Skip tess epilogue if the program is missing stores. Some of the negative API tests make shaders for tess stages that don't do all the stores they need to. Once we start precompiling (or doing shader-db of tess), we need to at least not segfault when generating them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	b420d04e1f	freedreno/ir3: Fix register allocation assertion failures. We were failing to tell the allocator about the restriction that scalar texture instructions (allocated as scalar regs) couldn't be allocated such that the start of the full unwritemasked vector started before r0. There was a patch in select_reg_callback on a6xx that tried to work around that, but you could still end up backed into a corner you shouldn't be because we didn't tell the RA what it needed. Fixes compiler assertion failures on a300-a400's blit_z shader, used for Z32F gmem blits. Looks like as a result we get tighter register allocation but more nops: instructions in affected programs: 757945 -> 760356 (0.32%) nops in affected programs: 317983 -> 320468 (0.78%) non-nops in affected programs: 27525 -> 27451 (-0.27%) mov in affected programs: 3098 -> 3023 (-2.42%) dwords in affected programs: 109664 -> 110656 (0.90%) last-baryf in affected programs: 112701 -> 112847 (0.13%) full in affected programs: 4326 -> 4011 (-7.28%) sstall in affected programs: 120550 -> 120836 (0.24%) (ss) in affected programs: 13939 -> 13918 (-0.15%) (sy) in affected programs: 3006 -> 2786 (-7.32%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Kristian H. Kristensen	73f34e0d46	freedreno/ir3: Drop hack to clean up split vars When the GS lowering was working on store_output intrinsics, we had to clean up the split vars to avoid getting confused. Now that we shadow the output vars instead, there's no confusion and we can drop this hack. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Kristian H. Kristensen	dd8d257a30	freedreno/ir3: Lower GS builtins before lowering IO We mostly got away with replacing a store_output with a store_var, but for complex types like structs, that doesn't work. Once the IO has been lowered from vars to intrinsic, we've lost the deref chains and can't properly shadow the outputs. This commits moves the GS lowering up so we do it before the output variables get lowered to store_output. This way the pass works much like nir_lower_io_to_temporaries() and cleanly shadows the outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Kristian H. Kristensen	79355fd901	freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass This pass lowers per-vertex input intrinsics to load_shared_ir3. This was open coded in the TCS and GS lowering passes before - this way we can share it. Furthermore, we'll need to run the rest of the GS lowering earlier (before lowering IO) so we need to split off this part that operates on the IO intrinsics first. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Kristian H. Kristensen	b7bfccf085	freedreno/ir3: Rename ir3_nir_lower_to_explicit_io We rename it to ir3_nir_lower_to_explicit_output, since it only handles output and we'll add a lowering pass for input next. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Kristian H. Kristensen	a16ee14f37	freedreno/ir3: Pass stream output info to ir3_shader_from_nir We need shader->stream_output filled out when we layout the push constants in ir3_setup_const_state(). Otherwise const_state->offsets.tfbo ends up as ~0, which doesn't work. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Eric Anholt	07f89126cd	freedreno/ir3: Fix the a3xx TF outputs stores. We were trying to deref the vector-collected outputs[] array before it's been set up, but we want the per-component outputs anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Eric Anholt	b0b8011e3e	freedreno/ir3: Set up the block predecessors for a3xx TF Fixes a segfault in ir3_legalize. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Eric Anholt	0e51082cfa	freedreno/ir3: Leave bools as 1-bit, storing them in full regs. If use NIR's 1-bit bool representation , we get exactly the bool behavior the hardware provides: CMPS produces true or false, AND/OR/XOR work as intended without extra absnegs, and we can pass those half values directly to other CMPS. We emit an absneg for b2b1 ("turn a memory load into a 1-bit NIR boolean"), but we would have done so for the ir3_n2b() on the use of that value anyway. The most awkward bit is that inot(a@1) is now a sub(1, a), but we can encode the 1 as an immediate so it's fine. No significant changes to GL_TIME_ELAPSED on my set of traces (n=21). instructions in affected programs: 1570638 -> 1548702 (-1.40%) nops in affected programs: 624053 -> 611381 (-2.03%) non-nops in affected programs: 959061 -> 949797 (-0.97%) mov in affected programs: 5258 -> 5252 (-0.11%) cov in affected programs: 15099 -> 15902 (5.32%) dwords in affected programs: 469600 -> 452768 (-3.58%) last-baryf in affected programs: 162211 -> 154726 (-4.61%) full in affected programs: 4881 -> 4797 (-1.72%) sstall in affected programs: 173953 -> 174545 (0.34%) (ss) in affected programs: 10922 -> 10934 (0.11%) (sy) in affected programs: 728 -> 745 (2.34%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>	2020-04-30 23:36:09 +00:00
Eric Anholt	769adc9546	freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops. It's set by ir3_put_dst() immediately after. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>	2020-04-30 23:36:09 +00:00
Rob Clark	d56b8c4554	freedreno: sync registers with envytools Pull in the `SP_xS_BRANCH_COND` regs to keep the mesa and envytools copies from getting out of sync. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>	2020-04-30 20:03:17 +00:00
Rob Clark	ee293160d7	freedreno/a6xx: add OUT_PKT() Similar to OUT_REG(), this has the benefits of: 1. No more messing up pkt size 2. Detects errors of mixing up the order of dwords in the packet 3. Optimizes to more efficient code Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>	2020-04-30 20:03:17 +00:00
Rob Clark	e3fc8dd001	freedreno/drm: inline the things The existing structure dates back to when this code was part of libdrm, and we wanted some of this not to be exposed as ABI between libdrm and mesa. Now that this is no longer a constraint, inline things. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>	2020-04-30 20:03:17 +00:00
Rob Clark	75435d5e2a	freedreno/drm: drop atomic refcnts Since we dropped the async flush_queue, we no longer need the refcnts to be atomic. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>	2020-04-30 20:03:17 +00:00
Eric Anholt	4715502975	freedreno/ir3: Initialize the unused dwords of the immediates consts. Avoids having spurious differences (and weird values to look at!) in traces from uninitialized memory. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4824>	2020-04-30 19:23:39 +00:00
Jonathan Marek	3e1b93ec4f	turnip: fix wrong substream size in parse_multisample_and_color_blend Missed updating this when adding tu6_emit_sample_locations Fixes: `a92d2e1109` ("turnip: implement VK_EXT_sample_locations") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4795>	2020-04-29 20:09:54 +00:00
Rob Clark	a9c255d70c	freedreno/a6xx+tu: rename VSC_DATA/VSC_DATA2 These are the draw-stream and primitive-stream, so lets give them more descriptive names. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4750>	2020-04-28 23:31:58 +00:00
Rob Clark	656051d735	freedreno/ir3/ra: only assign array base in first pass In particular, we specifically don't want to let the base change between passes, as it could end up conflicting with registers assigned in the first pass. Mostly-closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2838 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>	2020-04-28 20:06:49 +00:00
Rob Clark	3d8ec96762	freedreno/ir3/ra: split out helper for array assignment Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>	2020-04-28 20:06:49 +00:00
Rob Clark	6313b8d881	freedreno/ir3/ra: use ir3_debug_print helper Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>	2020-04-28 20:06:49 +00:00
Rob Clark	8b3ac7084a	freedreno/ir3/ra: remove unused variable Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>	2020-04-28 20:06:49 +00:00
Rob Clark	997828e31b	freedreno/computer: add script to test widening/narrowing Just something I hacked together to help figure out which instructions can fold in a wideing/narrowing conversion. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>	2020-04-28 20:06:49 +00:00
Eric Anholt	4a42a50585	freedreno/ir3: Add support for disasm of cat2 float32 immediates. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>	2020-04-27 19:35:00 +00:00
Eric Anholt	292231596b	freedreno/ir3: Refactor out print_reg_src(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>	2020-04-27 19:35:00 +00:00
Eric Anholt	3bcf819b43	freedreno/ir3: Convert remaining disasm src prints to reginfo. More lines of code, but they're much more intelligible. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>	2020-04-27 19:35:00 +00:00
Eric Anholt	1462b00391	freedreno/ir3: Add a unit test for our disassembler. Makes sure that we can maintain consistent output from our disassembly as we refactor. I've only included stuff that matches qcom's disasm so far. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>	2020-04-27 19:35:00 +00:00
Eric Anholt	90984ba853	freedreno/ir3: Print a space after nop counts, like qcom's disasm. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>	2020-04-27 19:35:00 +00:00
Eric Anholt	916629f9d7	freedreno/ir3: Fix the disasm of half-float STG dests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>	2020-04-27 19:35:00 +00:00
Jonathan Marek	065068c66a	freedreno/ir3: run nir_lower_pack This lowers pack_32_2x16/unpack_32_2x16 into the scalar versions of those instructions. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>	2020-04-27 18:40:03 +00:00
Jonathan Marek	42093bb694	nir: add pack_32_2x16_split/unpack_32_2x16_split lowering The new option replaces the two other _split lowering options, since there's no need for separate options. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>	2020-04-27 18:40:03 +00:00
Alyssa Rosenzweig	6943eda5c9	ir3: Use shared mediump output lowering Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4716>	2020-04-27 16:32:24 +00:00
Connor Abbott	bf3c9d2770	tu: Don't invert point coords We shouldn't need to invert them, and the Vulkan blob doesn't either. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4733>	2020-04-25 16:15:48 +00:00
Connor Abbott	180f98678f	ir3: Remove VARYING_SLOT_PNTC remapping hack The st now does this for us. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4732>	2020-04-25 15:52:05 +00:00
Connor Abbott	a661d18a39	tu: Implement PrimID passthrough Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>	2020-04-25 01:06:21 +00:00
Connor Abbott	1f9839907a	ir3: Skip missing VS outputs in VS out map when linking The hardware is capable of automatically filling in certain values in the VPC without writing them from the last geometry stage, like gl_PointCoord or gl_PrimitiveID when there is no GS. However, we do have to enable these outputs (i.e. set the VPC_VAR_DISABLE bit to 0) as VPC_VAR_DISABLE is really about FS inputs rather than VS outputs. To do this, we move the computation of the enable bits to ir3_link_add(), which is also a nice refactor anyway. In addition we detect the PrimID case specifically so that the driver can program the location. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>	2020-04-25 01:06:21 +00:00
Connor Abbott	cc530858c1	freedreno/a6xx: Document PrimID passthrough registers Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>	2020-04-25 01:06:21 +00:00
Kristian H. Kristensen	bf542484ea	freedreno/ir3: Print @tex write mask using 0x%x That way we can parse it again with the assembler. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>	2020-04-25 00:03:43 +00:00
Kristian H. Kristensen	c801228f0d	freedreno/ir3: Reset lex line number when we start parsing Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>	2020-04-25 00:03:43 +00:00
Kristian H. Kristensen	34e7179dfa	freedreno/ir3: Parse, but ignore @in, @out and @tex headers Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>	2020-04-25 00:03:43 +00:00
Kristian H. Kristensen	da467817e3	freedreno/ir3: Move ir3 assembler to backend compiler For easier reuse. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>	2020-04-25 00:03:43 +00:00
Kristian H. Kristensen	869d86e664	freedreno/computerator: Decouple ir3 assembler Specifically, don't include ir3_asm.h in the parser as that's computerator specific. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>	2020-04-25 00:03:43 +00:00
Jonathan Marek	c3ef0275c4	turnip: add adreno 650 Tile alignment is 96, with gmem alignment of 0x6000 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4608>	2020-04-24 17:42:01 +00:00
Jonathan Marek	aa3624b8ab	turnip: use RESOLVE_TS event This is required on a650 to flush the GMEM store. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4608>	2020-04-24 17:42:01 +00:00
Jonathan Marek	f81e56c9a0	turnip: remove unused RB_UNKNOWN_8E04_blit New blit code doesn't change this value, and different values seem to be related to the driver version and not the GPU version. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4608>	2020-04-24 17:42:01 +00:00
Jonathan Marek	73f7f73ef3	freedreno/ir3: fix incorrect conversion folding Fixes dEQP-VK.glsl.builtin.function.pack_unpack.unpackhalf2x16_compute Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4708>	2020-04-24 13:11:58 +00:00
Jonathan Marek	dd49a40410	freedreno/ir3: set even bit for f2f16_rtne Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4708>	2020-04-24 13:11:58 +00:00
Jonathan Marek	edc35c1f54	freedreno/ir3: fix 16-bit ssbo access Update cat6 instruction type, and shift 1 in lower_offset_for_ssbo. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4708>	2020-04-24 13:11:58 +00:00
Jonathan Marek	e43fc003e0	turnip: divide cube map depth by 6 This matches the GL driver and fixes these tests: dEQP-VK.glsl.texture_functions.query.texturesize.samplercubearray* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4709>	2020-04-24 10:24:55 +00:00
Jason Ekstrand	f4addfdde3	spirv: Use nir_const_value for spec constants When we originally wrote spirv_to_nir we didn't have a good scalar value union to handily use so we rolled our own thing for spec constants. Now that we have nir_const_value, we can use that and simplify a bunch of the spec constant logic. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>	2020-04-24 09:23:59 +00:00
Jason Ekstrand	6211e79ba5	turnip: Properly handle all sizes of specialization constants cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>	2020-04-24 09:23:59 +00:00
Eric Anholt	5593d80a2c	freedreno/ir3: Fix sizing of the inputs/outputs array. If you have a struct, the var's base driver location is not the last driver location that will be accessed in that var. We have a shader struct member with this number for us, already. Fixes overflows in: dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4670>	2020-04-23 18:52:46 +00:00
Eric Anholt	ac937bf878	freedreno/ir3: Fix driver_location of the added vertex_flags varying. It was ignoring the sizes of the output variables and assuming single-slot, and failing to update num_outputs. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4670>	2020-04-23 18:52:46 +00:00
Eric Anholt	e9add0c501	drm-shim: Let the driver choose to overwrite the first render node. When I was writing drm-shim, I was focused on the v3d kmsro case -- use my intel device as the kmsro display device and add on a simulator-based v3d device that we could render with. But for the noop backends we use for shader-db, it's a lot more useful to just overwrite the first render node in the system so that you don't have to pass a -d <how many render nodes I already have in my system> argument. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4664>	2020-04-23 17:54:54 +00:00
Eric Anholt	5a8718f01b	freedreno: Make the slice pitch be bytes, not pixels. Back in a2xx, HW pitches were in pixels, so storing that was reasonable. Ever since then, the HW wants pitches in bytes, and we have only one instance of using pitch in pixels in the code (a3xx sysmem path). Flip things around so that only a2xx has to worry about the cpp for looking at pitches. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4558>	2020-04-23 16:37:50 +00:00
Eric Anholt	bd76a24fd1	freedreno: Introduce a "cpp_shift" value for cpp divs/muls. This only converts part of the driver to use it, leaving the rest to the following commit (which inspired this one). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4558>	2020-04-23 16:37:50 +00:00
Hyunjun Ko	227df2a2ba	turnip: Fix crashes when geometry shader constants aren't used Fixes dEQP-VK.transform_feedback.fuzz.2_level_array.float.geometry, for example. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4693>	2020-04-23 05:19:04 +00:00
Hyunjun Ko	0edff5123c	turnip: Skip unused regs when setting up streamout buffers Fixes: `374406a7c4` Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Brian Ho <brian@brkho.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4604>	2020-04-23 01:14:19 +00:00
Hyunjun Ko	e892733b80	turnip : Fix wrong offset calculation for xfb buffer. In vulkan, offsets are already provided through the api vkCmdBindTransformFeedbackBuffersEXT, so this is duplicated calculation. Fixes : `9ff1959ca5` Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Brian Ho <brian@brkho.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4604>	2020-04-23 01:14:19 +00:00
Hyunjun Ko	e34b0d65f9	turnip: Implement and enable VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT Tested by dEQP-VK.transform_feedback.simple.query* Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Brian Ho <brian@brkho.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4604>	2020-04-23 01:14:19 +00:00
Hyunjun Ko	aff02dd76b	turnip: make the struct slot_value of queries get 2 values In case of transform feedback query, it writes two integer values, which one is for primitives written and another is for primitives generated. To handle this, the second member of the struct slot_value is worth to be presented not as a padding. In addition, we also need to modify get/copy_result to access both values. This patch is the prep work for the transform feedback query support. Tested with dEQP-VK.pipeline.timestamp.* dEQP-VK.query_pool.occlusion_query.* Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Brian Ho <brian@brkho.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4604>	2020-04-23 01:14:19 +00:00
Jonathan Marek	c552b5fd1d	turnip: implement VK_EXT_sampler_filter_minmax Passes dEQP-VK.pipeline.sampler.view_type.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4662>	2020-04-22 20:12:14 +00:00
Jonathan Marek	a77e2ac835	turnip: enable cube arrays Passes dEQP-VK.pipeline.sampler.view_type.cube_array.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4663>	2020-04-22 19:57:20 +00:00
Jonathan Marek	9daeb50454	turnip: implement VK_EXT_filter_cubic Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4672>	2020-04-22 19:03:58 +00:00
Jonathan Marek	a92d2e1109	turnip: implement VK_EXT_sample_locations Passes tests in: dEQP-VK.pipeline.multisample.sample_locations_ext.* Note that these tests fail because of gl_PrimitiveID not working correctly: dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4665>	2020-04-22 18:46:46 +00:00
Jonathan Marek	83b2f1d8cf	turnip: set shader key msaa field Fixes per-sample interpolation. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4665>	2020-04-22 18:46:46 +00:00
Jonathan Marek	a5cce95280	turnip: enable VK_FORMAT_S8_UINT as stencil format Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4588>	2020-04-22 17:45:33 +00:00
Jonathan Marek	44c6c145da	turnip: improve GMEM load/store logic Determine load/store at renderpass creation time. This also fixes behavior with S8_UINT. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4588>	2020-04-22 17:45:33 +00:00
Jonathan Marek	e72201c787	turnip: disable depth test for S8_UINT attachment Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4588>	2020-04-22 17:45:33 +00:00
Connor Abbott	4daa3917a3	ir3: Fix bug with shaders that only exit via discard discard is supposed to be a terminator, killing the thread, so that it's possible to exit main solely by a discard e.g. inside of an infinite loop. However, it currently isn't treated as a terminator in NIR due to workarounds turning it into demote (d3d-style kill) and even if that were fixed, we probably wouldn't want to treat discard_if as a jump since otherwise the scheduler wouldn't be able to schedule things around it. So, add this workaround which inserts jump instructions as necessary to guarantee that the program always terminates. This fixes a hang in dEQP-VK.graphicsfuzz.while-inside-switch, which conditionally does a discard inside an infinite loop. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4658>	2020-04-22 09:49:40 +00:00
Connor Abbott	8cfa60eab8	ir3: Don't double-insert the first block The first block was being added to the list twice, once here and once in emit_block(), leading to list corruption and infinite loops when trying to traverse the list of blocks backwards. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4658>	2020-04-22 09:49:40 +00:00
Eric Anholt	2f4a3c1ca0	freedreno/ir3: Drop handling FRAG_RESULT_DEPTH writing to .z Since we consume NIR, we get FRAG_RESULT_DEPTH in .x. Something must have been working out for this code to not be trying to get an undefined value, but go ahead and drop it now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4668>	2020-04-21 23:30:53 +00:00
Jonathan Marek	eab73799d1	turnip: fix GMEM resolve in CmdNextSubpass The BLIT scissor must be set correctly for tu_store_gmem_attachment. Fixes this deqp test: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.137_191_1.samples Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4666>	2020-04-21 23:04:34 +00:00
Eric Anholt	c1e7c1f422	freedreno/drm-shim: Add support for faking other adreno chips. I wanted to look at the effect of a core NIR change on a2xx codegen, but I don't have any of those boards. This could also prove useful for quickly sanity-checking the compiler by running shader-db on it -- a2xx fails in a few ways on glmark2, and a3xx-a5xx fails on glmark2 in a debug_assert (which we don't have enabled in our dEQP runs). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4652>	2020-04-21 15:47:39 +00:00
Connor Abbott	ae169f38ce	tu: Fix the advertised maxFragmentInputComponents This appears to be limited by VPC_CNTL_0::NUMNONPOSVAR, which is an 8-bit bitfield with no possibility for expansion. Also, in practice we'll be limited by the vertex shader output maximum, which includes gl_Position, of 128, so that users won't be able to use more than 124 components anyways. Lower it to match the GL blob. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4641>	2020-04-21 10:04:13 +00:00
Connor Abbott	45ec9c0f3d	freedreno/a6xx: Expand various varying-count bitfields The extra bit needs to be used when using the maximum of 128 varying components. I confirmed that PC_PRIMITIVE_CNTL_1 and SP_PRIMITIVE_CNTL are expanded using a trace of the Vulkan blob with the maximum number of varyings, and changed the others by analogy. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4641>	2020-04-21 10:04:13 +00:00
Connor Abbott	94cb129d51	ir3/ra: Fix off-by-one issues with live-range extension The intersects() function assumes that inside each instruction values always die before they are defined, so that if the end of one range is the same instruction as the beginning of the next then they don't intersect. However, this isn't the case for values that become live at the beginning of a basic block, which become live before the first instruction, or instructions that die at the end of a basic block which die after the last instruction. For example, imagine that we have two values, A which is defined earlier in the block and B which is defined in the last instruction of the block and both die at the end of the basic block (e.g. are used in the next iteration of a loop). We would compute a range for A of, say, (10, 20) and for B of (20, 20) since each block's end_ip is the same as the ip of the last instruction, and RA would consider them to not interfere. There's a similar problem with values that become live at the beginning. The fix is to offset the block's start_ip and end_ip by one so that they don't correspond to any actual instruction. One way to think about this is that we're adding fake instructions at the beginning and end of a block where values become live & die. We could invert the order, so that values consumed by each instruction are considered dead at the end of the previous instruction, but then values that become dead at the beginning of the basic block would incorrectly have an empty live range, with a similar problem at the end of the basic block if we try to say that values are defined at the beginning of the next instruction. So the extra padding instructions are unavoidable. This fixes an accidental infinite loop in the shader for dEQP-VK.spirv_assembly.type.scalar.u32.switch_vert. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4614>	2020-04-18 17:31:56 +00:00
Connor Abbott	64e3b8d66b	tu: Use tu_cs_add_entries() with non-render-pass secondaries Even though vkCmdRenderPassBegin() isn't allowed inside a secondary command buffer, vkCmdDispatch() is, and we emit an IB with compute dispatches, which means that if the secondary command buffer records a vkCmdDispatch() then we'll have an IB inside an IB, which is illegal. Fixes hangs in e.g. dEQP-VK.api.command_buffers.record_simul_use_secondary_one_primary. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4605>	2020-04-17 14:11:07 +00:00
Jonathan Marek	2437808671	turnip: image_view rework Instead of exposing various layout functions, move image-related logic into tu_image.c and have the image_view pre-fill relevant register values. This changes the clear/blit code to use image_view. This will make it much easier to deal with aspect masks, in particular for planar formats and D32_S8. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>	2020-04-16 14:04:18 +00:00
Jonathan Marek	300d0e2b80	turnip: don't limit framebuffer size to image size Minor cleanup, I couldn't find anything that suggests this should be done, and anv doesn't do it either. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>	2020-04-16 14:04:18 +00:00
Jonathan Marek	b6455e9a6a	turnip: compute render_components/srgb_cntl at renderpass creation time Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>	2020-04-16 14:04:18 +00:00
Connor Abbott	0c05d46237	tu: Align GMEM resolve blit scissor Even though we normally use the CP_BLIT path with resolves that aren't aligned, there's a special case when we're resolving the entire image and there's enough padding so that we can still use CP_EVENT_WRITE::BLIT when the render area isn't aligned. The hardware seems to not like unaligned scissors when not clearing, and sometimes hangs rather than silently round the scissor. This causes hangs in e.g. dEQP-VK.glsl.derivate.dfdx.texture.msaa4.float_highp. There was some concern that the CP_BLIT path might use this scissor also, but I confirmed that this isn't the case by setting it to 0 before resolving and then noting that CP_BLIT still works (but CP_EVENT_WRITE doesn't). Furthermore, this is actually impossible because of how the 2D engine is set up: it gets its own pair of register banks, which can be switched independently of the 3D register banks, so that 2D events (CP_BLIT) normally aren't synchronized relative to 3D events (CP_EVENT_WRITE, CP_DRAW_*, and CP_EXEC_CS) and therefore they can't share any registers except for non-pipelined registers like RB_CCU_CNTL that don't use the register bank mechanism at all. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4585>	2020-04-16 12:00:22 +00:00
Connor Abbott	3a9e66277a	ir3: Handle load_ubo_ir3 when promoting to constants This restores support for promoting UBO loads to constant loads when using LDC. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>	2020-04-15 22:38:20 +00:00
Connor Abbott	abcfb64370	ir3: Fix LDC offset units I had missed that LDC actually uses vec4 units for its offset. This means that we have to create a new instruction, and lower it in ir3_nir_lower_io_offsets, similar to the existing SSBO instructions. Unfortunately we can't assume that loads are always vec4-aligned, so we have to use the alignment information that NIR gives us. Unfortunately, it's currently woefully inadequate, and will have to be fixed to give us good codegen in the future. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>	2020-04-15 22:38:20 +00:00
Brian Ho	13ce637f1b	freedreno/turnip: Update GRAS_LAYER_CNTL to GRAS_MAX_LAYER_INDEX After some experimentation, I believe that GRAS_LAYER_CNTL is actually just a count register storing the number of layers in the render target. While debugging cube_array geometry tests, I noticed that the blob was setting an unknown 0x8 to LAYER_CNTL, so I checked the value of LAYER_CNTL for various layer sizes: 1: LAYER_CNTL=0 2: LAYER_CNTL=1 3: LAYER_CNTL=2 4: LAYER_CNTL=3 9: LAYER_CNTL=8 256: LAYER_CNTL=255 2000: LAYER_CNTL=1999 Seems like this register just stores a count of the largest layer that can be written to via gl_Layer. This commit updates the reg docs, freedreno's gs implementation, and turnip's gs implementation. Fixes dEQP-VK.geometry.layered.cube_array.* Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>	2020-04-15 16:19:34 +00:00
Brian Ho	c2399e9574	turnip: Emit geometry shader descriptor consts Without these consts, the geometry shader is unable to read from textures or uniforms. Fixes dEQP-VK.geometry.layered.*.readback Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>	2020-04-15 16:19:34 +00:00
Brian Ho	d6d5ee29ab	turnip: Correctly set layer stride for 3D images Previously we were using layout.layer_size for the layer stride, but in Vulkan, you can alias a 3D image as an array of 2D images via the VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT flag. One reason to use this behavior is so the geometry shader can write to a specific depth in a 3D framebuffer with gl_Layer. Since the 3D image is not a true layered image, layer_size is 0. Instead, we can copy what freedreno does and use the slice size. Fixes dEQP-VK.geometry.layered.3d.* Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>	2020-04-15 16:19:34 +00:00
Jonathan Marek	23be216071	freedreno/ir3: don't overwrite wrmask in ir3_SAM Fixes (with other patches to allow these tests to run): dEQP-VK.ycbcr.query.size_lod.vertex.* Suggested-by: Rob Clark <robclark@gmail.com> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4557>	2020-04-14 19:12:47 +00:00
Jonathan Marek	aeb5b9cebf	freedreno/ir3: fix emit_tex_info split_dest Fixes a "free(): invalid next size (fast)" error in: dEQP-VK.glsl.texture_functions.query.texturequerylevels.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4557>	2020-04-14 19:12:47 +00:00
Connor Abbott	31988baba4	ir3: Fix txs with bindless I missed that this had a micro-optimization to assume that there was only ever one source, which is no longer valid for the bindless model since we now have a bindless handle source. Remove the optimization to fix assertion failures with turnip. Fixes e.g. dEQP-VK.glsl.texture_functions.query.texturesize.sampler2d_fixed_vertex Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4548>	2020-04-14 16:25:34 +00:00
Rob Clark	4b24b9647d	freedreno/ir3/ra: cleanup some leftovers Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	751c11a8c7	freedreno/ir3: rename depth->dce Since DCE is the only remaining function of this pass, after the pre-RA scheduler rewrite. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	cf74048fd1	freedreno/ir3: better cleanup when removing unused instructions Do a better job of pruning when removing unused instructions, including cleaning up dangling false-deps. This introduces a new ssa src pointer iterator, which makes it easy to clear links without having to think about whether it is a normal ssa src or a false-dep. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	96ff2a4099	freedreno/ir3/ra: handle array case for SFU select_reg opt The src of the SFU instruction could also be array/reg (non-SSA). Handle this case too. The postsched cp pass makes this scenario more likely. Fixes: `cc82521de4` ("freedreno/ir3: round-robin RA") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	b787b353d0	freedreno/ir3: add mov/cov stats While not always avoidable, cov instructions are a useful thing to look at to see if we could fold into src. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	89a78a07de	freedreno/ir3/postsched: avoid moving tex ahead of kill Add extra dependencies of tex/mem instructions on previous kill instructions to avoid moving them ahead of kills. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	017fdab217	freedreno/ir3/postsched: remove some leftovers These aren't used in postsched. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	9701008d64	freedreno/ir3/sched: awareness of partial liveness Realize that certain instructions make a vecN live, and account for this, in hopes of scheduling the remaining components of the vecN sooner. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	d2f4d332db	freedreno/ir3: new pre-RA scheduler This replaces the depth-first search scheduler with a more traditional ready-list scheduler. It primarily tries to reduce register pressure (number of live values), with the exception of trying to schedule kills as early as possible. (Earlier iterations of this scheduler had a tendency to push kills later, and in particular moving texture fetches which may not be necessary ahead of kills.) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	0f22f85fe7	freedreno/ir3: fix location of inserted mov's If the group pass must insert a mov to resolve conflicts, avoid the mov appearing after the meta:collect whose src it is. The current pre-RA scheduler doesn't really care about the initial instruction order, but the new one will in some cases. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	908044ef4b	freedreno/ir3: simplify grouping pass Since `bdf6b7018c` the logic only needs to handle grouping collect srcs, So remove the now unnecessary indirection. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	860f5981f0	freedreno/ir3: make falsedep use's optional Add option when collecting uses to control whether they include falsedeps or not. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	d09e3afdcc	freedreno/ir3: spiff out disasm a bit for verbose mode, print also the instruction "cycle" (which takes into account (rptN) and (nopN)) in addition to instruction offset. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Jonathan Marek	40ccbae622	freedreno/computerator: support bindless sampler instructions Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4526>	2020-04-13 20:15:48 +00:00
Jonathan Marek	bc9a28beed	freedreno/computerator: support nop prefix Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4526>	2020-04-13 20:15:48 +00:00
Eric Anholt	95d4a956c0	freedreno/ir3: CSE the up/downconversion of SEL's cond's size. Not many programs hit this, but if you were, say, selecting between vec4s, you'd convert the cond 4 times. instructions in affected programs: 2957 -> 2717 (-8.12%) nops in affected programs: 989 -> 899 (-9.10%) non-nops in affected programs: 1968 -> 1818 (-7.62%) dwords in affected programs: 3232 -> 2752 (-14.85%) last-baryf in affected programs: 102 -> 90 (-11.76%) full in affected programs: 5 -> 4 (-20.00%) sstall in affected programs: 329 -> 329 (0.00%) (ss) in affected programs: 86 -> 105 (22.09%) (sy) in affected programs: 14 -> 12 (-14.29%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4516>	2020-04-13 19:24:52 +00:00
Eric Anholt	82375ccaa4	freedreno/ir3: Stop doing b2n on the SEL condition. SEL_B32 (and presumably B16) checks for 0 or nonzero in the condition (tested by just stuffing a uniform's value into it), so there's no need to do ir3_b2n() on it, or any preceding ir3_n2b(). instructions in affected programs: 664444 -> 659927 (-0.68%) nops in affected programs: 267898 -> 266312 (-0.59%) non-nops in affected programs: 420260 -> 417329 (-0.70%) dwords in affected programs: 144032 -> 137568 (-4.49%) last-baryf in affected programs: 10801 -> 10321 (-4.44%) full in affected programs: 2003 -> 2002 (-0.05%) sstall in affected programs: 76670 -> 77405 (0.96%) (ss) in affected programs: 4515 -> 4525 (0.22%) (sy) in affected programs: 612 -> 604 (-1.31%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4516>	2020-04-13 19:24:52 +00:00
Eric Anholt	904d5d63b4	freedreno: Fix leak of binning shader variants. The v->binning variant is never added to shader->variants, so just free each one as we free the nonbinning variant. Noticed from drm-shim mode running out of open fds, since each bo ends up with an fd. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4502>	2020-04-10 18:42:20 +00:00
Kristian H. Kristensen	5ec1f264f1	freedreno/ir3: Fix sz vs class confusion Add bounds checking to make sure we don't silently access out of bounds again. Fixes: `90f7d12236` ("freedreno/ir3/ra: pick higher numbered scalars in first pass") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4503>	2020-04-10 10:24:14 -07:00
Connor Abbott	089e1fb287	tu: Implement descriptor set update templates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	e1595026f6	tu: Add missing code for immutable samplers Actually fill out the samplers, based on the radv implementation. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	a07b55443b	tu: Emit CP_LOAD_STATE6 for descriptors This restores the pre-loading of descriptor state, using the new SS6_BINDLESS method that allows us to pre-load bindless resources. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	d37843fee1	tu: Switch to the bindless descriptor model Under the bindless model, there are 5 "base" registers programmed with a 64-bit address, and sam/ldib/ldc and so on each specify a base register and an offset, in units of 16 dwords. The base registers correspond to descriptor sets in Vulkan. We allocate a buffer at descriptor set creation time, hopefully outside the main rendering loop, and then switching descriptor sets is just a matter of programming the base registers differently. Note, however, that some kinds of descriptors need to be patched at command recording time, in particular dynamic UBO's and SSBO's, which need to be patched at CmdBindDescriptorSets time, and input attachments which need to be patched at draw time based on the the pipeline that's bound. We reserve the fifth base register (which seems to be unused by the blob driver) for these, creating a descriptor set on-the-fly and combining all the dynamic descriptors from all the different descriptor sets. This way, we never have to copy the rest of the descriptor set at draw time like the blob seems to do. I mostly chose to do this because the infrastructure was already there in the form of dynamic_descriptors, and other drivers (at least radv) don't cheat either when implementing this. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	fc850080ee	ir3: Rewrite UBO push analysis to support bindless Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	274f3815a5	ir3: Plumb through bindless support Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	7d0bc13fca	ir3: LDC also has a destination Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	1842961e58	ir3: Also don't propagate immediate offset with LDC Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	de7d90ef53	ir3: Plumb through support for a1.x This will need to be used in some cases for the upcoming bindless support, plus ldc.k instructions which push data from a UBO to const registers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	c8b0f90439	ir3: Add bindless instruction encoding Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	122a900d7d	freedreno/a6xx: Add registers for the bindless model In Vulkan, descriptors for samplers, SSBO's, etc. are collected into descriptor sets, and shaders can use multiple descriptor sets. At command-recording time, users can swap out only some of the descriptor sets, and the driver is supposed to do the minimum amount necessary to update any internal binding tables, knowing that only some of the descriptors have changed. With the old binding model, focused on GL, where there are separate tables for each type of resource, we can do somewhat better than now by preserving descriptors from lower descriptor sets when switching higher descriptor sets. However we still have to copy around descriptors before each draw. At least for a6xx, qualcomm went further, essentially copying the Vulkan binding model as an alternate way to load resources. There's an array of registers (actually an array for compute and one for everything else), where each register holds a pointer to a descriptor set that can contain various different descriptor types. The descriptors are padded out to 16 dwords, so that every instruction can use an index instead of a dword offset. It's called "bindless", I think, because it can also be used to implement the old GL bindless extensions (presumably it allows more samplers and textures than the old model). This commit adds the register and cmdstream parts. Next up will be the instruction encoding. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	e088d82aa6	freedreno/a6xx: Add UBO size field Verified with the vulkan blob, which uses ldc and UBO descriptors, and turnip will too soon. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	d3b7681df2	tu: ir3: Emit push constants directly Carve out some space at the beginning for push constants, and push them directly, rather than remapping them to a UBO and then relying on the UBO pushing code. Remapping to a UBO is easy now, where there's a single table of UBO's, but with the bindless model it'll be a lot harder. I haven't removed all the code to move the remaining UBO's over by 1, though, because it's going to all get rewritten with bindless anyways. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Connor Abbott	63c2e8137d	tu: Dump out shader assembly when requested We don't use the ir3 variant machinery, so we have to do this ourselves. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Jonathan Marek	2e084c2cb3	turnip: new clear/blit implementation with shader path fallback The shader path is used to implement the following cases: * stencil aspect mask on D24S8 (for image_to_buffer,buffer_to_image) * clear/copy msaa destination (2D engine can't have msaa dest) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	de6967488a	turnip: add vk_format_is_snorm/is_float Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	51fe52d2fd	turnip: rework format helpers * Take tile_mode as input directly * tu6_format_gmem to tu6_base_format, use may not be limited to GMEM * Add new helpers that will return the correct tile_mode as for image level as part of the format. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	009082dcff	turnip: use dirty bits for dynamic viewport/scissor state CmdClearAttachments shader path will overwrite this state, so it needs to be re-emitted with dirty bits in that case. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	ed83281f0c	turnip: save attachment samples in renderpass state This is needed to be able to know the number of samples during CmdClearAttachments which can be used while the framebuffer is unknown. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	0637eab678	turnip: disable 8x msaa Not everything supports 8x msaa, and the blob doesn't support it at all. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	f03e63cd99	turnip: fix nir validate failure from push constant lowering Fixes newly added checks in nir validate failing. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	86d1a4c907	turnip: split up gmem/tile alignment Note: the x1/y1 align in tu6_emit_blit_scissor was broken Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	f494799a7f	turnip: RB_CCU_CNTL fixes * Correct bypass value for a618 * Bypass value for blitter * Don't set RB_CCU_CNTL again unnecessarily in tu6_emit_binning_pass Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	e4c05a5335	freedreno/registers: add RB_CCU_CNTL bitfields Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>	2020-04-09 14:43:02 +00:00
Jonathan Marek	420ca1e4a1	turnip: use buffer size instead of bo size for VFD_FETCH_SIZE Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>	2020-04-09 02:05:52 +00:00
Jonathan Marek	e62f8ae15a	turnip: improve vertex input handling Emit vertexBindingDescriptionCount bindings, instead of one per attribute. Verified with dEQP-VK.pipeline.vertex_input.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>	2020-04-09 02:05:52 +00:00
Jonathan Marek	d6a8591f72	turnip: fix compute shaders crashing after geometry shader change Fixes: `1af71bee73` ("turnip: Set has_gs in ir3_shader_key") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4483>	2020-04-08 01:56:53 +00:00
Kristian H. Kristensen	4399cacaf0	turnip: Drop dep_llvm from dependencies Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>	2020-04-07 18:44:21 +00:00
Kristian H. Kristensen	5789505ab3	turnip: Make Android platform build We still don't have a way to keep this from breaking, but I don't think this ever built. Let's call it progress. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>	2020-04-07 18:44:21 +00:00
Kristian H. Kristensen	97578c69e8	turnip: Stub out VK_KHR_external_{fence,semaphore}_fd Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>	2020-04-07 18:44:21 +00:00
Kristian H. Kristensen	e99f6f2ea1	turnip: Add missing VKAPI_ATTR annotations Make sure the types match. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>	2020-04-07 18:44:21 +00:00
Eric Anholt	1618159772	freedreno/a6xx: Set a level's pitch based on minified level0 pitch, not width0. Found from piglit fbo-generatemipmaps failures, then tracked down with the texturator test. The piece that really revealed things was finding that 1024x1 linear RGBA8 on the older blob drivers would have a pitch of 5120 instead of 4096, and the following levels minified that pitch. Fixes ~124 piglit tests (~8.5% of piglit failures) on cheza. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>	2020-04-07 18:02:56 +00:00
Eric Anholt	4b881d5270	freedreno: Add the outline of a test for a6xx texture layout. Trying to work out texture layout by remembering what things looked like in texturator is hard. Instead, let's use texture layouts from tracing the blob as a source of truth to make sure that we pick the same layouts they do (and don't break known-good ones). More testcases will be added as I fix layout bugs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>	2020-04-07 18:02:56 +00:00
Eric Anholt	9c6bfe8733	freedreno/a6xx: Drop the "alignment" layout temporary. It's just 1 for !3d, which means that the align we're doing in that case is pointless. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>	2020-04-07 18:02:56 +00:00
Eric Anholt	59a2220398	freedreno/a6xx: Remove the "aligned_height" temporary. Now that we're not incrementally minifying height, we can just modify it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>	2020-04-07 18:02:56 +00:00
Eric Anholt	cdff81fa9a	freedreno/a6xx: Sink the per-level size temps inside the loop. u_minify(n, 1) is no cheaper than u_minify(n, level), and this makes the logic a lot simpler to follow. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>	2020-04-07 18:02:56 +00:00
Jonathan Marek	a1727598a0	turnip: implement timestamp query Passes tests in: dEQP-VK.pipeline.timestamp.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4027> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4027>	2020-04-07 14:58:47 +00:00
Brian Ho	d64a7d6e69	turnip: Enable geometryShader device feature Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:21 +00:00
Brian Ho	bdf6b481d8	turnip: Enable geometry shaders for CP_DRAWs Enable geometry shading on draw if the pipeline has a geometry stage. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	b80dc4f5a6	turnip: Populate tu_pipeline.active_stages This can be used to determine if the pipeline has a specific shader stage (e.g. geometry shader). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	8eb0096312	turnip: Update maxGeometryShaderInvocations to match blob Geometry shaders support an invocations parameter up to a limit defined by maxGeometryShaderInvocations. This was set to 127, but executing with invocations > 32 causes a crash. As it turns out, the blob only advertises a max of 32 invocations, so we set that in turnip as well. Fixes dEQP-VK.geometry.instanced.draw_*_instances_{127, 64}_geometry_invocations Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	3550e20229	turnip: Selectively configure GRAS_LAYER_CNTL One of the features of geometry shaders is the ability to render to different layers by assigning to the gl_Layer (Layer in SPIR-V) builtin. While have already plumbed the layer regid to the geometry shader, we also need to GRAS_LAYER_CNTL to actually use layered rendering. In addition, gmem does not support layered rendering, so we need to force sysmem. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	475fe500bf	turnip: Set up REG_A6XX_SP_GS_CONFIG Updates GS_CONFIG and HLSQ_GS_CNTL registers to match those emitted by the blob and fd. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	fceccc411a	turnip: Configure VFD_CONTROL with gsheader and primitiveid This commit updates VFD_CONTROL to use the GS header and primitive ID sysvals if a geometry shader stage is present in the pipeline. Like in the case of VPC, the code here is adapted from fd6_program. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	012773be26	turnip: Configure VPC for geometry shaders This commit updates tu6_emit_vpc to selectively emit GS-specifc configuration. Most of this is repurposed from fd6_program.c. This also refactors `link_geometry_stages` to ir3_nir_lower_tess.c so it can be shared between fd and tu. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	6eabd6bd51	turnip: Emit geometry shader obj and related consts Like with other shader types, we need to emit the geometry shader object and the consts it uses. In addition, we need to emit additional geometry-specific consts that link primitive/vertex stride between the vs and gs. In conjunction with the gsheader, these are used by the vs to determine where to stlw outputs and used by the gs to determine where to ldlw those outputs from. FD emits these consts in the draw call because in GL, you can mix and match shaders in different programs. In Vulkan, however, we compile and link the shaders at pipeline creation, so we can emit these in the pipeline IB instead. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Brian Ho	1af71bee73	turnip: Set has_gs in ir3_shader_key The ir3 compiler only lowers the VS and GS for geometry shading if the corresponding has_gs key is set in the shader key. Without it, GS-specific intrinsics like load_per_vertex_input won't get lowered and the GS header will be initialized with invalid values. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>	2020-04-07 14:13:20 +00:00
Rob Clark	629c0cee0a	freedreno/ir3/cf: use ssa-uses Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>	2020-04-04 00:07:10 +00:00
Rob Clark	72f6b03aec	freedreno/ir3: add a pass to collect SSA uses We don't really track these as the ir is transformed, but it would be a useful thing for some passes to have. So add a pass to collect this information. It uses instr->data (generic per-pass ptr), with the hashsets hanging under a mem_ctx for easy disposal at the end of the pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>	2020-04-04 00:07:10 +00:00
Rob Clark	67dbe8088f	freedreno/ir3/cf: skip array load/store Don't fold conversions into array (incl phi lowered to regs/array). These aren't SSA. Avoids crashes in particular in frag shaders with flow control, which would leave a dangling array write disconnect from the original cov src. Possibly this could be slightly relaxed, if there is no other consumer of the src, and it were in the same block. But it would require updating block->keeps, and taking care of barrier state. Which isn't a thing the cf pass does currently. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>	2020-04-04 00:07:10 +00:00
Rob Clark	c2d0cc8b8d	freedreno/ir3: fixup cat3 32b vs 16b These should be keyed on src arg type. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>	2020-04-04 00:07:10 +00:00
Rob Clark	e73a8a9703	freedreno/ir3/cf: handle widening too We can also fold f16->f32 conversions. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>	2020-04-04 00:07:10 +00:00
Lionel Landwerlin	c3e305616c	drm-shim: return device platform as specified v2: Embed the libdrm dependency inside the drm-shim dependency Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Anholt <eric@anholt.net> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4429>	2020-04-03 21:14:18 +00:00
Brian Ho	6e76453472	ir3: Disable copy prop for immediate ldlw offsets Immediate offsets are currently collapsed for ldlw, but ldlw does behave correctly with immediate values. For example, `ldlw.u32 r0.x, l[4], 1` actually means to use the value of regid 4 (r1.x) as the offset when we actually want it to use the imm value of 4 as the offset. This commit disables copy prop for ldlw offsets so the same intrinsic gets compiled to: mov.u32u32 r0.y, 0x00000004 ldlw.u32 r0.x, l[r0.y], 1 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4439> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4439>	2020-04-03 19:44:46 +00:00
Brian Ho	355abfeed5	turnip: Advertise 8 bit subpixel precision Previously, turnip advertised 4-bit subpixel precision when in practice, a6xx seems to render with 8-bit precision. This caused dEQP-VK.renderpass2.suballocation.subpass_dependencies.late_fragment_tests.* to fail because they compare images rendered with turnip against ones rendered via a software reference implementation parameterized by turnip's VkPhysicalDeviceLimits.subPixelPrecisionBits value. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4172> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4172>	2020-04-03 16:27:56 +00:00
Connor Abbott	73e574acb8	freedreno: Rename RB_DONE_TS This makes the various cache_flush implementations make more sense. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4065> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4065>	2020-04-02 16:18:25 +00:00
Connor Abbott	36133a5434	freedreno: Cleanup event names It turns out that every _TS event, i.e. every event which requires a seqno pointer, also allows generating an interrupt in the kernel, at least since a3xx. And furthermore these interrupts are named by the kgsl kernel driver and already in envytools. Therefore it's possible to map out what the _TS events are with 100% certainty, given access to the hardware, by sending a CP_EVENT_WRITE with bit 31 set, unmasking all interrupts in the kernel, and logging which ones get hit. I've done this for a6xx, and I've also looked at the a5xx firmware, and the list of TS interrupts is the same as a6xx, so I have a pretty good idea of what the a5xx events are. I also fixed a few related things along the way: - VIZQUERY_END overlaps with WT_DONE_TS, but VIZQUERY_START was also a mess, with neither VIZQUERY_START nor HLSQ_FLUSH using variants. I added what seems like reasonable variants, based on the existing comment and the fact that HLSQ_FLUSH is only used in Mesa with a3xx and a4xx. - CACHE_FLUSH_AND_INVALIDATE seems to come straight from R600, and I have no idea if it's actually valid with a2xx, but given that RB_DONE_TS exists in the interrupt mask since a3xx, I guessed that RB_DONE_TS hasn't changed position since then and put it down as a3xx+ and limited CACHE_FLUSH_AND_INVALIDATE to a2xx. Someone with the relevant hardware should be able to confirm. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4065>	2020-04-02 16:18:25 +00:00
Eric Anholt	31011c7a39	freedreno/turnip: Use the NIR info to decide if we need helper invocations. We had an approximation that was assuming any ddx or tex instruction needed helper invocations, but that's not true for texelFetch() or textureSize(). It also meant that we were setting PIXLOD on vertex and compute shaders doing texturing, which doesn't really make sense. shader-db (with a hack to log pixlod): total pixlod in shared programs: 582 -> 573 (-1.55%) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2681 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4308> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4308>	2020-03-31 22:29:22 +00:00
Rob Clark	127fa5d00c	freedreno/ir3: fix android build Fixes: `e5339fe4a4` ("Move compiler.h and imports.h/c from src/mesa/main into src/util") Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>	2020-03-31 18:46:04 +00:00
Connor Abbott	d63acce5f4	tu: Return the correct alignment for images The alignment field was never initialized, so we were just returning an alignment of 0. Return the alignment from fdl, and while we're here cleanup some leftovers in tu_private.h. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4357> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4357>	2020-03-31 08:22:58 +00:00
Connor Abbott	d84c206d85	freedreno/fdl: Add base_align Tell users what the base address of the image needs to be aligned to. These values are based on experimentation via passing an offset to vkBindImageMemory with turnip and seeing if tests still pass. Note that r8g8 is also special in this regard, however it actually has an increased alignment (in bytes). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4357>	2020-03-31 08:22:58 +00:00
Eric Engestrom	79af30768d	meson: inline `inc_common` Let's make it clear what includes are being added everywhere, so that they can be cleaned up. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>	2020-03-28 21:36:54 +01:00
Rob Clark	f7d53275fb	freedreno/ir3/ra: re-work a6xx merged register file conflicts In particular setup the full/half conflicts first. This avoids spurious conflicts that where causing RA to place vecN half-regs poorly. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	faf276b4c8	freedreno/ir3/ra: split building regs/classes and conflicts Split out the construction of registers and classes (which is the same on all gens) from setting up conflicts. Prep to re-work how we setup conflicts on a6xx+ which merged half/full register file. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	90f7d12236	freedreno/ir3/ra: pick higher numbered scalars in first pass Since we are re-assigning the scalars anyways in the second pass, assign them to the highest free reg in the first pass (rather than lowest) to allow packing vecN regs as low as possible. Note this required some changes specifically for tex instructions with a single component writemask that is not necessarily .x, as previously these would get assigned in the first RA pass, and since they are still scalar, we'd end up w/ some r47.* and other similarly way-to-high assignments after the 2nd pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	1da90ca9bf	freedreno/ir3/ra: compute register target from liveranges Using the output of the first pass isn't ideal, as it can bake in the losses from fragmentation which the scalar pass is intended to fill in. This gets worse when we start using "vectorish" instructions, due to higher use of vecN values. Instead, we can just use the outputs of the liveness analysis to get a more accurate # of maximum live values at any point. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	d2cc92c747	freedreno/ir3/ra: fix array liveranges Fixes: `1b658533e1` ("freedreno/ir3: extend liverange of arrays") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	6347c2ea89	freedreno/ir3/ra: add def/use iterators Decouple the messy logic of figuring out vreg names defined/used by an instruction from the logic of what to do about it by introducing iterators. There is still some array vs ssa special casing in ra_block_compute_live_ranges(), but less than before. And this will avoid introducing a second copy of the def/use logic in a following patch which uses the liveranges to calculate the maximum # of live values (which is the optimal target for max physical register window to round-robin within). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	bf0aa7ed90	freedreno/ir3/ra: drop extending output live-ranges This is no longer needed as we create meta:collect instructions in the end block, which achieves the same result. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	0e7d24b532	freedreno/ir3/ra: add helper to map name to array For vreg names that refer to arrays rather than SSA values, this is the counterpart to name_to_instr(). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	d99d358389	freedreno/ir3/ra: fix target register calculation Account for the # of regs an instruction writes, and fix an off-by-one. (We are about to replace this with calculating the register target using the live-ranges, but in debugging that it was useful to assert() if it chose a higher target.) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00

... 3 4 5 6 7 ...

1317 Commits