KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Eric Engestrom	cae6093266	freedreno/perfcntrs: fix fd leak CoverityID: 1110568, 1458071 Fixes: `5a13507164` ("freedreno/perfcntrs: add fdperf") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3671> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3671>	2020-02-04 19:26:40 +00:00
Kristian H. Kristensen	df6a2a7197	turnip: Be explicit about converting vk compare func to a6xx Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>	2020-02-04 06:03:52 +00:00
Kristian H. Kristensen	67dd51606c	freedreno/fdperf: Cast away some ignored return values This is developer tool, it can crash and burn if it fails to allocate. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>	2020-02-04 06:03:52 +00:00
Rob Clark	982d61e2cd	freedreno/ir3: fix a dirty lie Lies, damn lies, and leftover hacks! We no longer hard-code these two, so fix the disasm to print the correct values. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	752aeb7b3f	freedreno/ir3: simplify split from collect In some cases we need to split components out from what was already a collect. That was making it hard to DCE unused components of the collect. (Ie. unused components of fragcoord, etc) So just detect this case and skip the chained collect+split. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	8d0e7d9a4c	freedreno/ir3: create fragcoord instructions in input block This was somehow working to create the instructions in a random block, and use the value in other blocks, by dumb luck. But two-pass-RA's better choice of register assignment causes a couple dEQPs to start failing without this fix: dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_1 dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_2 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	fb09020ef2	freedreno/ir3: remove unused tex arg harder Just killing the SSA link isn't enough. It confuses RA, legalize, and postsched to see a bogus unused reg. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	2ffe44ec0a	freedreno/ir3: add RA sanity check Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	3e79c4f0ed	freedreno/ir3: two pass register allocation Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	b0293af7a5	freedreno/ir3: don't precolor unused inputs This apparently can happen with gs/tess. And will cause problems with two-pass-ra, so lets just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	ad2587d3c8	freedreno/ir3: add is_tex_or_prefetch() Some of the aspects of tex prefetch are in common with normal tex instructions, such as having a wrmask to control which components are written. Add a helper for this. This should result in actually using the prefetch wrmask to avoid fetching unneeded components. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	4a7a6c9ef0	freedreno/ir3: number instructions from one ra_block_compute_live_ranges() treats zero as "not yet defined", so probably best to not let this be a valid instruction # Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	0f78c32492	freedreno/ir3: post-RA sched pass After RA, we can schedule to increase parallelism (reduce nop's) without worrying about increasing register pressure. This pass lets us cut down the instruction count ~10%, and prioritize bary.f, kill, etc, which would tend to increase register pressure if we tried to do that before RA. It should be more useful if RA round-robin'd register choices. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	3369406e44	freedreno/ir3: fix kill scheduling kill (and other cat0/flow instructions) do not have a dst register. Which was mostly harmless before, other than RA thinking it would need a free register to write. (But nothing consumed it, so the value would be immediately dead.) But this would cause more problems with postsched which would see a bogus dependency. Also, post-RA sched does need to see the dependency on the predicate register. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	9a9f78f1f9	freedreno/ir3/ra: make use()/def() functions instead of macros Originally these were nested functions, which worked nicely, giving us the function of a local macro that was actual 'c' syntax (ie. not token pasted macro). But these were converted to macros because clang doesn't let us have nice gcc extensions. Extract these back out into functions, before adding more things and making the macros even more cumbersome. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	a5f24f966a	freedreno/ir3: a bit more optmsgs debug Also dump where arrays are allocated. This was useful for debugging. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	300d1181c7	freedreno/ir3: move atomic fixup after RA A post-RA sched pass will move the extra mov's to the wrong place, so rework the fixup so it can run after RA (and therefore after postsched) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	304b50c9f8	freedreno/ir3: move block-scheduling into legalize We want to do this only once. If we have post-RA sched pass, then we don't want to do it pre-RA. Since legalize is where we resolve the branch/jumps, we might as well move this into legalize. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	093c94456b	freedreno/ir3: move nop padding to legalize This way we can deal with it in one place, after all the blocks have been scheduled. Which will simplify life for a post-RA sched pass. This has the benefit of already taking into account nop's that legalize has to insert for non-delay related reasons. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	c803c662f9	freedreno/ir3: split out delay helpers We're going to want these also for a post-RA sched pass. And also to split nop stuffing out into it's own pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	54c795f829	freedreno/ir3: fix crash when no non-input instructions This scenario can come up with block-sched and nop-sched moved to after RA. So lets fix it first to keep things bisectable. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	c1194e10b2	freedreno/ir3: cleanup after lower_locals_to_regs Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	f0b792ea06	freedreno/ir3: shuffle a few ir3_register fields It makes life easier for postsched to always be able to rely on wrmask. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	d326d30efe	freedreno/drm: readonly cmdstream Noticed that we weren't consistently making cmdstream buffers gpu-readonly. Fix that and drop the need to pass flags to fd_bo_new_ring(). Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3663> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3663>	2020-01-31 13:01:52 -08:00
Brian Ho	58fd26c433	turnip: Fix vkCmdCopyQueryPoolResults with available flag Previously, calling vkCmdCopyQueryPoolResults with the VK_QUERY_RESULT_WITH_AVAILABILITY_BIT flag set the query result field in the buffer to 0 if unavailable and the query result if available. This was a misunderstanding of the Vulkan spec, and this commit corrects the behavior to emitting a separate available result in addition to the query result. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>	2020-01-30 20:30:46 +00:00
Brian Ho	1a3e2a7fa8	turnip: Fix vkGetQueryPoolResults with available flag Previously, calling vkGetQueryPoolResults with the VK_QUERY_RESULT_WITH_AVAILABILITY_BIT flag set the query result field in *pData to 0 if unavailable and the query result if available. This was a misunderstanding of the Vulkan spec, and this commit corrects the behavior to eriting a separate available result in addition to the query result. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>	2020-01-30 20:30:46 +00:00
Brian Ho	1c3319cf81	turnip: Free event->bo on vkDestroyEvent Fixes a leak from freeing event but not event->bo. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3639> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3639>	2020-01-30 18:50:06 +00:00
Jonathan Marek	1c5d84fcae	turnip: hook up cmdbuffer event set/wait Gets some basic tests under "dEQP-VK.synchronization.event" passing Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123>	2020-01-29 23:13:43 +00:00
Alejandro Piñeiro	d5c32db076	turnip: remove unused descriptor state dirty It was only used to be initialized to zero. Not even updated as descriptor sets are bind. As far as I understand, setting the bit TU_CMD_DIRTY_DESCRIPTOR_SET on tu_cmd_state.dirty is used instead. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624>	2020-01-29 20:52:52 +00:00
Eric Anholt	06b13dfed2	tu: Fix binning address setup after pack macros change. This fixes a regression in "vkcube -m headless" rendering, but upsettingly none of my CTS tests I've been using. Fixes: `59f29fc845` ("turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.") Caught-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609>	2020-01-29 19:30:09 +00:00
Brian Ho	3d5bdea2cf	turnip: Enable occlusionQueryPrecise This commit enables the occlusionQueryPrecise feature. No additonal work is required as occlusion queries are already implemented to track exact sample counts. Also enables a number of extra tests on the Vulkan CTS. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605>	2020-01-29 19:05:23 +00:00
Samuel Pitoiset	15d53d8294	compiler: add PERSP to the existing barycentric system values We need the LINEAR versions for AMD_shader_explicit_vertex_parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Eduardo Lima Mitev	e6b531af66	turnip: Fix issues in tu_compute_pipeline_create() that may lead to crash The shader object is destroyed even if its creation failed. It is also not destroyed if its compilation or upload fails, leading to leaks. Finally, tu_compute_pipeline_create() should set output var pPipeline to VK_NULL_HANDLE if it fails. Avoids crash on dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>	2020-01-29 09:25:20 +00:00
Eduardo Lima Mitev	0e11e8ba89	turnip: Remove failed command buffer from pool When an error condition occurs during tu_create_cmd_buffer(), the cmd buffer has already been added to a pool, so the cleanup code should remove it. Fixes a crash (assert in tu_device::tu_bo_finish()) in dEQP tests: dEQP-VK.api.object_management.max_concurrent.command_buffer_primary dEQP-VK.api.object_management.max_concurrent.command_buffer_secondary due to pool attempting to destroy an invalid command buffer. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>	2020-01-29 09:25:20 +00:00
Rob Clark	63af27bc76	freedreno/drm: fix invalid-cmdstream-size with older kernels A cmdstream of size zero is invalid. But this can appear in various places where we emit a pointer to state. This doesn't show up with newer kernels (newer than v5.0) which use "softpin", but on earlier kernels can result in: [drm:msm_ioctl_gem_submit [msm]] ERROR invalid cmdstream size: 0 Since the pointer value doesn't matter in these cases, the easy solution is just to not emit a cmds table entry in this case. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805>	2020-01-28 00:09:34 +00:00
Brian Ho	f55e215b8c	turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results of a query to a buffer based off the query's availability. Fixes: #2238 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	9a3656b9fd	turnip: Implement vkCmdResetQueryPool Clears the available bit for each requested query on the GPU. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	97fa4cb3dc	turnip: Implement vkGetQueryPoolResults for occlusion queries Implements fetching the results of a query pool with the VK_QUERY_RESULT_WAIT_BIT, VK_QUERY_RESULT_WITH_AVAILABILITY_BIT, and VK_QUERY_RESULT_PARTIAL_BIT flags. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	24b95485dc	turnip: Update query availability on render pass end Unlike on an immidiate-mode renderer, Turnip only renders tiles on vkCmdEndRenderPass. As such, we need to track all queries that were active in a given render pass and defer setting the available bit on those queries until after all tiles have rendered. This commit adds a draw_epilogue_cs to tu_cmd_buffer that is executed as an IB at the end of tu_CmdEndRenderPass. We then emit packets to this command stream that update the availability bit of a given query in tu_CmdEndQuery. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	f750dd2ab8	turnip: Implement vkCmdEndQuery for occlusion queries Mostly a translation of freedreno's implementation of glEndQuery for GL_SAMPLES_PASSED query objects with a slight modification to set the availability bit of the query bo (slot->available) if the query was not ended inside a render pass. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	5824a59ee2	turnip: Implement vkCmdBeginQuery for occlusion queries Mostly a translation of freedreno's implementation of glBeginQuery for GL_SAMPLES_PASSED query objects with special logic for handling tiled render passes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	78dea40b1c	turnip: Implement vkCreateQueryPool for occlusion queries General structure is inspired by anv's implementation in genX_query.c. We define a packed struct that tracks sample count at the beginning of the query and at the end; the result of the occlusion query is then slot->end - slot->begin. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	a155ab93a3	turnip: Update tu_query_pool with turnip-specific fields tu_query_pool was forked from radv_query_pool, but we will need a different set of fields to implement queries in turnip. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Connor Abbott	b103157a0e	freedreno: Document CP_INDIRECT_BUFFER_CHAIN This will let us use batch chaining instead of growing batches on a5xx and a6xx. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>	2020-01-24 10:03:08 +00:00
Connor Abbott	f58242b56e	freedreno: Document CP_UNK_A6XX_55 Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>	2020-01-24 10:03:08 +00:00
Connor Abbott	3cf1d6b8db	freedreno: Document CP_COND_REG_EXEC more The vulkan blob uses the RENDER_MODE mode to condition a blit on the render mode in traces of a dEQP triangle test. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3182> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3182>	2020-01-24 09:23:27 +00:00
Eric Anholt	59f29fc845	turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros. There are only a couple of hard cases left using pkt4, where the register number to write is computed. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	d67100519e	turnip: Convert renderpass setup to the new register packing macros. This gets a lot of the hard code converted over to the new macros, resulting in (I feel) much more readable code with LESS_SHOUTING_ABOUT_THE_REG(). I decided to consistently put the reg on its own line, so that all the register names line up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	08837ea3d2	turnip: Port krh's packing macros from freedreno to tu. This introduces some minor unpacking of the temporary fd_reg_pair structs to code that previously was packing a whole register field. In the pack wrapper in tu_cs.h, I added some explanatory docs, dropped the relocs handling since we don't need it, and removed the extra regs[] in the __ONE_REG() macro (which was causing gcc's optimizer to fall on its face in my release build). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	d4bc3c93ea	freedreno: Fix OUT_REG() on address regs without a .bo supplied. Sometimes you want to zero out an address by supplying a NULL BO, but without this we would end up only emitting one dword. Increases size of fd6_gmem.o by .8%, though it's not clear to me why (no obvious terrible codegen happening) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	c1327bc283	freedreno: Add some missing a6xx address declarations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	fbd9b4ce08	turnip: Fix execution of secondary cmd bufs with nothing in primary. We want to finish off cmd emission in the primary CS and add its entry to the IB, but regardless of whether there had been anything in the primary CS to emit, we still need a reserved CS entry for the loop below. Fixes crashes in dEQP-VK.binding_model.shader_access.secondary_cmd_buf.* and many more in dEQP-VK.renderpass* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3524> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3524>	2020-01-23 20:27:26 +00:00
Jonathan Marek	8aa5d96864	turnip: simplify tu_physical_device_get_format_properties Fixes the "bad VkImageTiling" error when tiling is VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Jonathan Marek	b7e22b7a35	vulkan/wsi: remove unused image_get_modifier Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Jonathan Marek	e8afd40758	turnip: set linear tiling for scanout images Fixes: `210e6887` "vulkan/wsi: Use the interface from the real modifiers extension" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Jonathan Marek	11f6fba1c9	turnip: hook up GetImageDrmFormatModifierPropertiesEXT Fixes: `210e6887` "vulkan/wsi: Use the interface from the real modifiers extension" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Guido Günther	c5334d2943	freedreno/drm: Don't miscalculate timeout The current code overflows (s * 1000000000) for s >= 5 but that is e.g. used in msm_bo_cpu_prep. Signed-off-by: Guido Günther <agx@sigxcpu.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3514> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3514>	2020-01-23 18:07:13 +00:00
Eric Anholt	b327501dbf	turnip: Add support for fine derivatives. This does appear to be the required instruction sequence (dsxpp_1 dst src; dsxpp_1.p dst src) as dropping either instruction fails the testsuite. Fixes dEQP-VK.glsl.derivate.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>	2020-01-23 17:38:29 +00:00
Eric Anholt	876824908d	freedreno/ir3: Plumb the ir3_shader_variant into legalize. legalize is computing a lot of state that goes in the variant, let's just store it directly instead of passing pointers around. This leaves max_bary in place, which is doing some surprising work (overwriting the original total_in in some cases). Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>	2020-01-23 17:38:29 +00:00
Anthony Pesch	f77369086c	util/hash_table: update users to use new optimal integer hash functions Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>	2020-01-23 17:06:57 +00:00
Eric Anholt	65e432695d	turnip: Add support for uniform texel buffers. Pretty straightforward: Port texture descriptor code from freedreno, fill in alignment limits from closed vk, and tu_cmd_buffer.c was already uploading the texture descriptor. This doesn't implement storage texel buffers (required in the compute pipeline) yet, since those will need an IBO descriptor for the store path. Still, making the load path be connected to the texture descriptor won't hurt. Part of #2237 Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_texel_buffer.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522>	2020-01-23 02:40:09 +00:00
Eric Anholt	3abfde13be	turnip: Add support for non-zero (still constant) UBO buffer indices. This was actually all ready to go at this point, and just needed to increment by the value. Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_buffer.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3504> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3504>	2020-01-22 02:13:38 +00:00
Jonathan Marek	5f791df0d0	turnip: fix array/matrix varyings Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>	2020-01-21 20:36:08 -05:00
Jonathan Marek	c171765223	turnip: remove tu_sort_variables_by_location nir_assign_io_var_locations already does sorting. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>	2020-01-21 20:36:08 -05:00
Jonathan Marek	1736447f27	freedreno/ir3: allow inputs with the same location turnip can have multiple inputs with the same location, and different location_frac. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>	2020-01-21 20:36:08 -05:00
Eric Anholt	d1166a3b3a	turnip: Disable UBWC on images used as storage images. The closed GL driver doesn't use UBWC on any storage images. It does tile mostly (skipping tiling on writeonly images, it seems), but for freedreno we've been enabling tiling in all cases and it's fine. We do need to disable UBWC, as tests fail otherwise and just plugging in the equivalent UBWC regs like we were setting up a texture isn't enough. Fixes dEQP-VK.image.atomic_operations.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	e5ce365cde	turnip: Add limited support for storage images. So far this doesn't handle the texture state-based storage image access loads, and doesn't support descriptor arrays (same as SSBOs). The texture side is more tricky, since we have another remapping table to work around. This is enough to get some of dEQP-VK.image.atomic_operations.* working. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	85e424c591	turnip: Refactor the intrinsic lowering. Too many things in one function, split them out based on the intrinsic. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	3ac662e8df	turnip: Fix some whitespace around binary operators. Conforms to mesa style and the rest of turnip. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	fb6fca0037	freedreno: Stop scattered remapping of SSBOs/images to IBOs. Just make it be all SSBOs then all storage images. The remapping table was there to make it so that the big gap present from gallium's atomic lowering would get cleaned up, but that's no longer case. The table has made it very hard to support Vulkan storage images, so it's time for it to go. This does mean that an SSBO/IBO that is only loaded (or size-queried) will now occupy a slot in the table where it wouldn't before. This seems like a minor cost compared to being able to drop this much logic. With the remapping table gone, SSBO array handling for turnip just falls out. Fixes many array cases of dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.* Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jonathan Marek <jonathan@marek.ca> (turnip) Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Eric Anholt	2dc2055157	turnip: Refactor linkage state setup. As I touch this for descriptor set reworks, I don't want to have to update it twice. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Hyunjun Ko	26d93a7495	turnip: fix invalid VK_ERROR_OUT_OF_POOL_MEMORY When VK_DESCRIPTOR_TYPE_SAMPLER is provided, it doesn't need to be counted as a buffer count. Otherwise it leads to mismatch of allocated buffer size, hitting VK_ERROR_OUT_OF_POOL_MEMORY finally. Fixes: `c39afe68f0` Also fixes amber tests: ./tests/cases/address_modes_float.amber ./tests/cases/address_modes_int.amber ./tests/cases/magfilter_linear.amber ./tests/cases/magfilter_nearest.amber Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2020-01-21 10:29:16 +01:00
Jason Ekstrand	210e68874b	vulkan/wsi: Use the interface from the real modifiers extension The anv implementation still isn't quite complete, but we can at least start using the structs from the real extension. v2: Fix circular pNext list (Lionel) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	75755e0eba	turnip: Pretend to support Vulkan 1.2 It doesn't really support any Vulkan properly yet so why not claim 1.2? This was an easier way of fixing the build than trying to roll it forward to a later version of ANV's entrypoint generator scripts.	2020-01-15 08:34:57 -06:00
Rob Clark	2629cb627c	freedreno/ir3: rename instructions Turns out this range of opcodes are more general purpose if/else/endif instructions. We should re-work tess to create a basic block and use normal flow control. And possibly (for a6xx+) optimize cases to use if/else/endif when appropriate. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3398> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3398>	2020-01-15 00:56:24 +00:00
Jason Ekstrand	d3737002ee	nir/lower_atomics_to_ssbo: Also lower barriers This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	e40b11bbcb	nir: Rename nir_intrinsic_barrier to control_barrier This is a more explicit name now that we don't want it to be doing any memory barrier stuff for us. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	60097cc840	nir: Add a new memory_barrier_tcs_patch intrinsic Right now, it's implemented as a no-op for everyone. For most drivers, it's a switch case in the NIR -> whatever which just breaks. For ir3, they already have code to delete tessellation barriers so we just add a case to also delete memory_barrier_tcs_patch. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Lasse Lopperi	3de2774dcb	freedreno/drm: Fix memory leak in softpin implementation Free the memory allocated for cmds/reloc_bos array when destoying the associated ringbuffer. For similar fix for the non-softpin implementation see: `d014af98b7` Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2324 Fixes: `f3cc0d2` ("freedreno: import libdrm_freedreno + redesign submit") Signed-off-by: Lasse Lopperi <lasse.lopperi@ge.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342>	2020-01-10 16:21:35 +00:00
Kristian H. Kristensen	f9d35ea55b	ir3: Set up full/half register conflicts correctly Setting up transitive conflicts between a full register and its two half registers (eg r0.x and hr0.x and hr0.y) will make the half registers conflict. They don't actually conflict and this prevents us from using both at the same time. Add and use a new ra helper that sets up transitive conflicts between a register and its subregisters, except it carefully avoids the subregister conflict. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2020-01-09 16:03:25 -08:00
Bas Nieuwenhuizen	b72182fcfa	turnip: Use VK_NULL_HANDLE instead of NULL. Only occurrence of implicitly converting pointer->int. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570>	2020-01-02 11:47:02 +00:00
Rob Clark	0c32063794	freedreno/ir3: fix flat shading again These days `ctx->inputs` is the split scalar input components and `ir->inputs` is the full vecN. This got fixed in the load_input case, but the load_interpolated_input case was missed. Fixes: `bdf6b7018c` ("freedreno/ir3: re-work shader inputs/outputs") Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-24 17:16:31 +00:00
Jonathan Marek	13adce2845	turnip: disable B8G8R8 vertex formats Looks like swap doesn't work as expected on these, disable them. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>	2019-12-19 19:03:02 -05:00
Jonathan Marek	b9d4c10e4b	turnip: minor warning fixes Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177>	2019-12-19 23:21:01 +00:00
Jonathan Marek	e9a32af3bf	turnip: implement secondary command buffers Uses a new "tu_cs_add_entries" function because tu_cs_emit_call doesn't work inside draw_cs (which is already called by tu_cs_emit_call). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>	2019-12-19 20:42:08 +00:00
Jonathan Marek	85fff42d08	turnip: compute gmem offsets at renderpass creation time This makes it easier to implement secondary command buffers, since we no longer need to know the render area to set the gmem offsets for input attachments and CmdClearAttachments. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>	2019-12-19 20:42:08 +00:00
Jonathan Marek	f81c41a812	turnip: emit_compute_driver_params fixes Offset was wrong, it is in vec4 not dwords. There's a hole between DP_NUM_WORK_GROUPS_Z and DP_LOCAL_GROUP_SIZE_X so use the IR3 enums. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Jonathan Marek	bb134c5316	turnip: emit base instance vs driver param Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Jonathan Marek	a3a70588c0	freedreno/ir3: support load_base_instance Not supported by hardware, uses same mechanism as base vertex. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Jonathan Marek	5c17d9b9ca	freedreno/registers: document vertex/instance id offset bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Kristian H. Kristensen	e4c2bb6a93	freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Jonathan Marek	fe4a8df9a8	freedreno/ir3: fix vertex shader sysvals with pre_assign_inputs The first pre_assign_inputs loop doesn't pre-assign sysvals, so skip the second part for sysvals. The sysvals don't need to be pre-assigned since the state for those isn't shared between binning / nonbinning shaders. Fixes assert failures in cases where the sysvals didn't end up in the same registers for binning / nonbinning. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168>	2019-12-19 11:31:12 -05:00
Jonathan Marek	5785bcc8a0	turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used Fixes artifacts in the subpasses demo, which has a shader using fragcoord without any varyings. It looks like setting this bit when there are no varyings can cause weirdness in some cases (without this change, if the previous shader had <= 8 varyings it would work, but with 9 varyings it would have artifacts). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>	2019-12-18 19:03:37 -05:00
Jonathan Marek	4a59bc6df2	turnip: add cache invalidate to fix input attachment cases Fixes artifacts in the subpasses demo. Workaround texture cache with input attachments from GMEM by adding a cache invalidate between subpasses. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>	2019-12-18 19:03:37 -05:00
Connor Abbott	648cc22afb	freedreno: Fix CP_MEM_TO_REG flag definitions These actually mean something completely different, at least on A5xx and A6xx. The only other usage of the old flags on something older than A6xx was a typo, so I don't know if it was always this way, but at the same time it means that we don't have to worry too much about that. Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>	2019-12-18 23:09:05 +01:00
Connor Abbott	4c5ac156c3	freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE Similar to the existing usage for CP_COND_WRITE5, this makes it clear what each of the magic parameters are for. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>	2019-12-18 23:09:00 +01:00
Connor Abbott	cfa1fb895a	a6xx: Add more CP packets And add fields uncovered by looking at the firmware. I think this covers all the memory, register, and scratch manipulation opcodes that exist on A6xx, plus one additional nice find for Vulkan and describing a previously unknown opcode and documenting CP_WAIT_REG_MEM. Note that the bits for the CP_REG_TO_MEM count, as well as the formula for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG, are changed because the A630 SQE firmware actually does something different. I haven't investigated older microcodes to see whether this extends back to A5xx and A4xx, but the only non-A6xx uses of this field result in the same bit-pattern when using the A6xx bit range and formula, so it should be safe to change the definition universally. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>	2019-12-18 23:08:55 +01:00
Jonathan Marek	072e95e07a	freedreno/ir3: update prefetch input_offset when packing inlocs If the input location changes then prefetch input_offset needs to change. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141>	2019-12-17 16:41:13 -05:00
Kristian H. Kristensen	9aaa23fbad	freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bits There are bits for binning, gmem and sysmem. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>	2019-12-17 11:45:20 -08:00
Eric Anholt	2da68c8649	turnip: Fix support for immutable samplers. We were setting up the hardware sampler state when updating a combined image sampler, but never looking at the immutable sampler for in the separate case. Fixes failures in dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_immutable.fragment.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127>	2019-12-16 19:51:27 -08:00
Jonathan Marek	edfc4daab8	turnip: don't set LRZ enable at end of renderpass Fixes hanging with cases that use more than one renderpass. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122>	2019-12-17 00:59:00 +00:00
Jonathan Marek	c7c5a84cf3	freedreno/ir3: lower pack/unpack ops Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>	2019-12-16 19:20:07 -05:00
Eric Anholt	2d3182b429	turnip: Add support for descriptor arrays. I had a bigger rework I was working on, but this is simple and gets tests passing. Fixes 36 failures in dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_mutable.fragment.* (now all passing) Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>	2019-12-16 23:57:22 +00:00
Eric Anholt	02d764b96a	turnip: Drop unused variable. We really need -Werror in CI. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>	2019-12-16 23:57:22 +00:00
Jonathan Marek	a3ea4805aa	turnip: remove duplicate A6XX_SP_CS_CONFIG_NIBO Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	2d3492bc62	turnip: change emit_ibo to be like emit_textures Adds missing alignment and error checking. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	718bd4f8b4	turnip: fix emit_ibo Based on the GL driver: -Compute needs different opcode (this fixes a GPU hang problem) -REG_A6XX_SP_IBO_LO/REG_A6XX_SP_CS_IBO_LO were swapped Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	65007d438c	turnip: remove compute emit_border_color Current tu6_emit_border_color doesn't work for compute and there's no example from the GL driver to base it on, so replace it with a finishme. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	c9b12c71d7	turnip: fix emit_textures for compute shaders Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	b936143327	freedreno/ir3: lower mul_2x32_64 lower_mul_2x32_64 generates mul_high opcodes, and lower_mul_high is done by nir_lower_alu, so call nir_lower_alu after nir_opt_algebraic. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:37:09 -05:00
Jonathan Marek	d4676d7a16	turnip: implement CmdFillBuffer/CmdUpdateBuffer Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Jonathan Marek	8d893a2071	turnip: don't require src image to be set for clear blits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Jonathan Marek	f78c4251f1	turnip: use common blit path for buffer copy Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Jonathan Marek	d6c8aa2b72	turnip: use single substream cs Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Eric Anholt	f58ef5d481	turnip: Lower usub_borrow. Fixes dEQP-VK.glsl.builtin.function.integer.usubborrow.uvec2_mediump_fragment. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>	2019-12-16 04:52:09 +00:00
Rob Clark	3b8feefd9c	freedreno/ir3: add iterator macros So many open coded list iterators were getting annoying. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	ad92aa36ac	freedreno/ir3: add scheduler traces Add some infrastructure to trace scheduler decisions. The next patch will add some more traces, just splitting this out to reduce clutter. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	dd34ccb2c5	freedreno/ir3: add last-baryf shaderdb stat Sometimes sched changes that are a win in terms of instruction count and/or register pressure, are worse in real life, due to keeping varying storage locked for too long. Add a shader-db stat to give this more visibility. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Jonathan Marek	828f8f5531	turnip: implement subpass input attachments Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	3b4b5f549f	turnip: CmdClearAttachments fixes Partial depth/stencil clear and skipping unused attachments. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	aac7d6c1dc	turnip: subpass rework A renderpass is a tile load/store cycle. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	4322cf34c4	turnip: add dirty bit for push constants Fixes push constants not updating in some cases. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	27d2174508	turnip: no 8x msaa on 128bpp formats We don't have an entry for cpp 128 in the tile_alignment table, but I don't think the HW supports this at all (blob driver just doesn't have 8x msaa). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	5fd9fd3516	turnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view Use a special format which allows sampling the stencil and set the correct swizzle. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	e71f79f6c6	turnip: set FRAG_WRITES_SAMPMASK bit GPU hangs if SAMPMASK_REGID is used without this bit. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	99a4f7c79f	turnip: set load_layer_id to zero We don't have layered rendering and ir3 doesn't support this intrinsic, so just set it to zero for now. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	7bbcf7deff	turnip: update tile_align_w/tile_align_h It looks like the actual tile alignment requirement is less than 32x32, but in some cases input attachment texture needs 64 alignment. Reduced the h alignment to 16 to compensate and it seems to work fine. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	402bc111fc	turnip: fix tile layout logic Use DIV_ROUND_UP and stop trying to increase the tile_count width/height once tile_align_w/tile_align_h are reached. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	14cbe2dea5	turnip: fix hw binning render area Fix a mistake in the y2 coordinate. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	029322c100	freedreno/registers: add a6xx texture format for stencil sampler Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	2db03867f6	freedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	ab54aceaa8	turnip: fix incorrectly failing assert pColorBlendState is allowed to be NULL if subpass has >0 color attachments but they are all unused. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:16 -05:00
Kristian H. Kristensen	b6f8c42846	freedreno/a6xx: Silence warning for unused perf counters Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	9b05466144	freedreno/registers: Add 64 bit address registers Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	bdd98b892f	freedreno: New struct packing macros Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	b27b0e8550	freedreno/registers: Remove duplicate register definitions Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Eric Anholt	8bf590b46b	tu: Move UBWC layout into fdl6_layout() and use that function. This gets us shared non-UBWC layout code between gallium and turnip. Until I fix up the rest of gallium to handle UBWC mipmapping, we do the single-level UBWC setup in gallium as a fixup after layout. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	de619d7503	freedreno: Switch the 16-bit workaround to match what turnip does. Prevents regressions on argb1555 and rgb565 when making turnip use freedreno's layout. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	d9cf3e76bd	freedreno: Move a6xx's setup_slices() to a shareable helper function. We pass in all the parameters for setting up the layout, though freedreno still sets a few of them up early (since it uses layout helpers in making some decisions about the layout setup parameters that will be cleaned up once krh's blitter work lands).	2019-12-11 04:24:18 +00:00
Eric Anholt	67258a44d2	tu: Move our image layout into a freedreno_layout struct. This lets us start using some of the fdl_* helpers and have more obviously matching code between gallium and turnip. We can't yet use the fdl_* UBWC helpers, since the gallium driver doesn't do UBWC mipmaps (which I'm working on in another branch). Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	ea7631a9a6	freedreno: Move UBWC layout into a slices array like the non-UBWC slices. This is a little refactor in preparation for UBWC mipmapping support. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	97be9503bb	freedreno: Drop the extra offset field for mipmap slices. We can just bake the UBWC-goes-first delta into the slices at setup time. I did have to fix up the resource shadowing swap path to swap the slice fields, as it was missing and regressed the format reinterpets otherwise. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Rob Clark	1b4c12d3ee	freedreno/a6xx: fix LRZ logic In particular, we need to invalidate the LRZ state when we cannot be confident in what the Z state would be during rendering: 1) depth test modes not supported by LRZ 2) stencil test, which would require full rasterization and stencil test in the binning pass (whereas LRZ normally just needs to determine the min and max z value in an 8x8 quad) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-10 22:55:21 +00:00
Eric Anholt	0470a03769	freedreno: Track the set of UBOs to be uploaded in UBO analysis. We were iterating over the entire 32-entry array each time, when we can just use a bitset to know that we're only uploading from the first entry normally. Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL fishtank. Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-12-09 14:13:50 -08:00
Rob Clark	dc791d3c68	freedreno/fdperf: use drmOpen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-09 13:09:58 -08:00
Jonathan Marek	9d78cf4584	turnip: add hw binning Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-09 08:22:18 -05:00
Jonathan Marek	0796e7e70d	turnip: implement border color Fixes the deqp fails in: dEQP-VK.pipeline.sampler.border (minus 1d array/d24 cases which fail for other reasons) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-05 22:12:30 -05:00
Jonathan Marek	095d35eff8	turnip: improve emit_textures Two things: * Texture/sampler pointers aligned to the size of texture/sampler state * Returning errors instead of crashing on OOM Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-05 22:12:30 -05:00
Jonathan Marek	3ab4f99461	turnip: add function to allocate aligned memory in a substream cs To use with texture states that need alignment (texconst, sampler, border) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-05 22:12:29 -05:00
Eric Anholt	c3efeac4c6	turnip: Add support for compute shaders. Since compute shares the FS state with graphics, we have to re-upload the pipeline state when switching between compute dispatch and graphics draws. We could potentially expose graphics and compute as separate queues and then we wouldn't need pipeline state management, but the closed driver exposes a single queue and consistency with them is probably good. So far I'm emitting texture/ibo state as IBs that we jump to. This is kind of silly when we could just emit it directly in our CS, but that's a refactor we can do later. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	ccf8230547	turnip: Move pipeline BO list adding to BindPipeline. We only need to do it once when we bind, rather than having to check at every draw call. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	e26962f756	turnip: Sanity check that we're adding valid BOs to the list. I tripped over this during CS enabling when my program BO wasn't set up. Easier to debug this way than the kernel telling us a 0 handle is invalid. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	4365e955d8	turnip: Add a helper function for getting tu_buffer iovas. Easier than remembering to add all 3 offsets. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	70d6428be5	turnip: Refactor the graphics pipeline create implementation. The loop over the pipelines to create (and the failure handling) was noisy, and the stub for compute setup looked nicer to me. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	e46da7dbea	turnip: Add basic SSBO support. This is enough to pass dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.fragment.single_descriptor.* with fragmentStoresAndAtomics set, and thus to be able to start working on compute. I haven't enabled that flag yet, because it also implies image load/store support, which I haven't filled in. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	1f4e8f3c46	turnip: Reuse tu6_stage2opcode() more. A bit of cleanup for adding more stages later. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	5b23671f6a	turnip: Drop redefinition of VALIDREG now that it's in ir3.h. Fixes: `937b905569` ("freedreno/ir3: fix neverball assert in case of unused VS inputs") Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	bb49f19c1b	turnip: Fix unused variable warnings. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Jonathan Marek	ec28714b78	turnip: allow writes to draw_cs outside of render pass This is for state commands like CmdSetViewport that can be used outside of a renderpass. Accumulating those into draw_cs outside of the renderpass should have the desired effect. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 17:35:18 -05:00
Rob Clark	937b905569	freedreno/ir3: fix neverball assert in case of unused VS inputs The logic to ensure VS and BS inputs are aligned wasn't accounting for unused inputs in VS. This usually doesn't happen, but it seems it can in the case of ARB programs? Fixes assert: ``` fd6_program_create: Assertion `bs->inputs[i].regid == vs->inputs[i].regid' failed. ``` Fixes: `882d53d8e3` ("freedreno/ir3+a6xx: same VBO state for draw/binning") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Rob Clark	4e47c205b9	freedreno/ir3: remove store_output lowered to store_shared_ir3 Fixes crashes that were unnoticed in CI because debug_assert() was not enabled (but become real crashes after the next patch): dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_highp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_lowp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_mediump_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_highp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_lowp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_mediump_geometry Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Jonathan Marek	1576ff5fbb	turnip: MSAA resolve directly from GMEM Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	abaaf0b2e7	turnip: don't set unused BLIT_DST_INFO bits for GMEM clear These bits are ignored when clearing so don't bother setting them. Note: MSAA samples when clearing comes from other registers (tu6_emit_msaa) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	4babdc7381	turnip: implement CmdClearAttachments Passes these deqp tests: dEQP-VK.api.image_clearing.core.attachsingle* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	1dfa2e6c99	turnip: don't skip unused attachments when setting up tiling config This makes it easier to find the gmem_offset associated with an attachment. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	bebfb17a2b	turnip: fix display wsi fence timing out Fixes: `df9f2adf` ("turnip: add display wsi") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-02 14:29:47 -05:00
Eric Anholt	424d5e4e11	turnip: Disable timestamp queries for now. They're not implemented, and not critical to bring up immediately. Avoids failures in the CTS when nothing gets written to the query. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 10:05:59 -08:00
Jonathan Marek	080c92e7d4	freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	98d7125b36	freedreno/perfcntrs/fdperf: add missing a20x compatible Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	24cde37e8d	freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	baab4017b9	freedreno/perfcntrs: add a2xx MH counters Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	0d0c8a9e82	freedreno/registers: add missing MH perfcounter enum for a2xx Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	62ff90cc5e	turnip: fix integer render targets Add missing required bits. Fixes at least: dEQP-VK.pipeline.render_to_image.dedicated_allocation.1d.small.r16g16_sint_d24_unorm_s8_uint dEQP-VK.pipeline.render_to_image.dedicated_allocation.2d.mipmap.r16g16_sint_d24_unorm_s8_uint dEQP-VK.renderpass.dedicated_allocation.attachment.4.401 dEQP-VK.renderpass2.suballocation.formats.r16_uint.load.draw dEQP-VK.synchronization.op.single_queue.barrier.write_draw_read_copy_image_to_buffer.image_128x128_r16_uint Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-26 16:01:19 -08:00
Eric Anholt	930432577f	freedreno: Introduce a resource layout header. This will be used for sharing resource layout code between freedreno and tu. Mostly copied from a commit by Rob, with a new location and the slice struct renamed for consistency. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Jonathan Marek	773d640efa	turnip: implement UBWC This enables UBWC for everything except 3D textures. It breaks many image_to_image copies but those aren't important and it can be worked around later (image_to_image copy needs to be done in two steps, decode from the source format and then encode to the destination format). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 22:21:57 +00:00
Jonathan Marek	91fd83d142	freedreno/regs: update UBWC related bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 22:21:57 +00:00
Rob Clark	1a8c49d76c	freedreno/perfctrs/fdperf: periodically restore counters When GPU is idle and suspends, the currently selected countables will all reset to the first one. So periodically restore the selected countables. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	5a13507164	freedreno/perfcntrs: add fdperf Port from the envytools tree, but converted to use the .c tables for describing the perfcounter groups/countables, rather than using rnndec to get this at runtime from the register xml. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	b2338a5b00	freedreno/perfcntrs/a6xx: remove RBBM counters Currently this are getting blocked by the kernel.. these counters don't seem to be the most useful ones, and to use them we'd have to somehow probe the kernel by submitting cmdstream to write the selector regs and see if that triggers a GPU fault. So let's just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	6a517b3079	freedreno/perfctrs/a2xx: move CP to be first group fdperf expects this, to find the ALWAYS_COUNT counter Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	e35c4e6ad2	freedreno/perfcntrs: add accessor to get per-gen tables Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	b21f03ae7e	freedreno/perfcntrs: move to shared location This should eventually be useful for VK_KHR_performance_query as well. And in the more near term, for fdperf. Attempt to not break android build is best-effort and untested. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Hyunjun Ko	02f4c39b8d	freedreno/ir3: enable half precision for pre-fs texture fetch Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	407f8c71d3	freedreno/ir3: fixup when changing to mad.f16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	d0f38394b1	freedreno/ir3: fix printing output registers of FS. Fixes: `cea39af2fb` ("freedreno/ir3: Generalize ir3_shader_disasm()") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	37f5395783	freedreno/ir3: Enabling lowering 16-bit flrp Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	35124b0311	freedreno: support 16b for the sampler opcode Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	b934716bd8	freedreno/ir3: Implement f2b16 and i2b16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	030b046df8	freedreno/ir3: Add implementation of nir_op_b16csel Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	f0a046024d	freedreno/ir3: Support 16-bit comparison instructions v2. [Hyunjun Ko (zzoon@igalia.com)] Avoid using too much open code like "instr->regs[n]->flags \|= FOO" v3. [Hyunjun Ko (zzoon@igalia.com)] Remove redundant code for both 16b and 32b operations. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	138542499f	freedreno/ir3: cleanup by removing repeated code Prep-work for the corresponding patch. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Eric Anholt	bdf03b738d	turnip: Drop the copy of the formats table. Now that we can (mostly) generate a pipe format for a VkFormat, use that to answer queries about formats. This will let us refactor the freedreno format table surface layout code to be shared between gallium and vulkan. This causes us to expose fewer formats for now (on a 1/100 CTS run I'm doing, skips go from 3671 to 3835 out of 5145 tests). Fails stay about the same (478 -> 434, but the run is pretty flaky and we're doing fewer tests now). v2: Rebase on master, throw a finishme on missing vk-to-pipe formats that tu used to support. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v1) Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-19 15:35:52 -08:00
Jonathan Marek	d2cf3cad91	turnip: fix sRGB GMEM clear Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-19 21:35:37 +00:00
Jonathan Marek	d68acdb3b9	turnip: implement CmdClearColorImage/CmdClearDepthStencilImage Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-19 21:35:37 +00:00
Jonathan Marek	3cd44839fa	turnip: add x11 wsi Copied from radv Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-18 22:18:05 +00:00
Jonathan Marek	df9f2adfa3	turnip: add display wsi Copied from radv (minus the fence change) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-18 22:18:05 +00:00
Jonathan Marek	75e58d1fae	freedreno/registers: fix a6xx_2d_blit_cntl ROTATE A change from `b7093882` got overwritten by `610c8c93` Fixes: `610c8c93` ("freedreno/registers: Update with GS, HS and DS registers") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-17 17:40:53 +00:00
Jonathan Marek	0f5743429c	freedreno/ir3: disable texture prefetch for 1d array textures Prefetch only supports the basic 2D texture case, checking is_array is needed because 1d array textures pass the coord num_components==2 test. Fixes: `2a0d45ae` ("freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-17 17:01:18 +00:00
Eric Anholt	882ca6dfb0	util: Move gallium's PIPE_FORMAT utils to /util/format/ To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 10:47:20 -08:00
Rob Clark	0f33c255d3	freedreno/ir3: remove unused parameter Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-12 13:57:52 -08:00
Rob Clark	df7a88dca3	freedreno/ir3: legalize cleanups We can clear the "needs" flags once we emit a flag. And also, don't open-code the opcode name. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	b22617fb57	freedreno/ir3: fix gpu hang with pre-fs-tex-fetch For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x, even if it is not used in the shader (ie. only varying use is for tex coords). But if, for example, gl_FragCoord is used, it could get assigned on top of bary_ij, resulting in a GPU hang. The solution to this is two-fold: (1) the inputs/outputs rework has the benefit of making RA realize bary_ij is a vec2, even if there are no split/collect instructions (due to no varying fetches in the shader itself). And (2) extend the live ranges of meta:input instructions to the first non-input, to prevent RA from assigning the same register to multiple inputs. Backport note: because of (1) above, a better solution for 19.3 would be to revert `f30c256ec0`. Fixes: `f30c256ec0` ("freedreno/ir3: enable pre-fs texture fetch for a6xx") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	4bb697d938	freedreno/ir3: only tex instructions have wrmask At the ir3 level, we would assume that we could use wrmask to mask off other components of an instruction returning a vecN when they are not used. Which would let RA use components not written for other live values. But this is only true for tex instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	bdf6b7018c	freedreno/ir3: re-work shader inputs/outputs Allow inputs/outputs to be vecN (ie. whatever their actual size is), and use split to get scalar components of inputs, and collect to gather up scalar components of outputs. The main motivation is to simplify RA, by only having to consider split/ collect to figure out where values need to land in consecutive scalar registers, rather than having to also deal with left/right neighbors. Because of varying packing, and the resulting fractional location (location_frac), to implement load_input/store_output, it is still convenient to have a table of scalar inputs/outputs. We move this to the compile ctx (since it is only needed for nir->ir3). Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	2aae13f642	freedreno/ir3: simplify creating sysval inputs In almost all places, the add_sysval_input() is paired directly with a create_input(). (The one exception is frag shader ij bary coord, and this exception will go away in a later patch.) So go ahead and clean this up before reworking input/output handling. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	68d2ec5f7e	freedreno/ir3: remove first-vertex sysval This is a driver-param (loaded from uniform), not a sysval (populated by hw into a register). So it has no value to having a sysval slot. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	7b2166785a	freedreno/ir3: helper to print ir if debug enabled Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	7a5f073da3	freedreno/ir3: show input/output wrmask's in disasm Currently it is always 0x1 (scalar), but that will change in a later patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	c00a67171c	freedreno/ir3: add input/output iterators We can at least get rid of the if-not-NULL check in a bunch of places. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	b2417801e5	freedreno/ir3: remove impossible condition We keep kill's alive w/ keeps these days, rather than a fake output. This condition was left over from prior to that change. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	611258d578	freedreno/ir3: rename fanin/fanout to collect/split If I'm going to refactor a bit to use these meta instructions to also handle input/output, then might as well cleanup the names first. Nouveau also uses collect/split for names of these meta instructions, and I like those names better. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	4af86bd0b9	freedreno/ir3: remove half-precision output This doesn't really work, we can't necessarily just change the outputs to half-precision like this in anything but simple cases. Keep the shader key entry around though, eventually with proper mediump support we could use this with a nir pass to use lower precision frag shader outputs when the render target format has <= 16b/component. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	089b105396	freedreno/ir3: fix valgrind complaint with STLW The instruction has 3 src regs, so `instr->regs[0..3]` are valid, but `instr->regs[4]` is not. ``` Test case 'dEQP-GLES31.functional.shaders.linkage.es31.tessellation.varying.rules.output_superfluous_declaration'.. ==29239== Invalid read of size 8 ==29239== at 0x5BE9CDC: emit_cat6 (ir3.c:841) ==29239== by 0x5BEA1BF: ir3_assemble (ir3.c:921) ==29239== by 0x5BDF0A7: ir3_shader_assemble (ir3_shader.c:133) ==29239== by 0x5BDF193: assemble_variant (ir3_shader.c:162) ==29239== by 0x5BDF407: create_variant (ir3_shader.c:215) ==29239== by 0x5BDF4DB: shader_variant (ir3_shader.c:241) ==29239== by 0x5BDF553: ir3_shader_get_variant (ir3_shader.c:257) ==29239== by 0x5BA85F7: ir3_shader_variant (ir3_gallium.c:80) ==29239== by 0x5BA7703: ir3_cache_lookup (ir3_cache.c:96) ==29239== by 0x5B8B8B3: fd6_emit_get_prog (fd6_emit.h:119) ==29239== by 0x5B8C137: fd6_draw_vbo (fd6_draw.c:186) ==29239== by 0x5BB1FBB: fd_draw_vbo (freedreno_draw.c:290) ==29239== Address 0xb97f2d0 is 0 bytes after a block of size 240 alloc'd ==29239== at 0x4848D54: malloc (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so) ==29239== by 0x61BD35B: ralloc_size (ralloc.c:119) ==29239== by 0x61BD41B: rzalloc_size (ralloc.c:151) ==29239== by 0x5BE599B: ir3_alloc (ir3.c:45) ==29239== by 0x5BEA583: instr_create (ir3.c:984) ==29239== by 0x5BEA5DF: ir3_instr_create2 (ir3.c:1000) ==29239== by 0x5BEE317: ir3_STLW (ir3.h:1431) ==29239== by 0x5BF12D3: emit_intrinsic_store_shared_ir3 (ir3_compiler_nir.c:903) ==29239== by 0x5BF418B: emit_intrinsic (ir3_compiler_nir.c:1802) ==29239== by 0x5BF5D07: emit_instr (ir3_compiler_nir.c:2339) ==29239== by 0x5BF603F: emit_block (ir3_compiler_nir.c:2426) ==29239== by 0x5BF624B: emit_cf_list (ir3_compiler_nir.c:2474) ==29239== ``` Probably this only triggers in non-optimized builds? Fixes: `1f3b52ce50` ("freedreno/a6xx: Add register offset for STG/LDG") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Jonathan Marek	01cae57c80	freedreno: add Adreno 640 ID A640 seems to work without any other changes (glmark and vkcube). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-11 20:46:01 -05:00
Rob Clark	a3dc975ee7	freedreno/ir3: also track # of nops for shader-db The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	5f45818673	freedreno/ir3: sync disasm changes from envytools Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	df229977c3	freedreno/ir3: remove obsolete comment The meta PHI instruction was removed long ago. And fanin/fanout themselves to not contribute actual instructions (at least not by the time you get to sched, they may prevent copy-propagating away a mov) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	e804b42fd7	freedreno/ir3/ra: remove ir print after livein/out The IR hasn't changed at this point, so it isn't really adding any value. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	8b92052f10	freedreno/ir3/ra: move regs_count==0 check Fold it in to writes_gpr() (since a register that does not reference any registers by definition does not write a register). This lets us avoid having to handle this case in a few other places. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	bd21c73d3f	freedreno/ir3: ir3_print tweaks Handle HALF/HIGH flags in all cases, and colorize SSA src notation. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	5da10704bb	freedreno/ir3: use SSA flag on dest register too We did this in some places before, but not consistantly. But it will be useful for two-pass RA, to identify which registers have already been assigned. While we are cleaning this up, use __ssa_src() and new __ssa_dst() helper more consistently. (If nothing else, this reduces the # of callers of ir3_reg_create() to audit that we didn't miss something) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:14 +00:00
Rob Clark	8449f6183f	freedreno/ir3: split pre-coloring to it's own function Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:14 +00:00
Kristian H. Kristensen	4a4fad7f40	freedreno/ir3: Use regid() helper when setting up precolor regs Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:46:21 -08:00
Kristian H. Kristensen	47e2c19511	freedreno/a6xx: Program state for tessellation stages Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	7272e8a709	freedreno/ir3: Allocate const space for tessellation parameters The tessellation stages need size and stride or the patch layout as well as locations of attributes in the patch. The tesselation stages also use two system memory BOs and need the iovas of those. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	8739ea3ab5	freedreno/ir3: Pre-color TCS header and primitive ID inputs Similar to GS, the registers are shared and not reinitialized betewen VS and TCS, so we need to make sure to allocate the same registers for the system values between stages. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	b12ebe3e81	freedreno/ir3: Don't assume binning shader is always VS In tessellation mode, the TES is (probably) the binning shader. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	3cedeba7c9	freedreno/ir3: Setup inputs and outputs for tessellation stages Similar to GS, some inputs are reused when the chsh from VS to TCS or TES to GS, so we need to make sure we setup the right inputs and make the shared system values outputs so they don't get clobbered. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	e28fbbd861	freedreno/ir3: Implement TCS synchronization intrinsics We add two new IR3 specific nir intrinsics that map to the new condend and endpatch instructions. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	4915231b8a	freedreno/ir3: Implement tess coord intrinsic Our lowering pass made the z component unused by replacing its uses by 1 - x - y. The intrinsic implementation then just need to return the x and y components. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:37:08 -08:00
Kristian H. Kristensen	e16e48d00c	freedreno/ir3: End TES with chsh when using GS When we have both TES and GS, the TES needs to chain to the VS with chmask and chsh GS just like the VS does to either TCS or GS. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:37:05 -08:00
Kristian H. Kristensen	581cd59692	freedreno/ir3: Add new synchronization opcodes There are two new opcodes in use in tesselation control shaders: category 0, opcodes 13 and 15. unk13 is a kill type of instruction that terminates threads where !p0.x and it used to narrow down a patch wavefront to just thread 0. Then, once thread 0 has written the tess levels, it issues unk15, which might signal the TE that another patch has been fully written. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:37:02 -08:00
Kristian H. Kristensen	56ed835bff	freedreno/ir3: Extend geometry lowering pass to handle tessellation VS and TCS pass varyings the same way as VS and GS does. TCS then writes entire patch to a system memory BO and TES eventually reads back from the BO once the TE starts generating vertices. TES outputs vertices the same way as VS and GS, except when there's a GS as well, in which case TES passes varyings to GS same way the VS would. In addition, the TCS needs a little bit of control flow massaging so that it only runs for valid invocations needs a couple of unknown instructions to synchronize with the TE. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:59 -08:00
Kristian H. Kristensen	8621fbc37b	freedreno/ir3: Add tessellation field to shader key Whether we're tessellating and which primitives the TES outputs affects the entire pipeline so let's add a field to the key to track that. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:56 -08:00
Kristian H. Kristensen	77b96b843e	freedreno/ir3: Use imul24 in offset calculations With the imul24 opcode in place, we can now use it for computing local offsets (ie for ldlw/stlw). Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:53 -08:00
Kristian H. Kristensen	41984c8422	freedreno/ir3: Add ir3 intrinsics for tessellation These provide the iovas for system memory buffers used for tessellation as well as a new HW specific system value. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:50 -08:00
Kristian H. Kristensen	fe450ef4cf	freedreno/ir3: Add load and store intrinsics for global io These intrinsics take a ivec2 for the 64 bit base address and a integer offset. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:44 -08:00
Kristian H. Kristensen	1f3b52ce50	freedreno/a6xx: Add register offset for STG/LDG These instructions take a 64 bit iova as two conescutive registers and a immediate offset. This patch adds support for the offset to be a single register, which is added to the 64 bit iova. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:39 -08:00
Kristian H. Kristensen	3d16ec4a71	freedreno/a6x: Rename z/s formats What we call eRB6_Z24_UNORM_S8_UINT now is actually RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually RB6_Z24_UNORM_S8_UINT. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:36 -08:00
Kristian H. Kristensen	50124afe34	freedreno/a6xx: Fix layered texture type enum 2D array textures and 3D textures are different enum values after all. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:33 -08:00
Kristian H. Kristensen	7fed7c2a7d	freedreno/a6xx: Clear sysmem with CP_BLIT Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:28 -08:00
Kristian H. Kristensen	835f8d1ba1	freedreno/registers: Add comments about primitive counters Adding comments about best guess at what the counters count. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:19 -08:00
Kristian H. Kristensen	96968d0ba2	freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DST Move these two to be in order with the other VS regs. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:16 -08:00
Kristian H. Kristensen	ba54f7dd03	freedreno/registers: Fix typo Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:35:27 -08:00
Eric Engestrom	2f652e0b36	meson: move the generic symbols check arguments to a common variable Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviwed-by: Dylan Baker <dylan@pnwbakers>	2019-11-05 20:30:47 +00:00
Eric Engestrom	2c4395e61c	meson: add variable to control the symbols checks Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviwed-by: Dylan Baker <dylan@pnwbakers>	2019-11-05 20:12:32 +00:00
Dylan Baker	ee4f1bc187	util: rename PIPE_ARCH__ENDIAN to UTIL_ARCH__ENDIAN As requested by Tim. This was generated with: grep 'PIPE_ARCH_._ENDIAN' -rIl \| xargs sed -ie 's@PIPE_ARCH_\(.\)_ENDIAN@UTIL_ARCH_\1_ENDIAN@'g v2: - add this patch Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	f9f60da813	util/u_endian: set PIPE_ARCH__ENDIAN to 1 This will allow it to be used as a drop in replacement for _mesa_little_endian in a number of cases. v2: - Always define PIPE_ARCH_LITTLE_ENDIAN and PIPE_ARCH_BIG_ENDIAN, define the one that reflects the host system to 1 and the other to 0 - replace all uses of #ifdef, #ifndef, and #if defined() with #if and #if ! with PIPE_ARCH__ENDIAN Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Bas Nieuwenhuizen	72f858fc07	turnip: Remove _mesa_locale_init/fini calls. The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-31 09:47:56 +00:00
Jonathan Marek	fa3baeab76	freedreno/a2xx: add missing vertex formats (SSCALE/USCALE/FIXED) Mostly for vertex formats, but they are supported as texture formats too (untested however). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-10-30 18:04:17 +00:00
Timothy Arceri	1909bc526d	util: remove LIST_IS_EMPTY macro Just use the inlined function directly. The new function was introduced in `addcf410`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:39 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Eric Engestrom	32cff3781a	tu: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:10:31 +00:00
Rob Clark	bc67b892d0	freedreno/ir3: handle the progress case In some cases, in particular when you have things that can be src modifiers ((abs)/(neg)), once eliminating one mov, there is a possibility to remove another. Handle this by re-visiting an instruction after eliminating a copy on one of it's srcs. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	97b24efd9f	freedreno/ir3: remove restrictions on const + (abs)/(neg) These date back to relatively early days of ir3, when a lot was still not well understood. But according to CI (and what I've seen blob driver do), these are not actually real restrictions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	e665e65f96	freedreno/ir3: allow copy-propagate out of fanout Now that we fixed the sharp edges that this was papering over, we can relax the restriction about eliminating a mov coming out of a fanout (for example from result of texture fetch). Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	3ac328875e	freedreno/ir3: treat high vs low reg as conversion This avoids copy-propagating a high register into an instruction which cannot consume it. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	9e211b57b8	freedreno/ir3: propagate dest flags for collect/fanin We did this properly already for split/fanout. But collect was missed. Extract out a helper to share. This way we avoid copy propagating a mov from high or half reg into an instruction which cannot consume a high/half reg. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	49ab94694d	freedreno/ir3: make high regs easier to see in IR dumps Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	0f395f0933	freedreno/ir3: debug cleanup 1) deduplicate IR3_SHADER_DEBUG=disasm versus fs/vs/etc handling 2) standardize shader stage name prints, in particular VERT vs BVERT 3) don't mix stderr and stdout Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	aa8515463e	freedreno/ir3: fixup register footprint fixup Small typo resulted in not converting footprint to vec4, meaning that we could potentially ask for quite a few more registers than required Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-22 17:46:19 +00:00
Rob Clark	4c060235a2	freedreno/ir3: handle scalarized varying inputs If the load_interpolated_input is scalarized, we would be too conservative about deciding the tex instruction wasn't a candidate to pre-fetch: vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 /) vec2 32 ssa_1 = intrinsic load_barycentric_pixel () (0) / interp_mode=0 / vec1 32 ssa_2 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 0) / base=0 / / component=0 / / packed:v_uv,v_uv1 / vec1 32 ssa_3 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 1) / base=0 / / component=1 / / packed:v_uv,v_uv1 */ vec2 32 ssa_8 = vec2 ssa_2, ssa_3 vec4 32 ssa_9 = tex ssa_8 (coord), 0 (texture), 0 (sampler) Really we don't care that the texcoord components come from different load_interpolated_input instructions, just that they have consecutive varying offsets. Reported-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-22 17:46:19 +00:00
Marijn Suijten	0141a4cdc0	freedreno/ir3: Add missing ir3_nir_lower_tex_prefetch.c to Android.mk This file is created in `2a0d45ae6c` but addition to android makefiles was omitted. It breaks the build with missing references which are defined in this file. List the file in ir3_SOURCES to make the build succeed. Signed-off-by: Marijn Suijten <marijns95@gmail.com>	2019-10-21 22:43:00 +00:00
Rhys Perry	8b98d0954e	nir/lower_idiv: add new llvm-based path v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Rob Clark	1cea76274e	freedreno/ir3: handle imad24_ir3 case in UBO lowering Similiar to iadd, we can fold an added constant value from an imad24_ir3 into the load_uniform's constant offset. This avoids some cases where the addition of imad24_ir3 could otherwise be a regression in instr count. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	d9424e5821	freedreno/ir3: add imul24 opcode This maps to mul.s24 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	c7b8f16bee	freedreno/ir3: optimize immed 2nd src to mad We can't encode immed sources for cat3 (mad) instructions, but we can use const in first or third src. We handled this case already, but we weren't considering that we could lower immed to const. For manhattan: total instructions in shared programs: 35202 -> 34718 (-1.37%) instructions in affected programs: 14931 -> 14447 (-3.24%) helped: 90 HURT: 0 total full in shared programs: 2451 -> 2359 (-3.75%) full in affected programs: 653 -> 561 (-14.09%) helped: 69 HURT: 2 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 15:08:54 -07:00
Rob Clark	666b6236f7	freedreno/ir3: add rule to generate imad24 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	5e08f070f0	nir: add nir_lower_amul pass Lower amul to either imul or imul24, depending on whether 24b is enough bits to calculate an offset within the thing being dereferenced. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	bc2ccdc45a	freedreno/ir3: Handle newly added opcode nir_op_imad24_ir3 Simply emit an ir3_MAD_S24 instruction in the backend. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	6ad442acae	freedreno/ir3: rename mul.s/mul.u to mul.s24/mul.u24, to better reflect that these are 24b multiply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	f30c256ec0	freedreno/ir3: enable pre-fs texture fetch for a6xx Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	72048dd799	turnip: add support for pre-fs texture fetch Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Hyunjun Ko	e9450ad27d	freedreno/ir3: Add support for texture sampling pre-dispatch Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev	2a0d45ae6c	freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch The pass should run once at the end of shader compilation, for a4xx onwards. It iterates texture sampling instructions and mark those eligibile for pre-dispatch by changing the tex op from 'tex' to 'tex_prefetch'. An instruction is eligibile if: * The coordinate is a vector where all its components come from a shader input. * The order of the components match exactly that of the input (no swizzles). * The instruction is in the 'main' function, and in the outer most-block. The first two restrictions were arrived to empirically, so more testing could tighten or loosen it. The 3rd restriction is there to allow moving the instructions eligible for pre-dispatch to the beginning of the shader, so that we don't block the registers holding the result for too long. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	7d4213fe88	freedreno/ir3: force i/j pixel to r0.x It seems that pre-fs texture fetch only works if ij_pix ends up in r0.x. I've tried unknown zero bits, to no avail, and blob also seems to force r0.x when this feature is used. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	07e9bf564f	freedreno/ir3: add pre-dispatch tex fetch to disasm Useful to see in disassembly listing texture fetches that were moved to pre-dispatch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	2b93eb9c76	freedreno/ir3: add dummy bary.f(ei) for pre-fs-fetch If the only use of varyings is a pre-shader texture-fetch, we still need to issue a bary.f with the end-input flag, otherwise we'll block further VS invocations, as the hw will think varying storage is still busy. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	392a309a55	freedreno/ir3: fixup register footprint to account for prefetch It is possible that the result of a pre-fs texture fetch is an output (or partially an output) of the FS. Sine the meta:tex_prefetch instructions are dropped before the assembler, we need to account for this when we fixup the register footprint. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	482e1b9955	freedreno/ir3: add meta instruction for pre-fs texture fetch Add a placeholder instruction to track texture fetches made prior to FS shader dispatch. These, like meta:input instructions are scheduled before any real instructions, so that RA realizes their result values are live before the first real instruction. And to give legalize a way to track usage of fetched sample requiring (sy) sync flags. There is some related special handling for varying texcoord inputs used for pre-fs-fetch, so that they are not DCE'd and remain in linkage between FS and previous stage. Note that we could almost avoid this special handling by giving meta:tex_prefetch real src arguments, except that in the FS stage, inputs are actual bary.f/ldlv instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	11e467c378	freedreno/ir3: don't DCE ij_pix if used for pre-fs-texture-fetch When we enable pre-dispatch texture fetch, we could have a scenario where the barycentric i/j coord sysval is not used in the shader, but only used for the varying fetch for the pre-dispatch texture fetch. In this case we need to take care not to DCE this sysval. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	af817a44c1	freedreno/ir3: track sysval slot for inputs Will be needed for special handling of SYSTEM_VALUE_BARYCENTRIC_PIXEL (ij_pix) when pre-fs texture fetch is enabled. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	35692fab86	freedreno/ir3: remove unused ir3_instruction::inout Not sure I remember how long this has been unused for. But it's unused now. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Hyunjun Ko	fd14788e1f	freedreno/ir3: Add data structures to support texture pre-fetch Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	766a68cdb9	freedreno: update registers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Kristian H. Kristensen	622afc8dbd	freedreno/a6xx: Implement PIPE_QUERY_PRIMITIVES_GENERATED for GS When we don't have streamout enabled, we have to read this register to get the number of primitives emitted. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	d6ed39e20e	freedreno/ir3: End VS with CHMASK and CHSH in GS pipelines When used in a GS pipeline, the VS doesn't end with the END instruction. Instead it chains to the GS, which continues running with the same register allocation. The intended use cases seems to be that you can compile a regular VS (ie outputs in registers and ending with END) but then tack on link-time generated code past the END to write the outputs using STLW, in case the VS is used with GS. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	4b7312b763	freedreno/ir3: Start GS with (ss) and (sy) We don't know what kind of loads we might have to wait on when coming in from chsh in the VS so set both sync flags. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	c347708bea	freedreno/ir3: Pre-color GS header and primitive ID These sysvals have to be unclobbered by VS and in the same registers in both VS and GS, since the chsh from VS to GS doesn't reload the values. We use the pre-color argument to ir3_ra() to always place these values in r0.x and r0.y. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	ce08fddbbe	freedreno/ir3: Setup ir3 inputs and outputs for GS Inputs are the GS header, which contains vertex ID, local primitive ID and thread ID as well as primitive ID. The setup is a little different from other sysvals, since we always have to receive them in the VS so that it can pass them on into the GS. The vertex flag outputs from GS is set up as a proper nir output in the lowering pass and doesn't need special handling here. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0293d14719	freedreno/ir3: Implement primitive layout intrinsics This implements the load_vs_primitive_stride_ir3, load_vs_vertex_stride_ir3 and load_primitive_location_ir3 intrinsics, used for getting the primitive layout strides and locations. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	8e16fb1528	freedreno/ir3: Implement lowering passes for VS and GS This introduces two new lowering passes. One to lower VS to explicit outputs using STLW and one to lower GS to load input using LDLW and implement the GS specific functionality. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	8f39985b01	freedreno/ir3: Add has_gs flag to shader key Since the presence of GS changes how the VS operates we need to track that in the shader key. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0324706764	freedreno/ir3: Add intrinsics that map to LDLW/STLW These intrinsics will let us do all the offset calculations in nir, which is nicer to work with and lets nir_opt_algebraic eat it all up. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	436d125adf	freedreno/ir3: Add new LDLW/STLW instructions These access memory used for passing data between geometry stages. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	956d319446	freedreno/ir3: Extend RA with mechanism for pre-coloring registers We'll need to pre-color certain input registers betwee VS and GS shaders. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0b6625d825	freedreno/ir3: Use third register for offset for LDL and LDLV Before, offset held the offset, which can be either immediate or a register. Use a third register to hold the offset so that we can use a register. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	3a93e60e7b	freedreno/ir3: Add support for CHSH and CHMASK instructions Just add the constructors for now and special case similar to END so we don't remove them. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	610c8c938e	freedreno/registers: Update with GS, HS and DS registers Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Erik Faye-Lund	71c0dcf266	nir: support feeding state to nir_lower_clip_[vg]s Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00

... 4 5 6 7 8 ...

984 Commits