KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	e36e260c42	radv: add mipmap support for the TC-compat zrange bug Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:55 +02:00
Bas Nieuwenhuizen	d062bec48d	radv: Hash Wave32 settings in shader key. Can result in different shaders. Fixes: `8a86908e9a` "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders" Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen	3a5950f501	radv: Add device argument for dcc compression check. Because it is about to be generation dependent. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	8c63ffe54d	radv: Disable compression for compute DCC decompress store. Previously we relied on stores not using DCC but that is going to change, so disable compression explicitly. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	216a9d8871	radv: Add extra struct to image view creation. For extra args. Unlike image creation, I'm not embedding the vk struct in there, so all the inline structs can be kept. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	66131ceb8b	radv: Pass through render loop detection to internal layout decisions. And do nothing with it yet. Everything outside a renderpass has no render loop. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	a171a6663d	radv: Add render loop detection in renderpass. VK spec 7.3: "Applications must ensure that all accesses to memory that backs image subresources used as attachments in a given renderpass instance either happen-before the load operations for those attachments, or happen-after the store operations for those attachments." So the only renderloops we can have is with input attachments. Detect these. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	a7041f3b4e	radv: Store image view also outside framebuffer. So we can use it with imageless framebuffers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 22:19:16 +02:00
Bas Nieuwenhuizen	49e6c2fb78	radv: Store color/depth surface info in attachment info instead of framebuffer. That way we can use it for imageless framebuffers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 22:18:51 +02:00
Bas Nieuwenhuizen	72e7b7a00b	ac/nir,radv: Optimize bounds check for 64 bit CAS. When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 21:21:55 +02:00
Samuel Pitoiset	e8110e51c6	radv: fix image_has_{cmask,fmask}() helpers The driver should now rely on cmask_offset because CMASK can be disabled by the driver for some reasons (eg. mipmaps). Apply the same change for FMASK, although it should be useless. Fixes: `ad1bc8621d` ("radv: remove radv_get_image_fmask_info()") Fixes: `10d08da52c` ("radv/gfx10: add missing dcc_tile_swizzle tweak") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 14:00:50 +02:00
Samuel Pitoiset	ad1bc8621d	radv: remove radv_get_image_fmask_info() It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 13:34:46 +02:00
Samuel Pitoiset	9c9745e8dd	radv: remove radv_get_image_cmask_info() It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 13:34:41 +02:00
Samuel Pitoiset	8a86908e9a	radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders It can be enabled with RADV_PERFTEST=gewave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 09:37:36 +02:00
Samuel Pitoiset	953bbacc23	radv/gfx10: add Wave32 support for fragment shaders It can be enabled with RADV_PERFTEST=pswave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 09:37:34 +02:00
Samuel Pitoiset	ea38565011	radv/gfx10: add Wave32 support for compute shaders It can be enabled with RADV_PERFTEST=cswave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 09:35:04 +02:00
Daniel Schürmann	45638e14fb	radv: Don't include radv_private.h from radv_shader.h This patch decouples radv_shader.h from any LLVM dependency. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-30 10:29:11 +02:00
Bas Nieuwenhuizen	4058b354c5	radv: Set FLUSH_ON_BINNING_TRANSITION. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-23 21:26:59 +02:00
Dave Airlie	2ac2b98780	radv: fix crash in shader tracing. Enabling tracing, and then having a vmfault, can leads to a segfault before we print out the traces, as if a meta shader is executing and we don't have the NIR for it. Just pass the stage and give back a default. Fixes: `9b9ccee4d6` ("radv: take LDS into account for compute shader occupancy stats") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 11:00:25 +10:00
Samuel Pitoiset	24b1b1f574	radv: add an option for disabling NGG on GFX10 Will be useful for testing the legacy path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 15:43:36 +02:00
Samuel Pitoiset	ed53d2c4be	radv/gfx10: disable the TC compat zrange workaround Unnecessary. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 08:32:36 +02:00
Samuel Pitoiset	ae4b1fc095	radv/gfx10: always build the GS copy shader but uses it on-demand It should be possible to build it on-demand too but it requires more work. On GFX10, the GS copy shader is required when tess is enabled with extreme geometry. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 08:32:30 +02:00
Samuel Pitoiset	4dcdc4cdc5	radv: allow to select DST_SEL with RELEASE_MEM Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:16:57 +02:00
Samuel Pitoiset	5bbcb3f5bc	radv/gfx10: implement support for GS as NGG Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 15:45:53 +02:00
Samuel Pitoiset	ee21bd7440	radv/gfx10: implement NGG support (VS only) This needs to be cleaned up a bit, and it probably contains missing stuff and/or bugs. This doesn't fix the "half of the triangles" issue. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Bas Nieuwenhuizen	d0978427cb	radv/gfx10: Use new uconfig reg index packet for GFX10+. Otherwise the hardware/firmware seems to not set the registers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	2481ac81d3	radv/gfx10: implement radv_initialise_ds_surface() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	e80f189de0	radv/gfx10: implement radv_initialise_color_surface() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	3dc5ec5d16	radv/gfx10: generate gfx10_format_table.h Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Bas Nieuwenhuizen	726a31df70	radv: Add the concept of radv shader binaries. This simplifies a bunch of stuff by (1) Keeping all the things in a single allocation, making things easier for the cache. (2) creating a shader_variant creation helper. This is immediately put to use by creating rtld shader binaries. This is the main reason for the binaries, as we need to do the linking at upload time, i.e. post caching. We do not enable rtld yet. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Samuel Pitoiset	8ea7ee1536	radv: rename and re-document cache flush flags SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 18:38:37 +02:00
Samuel Pitoiset	5411f47056	radv: set DISABLE_CONSTANT_ENCODE_REG to 1 for Raven2 Ported from RadeonSI, will be emitted for GFX10 too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:45:15 +02:00
Samuel Pitoiset	34bef8a0d7	radv: clear CMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear CMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:28 +02:00
Samuel Pitoiset	476b907a3b	radv: clear FMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear FMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:25 +02:00
Samuel Pitoiset	e67fc11c26	radv: pass sample locations for transitions before depth/stencil resolves HTILE decompressions need the user sample locations if specified in the current subpass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:35 +02:00
Samuel Pitoiset	5cf350f565	radv: implement all depth/stencil resolve modes using compute This path supports layers but it requires to decompress HTILE before resolving. The driver also needs to fixup HTILE after the resolve. This path is probably slower than the graphics one. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:19 +02:00
Samuel Pitoiset	cdc6efddf9	radv: implement all depth/stencil resolve modes using graphics When using graphics, the driver doesn't need to decompress HTILE before resolving. This path currently doesn't support layers so we have to fallback to the compute path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:15 +02:00
Samuel Pitoiset	e52ad9f845	radv: record if a render pass has depth/stencil resolve attachments Only supported with vkCreateRenderPass2(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:12 +02:00
Samuel Pitoiset	ac6369a2d0	radv: rename has_resolve to has_color_resolve Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:10 +02:00
Samuel Pitoiset	e91c1ea06c	radv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmask This allows us to disable the FMASK decompress pass when transitioning from CB writes to shader reads. This will likely be improved and enabled by default in the future. No CTS regressions on GFX8 but a few number of multisample CTS failures on GFX9 (they look related to the small hint). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-19 10:06:39 +02:00
Samuel Pitoiset	7971697efe	radv: store the DCC predicate for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	38aa386e96	radv: store the FCE predicate for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	7295512037	radv: store the fast color clear values for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Bas Nieuwenhuizen	4107590911	radv: Decompress DCC when the image format is not allowed for buffers. Otherwise the buffer loads/stores in the bufimage meta operations fail. If we decompress DCC then we can use the "canonical" format compatible with the not-supported format. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-17 10:56:50 +00:00
Daniel Schürmann	c58dff753c	radv: enable AMD_shader_ballot with RADV_PERFTEST_SHADER_BALLOT ('shader_ballot') Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Samuel Pitoiset	e7677a697b	radv: handle sample locations during automatic layout transitions From the Vulkan spec 1.1.109: "Some implementations may need to evaluate depth image values while performing image layout transitions. To accommodate this, instances of the VkSampleLocationsInfoEXT structure can be specified for each situation where an explicit or automatic layout transition has to take place. [...] and VkRenderPassSampleLocationsBeginInfoEXT can be chained from VkRenderPassBeginInfo to provide sample locations for layout transitions performed implicitly by a render pass instance." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:11 +02:00
Samuel Pitoiset	d0d41e58c3	radv: determine the first subpass id for every attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:08 +02:00
Bas Nieuwenhuizen	9701cb1034	radv: Use bo metadata for imported image tiling on Android. This way we handle linear images etc. correctly. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-04 18:32:45 +00:00
Samuel Pitoiset	da26013eb7	radv: implement VK_EXT_sample_locations and disable it Basically, this extension allows applications to use custom sample locations. It doesn't support variable sample locations during subpass. Note that we don't have to upload the user sample locations because the spec doesn't allow this. The extension is currently disabled because the driver needs to support variable sample locations during layout transitions. The depth decompress needs to know them and that's a bit invasive. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-30 09:52:16 +02:00
Samuel Pitoiset	eaeaad25f7	radv: sync before resetting a pool if there is active pending queries Make sure to sync all previous work if the given command buffer has pending active queries. Otherwise the GPU might write queries data after the reset operation. This fixes a bunch of new dEQP-VK.query_pool.* CTS failures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 08:47:54 +02:00
Samuel Pitoiset	a7763ddcf2	radv: clean up the sample locations codebase Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:35 +02:00
Samuel Pitoiset	135dff8dcf	radv: remove remaining code related to 16 samples The driver only supports up to 8 samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:33 +02:00
Marek Olšák	ccfcb9d818	ac: rename SI-CIK-VI to GFX6-GFX7-GFX8 Acked-by: Dave Airlie <airlied@redhat.com> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.	2019-05-15 20:54:10 -04:00
Bas Nieuwenhuizen	1619f20883	radv: Clean up signalled and submitted fields from winsys fences. Other types like syncobj do not need it, so lets make things a bit more uniform. Also reduce confusion what the signalled/submitted referred to (especially with imported fences) Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-13 20:36:29 +00:00
Bas Nieuwenhuizen	d6dfb2cf50	radv: Add support for icd loader interface v4. Adds support for physical device functions unknown to the loader. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-13 00:41:31 +02:00
Bas Nieuwenhuizen	8139efbbbd	radv: Use given stride for images imported from Android. Handled similarly as radeonsi. I checked the offsets are actually used. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-06 15:36:39 +00:00
Samuel Pitoiset	08be23bfde	radv: set WD_SWITCH_ON_EOP=1 when drawing primitives from a stream output buffer According to RadeonSI, this seems to be required by the hardware to avoid GPU hangs. I think I just forgot to set that bit when I implemented VK_EXT_transform_feedback. This fixes a GPU hang with Space Engineers and DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110291 Fixes: `b4eb029062` ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 15:55:46 +02:00
Bas Nieuwenhuizen	5564c38212	radv: Update descriptor sets for multiple planes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	b2cfa231d0	radv: Add support for image views with multiple planes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	65c4f612aa	radv: Add ycbcr conversion structs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	66507cc656	radv: Add single plane image views & meta operations. Copies & clear of multiplane images is not allowed so we do not have to handle that case. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	42d159f276	radv: Add multiple planes to images. No functional changes. This temporarily uses plane 0 for everything. Long term plan is that only single plane images get to use metadata like htile/dcc/cmask/fmask. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	3c2e8267d0	radv: Add support for driconf. This includes 0 options. The cache parsing is located at a position where we can easily add config filtering by VkApplicationInfo. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-23 23:49:39 +00:00
Bas Nieuwenhuizen	8d2654a419	radv: Support VK_EXT_inline_uniform_block. Basically just reserve the memory in the descriptor sets. On the shader side we construct a buffer descriptor, since AFAIU VGPR indexing on 32-bit pointers in LLVM is still broken. This fully supports update after bind and variable descriptor set sizes. However, the limits are somewhat arbitrary and are mostly about finding a reasonable division of a 2 GiB max memory size over the set. v2: - rebased on top of master (Samuel) - remove the loading resources rework (Samuel) - only load UBO descriptors if it's a pointer (Samuel) - use LLVMBuildPtrToInt to avoid IR failures (Samuel) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2)	2019-04-19 09:21:47 +02:00
Bas Nieuwenhuizen	5f5ac19f13	radv: Implement VK_EXT_pipeline_creation_feedback. Does what it says on the tin. The per stage time is only an approximation due to linking and the Vega merged stages. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-20 21:19:46 +00:00
Samuel Pitoiset	a66b186beb	radv: use typed buffer loads for vertex input fetches This drastically reduces the number of SGPRs because the driver now uses descriptors per vertex binding, instead of per vertex attribute format. 29077 shaders in 15096 tests Totals: SGPRS: 1354285 -> 1282109 (-5.33 %) VGPRS: 909896 -> 908800 (-0.12 %) Spilled SGPRs: 24840 -> 24811 (-0.12 %) Code Size: 49221144 -> 48986628 (-0.48 %) bytes Max Waves: 243930 -> 244229 (0.12 %) Totals from affected shaders: SGPRS: 390648 -> 318472 (-18.48 %) VGPRS: 288432 -> 287336 (-0.38 %) Spilled SGPRs: 94 -> 65 (-30.85 %) Code Size: 11548412 -> 11313896 (-2.03 %) bytes Max Waves: 86460 -> 86759 (0.35 %) This gives a really tiny boost. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 13:31:11 +01:00
Samuel Pitoiset	0b9a06a1a0	radv: store more vertex attribute infos as pipeline keys They are required for using typed buffer loads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 13:31:08 +01:00
Bas Nieuwenhuizen	7631feaa00	radv: Sync ETC2 whitelisted devices. Fixes: `4bb6c49375` "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9." Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-02-20 02:55:41 +01:00
Samuel Pitoiset	210aec3612	radv: store vertex attribute formats as pipeline keys The formats will be used for reducing the number of loaded channels. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:10:09 +01:00
Samuel Pitoiset	1b8983c25b	radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8 This fixes a critical issue. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:39:30 +01:00
Samuel Pitoiset	5806d99984	radv: gather more info about push constants This is needed in order to inline some push constants when possible. This also adds a new helper for initializing the pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:25:34 +01:00
Samuel Pitoiset	6430616e77	radv: track if subpasses have color attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	5699ac0078	radv: determine the last subpass id for every attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:59 +01:00
Samuel Pitoiset	a20c2e38d8	radv: store the list of attachments for every subpass This reworks how the depth stencil attachment is used for simplicity. This also introduces radv_render_pass_compile() helper that will be used for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:54 +01:00
Samuel Pitoiset	a7c7d811f1	radv: move subpass image transitions to radv_cmd_buffer_begin_subpass() Instead of doing them in radv_cmd_buffer_set_subpass(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:52 +01:00
Samuel Pitoiset	545552c9b9	radv: remove unused radv_render_pass_attachment::view_mask Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:42 +01:00
Timothy Arceri	9b9ccee4d6	radv: take LDS into account for compute shader occupancy stats Ported from `d205faeb6c`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-01 22:25:30 +11:00
Samuel Pitoiset	5f0b17d581	radv: compute the GFX9 fence VA at allocation time Instead of doing every time we emit cache flushes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 11:31:12 +01:00
Samuel Pitoiset	bd098884f1	radv: remove old_fence parameter from si_cs_emit_write_event_eop() This parameter is actually useless as the immediate value can always be zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 11:31:07 +01:00
Rhys Perry	e4c6423c5e	radv: avoid context rolls when binding graphics pipelines It's common in some applications to bind a new graphics pipeline without ending up changing any context registers. This has a pipline have two command buffers: one for setting context registers and one for everything else. The context register command buffer is only emitted if it differs from the previous pipeline's. v2: ensure late scissor emission is done when radv_emit_rbplus_state() is called v2: make use of cmd_buffer->state.workaround_scissor_bug v3: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 14:37:53 +00:00
Rhys Perry	5564a797f2	radv: add missed situations for scissor bug workaround v2: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 14:37:53 +00:00
Samuel Pitoiset	d58b11e709	radv: get rid of bunch of KHR suffixes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-09 12:26:48 +01:00
Bas Nieuwenhuizen	656c1c488c	radv: Remove device path. unused and gcc complains about strncpy. (from what I can see because strncpy does not leave a 0 byte on truncate. That said we don't use it so this does not fix a real bug). Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:14 +01:00
Samuel Pitoiset	6b976024a8	radv: add support for FMASK expand Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:17 +01:00
Samuel Pitoiset	fa16da53d8	radv: initialize FMASK for images in fully expanded mode The value depends on the number of samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:15 +01:00
Samuel Pitoiset	3a5adc2879	radv: add a predicate for reflecting DCC decompression state It's somehow similar to the FCE predicate. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:10 +01:00
Samuel Pitoiset	3fbdcd942f	amd: remove support for LLVM 6.0 User are encouraged to switch to LLVM 7.0 released in September 2018. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-06 14:02:56 +01:00
Samuel Pitoiset	824cfc1ee5	radv: rework the TC-compat HTILE hardware bug with COND_EXEC After investigating on this, it appears that COND_WRITE doesn't work correctly in some situations. I don't know exactly why does it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK also uses COND_EXEC I think there is a reason. Now the driver stores a new metadata value in order to reflect the last fast depth clear state. If a TC-compat HTILE is fast cleared with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to work around that hardware bug. This fixes rendering issues with The Forest and DXVK and doesn't seem to introduce any regressions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914 Fixes: `68dead112e` ("radv: update the ZRANGE_PRECISION value for the TC-compat bug") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-05 09:26:31 +01:00
Samuel Pitoiset	724107553c	radv: implement fast HTILE clears for depth or stencil only on GFX9 This allows to fast clear the depth part (or the stencil part) of a depth+stencil surface when HTILE is enabled. I didn't test on GFX8, so it's disabled currently. This gives a very nice boost, for example when clearing the depth aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster). BEFORE: 235 us AFTER: 13 us Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:18 +01:00
Samuel Pitoiset	483a28bfd4	radv: tidy up radv_set_dcc_need_cmask_elim_pred() This is just a small cleanup. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 14:05:33 +01:00
Samuel Pitoiset	c571ca7a08	radv: replace si_emit_wait_fence() with radv_cp_wait_mem() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-05 09:48:50 +01:00
Samuel Pitoiset	b1b2dd06a7	radv: add missing TFB queries support to CmdCopyQueryPoolsResults() Cc: 18.3 <mesa-stable@lists.freedesktop.org> Fixes: `b4eb029062` ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-05 09:48:43 +01:00
Samuel Pitoiset	b4eb029062	radv: implement VK_EXT_transform_feedback This implementation should work and potential bugs can be fixed during the release candidates window anyway. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:10:58 +01:00
Samuel Pitoiset	f4fa8de794	radv: gather stream output info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	79bbdf8e45	radv: implement image to image operations for R32G32B32 This should address the remaining failures in Batman Arkhman City. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 10:50:08 +02:00
Samuel Pitoiset	593996bc02	radv: implement buffer to image operations for R32G32B32 This should fix rendering issues with Batman Arkham City. We will probably need to implement itob and itoi at some point, but currently nothing hits these paths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-16 09:22:38 +02:00
Bas Nieuwenhuizen	6ed0fd24d4	radv: Implement VK_EXT_pci_bus_info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-15 12:27:49 +02:00
Samuel Pitoiset	229803b66a	radv: implement clear operations for R32G32B32 This fixes crashes for some CTS: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color..linear__* dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.._linear_* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-11 14:49:16 +02:00
Samuel Pitoiset	621e70dd40	radv: adjust the CmdUpdateBuffer threshold for optimal performance According to my benchmark results, it appears that we should reduce the threshold to 1024. BEFORE: 1 KB: 68.656000 ms 2 KB: 118.368000 ms AFTER: 1 KB: 31.760000 ms 2 KB: 29.840000 ms Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-28 09:08:44 +02:00
Samuel Pitoiset	3871dd7a92	radv: allow to force anisotropy via RADV_TEX_ANISO Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-18 13:27:58 +02:00
Samuel Pitoiset	c79aad30ae	radv: emit the initial config only once in the preambles It shouldn't be needed to emit the initial graphics or compute state when beginning a new command buffer. Emitting them in the preamble should be enough and this will reduce IB sizes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Bas Nieuwenhuizen	fbcd167314	radv: Add on-demand compilation of built-in shaders. In environments where we cannot cache, e.g. Android (no homedir), ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the startup cost of creating a device in radv is rather high, due to compiling all possible built-in pipelines up front. This meant depending on the CPU a 1-4 sec cost of creating a Device. For CTS this cost is unacceptable, and likely for starting random apps too. So if there is no cache, with this patch radv will compile shaders on demand. Once there is a cache from the first run, even if incomplete, the driver knows that it can likely write the cache and precompiles everything. Note that I did not switch the buffer and itob/btoi compute pipelines to on-demand, since you cannot really do anything in Vulkan without them and there are only a few. This reduces the CTS runtime for the no caches scenario on my threadripper from 32 minutes to 8 minutes. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-14 10:26:24 +02:00
Bas Nieuwenhuizen	806a792b43	radv: Make fs key exemplars ordered to be a reverse fs_key lookup. While at it, share the exemplars and account for a non-occurring fs key. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-14 10:26:06 +02:00
Samuel Pitoiset	0a8127bbfb	radv: make use of radv_subpass_barrier() when resolving subpasses The goal is to use radv_barrier()/radv_subpass_barrier() as much as possible for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:11 +02:00
Samuel Pitoiset	e45ba51ea4	radv: add support for VK_EXT_conditional_rendering Inherited commands buffers are not supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:09 +02:00
Samuel Pitoiset	946cf3f39f	radv: add support for non-inverted conditional rendering By default, our internal rendering commands are discarded only if the predicate is non-zero (ie. DRAW_VISIBLE). But VK_EXT_conditional_rendering also allows to discard commands when the predicate is zero, which means we have to use a different flag. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:06 +02:00
Samuel Pitoiset	5b32926f7e	radv: remove unnecessary verification code around ring_offsets_idx I don't want to waste CPU cycles for nothing. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:42 +02:00
Samuel Pitoiset	1f616a840e	radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9 A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion counters) must immediately precede every timestamp event to prevent a GPU hang on GFX9. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:22:36 +02:00
Samuel Pitoiset	fe28978f2a	radv: introduce radv_subpass_attachment data structure Needed for VK_KHR_create_renderpass2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:20:06 +02:00
Samuel Pitoiset	4a67ce886a	radv: make sure to wait for CP DMA when needed This might fix some synchronization issues. I don't know if that will affect performance but it's required for correctness. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-11 12:11:56 +02:00
Samuel Pitoiset	f2a310849e	radv: only flush CB meta in pipeline image barriers when needed If the given image doesn't enable CMASK, FMASK or DCC that's useless to flush CB metadata. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:20:16 +02:00
Dave Airlie	7398913a62	ac/radv: move llvm compiler info to struct and init in one place This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:16 +10:00
Dave Airlie	e1387eaf12	radv: create/destroy passmgr at the higher level. This is prep work for moving this to a per-thread struct Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:05 +10:00
Samuel Pitoiset	7a57c82767	radv: use separate bind points for the dynamic buffers The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-27 09:48:31 +02:00
Samuel Pitoiset	9c09e7d66e	radv: remove unused 'predicated' parameter from some functions It's always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-27 09:48:15 +02:00
Samuel Pitoiset	fa42fa1a60	radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries Ported from RadeonSI. This appears to fix some random fails with: dEQP-VK.query_pool.statistics_query.* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 18:23:16 +02:00
Keith Packard	1df586be12	radv: add VK_EXT_display_control to radv driver [v5] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Rework fence integration into the driver so that waiting for any of a mixture of fence types (wsi, driver or syncobjs) causes the driver to poll, while a list of just syncobjs or just driver fences will block. When we get syncobjs for wsi fences, we'll adapt to use them. v3: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v4: Adapt to WSI fence API change. It now returns VkResult and no longer has an option for relative timeouts. v5: wsi_register_display_event and wsi_register_device_event now use the default allocator when NULL is provided, so remove the computation of 'alloc' here. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-23 07:59:00 -07:00
Samuel Pitoiset	65b3fed037	radv: always initialize the clear depth/stencil values to 0 Similar to the clear color values. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	204cf5714a	radv: always initialize the clear color values to 0 Having random data in there is probably not the best. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	20170865db	radv: don't store the number of samples as log2 Needed for the following patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Keith Packard	451b58a51e	radv: Add KHR_display extension to radv [v5] This adds support for the KHR_display extension to the radv Vulkan driver. The driver now attempts to open the master DRM node when the KHR_display extension is requested so that the common winsys code can perform the necessary operations. v2: * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Adapt to new wsi_device_init API (added display_fd) v4: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v5: Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	da997ebec9	vulkan: Add KHR_display extension using DRM [v10] This adds support for the KHR_display extension support to the vulkan WSI layer. Driver support will be added separately. v2: * fix double ;; in wsi_common_display.c * Move mode list from wsi_display to wsi_display_connector * Fix scope for wsi_display_mode andwsi_display_connector allocs * Switch all allocations to vk_zalloc instead of vk_alloc. * Fix DRM failure in wsi_display_get_physical_device_display_properties When DRM fails, or when we don't have a master fd (presumably due to application errors), just return 0 properties from this function, which is at least a valid response. * Use vk_outarray for all property queries This is a bit less error-prone than open-coding the same stuff. * Remove VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR from surface caps Until we have multi-plane support, we shouldn't pretend to have any multi-plane semantics, even if undefined. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Add separate 'display_fd' and 'render_fd' arguments to wsi_device_init API. This allows drivers to use different FDs for the different aspects of the device. Use largest mode as display size when no preferred mode. If the display doesn't provide a preferred mode, we'll assume that the largest supported mode is the "physical size" of the device and report that. v4: Make wsi_image_state enumeration values uppercase. Follow more common mesa conventions. Remove 'render_fd' from wsi_device_init API. The wsi_common_display code doesn't use this fd at all, so stop passing it in. This avoids any potential confusion over which fd to use when creating display-relative object handles. Remove call to wsi_create_prime_image which would never have been reached as the necessary condition (use_prime_blit) is never set. whitespace cleanups in wsi_common_display.c Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Add depth/bpp info to available surface formats. Instead of hard-coding depth 24 bpp 32 in the drmModeAddFB call, use the requested format to find suitable values. Destroy kernel buffers and FBs when swapchain is destroyed. We were leaking both of these kernel objects across swapchain destruction. Note that wsi_display_wait_for_event waits for anything to happen. wsi_display_wait_for_event is simply a yield so that the caller can then check to see if the desired state change has occurred. Record swapchain failures in chain for later return. If some asynchronous swapchain activity fails, we need to tell the application eventually. Record the failure in the swapchain and report it at the next acquire_next_image or queue_present call. Fix error returns from wsi_display_setup_connector. If a malloc failed, then the result should be VK_ERROR_OUT_OF_HOST_MEMORY. Otherwise, the associated ioctl failed and we're either VT switched away, or our lease has been revoked, in which case we should return VK_ERROR_OUT_OF_DATE_KHR. Make sure both sides of if/else brace use matches Note that we assume drmModeSetCrtc is synchronous. Add a comment explaining why we can idle any previous displayed image as soon as the mode set returns. Note that EACCES from drmModePageFlip means VT inactive. When vt switched away drmModePageFlip returns EACCES. Poll once a second waiting until we get some other return value back. Clean up after alloc failure in wsi_display_surface_create_swapchain. Destroy any created images, free the swapchain. Remove physical_device from wsi_display_init_wsi. We never need this value, so remove it from the API and from the internal wsi_display structure. Use drmModeAddFB2 in wsi_display_image_init. This takes a drm format instead of depth/bpp, which provides more control over the format of the data. v5: Set the 'currentStackIndex' member of the VkDisplayPlanePropertiesKHR record to zero, instead of indexing across all displays. This value is the stack depth of the plane within an individual display, and as the current code supports only a single plane per display, should be set to zero for all elements Discovered-by: David Mao <David.Mao@amd.com> v6: Remove 'platform_display' bits from the build and use the existing 'platform_drm' instead. v7: Ensure VK_ICD_WSI_PLATFORM_MAX is large enough by setting to VK_ICD_WSI_PLATFORM_DISPLAY + 1 v8: Simplify wsi_device_init failure from wsi_display_init_wsi by using the same pattern as the other wsi layers. Adopt Jason Ekstrand's white space and variable declaration suggestions. Declare variables at first use, eliminate extra whitespace between types and names, add list iterator helpers, switch to lower-case list_ macros. Respond to Jason's April 8 review: * Create a function to convert relative to absolute timeouts to catch overflow issues in one place * use VK_NULL_HANDLE to clear prop->currentDisplay * Get rid of available_present_modes array. * return OUT_OF_DATE_KHR when display_queue_next called after display has been released. * Make errors from mode setting fatal in display_queue_next * Remove duplicate pthread_mutex_init call * Add wsi_init_pthread_cond_monotonic helper function to isolate pthread error handling from wsi_display_init_wsi Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v9: Fix vscan handling by using MAX2(vscan, 1) everywhere. Vscan can be zero anywhere, which is treated the same as 1. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v10: Respond to Vulkan CTS failures. 1. Initialize planeReorderPossible in display_properties code 2. Only report connected displays in get_display_plane_supported_displays 3. Return VK_ERROR_OUT_OF_HOST_MEMORY when pthread cond initialization fails. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> 4. Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2018-06-19 14:17:46 -07:00
Marek Olšák	6703fec58c	amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Samuel Pitoiset	fa8bc821a8	radv: clean up radv_{set,load}_depth_clear_regs() helpers And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:04 +02:00
Samuel Pitoiset	be794fa26b	radv: clean up radv_{set,load}_color_clear_regs() helpers And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:53:58 +02:00
Dave Airlie	600d34c822	radv: remove multisample bit from shader key. This wasn't being used anywhere inside the shader from what I can see. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 09:33:20 +10:00
Alex Smith	7ca0167ae9	radv: Consolidate GFX9 merged shader lookup logic This was being handled in a few different places, consolidate it into a single radv_get_shader() function. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:31 +01:00
Bas Nieuwenhuizen	b9fb2c266a	radv: Add startup debug option. This adds a RADV_DEBUG=startup option to dump more info about instance creation and device enumeration. A common question end users have is why the direver is not loading for them, and this has two common reasons: 1) They did not install the driver. 2) AMDGPU is not used for the card in the kernel. This adds some info messages so we can easily get a some useful output from end users. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	38933c1151	radv: Add option to print errors even in optimized builds. Errors are not that common of a case so we can eat a slight perf hit in having to call a function and do a runtime check. In turn this makes debugging random errors happening for end users easier, because they don't have to have a debug build on hand. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	729f7373de	radv: Make the sem_info allocate/free functions static. They are only used in 1 file. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Samuel Pitoiset	21baf33a94	radv: allow radv_emit_shader_pointer_head() to emit more pointers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:16 +02:00
Samuel Pitoiset	288fe7ec71	radv: split radv_emit_shader_pointer() This will allow to emit consecutive shader pointers for reducing the number of emitted SET_SH_REG packets, which is recommended. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:13 +02:00
Samuel Pitoiset	36a4d6d081	radv: add support for 32-bit pointers in user data SGPRs We still use 64-bit GPU pointers for all ring buffers because llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit GPU pointers for now. This can be improved later anyways. Vega10: Totals from affected shaders: SGPRS: 1008722 -> 1026710 (1.78 %) VGPRS: 706580 -> 707136 (0.08 %) Spilled SGPRs: 22555 -> 22209 (-1.53 %) Spilled VGPRs: 75 -> 75 (0.00 %) Code Size: 34819208 -> 35202140 (1.10 %) bytes Max Waves: 175423 -> 175086 (-0.19 %) Polaris10: Totals from affected shaders: SGPRS: 1029849 -> 1036517 (0.65 %) VGPRS: 709984 -> 708872 (-0.16 %) Spilled SGPRs: 22672 -> 22309 (-1.60 %) Spilled VGPRs: 82 -> 66 (-19.51 %) Scratch size: 76 -> 60 (-21.05 %) dwords per thread Code Size: 34915336 -> 35309752 (1.13 %) bytes Max Waves: 151221 -> 151677 (0.30 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:22 +02:00
Samuel Pitoiset	fcba3934fc	radv: add radv_emit_shader_pointer() helper For future work (support for 32-bit GPU pointers). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 21:28:59 +02:00
Samuel Pitoiset	1e86eaf7d8	radv: remove radv_device::llvm_supports_spill It's always true. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:21 +02:00
Bas Nieuwenhuizen	3d4d388e39	radv: Fix up 2_10_10_10 alpha sign. Pre-Vega HW always interprets the alpha for this format as unsigned, so we have to implement a fixup to do the sign correctly for signed formats. v2: Improve indexing mess. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:20 +02:00
Timothy Arceri	ce188813bf	radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR linking optimisations and only run over the NIR optimisation loop once similar to the GLSLOptimizeConservatively constant used by some GL drivers. We need to run over the opts at least once to avoid errors in LLVM (e.g. dead vars it can't handle) and also to reduce the time spent compiling the IR in LLVM. With this change the Blacksmith Unity demos compilation times go from 329760 ms -> 299881 ms when using Wine and DXVK. V2: add bit to radv_pipeline_key Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246	2018-05-13 09:58:33 +10:00
Matthew Nicholls	97d57ef917	radv: fix multisample image copies Previously before `fb077b0728`, the LOD parameter was being used in place of the sample index, which would only copy the first sample to all samples in the destination image. After that multisample image copies wouldn't copy anything from my observations. This fixes some copy_and_blit CTS tests. v3.1: - set lod to 0 for nir_txf_ms (Samuel) v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel) - updated commit description (Samuel) Fix this properly by copying each sample in a separate radv_CmdDraw and using a pipeline with the correct rasterizationSamples for the destination image. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 19:32:00 +02:00
Samuel Pitoiset	5c1233ed62	radv: use a global BO list only for VK_EXT_descriptor_indexing Maintaining two different paths is annoying but this gets rid of the performance regression introduced by the global BO list. We might find a better solution in the future, but for now just keeps two paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:18 +02:00
Samuel Pitoiset	7bd5367546	Revert "radv: Don't store buffer references in the descriptor set." In order to reduce a performance regression introduced by `4b13fe55a4` ("radv: Keep a global BO list for VkMemory."), we are going to maintain two different paths. One when VK_EXT_descriptor_indexing is enabled by the application because we need to have a global BO list, and one (the old one) when it's not enabled. With Talos on Polaris, the global BO list reduces performance by 10% which is too much for me. This reverts commit `ab6cadd3ec`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:13 +02:00
Samuel Pitoiset	2f63b3dd09	radv: enable DCC for MSAA 2x textures on VI under an option This can be enabled with RADV_PERFTEST=dccmsaa. DCC for MSAA textures is actually not as easy to implement. It looks like there is some corner cases. I will improve support incrementally. Vega support, as well as Polaris improvements, will be added later. No CTS changes on Polaris using RADV_DEBUG=zerovram and RADV_PERFTEST=dccmsaa. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:55 +02:00
Bas Nieuwenhuizen	ab6cadd3ec	radv: Don't store buffer references in the descriptor set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	4b13fe55a4	radv: Keep a global BO list for VkMemory. With update after bind we can't attach bo's to the command buffer from the descriptor set anymore, so we have to have a global BO list. I am somewhat surprised this works really well even though we have implicit synchronization in the WSI based on the bo list associations and with the new behavior every command buffer is associated with every swapchain image. But I could not find slowdowns in games because of it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Samuel Pitoiset	fde7b90ecf	radv: make radv_initialise_cmask() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:51 +02:00
Marek Olšák	43d66c8c2d	mesa: include mtypes.h less - remove mtypes.h from most header files - add main/menums.h for often used definitions - remove main/core.h v2: fix radv build Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-12 19:31:30 -04:00
Bas Nieuwenhuizen	6ff98dbf7c	radv: Implement VK_EXT_vertex_attribute_divisor. Pretty straight forward, just pass the divisors through the shader key and then do a LLVM divide. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen	ed94638156	radv: Enable RB+ where possible. According to Marek, not enabling it on Stoney has a significant negative performance impact. (And I guess this might impact performance on Raven as well) The register settings are pretty much copied from radeonsi. I did not put this in the pipeline as that would make the pipeline more dependent on the format which mean we would have to have more pipelines for the meta shaders. v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet does already. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 01:19:10 +02:00
Samuel Pitoiset	b0f8ad189c	radv: add radv_image_is_tc_compat_htile() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:26 +02:00
Samuel Pitoiset	ed41e776d0	radv: clean up radv_vi_dcc_enabled() And rename to radv_dcc_enabled() to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:14 +02:00
Samuel Pitoiset	e213f19907	radv: clean up radv_htile_enabled() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:12 +02:00
Samuel Pitoiset	0fc9113ac5	radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:10 +02:00
Samuel Pitoiset	7fe586f6fb	radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries This unnecessary when the precision bit flag is not set, and this might hurt performance. The Vulkan explains that not setting VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some implementations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:34 +02:00
Samuel Pitoiset	a8a696a38f	radv: use a mask for VBOs and shaders prefetching Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:42 +02:00
Samuel Pitoiset	922cd38172	radv: implement out-of-order rasterization when it's safe on VI+ Disabled by default for now, it can be enabled with RADV_PERFTEST=outoforder. No CTS regressions on Polaris, and all Vulkan games I tested look good as well. Expect small performance improvements for applications where out-of-order rasterization can be enabled by the driver. Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	2a329f4ada	radv: set SAMPLE_RATE to the number of samples of the current fb Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-30 17:32:15 +02:00
Samuel Pitoiset	52fba3f45d	radv: remove unused radv_pipeline::needs_data_cache variable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-22 14:30:37 +01:00
Samuel Pitoiset	d07edf5fdf	radv: add dump_shader to the NIR compiler options Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:00 +01:00
Samuel Pitoiset	38f34117dd	radv: fix vkGetDeviceQueue2() when create flags don't match This fixes CTS: dEQP-VK.api.device_init.create_device_queue2_unmatched_flags Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@gmail.com>	2018-03-14 09:53:42 +01:00
Samuel Pitoiset	fbe694562b	ac/nir: move ac_nir_compiler_options and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:23 +01:00
Samuel Pitoiset	237229430f	ac: move ac_shader_info to radv folder This is RADV specific code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:21 +01:00
Samuel Pitoiset	2cfba40eea	ac/nir: move ac_shader_variant_info and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:16 +01:00
Samuel Pitoiset	b2653007b9	ac/nir: move all RADV related code to radv_nir_to_llvm.c Now the "ac/nir" prefix will really be the shared code between RadeonSI and RADV, that might avoid confusions in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Bas Nieuwenhuizen	997306c031	radv: Increase the number of dynamic uniform buffers. The vulkan API is not ideal as it does not allow us have a shared limit. Feral needs 15+6 for one of their games, and I'm not a fan of overcommitting the limits, so increase the number of dynamic uniform buffers to 16. CC: <mesa-stable@lists.freedesktop.org> CC: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-12 09:46:22 +01:00
Samuel Pitoiset	c27f5419f6	radv: only emit cache flushes when the pool size is large enough This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:40 +01:00
Samuel Pitoiset	2fe07933bd	radv: keep track of the query pool size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:39 +01:00
Samuel Pitoiset	c956d0f406	radv: make sure to emit cache flushes before starting a query If the query pool has been previously resetted using the compute shader path. Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:14:49 +01:00
Dave Airlie	6bafd4f4dd	radv: remove device pointer from buffer. This is never used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:03:26 +10:00
Dave Airlie	1fc19a0f27	radv: merge tess rings into a single bo Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Bas Nieuwenhuizen	e72ad05c1d	radv: Return NULL for entrypoints when not supported. This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	076f7cfc6b	radv: Track enabled extensions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	4db78f3a6b	radv: Put supported extensions in a struct. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Fredrik Höglund	5a38d8f103	radv: implement VK_EXT_external_memory_host Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 00:46:07 +01:00
Samuel Pitoiset	4922e7f25c	radv: use separate bindings for graphics and compute descriptors The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:37:09 +01:00
Samuel Pitoiset	cf224014dd	radv: store the bind point when creating descriptors with templates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:37:07 +01:00
Matthew Nicholls	ef272b161e	radv: remove predication on cache flushes This can lead to a situation where cache flushes could get conditionally disabled while still clearing the flush_bits, and thus flushes due to application pipeline barriers may never get executed. Fixes: `a6c2001ace` (radv: add support for cmd predication.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 13:37:18 +10:00
Bas Nieuwenhuizen	882eff4d20	radv: Merge raster state with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:02:05 +01:00
Bas Nieuwenhuizen	69364f1c34	radv: Move gs state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:02:01 +01:00
Bas Nieuwenhuizen	e4e060d135	radv: Split out cliprect rule generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:56 +01:00
Bas Nieuwenhuizen	acbaef3005	radv: Merge VGT_GS_MODE computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:52 +01:00
Bas Nieuwenhuizen	9062b1c241	radv: Move tessellation state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:38 +01:00
Bas Nieuwenhuizen	4aa1cb4e90	radv: Move blend state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:34 +01:00
Bas Nieuwenhuizen	0f72f0eacb	radv: Split out generating VGT_SHADER_STAGES_EN. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:30 +01:00
Bas Nieuwenhuizen	694c34314b	radv: Split out the ia_multi_vgt_param precomputation. Also moved everything in a struct and then return the struct from the helper function, so it is clear in the caller what part of the pipeline gets modified. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:26 +01:00
Bas Nieuwenhuizen	0bea0851aa	radv: Split out db_shader_control computation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:18 +01:00
Bas Nieuwenhuizen	5dce47ae6d	radv: Compute shader_z_format when emitting it. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:13 +01:00
Bas Nieuwenhuizen	df2e7ab0db	radv: Merge depth stencil state with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:06 +01:00
Bas Nieuwenhuizen	d5a0af84ec	radv: Merge ps_input_cntl computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:01 +01:00
Bas Nieuwenhuizen	e2bf18030d	radv: Merge vtx_reuse_depth computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:55 +01:00
Bas Nieuwenhuizen	c80747b32c	radv: Merge vs state computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:50 +01:00
Bas Nieuwenhuizen	c4191cf944	radv: Merge binning state generation with pm4 emission. We don't need the pipeline state struct anymore. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:45 +01:00
Bas Nieuwenhuizen	6f1a3f081e	radv: Constify some pipeline helpers. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:40 +01:00
Bas Nieuwenhuizen	beeab44190	radv: Record a PM4 sequence for graphics pipeline switches. This gives about 2% performance improvement on dota2 for me. This is mostly a mechanical copy and replacement, but at bind time we still do: 1) Some stuff that is only based on num_samples changes. 2) Some command buffer state setting. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:22 +01:00
Bas Nieuwenhuizen	7c366bc152	radv: Determine unneeded dynamic states. Which avoids setting or emitting them. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:17 +01:00
Dave Airlie	298554541d	radv: move spi_baryc_cntl to pipeline We need to enable the pos float location 2 mode anytime we have persample not just when forced by the frag shader. This fixes: dEQP-VK.pipeline.multisample.min_sample_shading* Fixes: `58c97a079` (radv: enable location at sample when persample is forced.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-25 06:47:28 +10:00
Dave Airlie	766589d89a	radv: fix sample_mask_in loading. (v3.1) This is ported from radeonsi and fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_* v2: don't call this path for radeonsi, it does it in the epilog. use the radeonsi code path. v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel) v3.1: set ps_iter_samples default to 1 (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `bdcbe7c76` (radv: add sample mask input support) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 14:25:11 +10:00
Dave Airlie	316d762186	radv: add fs_key meta format support to resolve passes. Some of the hw resolve passes need the SPI color format setup correctly. This fixes lots of 16-bit and 32-bit format tests in dEQP-VK.renderpass.suballocation.multisample* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 08:50:51 +10:00
Bas Nieuwenhuizen	b1444c9ccb	radv: Implement VK_ANDROID_native_buffer. Passes dEQP-VK.api.smoke.* dEQP-VK.wsi.android.* with android-cts-7.1_r12 . Unlike the initial anv implementation this does use syncobjs instead of waiting on the CPU. This is missing meson build coverage for now. One possible todo is that linux 4.15 now has a sycall that allows us to export amdgpu fence to a sync_file, which allows us not to force all fences and semaphores to use syncobjs. However, I had trouble with my kernel crashing regularly with NULL pointers, and I'm not sure how beneficial it is in the first place given that intel uses syncobjs for all fences if available. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen	a3e241ed07	radv: Add create image flag to not use DCC/CMASK. If we import an image, we might not have space in the buffer for CMASK, even though it is compatible. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen	0b8991c0b6	radv: Implement VK_EXT_debug_report. This is not hooked up to any messages yet, but useful for e.g. renderdoc if you add some messages during development. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-17 11:29:04 +01:00
Dave Airlie	ad11fc3571	radv: don't emit unneeded vertex state. If the number of instances hasn't changed and we've already emitted it, don't emit it again. If the vertex shader is the same and the first_instance, vertex_offset haven't changed don't emit them again. This increases the fps in GL_vs_VK -t 1 -m -api vk from around 40 to around 60 here, it may not impact anything else. Dieter also reported smoketest going from 1060->1200 fps. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-12 00:43:07 +00:00
Bas Nieuwenhuizen	5db0bf9994	radv: Implement VK_EXT_discard_rectangles. Tested with a modified deferred demo and no regressions in a 1.0.2 mustpass run. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 13:26:22 +01:00
Bas Nieuwenhuizen	11b9cdd2d7	radv: Add mapping between dynamic state mask and external enum. The EXT values are really large, e.g. VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 1000099000, so 1 << value is not going to fit into a 32-bit mask. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 13:24:31 +01:00
Samuel Pitoiset	b09b3f8834	radv: add has_scissor_bug for Vega10 and Raven Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:56 +01:00
Samuel Pitoiset	a3c2a86757	radv: make shader BOs read-only for the GPU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:51 +01:00
Samuel Pitoiset	87efa71001	radv: remove unused radv_color_buffer_info::cb_clear_valueX Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-05 17:26:51 +01:00
Bas Nieuwenhuizen	6a36bfc64d	radv: Implement binning on GFX9. Overall it does not really help or hurt. The deferred demo gets 1% improvement and some games a 3% decrease, so I don't think this should be enabled by default. But with the code upstream it is easier to experiment with it. v2: Remove initializing the registers from si_emit_config. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-31 15:07:07 +01:00
Bas Nieuwenhuizen	44fcf58744	radv: Disable DCC for GENERAL layout and compute transfer dest. Apps can use this for render feedback loops, where things are defined if they render each pixel only once. However, DCC fails here, as the level of coherence is a block not a pixel, so disable it. This is also going to help implementing other stuff. Even if we optimize this later to only happen if there actually is a loop (if possible at all ...), then the machinery is still useful to exclude images accessible by the SDMA queue when that is implemented. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:53 +01:00
Bas Nieuwenhuizen	1cfab28e6e	radv: Make color meta operations layout aware. For fast clear eliminate and decompressions, we always use the most compressed format. For clears, the code already creates a renderpass on demand with the exact same layout as specified. Otherwise we start distinguishing between GENERAL and TRANSFER_DST_OPTIMAL. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:44 +01:00
Bas Nieuwenhuizen	3e2a6191c9	radv: Add compute DCC decompress. We do an in place copy where we read compressed and write decompressed. By doing this in sizes that cover entire DCC blocks and waiting for all reads in the block before starting to write we avoid corruption. In the end we clear the DCC metadata to 0xffffffff. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:40 +01:00
Bas Nieuwenhuizen	e5feeec140	radv: Add GFX DCC decompress. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:31 +01:00
Dave Airlie	420627e6e7	radv/gfx9: fix buffer to image for 3d images on compute queues This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.64x64x8 tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 09:37:09 +10:00
Dave Airlie	09612a62e1	radv/gfx9: fix 3d image clears on compute queues This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.64x64x8 tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 09:37:05 +10:00
Dave Airlie	d08f267814	radv/gfx9: fix 3d image to image transfers on compute queues. This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.64x64x8 tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 09:37:00 +10:00
Dave Airlie	fbac9f86aa	radv/meta: fix blit paths for depth/stencil (v2.1) This fixes the layout issue for the blit path as well. This fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint* v2: use compatible render passes. v2.1: use enum Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:11:02 +10:00
Dave Airlie	821b5379f0	radv: handle depth/stencil image copy with layouts better. (v3.1) If we are doing a general->general transfer with HIZ enabled, we want to hit the tile surface disable bits in radv_emit_fb_ds_state, however we never get the current layout to know we are in general and meta hardcoded the transfer layout which is always tile enabled. This fixes: dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general v2: refactor some shared helpers for blit patches v3: we only need multiple render passes as they should be compatible. v3.1: use enum (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:10:04 +10:00
Dave Airlie	9f675bf934	radv/gfx9: add support for 3d images to blit 2d paths This add support for a 3D image reading path to the blit 2d paths, like I did for the clear paths. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:09:28 +10:00
Dave Airlie	a99fa7e8a2	radv/gfx9: add 3d sampler image->buffer copy shader. (v3) On GFX9 we must access 3D textures with 3D samplers AFAICS. This fixes: dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer on GFX9 for me. v1.1: fix tex->sampler_dim to dim v2: send layer in from outside v3: don't regress on pre-gfx9 Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:08:48 +10:00
Samuel Pitoiset	3595a11648	radv: create pipeline layout objects for all meta operations They are dummy objects but the spec requires layout to not be NULL, this just makes sure we are creating valid pipeline layout objects. This will allow us to remove some useless checks. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:06 +01:00
Samuel Pitoiset	8d00e63ca8	radv: remove useless radv_cmask_info::base_address_reg Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:51:11 +01:00
Bas Nieuwenhuizen	969421b7da	radv: Implement fences based on syncobjs. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:12 +01:00
Samuel Pitoiset	9fdc1437ba	radv: store the dispatch initiator into the device Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:20:55 +01:00
Samuel Pitoiset	c7c7b00889	radv: only re-mit the index type when it changes dota2 binds a ton of index buffers but the type is always 16-bit. Note that we have to invalidate the type when switching from indexed draws to normal draws. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:21:36 +01:00
Samuel Pitoiset	a380bc7ecf	radv: track different status of a command buffer RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think it makes sense to declare it. Could be used later with better command buffer error handling. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:21:21 +01:00
Alex Smith	8fda98c4f1	radv: Add LLVM version to the device name string Allows apps to determine the LLVM version so that they can decide whether or not to enable workarounds for LLVM issues. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-07 08:58:34 +00:00
Dave Airlie	69365d72de	radv/wsi: drop allocate memory special case Just check if image has scanout flag set v2 (Jason Ekstrand): - Rebase - Also drop the now unused radv_mem_flag_bits enum Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Samuel Pitoiset	319f56e675	radv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-30 21:38:00 +01:00
Samuel Pitoiset	4eab78b03c	radv: do not store gfx9_epitch in radv_color_buffer_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-30 21:37:58 +01:00
Samuel Pitoiset	3a32858fc3	radv: use a 16 bytes array for the sampled/storage image descriptors This allows to update them with only one memcpy(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-20 11:18:22 +01:00
Samuel Pitoiset	c665879455	radv: replace vb_dirty with RADV_CMD_DIRTY_VERTEX_BUFFER Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:01:05 +01:00
Samuel Pitoiset	8fd213277f	radv: drop radv_cmd_dirty_mask_t typedef I don't think we will need a 64-bit unsigned integer for the dirty flags in the future, and there is still 20 bits left. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:01:01 +01:00
Samuel Pitoiset	f697365058	radv: use an unsigned 32-bit integer for radv_queue::family_index VkDeviceQueueCreateInfo::queueFamilyIndex is an unsigned 32-bit integer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:00:59 +01:00
Samuel Pitoiset	4e16c6a41e	radv: make radv_emit_framebuffer_state() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:04:25 +01:00
Samuel Pitoiset	f87c58dde3	radv: prefetch VBO descriptors at the right place Just after the vertex shader. This seems to give a minor boost for, at least, Serious Sam Fusion 2017 and Dawn of War 3. I don't see any real impacts with The Talos Principle. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:03:16 +01:00
Dave Airlie	031e591923	radv: move calculating vs out info regs into pipeline. This moves some calculations of register values into the pipeline construction, it saves looking at outinfo in the cmd buffer emit. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-13 07:16:53 +00:00
Dave Airlie	3bf8be41b8	radv: pre-calculate user_data_0 registers and store in pipeline There's no point recalculating these the whole time on descriptor emission, just store them at pipeline creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 21:44:49 +00:00
Dave Airlie	60a9705e00	radv: move descriptor sets out of cmd_state. Instead of storing all the pointers and zeroing them all out, just store a valid bitmask in the state. This also moves the CmdBindPipeline path down the cpu usage path for the multithreading demo as it no longer has to traverse MAX_SETS to find the active descriptor sets. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:11:03 +00:00
Dave Airlie	3a0d098252	radv: add helper for setting a descriptor. This is just a simple refactor. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:11:00 +00:00
Dave Airlie	b48063a2f2	radv: move vertex binding out of cmd state. This isn't required to be cleared, since buffers are only linked by vertex elements, so if elements are clear then no buffers should be referenced. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:10:56 +00:00
Dave Airlie	7365626d78	radv: reorder cmd_state to remove a hole. This just removes a hole in the cmd_state and packs some bools together. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:10:53 +00:00
Bas Nieuwenhuizen	cecbcf4b2d	radv: Use an array to store descriptor sets. The vram_list linked list resulted in lots of pointer chasing. Replacing this with an array instead improves descriptor set allocation CPU usage by 3x at least (when also considering the free), because it had to iterate through 300-400 sets on average. Not a huge improvement as the pre-improvement CPU usage was only about 2.3% in the busiest thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-04 20:18:17 +01:00
Bas Nieuwenhuizen	806721429a	radv: Don't expose heaps with 0 memory. It confuses CTS. This pregenerates the heap info into the physical device, so we can use it for translating contiguous indices into our "standard" ones. This also makes the WSI a bit smarter in case the first preferred heap does not exist. Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org>	2017-11-02 20:28:19 +01:00
Samuel Pitoiset	c39f39106d	radv: make radv_bind_descriptor_set() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-02 09:36:14 +01:00
Dave Airlie	799ef80059	radv: make sure we set buffers as shareable properly. This should make sure we don't treat exports buffers as local bos. Fixes: `a639d40f13` (radv: add support for local bos. (v3)) Tested-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-02 01:01:29 +00:00
Samuel Pitoiset	11fdc2cd34	radv: bail out when binding the same index buffer DOW3 appears to hit this path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 10:16:35 +01:00
Alex Smith	de88979413	radv: Implement VK_AMD_shader_info This allows an app to query shader statistics and get a disassembly of a shader. RenderDoc git has support for it, so this allows you to view shader disassembly from a capture. When this extension is enabled on a device (or when tracing), we now disable pipeline caching, since we don't get the shader debug info when we retrieve cached shaders. v2: Improvements to resource usage reporting v3: Disassembly string must be null terminated (string_buffer's length does not include the terminator) v4: Fixed LDS reporting. (Bas) Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-29 00:28:45 +02:00
Samuel Pitoiset	0d61109bb7	radv: make radv_fill_buffer() return the needed flush bits Only needed when the CS path is used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-27 13:47:03 +02:00
Samuel Pitoiset	b1e31c1911	radv: store the dynamic state mask into radv_dynamic_state Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-26 09:37:03 +02:00
Bas Nieuwenhuizen	49d035122e	radv: Add single pipeline cache key. To decouple the key used for info gathering and the cache from whatever we pass to the compiler. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 00:28:40 +02:00
Dave Airlie	a5499b639c	radv: only emit dfsm packets if dfsm is allowed. radeonsi only emits these when dfsm is enabled, so for now just hinge them on a flag we never set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-24 23:00:57 +01:00
Andres Rodriguez	eff2bdbd82	radv: factor out radv_alloc_memory This allows us to pass extra parameters to the memory allocation operation that are not defined in the vulkan spec. This is useful for internal usage. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:15:49 +02:00

... 3 4 5 6 7 ...

646 Commits