KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	b21a4efb55	radv/winsys: allow local BOs on APUs Ported from RadeonSI. Local BOs ignore BO priorities, and we don't need those on APUs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:24 +02:00
Samuel Pitoiset	5c1233ed62	radv: use a global BO list only for VK_EXT_descriptor_indexing Maintaining two different paths is annoying but this gets rid of the performance regression introduced by the global BO list. We might find a better solution in the future, but for now just keeps two paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:18 +02:00
Samuel Pitoiset	7bd5367546	Revert "radv: Don't store buffer references in the descriptor set." In order to reduce a performance regression introduced by `4b13fe55a4` ("radv: Keep a global BO list for VkMemory."), we are going to maintain two different paths. One when VK_EXT_descriptor_indexing is enabled by the application because we need to have a global BO list, and one (the old one) when it's not enabled. With Talos on Polaris, the global BO list reduces performance by 10% which is too much for me. This reverts commit `ab6cadd3ec`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:13 +02:00
Nicolai Hähnle	24fb3e6aa1	ac/nir: use ac_build_image_opcode for image intrinsics So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:30:07 +02:00
Nicolai Hähnle	74063431f1	radeonsi: generate image load/store/atomic ops using ac_build_image_opcode In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:29:57 +02:00
Nicolai Hähnle	625dcbbc45	amd/common: pass address components individually to ac_build_image_intrinsic This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:23:52 +02:00
Nicolai Hähnle	f931583828	amd/common: pass new enum ac_image_dim to ac_build_image_opcode This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:23:40 +02:00
Nicolai Hähnle	a807a9b215	ac/nir: fix atomic compare-and-swap The LLVM instruction returns { i32, i1 }, where the i1 indicates success. We're only interested in the first part, which is the loaded value. Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.* Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:40 +02:00
Bas Nieuwenhuizen	dffdef6737	radv: Add Vega M support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen	d1ce31d36c	radv: Add bound checking workaround for dynamic buffers. I have seen a few applications and games do the dynamic buffer bounds incorrectly, this make it easier to work around, e.g. for debugging. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:13:25 +02:00
Samuel Pitoiset	2f63b3dd09	radv: enable DCC for MSAA 2x textures on VI under an option This can be enabled with RADV_PERFTEST=dccmsaa. DCC for MSAA textures is actually not as easy to implement. It looks like there is some corner cases. I will improve support incrementally. Vega support, as well as Polaris improvements, will be added later. No CTS changes on Polaris using RADV_DEBUG=zerovram and RADV_PERFTEST=dccmsaa. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:55 +02:00
Samuel Pitoiset	dc3d39771f	radv: decompress DCC for multisampled source images before resolving Multisampled source images (ie. color attachments) can be now DCC compressed, so the driver needs to perform a DCC decompression pass before resolving Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:52 +02:00
Samuel Pitoiset	1aefb62f1e	radv: add a workaround for fast clears with DCC and MSAA textures This should be fixed at some point in order to improve performance. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:50 +02:00
Samuel Pitoiset	373fa0b599	radv: allocate CMASK for DCC fast clear with MSAA CMASK is required because it should be cleared to 0xCCCCCCCC for MSAA textures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:48 +02:00
Samuel Pitoiset	255506c4e0	radv: implement fast color clear for DCC with MSAA When DCC is enabled with MSAA textures, CMASK should be cleared to 0xCCCCCCCC. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:45 +02:00
Samuel Pitoiset	796b6f4aab	radv: make sure to sync after resolving using the compute path This fixes some random CTS failures: dEQP-VK.renderpass.multisample.*. Performing a fast-clear eliminate is still useless, but it seems that we need to sync. Found while running CTS with RADV_DEBUG=zerovram. Fixes: `56a171a499` ("radv: don't fast-clear eliminate after resolving a subpass with compute") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:09:55 +02:00
Samuel Pitoiset	4a698660ae	radv: dump the SHA1 of SPIRV in the hang report Might be useful for debugging purposes, especially when we want to replace a shader on the fly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:09:52 +02:00
Bas Nieuwenhuizen	0e10790558	radv: Enable VK_EXT_descriptor_indexing. This adds everything except non-uniform indexing, which needs a bit more work and testing. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	b5e04e9217	radv: Support allocating variable size descriptor sets. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	78c54acbe8	radv: Add support for variable descriptor set layouts. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	082c11e8a5	radv: Fix GetDescriptorSetLayoutSupport. The continue means we do alignment differently than during creation, making the buffer smaller than expected. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	d02bbde1a8	radv: Use sorted bindings for set layout creation. Previously we did not care about havin the set storage in order, but for variable descriptor count we want the highest binding at the end of the storage. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	ab6cadd3ec	radv: Don't store buffer references in the descriptor set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	4b13fe55a4	radv: Keep a global BO list for VkMemory. With update after bind we can't attach bo's to the command buffer from the descriptor set anymore, so we have to have a global BO list. I am somewhat surprised this works really well even though we have implicit synchronization in the WSI based on the bo list associations and with the new behavior every command buffer is associated with every swapchain image. But I could not find slowdowns in games because of it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Marek Olšák	c6f1d36019	radeonsi: add support for VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:33 -04:00
Marek Olšák	d6a66bc8db	amd/addrlib: add support for VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:32 -04:00
Samuel Pitoiset	893e19efb7	radv: fix scissor computation when using half-pixel viewport offset 'scale[i]' can be non-integer. Original patch by Philip Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074 Fixes: `0f3de89a56` ("radv: Use the guard band.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-17 22:12:14 +02:00
Jason Ekstrand	72ab499c9f	anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-16 07:59:25 -07:00
Samuel Pitoiset	62510846b6	radv: clean up radv_decompress_resolve_subpass_src() To handle the source color image transitions in the same place. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:21:05 +02:00
Samuel Pitoiset	56a171a499	radv: don't fast-clear eliminate after resolving a subpass with compute That looks useless, and I think radv_handle_image_transition() will do a fast-clear eliminate because it's called after the resolve. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:21:02 +02:00
Samuel Pitoiset	7e84d69861	radv: handle CMASK/FMASK transitions only if DCC is disabled DCC implies a fast-clear eliminate, so I think this sounds reasonable. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:59 +02:00
Samuel Pitoiset	584d1f2711	radv: merge radv_handle_{dcc,cmask}_image_transition() functions Into radv_handle_color_image_transition(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:56 +02:00
Samuel Pitoiset	d5812b900b	radv: add radv_init_color_image_metadata() helper In order to separate initialization from decompression. In the future, that will allow us to init DCC/FMASK/CMASK in one shot. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:54 +02:00
Samuel Pitoiset	fde7b90ecf	radv: make radv_initialise_cmask() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:51 +02:00
Samuel Pitoiset	790f6e4718	radv: clean up radv_handle_image_transition() a bit Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:49 +02:00
Samuel Pitoiset	6967d32beb	radv: add radv_handle_color_image_transition() helper To handle CMASK, FMASK and DCC transitions in the same place. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:45 +02:00
Samuel Pitoiset	c6b1f1c97a	radv: handle DCC image transitions before CMASK/FMASK transitions Mostly because DCC implies a fast-clear eliminate and we should be able to skip some DCC decompressions by setting a predicate like for CMASK and FMASK. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:42 +02:00
Samuel Pitoiset	79c87a45b6	radv: disable prediction only if it has been enabled When decompressing DCC we don't enable it, so it's useless to disable it. This reduces the number of prediction packets sent to the GPU when performing color decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:39 +02:00
Bas Nieuwenhuizen	b0e3a9b19f	ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too. No clue how I missed those ... Fixes: `4503ff760c` "ac/nir: Add workaround for GFX9 buffer views." CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 11:55:48 +02:00
Daniel Schürmann	f2c6a55061	radv: enable subgroup capabilities Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Daniel Schürmann	4b0616e533	ac: handle subgroup intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Daniel Schürmann	d5f7ebda3e	ac: add LLVM build functions for subgroup instrinsics Co-authored-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:09 +02:00
Daniel Schürmann	d19f20e793	ac: make ballot and umsb capable of 64bit inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Marek Olšák	a3b785be4d	winsys/amdgpu: allow local BOs on APUs Local BOs ignore BO priorities, and we don't need those on APUs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	43d66c8c2d	mesa: include mtypes.h less - remove mtypes.h from most header files - add main/menums.h for often used definitions - remove main/core.h v2: fix radv build Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-12 19:31:30 -04:00
Bas Nieuwenhuizen	6ff98dbf7c	radv: Implement VK_EXT_vertex_attribute_divisor. Pretty straight forward, just pass the divisors through the shader key and then do a LLVM divide. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen	7eff8d7d35	ac/surface: Allow S swizzle for displayable surfaces. For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts S swizzles. At the same time addrlib prefers D swizzles is allowed, so we can just allow S swizzles as fallback. Fixes: `b64b712558` "ac/surface/gfx9: request desired micro tile mode explicitly" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-12 21:24:55 +02:00
Samuel Pitoiset	9eac49246c	radv: fix radv_layout_dcc_compressed() when image doesn't have DCC num_dcc_levels means that DCC is supported, but this doesn't mean that it's enabled by the driver. Instead, we should rely on radv_image_has_dcc(). This fixes some multisample regressions since `0babc8e5d6` ("radv: fix picking the method for resolve subpass") on Vega. This is because the resolve method changed from HW to FS, but those fails are totally unexpected, so there might some differences between Polaris and Vega here. Fixes: `44fcf58744` ("radv: Disable DCC for GENERAL layout and compute transfer dest.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:46 +02:00
Samuel Pitoiset	ab0e625a67	radv: add radv_decompress_resolve_{subpass}_src() helpers This helper shares common code before resolving using either a fragment or a compute shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:44 +02:00
Samuel Pitoiset	ed93d90a67	radv: add radv_init_dcc_control_reg() helper And add some comments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:41 +02:00
Bas Nieuwenhuizen	bd95397d65	radv: Enable RB+ on Raven. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 18:46:55 +02:00
Tapani Pälli	9f29b1a4c8	vulkan: fix build issue on android (both anv/radv) Fixes linking errors against: anv_GetPhysicalDeviceImageFormatProperties2KHR radv_GetPhysicalDeviceImageFormatProperties2KHR Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-11 13:55:49 +03:00
Nicolai Hähnle	0630e52c9e	radeonsi: pass -O halt_waves to umr for hang debugging This will give us meaningful wave information in the case of a hang where shaders are still running in an infinite loop. Note that we call umr multiple times for different sections of the ddebug hang dump, and so the wave information will not necessarily match up between sections. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-11 12:44:24 +02:00
Jason Ekstrand	69f447553c	vulkan: Drop vk_android_native_buffer.xml All the information in vk_android_native_buffer.xml is now in vk.xml. The only exception is the extension type attribute which we can work around in the generators while we wait for the XML to be fixed. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-10 19:29:49 -07:00
Bas Nieuwenhuizen	ed94638156	radv: Enable RB+ where possible. According to Marek, not enabling it on Stoney has a significant negative performance impact. (And I guess this might impact performance on Raven as well) The register settings are pretty much copied from radeonsi. I did not put this in the pipeline as that would make the pipeline more dependent on the format which mean we would have to have more pipelines for the meta shaders. v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet does already. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 01:19:10 +02:00
Samuel Pitoiset	0babc8e5d6	radv: fix picking the method for resolve subpass The source and destination image parameters were swapped. No CTS changes on Polaris10, but I suspect this might fix something. Fixes: `2a04f5481d` ("radv/meta: select resolve paths") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-10 21:55:28 +02:00
Samuel Pitoiset	9f6a28eb27	radv: add shader BOs to the list at pipeline bind time Otherwise, the shader BOs are not added to the list on SI because prefetching isn't supported. Calling radv_cs_add_buffer() in the prefetch codepath was a bad idea. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952 Fixes: `4ad7595f35` ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Turo Lamminen <turo@alternativegames.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-10 21:55:28 +02:00
Marek Olšák	e29facff31	ac/surface: don't set the display flag for obviously unsupported cases (v2) This enables the tile swizzle for some cases of the displayable micro mode, and it also fixes an addrlib assertion failure on Vega. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-04-10 13:06:03 -04:00
Marek Olšák	b64b712558	ac/surface/gfx9: request desired micro tile mode explicitly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-10 12:44:41 -04:00
Bas Nieuwenhuizen	4381be4648	ac/nir: Use an array instead of hashtable for SSA defs. Saves about 2% of compile time for F1 2017, as well as reduce code size of an optimized libvulkan_radeon.so by about 1 KiB. This still keeps the hashtable, as we also stored blocks in there. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-10 09:53:16 +02:00
Bas Nieuwenhuizen	41fbcc7901	radv: Always reset draw user SGPRs after secondary command buffer. As we sometimes reset them to -1, -1 does not mean that they are not written by the secondary command buffer. Fixes: `ad11fc3571` "radv: don't emit unneeded vertex state." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 23:04:42 +02:00
Bas Nieuwenhuizen	74b0b869dd	radv: Don't set instance count using predication. The packet can sometimes be skipped, but we still think the change takes effect. This just makes the packet always take effect. Fixes: `ad11fc3571` "radv: don't emit unneeded vertex state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105942 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 23:04:35 +02:00
Samuel Pitoiset	04e609f1f8	radv: fix prefetching of vertex shader and VBOs on SI Forgot one check... Too many mistakes for a simple change. Fixes: `f1d7c16e85` ("radv: fix prefetching compute shaders on CIK and older chips") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 16:14:12 +02:00
Samuel Pitoiset	56a4d03b0c	radv: implement VK_AMD_shader_core_properties Simple extension that only returns information for AMD hw. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	466aba9fa2	radv: add RADV_NUM_PHYSICAL_VGPRS constant Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	2f7bb93146	radv: add radv_get_num_physical_sgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	b0f8ad189c	radv: add radv_image_is_tc_compat_htile() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:26 +02:00
Samuel Pitoiset	95d5ad80e9	radv: add radv_use_dcc_for_image() helper And add some TODOs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:24 +02:00
Samuel Pitoiset	fab5fe4284	radv: rename radv_image_is_tc_compat_htile() ... to radv_use_tc_compat_htile_for_image(). This function name makes more sense to me because we want to know if and only if TC-compat HTILE should be used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:21 +02:00
Samuel Pitoiset	2692736cee	radv: simplify a check in radv_initialise_color_surface() If the image has FMASK metadata, the number of samples is > 1 because radv_image_can_enable_fmask() handles that already. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:16 +02:00
Samuel Pitoiset	ed41e776d0	radv: clean up radv_vi_dcc_enabled() And rename to radv_dcc_enabled() to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:14 +02:00
Samuel Pitoiset	e213f19907	radv: clean up radv_htile_enabled() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:12 +02:00
Samuel Pitoiset	0fc9113ac5	radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:10 +02:00
Samuel Pitoiset	32f5174ce8	radv: add radv_get_cmask_fast_clear_value() helper DCC for MSAA textures are currently unsupported but that will be used later on. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:08 +02:00
Samuel Pitoiset	f882c62218	radv: add radv_clear_{cmask,dcc} helpers They will help for DCC MSAA textures and if we support mipmaps in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:05 +02:00
Samuel Pitoiset	8f9f62c2db	radv: don't pass the pipeline to radv_flush_constants() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:27 +02:00
Samuel Pitoiset	2bd50cceff	radv: rename radv_cmd_buffer_update_vertex_descriptors() ... to radv_flush_vertex_descriptors(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-06 19:46:23 +02:00
Samuel Pitoiset	e829a0cc1e	radv: do not try to skip draw calls when VBOs upload failed This is unnecessary because we record an error which should be returned by vkEndCommandBuffer(), and the app shouldn't submit a command buffer when this happens. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:21 +02:00
Samuel Pitoiset	f1d7c16e85	radv: fix prefetching compute shaders on CIK and older chips Because the check was moved to radv_emit_prefetch_L2(). Fixes: `4ad7595f35` ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:18 +02:00
Samuel Pitoiset	7fe586f6fb	radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries This unnecessary when the precision bit flag is not set, and this might hurt performance. The Vulkan explains that not setting VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some implementations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:34 +02:00
Samuel Pitoiset	d53dff3bfc	radv: enable the Polaris small primitive filter control Enable it directly in the preamble, but do not enable line on Polaris10/11/12 because there is a hw bug. There is possibly an issue when MSAA is off, but this doesn't regress any CTS and AMDVLK doesn't have a workaround as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:31 +02:00
Samuel Pitoiset	942fdfe357	radv: implement a fast prefetch path for the vertex stage This allows to start draws as soon as possible. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:48 +02:00
Samuel Pitoiset	4ad7595f35	radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:45 +02:00
Samuel Pitoiset	a8a696a38f	radv: use a mask for VBOs and shaders prefetching Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:42 +02:00
Samuel Pitoiset	922cd38172	radv: implement out-of-order rasterization when it's safe on VI+ Disabled by default for now, it can be enabled with RADV_PERFTEST=outoforder. No CTS regressions on Polaris, and all Vulkan games I tested look good as well. Expect small performance improvements for applications where out-of-order rasterization can be enabled by the driver. Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	d6709c91a6	radv: change blend_enable field to use four bits per CB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	a8818d1af2	radv: scan which color blend attachments are enabled With cb_target_enabled_4bit in order to have four bits per CB. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	ac456d0d1b	radv: put more fields in radv_blend_state Some will be used for further optimizations (ie. out-of-order rast). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	e4976ca33b	radv: do not always disable dual quad mode when chip has RbPlus For GFX9+ only, RadeonSI does this too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	b8c06a961c	radv: don't use the SPI barrier management bug workaround Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	ab147cba77	radv: mask out high VM address bits in registers where needed Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	acf60abc54	radv: enable VK_EXT_shader_viewport_index_layer The driver already supports exporting the Layer and ViewportIndex built-ins from vertex or tessellation shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-03 14:05:46 +02:00
Mike Lothian	7e144ace95	ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Marek Olšák	dc04e4bba2	radeonsi: move FMASK shader logic to shared code We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:22 -04:00
Marek Olšák	5d91c2ccea	ac/gpu_info: print GB_ADDR_CONFIG	2018-04-02 13:10:37 -04:00
Marek Olšák	b1f33086ec	ac/gpu_info: reorder the fields and print them nicely	2018-04-02 13:10:37 -04:00
Marek Olšák	a0a96819e1	ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory	2018-04-02 13:10:37 -04:00
Marek Olšák	32b3932de1	ac/gpu_info: don't print irrelevant fields	2018-04-02 13:10:37 -04:00
Samuel Pitoiset	2a329f4ada	radv: set SAMPLE_RATE to the number of samples of the current fb Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-30 17:32:15 +02:00
Ian Romanick	4925347ec5	util: Include bitscan.h directly Previously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:30 -07:00
Ian Romanick	d76c204d05	util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:23 -07:00
Samuel Pitoiset	e45fe0ed66	radv: fix scanning output_usage_mask with structs To fix a regression in: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct And the following regressions (Polaris only): dEQP-VK.glsl.indexing.varying_array.* Fixes: `f3275ca01c` ("ac/nir: only enable used channels when exporting parameters") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-29 10:22:10 +02:00
Daniel Schürmann	b91cd5dba4	radv: enable VK_AMD_shader_trinary_minmax extension Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:39 +02:00
Daniel Schürmann	d00fb7ce54	ac: add support for trinary_minmax instructions v2: Add missing break (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:35 +02:00
Bas Nieuwenhuizen	4503ff760c	ac/nir: Add workaround for GFX9 buffer views. On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 00:03:03 +02:00
Marek Olšák	4f96747530	ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 17:23:41 -04:00
Samuel Pitoiset	1c4fdcf444	radv: enable VK_EXT_sampler_filter_minmax Only enable for CIK+ because it's buggy on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	413d77e7f9	radv: add support for VK_EXT_sampler_filter_minmax The driver only supports the required formats for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	99b52aa1da	radv: rename VEGA10 device name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:17 +02:00
Samuel Pitoiset	4d2c46dda3	radv: add support for Vega12 Based on RadeonSI. Untested. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:14 +02:00
Marek Olšák	20eb44ad65	radeonsi: add support for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Marek Olšák	5425d32fcf	amd/addrlib: update to the latest version for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Timothy Arceri	92fa89a08d	ac/radeonsi: pass bindless bool to load_sampler_desc() We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:16 +11:00
Timothy Arceri	51f175028d	ac/nir_to_llvm: fix component packing for double outputs We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Marek Olšák	769603564e	radeonsi: don't reallocate on DMABUF export if local BOs are disabled	2018-03-26 19:22:12 -04:00
Samuel Pitoiset	ccc64f3133	radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8 The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:57 +01:00
Samuel Pitoiset	5ae9772245	radv: add radv_calc_decompress_on_z_planes() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:55 +01:00
Samuel Pitoiset	9b8e75bee3	radv: add radv_image_is_tc_compat_htile() helper Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:54 +01:00
Jason Ekstrand	884d27bcf6	nir: Rename image intrinsics to image_var Generated with git grep -l nir_intrinsic_image \| xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-23 13:48:11 +11:00
Juan A. Suarez Romero	0bf1274883	radv: autotools: add radv_extensions.h in the generated VULKAN list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	13459c637a	anv/radv: autotools: include vulkan_*.h headers Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Samuel Pitoiset	52fba3f45d	radv: remove unused radv_pipeline::needs_data_cache variable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-22 14:30:37 +01:00
Timothy Arceri	c135316555	ac/nir_to_llvm: add frexp support Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-22 12:42:34 +11:00
Marek Olšák	f7ffa504a0	ac/surface: compute tile swizzle for GFX9 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Samuel Pitoiset	f0211155f1	radv: add support for VK_EXT_depth_range_unrestricted This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:55:41 +01:00
Samuel Pitoiset	4e9b0b39b5	radv: only enable one channel when exporting prim id It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:54:48 +01:00
Timothy Arceri	9a243eccae	radv: don't lower indirects until after opts have run Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 15:01:44 +11:00
Dave Airlie	32791a0502	radv: don't export NULL layer. We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: `d4c74aed7a` (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 21:36:48 +00:00
Dave Airlie	e8d9b7ab02	radv: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: `99b57daf4a` anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:40 +00:00
Dave Airlie	032014ac01	radv/query: handle multiview timestamp queries. For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:14 +00:00
Dave Airlie	32b4f3c38d	radv/query: handle multiview queries properly. (v3) For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:09 +00:00
Dave Airlie	4034dc5c72	radv/query: split out begin/end query emission This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:05 +00:00
Dave Airlie	d4c74aed7a	radv/multiview: mark layer_input if we have input attachments. This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:26:39 +00:00
Dave Airlie	8f052a3e25	radv: handle exporting view index to fragment shader. (v1.1) The fragment shader was trying to read this, but nothing was exporting it from the vertex shader. This handles it like the prim id export. Fixes: dEQP-VK.multiview.secondary_cmd_buffer.* dEQP-VK.multiview.index.fragment_shader.* v1.1: updated to use 0x1 (Samuel) Fixes: `e3265c10c8` (radv: Implement multiview draws.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-19 01:20:00 +00:00
Grazvydas Ignotas	e1b2e5667c	radv: make vk_format_description structures static No need to bother the linker about them. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Grazvydas Ignotas	331141e87e	radv: fix stale comment in generated vk_format_table.c It seems to be a leftover from u_format_table.py. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Samuel Pitoiset	e96a1d27dc	radv: run nir_opt_move_load_ubo Polaris10: SGPRS: 108560 -> 107856 (-0.65 %) VGPRS: 74576 -> 74520 (-0.08 %) Spilled SGPRs: 7375 -> 7113 (-3.55 %) Code Size: 4273464 -> 4274364 (0.02 %) bytes Max Waves: 9434 -> 9446 (0.13 %) Vega10: Totals from affected shaders: SGPRS: 108264 -> 107576 (-0.64 %) VGPRS: 69068 -> 69000 (-0.10 %) Spilled SGPRs: 7221 -> 6959 (-3.63 %) Code Size: 3800796 -> 3801496 (0.02 %) bytes Max Waves: 10687 -> 10709 (0.21 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-16 09:58:19 +01:00
Dave Airlie	9d0d806332	radv: drop geometry stride user sgpr. This removes the other geometry specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:21 +00:00
Dave Airlie	6f051549c3	radv: get rid of geometry user sgpr for num entries. This drops one of the geometry specific user sgprs, we can work this out at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:17 +00:00
Dave Airlie	9188bd78d7	radv: migrate lds size calculations to shader gen. This moves the lds_size calcs into the shader so we have all the size stuff in one file. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:12 +00:00
Dave Airlie	384aced65e	radv: drop scanning the tess shader in the nir code. This drops the now unneeded scanning and results in favour of the ones in the info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:08 +00:00
Dave Airlie	f50d520acf	radv: use num_patches output from tcs shader. Instead of recalculating the value, use the shader calculated value. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:05 +00:00
Dave Airlie	bf9a0ea853	radv/tess: remove last chunk of tess sgprs This removes the last TES-specifc user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:01 +00:00
Dave Airlie	6db44d6a8c	radv: pass num_patches to tes from tcs TES needs num_patches to do some of the calculations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:58 +00:00
Dave Airlie	010d055aae	radv: drop tess offchip layout for tcs. This removes the last TCS specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:54 +00:00
Dave Airlie	ee31cff856	radv: drop tcs_out_offsets Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:47 +00:00
Dave Airlie	b0460bbf1c	radv: drop tcs_out_layout Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:43 +00:00
Dave Airlie	6adf99165c	radv/tess: drop tcs_in_layout setting completely. Inline all calcs at shader creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:37 +00:00
Dave Airlie	f343d11ae7	radv: drop ls_out_layout const. We can precalculate input_vertex_size at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:32 +00:00
Dave Airlie	d89b16b7b9	radv/shader_info: start gathering tess output info (v2) This gathers the ls outputs written by the vertex shader, and the tcs outputs, these are needed to calculate certain tcs parameters. These have to be separate for combined gfx9 shaders. This is a bit pessimistic compared to the nir pass, as we don't work out the individual slots for tcs outputs, but I actually thing it should be fine to just mark the whole thing used here. v2: move to radv, handle clip dist (Samuel), handle compacts and patchs properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:23 +00:00

1 2 3 4 5 ...

2371 Commits