KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Ian Romanick	d76c204d05	util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:23 -07:00
Samuel Pitoiset	e45fe0ed66	radv: fix scanning output_usage_mask with structs To fix a regression in: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct And the following regressions (Polaris only): dEQP-VK.glsl.indexing.varying_array.* Fixes: `f3275ca01c` ("ac/nir: only enable used channels when exporting parameters") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-29 10:22:10 +02:00
Daniel Schürmann	b91cd5dba4	radv: enable VK_AMD_shader_trinary_minmax extension Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:39 +02:00
Daniel Schürmann	d00fb7ce54	ac: add support for trinary_minmax instructions v2: Add missing break (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:35 +02:00
Bas Nieuwenhuizen	4503ff760c	ac/nir: Add workaround for GFX9 buffer views. On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 00:03:03 +02:00
Marek Olšák	4f96747530	ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 17:23:41 -04:00
Samuel Pitoiset	1c4fdcf444	radv: enable VK_EXT_sampler_filter_minmax Only enable for CIK+ because it's buggy on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	413d77e7f9	radv: add support for VK_EXT_sampler_filter_minmax The driver only supports the required formats for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	99b52aa1da	radv: rename VEGA10 device name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:17 +02:00
Samuel Pitoiset	4d2c46dda3	radv: add support for Vega12 Based on RadeonSI. Untested. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:14 +02:00
Marek Olšák	20eb44ad65	radeonsi: add support for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Marek Olšák	5425d32fcf	amd/addrlib: update to the latest version for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Timothy Arceri	92fa89a08d	ac/radeonsi: pass bindless bool to load_sampler_desc() We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:16 +11:00
Timothy Arceri	51f175028d	ac/nir_to_llvm: fix component packing for double outputs We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Marek Olšák	769603564e	radeonsi: don't reallocate on DMABUF export if local BOs are disabled	2018-03-26 19:22:12 -04:00
Samuel Pitoiset	ccc64f3133	radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8 The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:57 +01:00
Samuel Pitoiset	5ae9772245	radv: add radv_calc_decompress_on_z_planes() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:55 +01:00
Samuel Pitoiset	9b8e75bee3	radv: add radv_image_is_tc_compat_htile() helper Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:54 +01:00
Jason Ekstrand	884d27bcf6	nir: Rename image intrinsics to image_var Generated with git grep -l nir_intrinsic_image \| xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-23 13:48:11 +11:00
Juan A. Suarez Romero	0bf1274883	radv: autotools: add radv_extensions.h in the generated VULKAN list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	13459c637a	anv/radv: autotools: include vulkan_*.h headers Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Samuel Pitoiset	52fba3f45d	radv: remove unused radv_pipeline::needs_data_cache variable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-22 14:30:37 +01:00
Timothy Arceri	c135316555	ac/nir_to_llvm: add frexp support Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-22 12:42:34 +11:00
Marek Olšák	f7ffa504a0	ac/surface: compute tile swizzle for GFX9 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Samuel Pitoiset	f0211155f1	radv: add support for VK_EXT_depth_range_unrestricted This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:55:41 +01:00
Samuel Pitoiset	4e9b0b39b5	radv: only enable one channel when exporting prim id It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:54:48 +01:00
Timothy Arceri	9a243eccae	radv: don't lower indirects until after opts have run Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 15:01:44 +11:00
Dave Airlie	32791a0502	radv: don't export NULL layer. We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: `d4c74aed7a` (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 21:36:48 +00:00
Dave Airlie	e8d9b7ab02	radv: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: `99b57daf4a` anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:40 +00:00
Dave Airlie	032014ac01	radv/query: handle multiview timestamp queries. For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:14 +00:00
Dave Airlie	32b4f3c38d	radv/query: handle multiview queries properly. (v3) For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:09 +00:00
Dave Airlie	4034dc5c72	radv/query: split out begin/end query emission This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:05 +00:00
Dave Airlie	d4c74aed7a	radv/multiview: mark layer_input if we have input attachments. This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:26:39 +00:00
Dave Airlie	8f052a3e25	radv: handle exporting view index to fragment shader. (v1.1) The fragment shader was trying to read this, but nothing was exporting it from the vertex shader. This handles it like the prim id export. Fixes: dEQP-VK.multiview.secondary_cmd_buffer.* dEQP-VK.multiview.index.fragment_shader.* v1.1: updated to use 0x1 (Samuel) Fixes: `e3265c10c8` (radv: Implement multiview draws.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-19 01:20:00 +00:00
Grazvydas Ignotas	e1b2e5667c	radv: make vk_format_description structures static No need to bother the linker about them. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Grazvydas Ignotas	331141e87e	radv: fix stale comment in generated vk_format_table.c It seems to be a leftover from u_format_table.py. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Samuel Pitoiset	e96a1d27dc	radv: run nir_opt_move_load_ubo Polaris10: SGPRS: 108560 -> 107856 (-0.65 %) VGPRS: 74576 -> 74520 (-0.08 %) Spilled SGPRs: 7375 -> 7113 (-3.55 %) Code Size: 4273464 -> 4274364 (0.02 %) bytes Max Waves: 9434 -> 9446 (0.13 %) Vega10: Totals from affected shaders: SGPRS: 108264 -> 107576 (-0.64 %) VGPRS: 69068 -> 69000 (-0.10 %) Spilled SGPRs: 7221 -> 6959 (-3.63 %) Code Size: 3800796 -> 3801496 (0.02 %) bytes Max Waves: 10687 -> 10709 (0.21 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-16 09:58:19 +01:00
Dave Airlie	9d0d806332	radv: drop geometry stride user sgpr. This removes the other geometry specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:21 +00:00
Dave Airlie	6f051549c3	radv: get rid of geometry user sgpr for num entries. This drops one of the geometry specific user sgprs, we can work this out at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:17 +00:00
Dave Airlie	9188bd78d7	radv: migrate lds size calculations to shader gen. This moves the lds_size calcs into the shader so we have all the size stuff in one file. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:12 +00:00
Dave Airlie	384aced65e	radv: drop scanning the tess shader in the nir code. This drops the now unneeded scanning and results in favour of the ones in the info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:08 +00:00
Dave Airlie	f50d520acf	radv: use num_patches output from tcs shader. Instead of recalculating the value, use the shader calculated value. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:05 +00:00
Dave Airlie	bf9a0ea853	radv/tess: remove last chunk of tess sgprs This removes the last TES-specifc user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:01 +00:00
Dave Airlie	6db44d6a8c	radv: pass num_patches to tes from tcs TES needs num_patches to do some of the calculations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:58 +00:00
Dave Airlie	010d055aae	radv: drop tess offchip layout for tcs. This removes the last TCS specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:54 +00:00
Dave Airlie	ee31cff856	radv: drop tcs_out_offsets Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:47 +00:00
Dave Airlie	b0460bbf1c	radv: drop tcs_out_layout Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:43 +00:00
Dave Airlie	6adf99165c	radv/tess: drop tcs_in_layout setting completely. Inline all calcs at shader creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:37 +00:00
Dave Airlie	f343d11ae7	radv: drop ls_out_layout const. We can precalculate input_vertex_size at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:32 +00:00
Dave Airlie	d89b16b7b9	radv/shader_info: start gathering tess output info (v2) This gathers the ls outputs written by the vertex shader, and the tcs outputs, these are needed to calculate certain tcs parameters. These have to be separate for combined gfx9 shaders. This is a bit pessimistic compared to the nir pass, as we don't work out the individual slots for tcs outputs, but I actually thing it should be fine to just mark the whole thing used here. v2: move to radv, handle clip dist (Samuel), handle compacts and patchs properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:23 +00:00
Dave Airlie	2012dae19a	radv: migrate unique index info shader info (v2) This just moves this function to an inline so the shader_info pass can use it. v2: use inline (Samuel) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:19 +00:00
Samuel Pitoiset	16ecf037f9	radv: dump LLVM IR when a hang is detected Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:07 +01:00
Samuel Pitoiset	81818662a5	radv: record LLVM IR when debugging shaders If AMD_shader_info or RADV_TRACE_FILE is used we might need to keep trace of LLVM IR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:03 +01:00
Samuel Pitoiset	d07edf5fdf	radv: add dump_shader to the NIR compiler options Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:00 +01:00
Samuel Pitoiset	50fcca328c	radv: pass the NIR compiler options to ac_compile_llvm_module() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:58 +01:00
Samuel Pitoiset	14c27c2511	radv: print some information when RADV_TRACE_FILE is set Just to be sure all options are enabled when trying to generate a hang report. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:54 +01:00
Samuel Pitoiset	5be2757c35	radv: only display options that are enabled Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:52 +01:00
Alejandro Piñeiro	50767214a7	spirv/radv: add AMD_gcn_shader capability, remove current extensions So now, during spirv_to_nir, it uses the capability instead of the extension. Note that we are really doing here is treating SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader is not the first SPV extension supported. For example, the capability draw_parameters infers if the extension SPV_KHR_shader_draw_parameters is supported or not. This could be seen as counter-intuitive, and that it would be easier to define which extensions are supported, and based our checks on that, but we need to take into account that some capabilities are optional from core, and others came from new extensions. Also this commit would make the implementation of ARB_spirv_extensions easier. v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann) Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 12:08:25 +01:00
Alex Smith	fcf267ba08	radv: Fix CmdCopyImage between uncompressed and compressed images From the spec: "When copying between compressed and uncompressed formats the extent members represent the texel dimensions of the source image and not the destination." However, as per `7b890a36`, we must still use the destination image type when clamping the extent so that we copy the correct number of layers for 2D to 3D copies. Fixes: `7b890a36` "radv: Fix vkCmdCopyImage for 2d slices into 3d Images" Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-14 09:59:21 +00:00
Samuel Pitoiset	38f34117dd	radv: fix vkGetDeviceQueue2() when create flags don't match This fixes CTS: dEQP-VK.api.device_init.create_device_queue2_unmatched_flags Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@gmail.com>	2018-03-14 09:53:42 +01:00
Dave Airlie	3b0f2081b5	radv: drop assert on bindingDescriptorCount > 0 The spec is pretty clear that this can be 0, and that it operates as a reserved binding. Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 16:54:52 +10:00
Dave Airlie	27a5e5366e	radv: mark all tess output for an indirect access. If a shader does a tcs store with an indirect access, we were only marking the first spot as used. For indirect access we always now mark all slots used by the variable. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Dave Airlie	4f0c89d66c	ac/nir: pass the nir variable through tcs loading. I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Dave Airlie	f9de2d409b	radv: get correct offset into LDS for indexed vars. This seems more correct to me, since if we have an array of floats they'll be vec4 aligned, and if we do af[2], we want the const index to increase by 2 slots in the non compact case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Jason Ekstrand	85000b812d	ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 13:25:27 -07:00
Samuel Pitoiset	7c83430672	ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:28 +01:00
Samuel Pitoiset	b128fd773f	ac/nir: remove some unnecessary includes and declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:27 +01:00
Samuel Pitoiset	cd4e823341	ac/nir: drop radv prefix from radv_lower_gather4_integer() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:25 +01:00
Samuel Pitoiset	fbe694562b	ac/nir: move ac_nir_compiler_options and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:23 +01:00
Samuel Pitoiset	237229430f	ac: move ac_shader_info to radv folder This is RADV specific code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:21 +01:00
Samuel Pitoiset	2cfba40eea	ac/nir: move ac_shader_variant_info and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:16 +01:00
Samuel Pitoiset	b2653007b9	ac/nir: move all RADV related code to radv_nir_to_llvm.c Now the "ac/nir" prefix will really be the shared code between RadeonSI and RADV, that might avoid confusions in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	8e15824b9d	ac/nir: make emit_barrier() non-static Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	4e3117b718	ac/nir: move radeon_llvm_reg_index_soa() to ac_nir_to_llvm.h Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	3a30b89353	ac/nir: make handle_shader_output_decl() non-static Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	3fe47b1290	ac/nir: change prototype of handle_shader_output_decl() This allows to remove the ac_nir_context dependency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	61a91ca3f5	ac/nir: move unpack_param() to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	28bb6873ec	ac/nir: move trim_vector to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	895632baef	ac/nir: move cast_ptr() to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	bf6368297b	ac/nir: move ac_build_alloca() to ac_llvm_build.c As well as si_build_alloca_undef() and drop the si prefix. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Bas Nieuwenhuizen	997306c031	radv: Increase the number of dynamic uniform buffers. The vulkan API is not ideal as it does not allow us have a shared limit. Feral needs 15+6 for one of their games, and I'm not a fan of overcommitting the limits, so increase the number of dynamic uniform buffers to 16. CC: <mesa-stable@lists.freedesktop.org> CC: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-12 09:46:22 +01:00
Marek Olšák	e99212e970	ac/gpu_info: print ib_start_alignment, add assertion	2018-03-09 16:28:29 -05:00
Bas Nieuwenhuizen	a793e7899f	radv: Fix the autotools build take 2. Forgot to remove a word.... Fixes: `04ffabf17a` "radv: Fix autotools build."	2018-03-09 14:10:24 +01:00
Bas Nieuwenhuizen	04ffabf17a	radv: Fix autotools build. Forgot it again .... Fixes: `b6347807a9` "radv: Generate icd files." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-09 09:36:19 +01:00
Samuel Pitoiset	365850fd68	ac/nir: set number of channels for packed mrt exports Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables VSRC1 (B in low bits, A high). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-09 09:28:20 +01:00
Bas Nieuwenhuizen	68201ab2da	radv: Update version to 1.1.70. Turns out they did not reset the patch number on release. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-09 07:53:39 +01:00
Bas Nieuwenhuizen	b6347807a9	radv: Generate icd files. If the api version is too low, the loader clamps the application requested version to the advertized version, which messes with which extensions are enabled. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-09 07:53:39 +01:00
Marek Olšák	78ef16e2f9	winsys/amdgpu: query GDS info Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	75c5d25f0f	radeonsi: align command buffer starting address to fix some Raven hangs Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Samuel Pitoiset	4e3c1ace65	ac/nir: do not emit unnecessary null exports in fragment shaders Null exports should only be needed when no other exports are emitted. This removes a bunch of 'exp null off, off, off, off done vm'. Affected games are Dota 2 and Wolfenstein 2, not sure if that really helps, but code size is decreasing there. Polaris10: Totals from affected shaders: SGPRS: 8216 -> 8216 (0.00 %) VGPRS: 7072 -> 7072 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 454968 -> 453896 (-0.24 %) bytes Max Waves: 772 -> 772 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-08 11:56:05 +01:00
Timothy Arceri	0c90264da4	ac/radeonsi: add emit_kill to the abi This should fix a regression with Rocket League grass rendering on the NIR backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717	2018-03-08 11:28:37 +11:00
Timothy Arceri	99cdc019bf	ac: make use of if/loop build helpers These helpers insert the basic block in the same order as they appear in NIR making it easier to follow LLVM IR dumps. The helpers also insert more useful labels onto the blocks. TGSI use the line number of the corresponding opcode in the TGSI dump as the label id, here we use the corresponding block index from NIR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Timothy Arceri	42627dabb4	ac: add if/loop build helpers These have been ported over from radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Daniel Schürmann	ffbf75cde4	radv: enable AMD_gcn_shader extension Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	18c7f1e041	ac: implement AMD_gcn_shader extended instructions Co-authored-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Bas Nieuwenhuizen	034cce96b4	radv: Don't emit a warning on VI-GFX9. We are conformant: https://www.khronos.org/conformance/adopters/conformant-products#submission_308 v2: Actually not emit it on gfx9. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	04d65d2b76	radv: Enable vulkan 1.1.0 for configurations that can support it. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	0168eaaa42	radv: Disable sampler ycbcr conversion. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	cce62f4065	radv: Expose that we don't support any VK_KHR_16_bit_storage parts. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	b99b9cc864	radv: Implement vkEnumerateInstanceVersion. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	5240fddb9d	radv: Add trivial device group implementation. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	84e877aa77	radv: Implement vkCmdDispatchBase. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	de5e25898c	radv: Implement VkGetDeviceQueue2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	b137e25277	radv: Support VkPhysicalDeviceProtectedMemoryFeatures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	4bcf4d1678	radv: Support VkPhysicalDeviceShaderDrawParameterFeatures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	41d958d073	radv: Implement VK_KHR_maintenance3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	8f9af587a2	radv: Add minimal subgroup support. Deliberately not implementing workgroup scopes as that is not needed for core vulkan. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	89651fba9b	radv: Change client version check. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen	5b3979704d	radv: Update MAX_API_VERSION to 1.1.0 v2: Don't bump supported version. v3: Update json files. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen	97f10934ed	ac/nir: Add vote_ieq/vote_feq lowering pass. The old vote_eq implementation supported only booleans, but now we have to support arbitrary values, so use the read_first_invocation intrinsic + ballot. I took this as an opportunity to figure out how easy it was to do this in nir instead of in the nir_to_llvm pass, and it actually turned out pretty okay IMO. Only creating the pass is some extra code. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:32 +01:00
Jason Ekstrand	44681e4795	nir: Generalize nir_intrinsic_vote_eq The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	3960d0e332	vulkan: Rename multiview from KHX to KHR Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 12:13:47 -08:00
Marek Olšák	2c3f3651c4	radeonsi: fix passing address32_hi to LLVM for high values The old function treats high values as negative, which LLVM interprets as 0.	2018-03-07 13:55:49 -05:00
Bas Nieuwenhuizen	94c9096c83	radv: Add entrypoints generation with the new vk.xml A lot of it is based on intel again. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 15:50:19 +01:00
Dave Airlie	fb077b0728	ac/nir: don't put lod into args if it's zero. If it's zero but put it in args we still end up consuming a register for it. This fixes some spilling in the NIR paths in Dirt Rally that isn't seen with TGSI. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-07 03:34:59 +00:00
Samuel Pitoiset	e96e6f60f7	radv: report the scratch private memory size with shader stats Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:42 +01:00
Samuel Pitoiset	7f6b91c9c3	ac/nir: count the scratch private memory size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:40 +01:00
Samuel Pitoiset	3b8e7459f2	ac: add ac_count_scratch_private_memory() Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:38 +01:00
Samuel Pitoiset	f3275ca01c	ac/nir: only enable used channels when exporting parameters This allows us to generate, for example, "exp param0 v0, off, off, off" if only the first channel is needed. Not sure if this improves performance but it's worth trying. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:35 +01:00
Samuel Pitoiset	675dde13b2	ac: update enabled channels mask when optimizing PARAM exports When the mask is not 0xf we need to update the number of enabled channels, otherwise the hardware won't emit the components that are combined. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:52 +01:00
Samuel Pitoiset	c24abae9dc	ac/nir: pass the number of enabled channels to si_llvm_init_export_args() Currently, it's always 0xf but an upcoming patch will reduce the number of channels for parameters export. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:50 +01:00
Samuel Pitoiset	5cd34f03c0	ac/shader: scan output usage mask for VS and TES Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:47 +01:00
Tapani Pälli	237c9caa78	vulkan: do not expose surface/swapchain extensions on Android On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit `9f763c1f9b`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 08:02:59 +02:00
Timothy Arceri	20bd0f6a2b	ac: pass the unmodified number of components to load gs inputs Currently both users of this would overflow an array when the input was a dual slot double as they expected the number of components to be a max of 4. Since we pass the type we can just let the functions handle doubles in a way they choose. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Samuel Pitoiset	322a51b549	ac: add ac_build_fsign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:36 +01:00
Samuel Pitoiset	e8bdde2289	ac: add ac_build_isign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:32 +01:00
Samuel Pitoiset	459e33900f	ac: add ac_build_fract() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:30 +01:00
Timothy Arceri	0f2c7341e8	ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Bas Nieuwenhuizen	eea20d59ab	radv: Fix copying from 3D images starting at non-zero depth. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-05 01:04:54 +01:00
Samuel Pitoiset	c133a3411b	radv: do not set pending_reset_query in BeginCommandBuffer() This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-02 09:44:12 +01:00
Timothy Arceri	f5305c1b44	ac: fix nir_intrinsic_shared_atomic_comp_swap handling Following on from `49879f3778` this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Samuel Pitoiset	c27f5419f6	radv: only emit cache flushes when the pool size is large enough This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:40 +01:00
Samuel Pitoiset	2fe07933bd	radv: keep track of the query pool size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:39 +01:00
Samuel Pitoiset	c956d0f406	radv: make sure to emit cache flushes before starting a query If the query pool has been previously resetted using the compute shader path. Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:14:49 +01:00
Bas Nieuwenhuizen	f9898b211e	radv: Use the syncobj wait ioctl to wait on fences if possible. Handles the !waitAll and signal after the start of the wait cases correctly. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	34bd5e2e2e	radv: Implement more efficient !waitAll fence waiting. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	6968d782d3	radv: Implement waiting on non-submitted fences. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	2a404c6f92	radv: Implement WaitForFences with !waitAll. Nothing to do except using a busy wait loop. At least for old kernels. A better implementation for newer kernels to come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255 Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Dave Airlie	49879f3778	ac/nir: fix shared atomic operations. The nir->llvm conversion was using the wrong srcs. Fixes: tests/spec/arb_compute_shader/execution/shared-atomics.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:06:06 +10:00
Dave Airlie	69495b30a3	ac/nir: don't apply slice rounding on txf_ms This matches the tgsi code. Fixes arb_texture_multisample texelFetch piglit tests. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` (radv: add initial non-conformant radv vulkan driver) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:04:34 +10:00
Samuel Pitoiset	639c4f2b54	ac/shader: move scanning some info about input PS declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-28 10:14:26 +01:00
Dave Airlie	c7b25005a1	ac/radv: move load base vertex abi setup to vertex shader. This was segfaulting: dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024 Fixes: `8de6f79707` (ac/radeonsi: add load_base_vertex() to the abi) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:58:12 +10:00
Dave Airlie	3401b028df	ac/shader: fix vertex input with components. This fixes: dEQP-VK.glsl.440.linkage.varying.component.* Fixes: `1c57a6da5e` (ac/shader: scan vertex inputs usage mask) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:04:46 +10:00
Dave Airlie	6bafd4f4dd	radv: remove device pointer from buffer. This is never used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:03:26 +10:00
Timothy Arceri	08fa84bb9a	ac: implement nir_op_ldexp Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	9790921ff5	ac: fix nir_op_fdd{x,y} handling radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation decide how it should be handled and radv was previously going for the higher quality option. Here we change the shared amd code to match how nir_op_fdd{x,y} is expected to be handled by the other NIR drivers. Fixes piglit test: ./bin/arb_shader_texture_lod-texgrad -auto Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	8de6f79707	ac/radeonsi: add load_base_vertex() to the abi Fixes the following piglit tests: ./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo ./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	5504bebfc4	ac: add support for handling nir_intrinsic_load_vertex_id This will be used by radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	3a0b4187dd	ac: fix f2b and i2b for doubles Without this llvm was asserting in debug builds. V2: use LLVMConstNull() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-28 09:23:49 +11:00
Samuel Pitoiset	a549da877b	ac/nir: clean up a hack about rounding 2nd coord component It's basically just the opposite, and it only makes sense to round the layer for 2D texture arrays. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-27 10:09:27 +01:00
Dave Airlie	250468f6b7	radv: expose async compute on SI It looks like we had all the pieces in place for this, just never tested it and turned it on. I don't see any CTS regressions and the computeshader demo runs. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Dave Airlie	1fc19a0f27	radv: merge tess rings into a single bo Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Samuel Pitoiset	e05507a427	ac/nir: use ordered float comparisons except for not equal Original patch from Timothy Arceri, I have just fixed the not equal case locally. This fixes one important rendering issue in Wolfenstein 2 (the cutscene transition issue). RadeonSI uses the same ordered comparisons, so I guess that what we should do as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-02-26 13:59:04 +01:00
Timothy Arceri	9873bd9dcd	ac: make use of ac_get_llvm_num_components() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
James Legg	afd8fd0656	radv: Really use correct HTILE expanded words. When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] `5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)` CC: Dave Airlie <airlied@redhat.com> CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Fixes: `5158603182` "radv: Use correct HTILE expanded words."	2018-02-24 02:16:22 +01:00
Mauro Rossi	8eed942136	radv/extensions: fix c_vk_version for patch == None Similar to `cb0d1ba156` ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) \| ((minor) << 12) \| (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: `e72ad05c1d` ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-24 00:31:31 +01:00
Bas Nieuwenhuizen	032870beda	radv: Fix autotools build. Somewhere along the way the Makefile changes got lost ... Fixes: `4db78f3a6b` "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <airlied@redhat.com>	2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen	e72ad05c1d	radv: Return NULL for entrypoints when not supported. This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	414f5e0e14	radv: Reword radv_entrypoints_gen.py With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	076f7cfc6b	radv: Track enabled extensions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	4db78f3a6b	radv: Put supported extensions in a struct. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Samuel Pitoiset	d6b7539206	ac/nir: remove emission of nir_op_fpow fpow is now lowered at NIR level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:44:46 +01:00
Samuel Pitoiset	7aa008d1d7	radv: enable lowering of fpow to fexp2 and flog2 There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:47 +01:00
Samuel Pitoiset	a01e9996b5	ac/nir: set GLC=1 for load/store of coherent/volatile images This disables persistence accross wavefronts. F1 2017 and Wolfenstein 2 appear to use some coherent images but this patch doesn't seem to change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:55 +01:00
Timothy Arceri	6d338d757f	ac/radeonsi: pass type to load_tess_varyings() We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
James Zhu	f0ad908e79	amd/common:add uvd hevc enc support check in hw query Based on amdgpu hardware query information to check if UVD hevc enc support Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-21 13:53:38 -05:00
Samuel Pitoiset	a6accad68f	ac/nir: add glsl_is_array_image() helper For consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:51 +01:00
Samuel Pitoiset	ff83dfb364	ac/nir: set the DA field when performing atomics on 3D images This doesn't fix anything known but it should definitely be set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:49 +01:00
Dave Airlie	baa0feb73d	radv: don't send num_tcs_input_cp to sgprs. We never use it in the shaders. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:36 +00:00
Dave Airlie	952222ddd4	radv/tess: don't need to look in constant for vertices_per_patch This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:28 +00:00
Dave Airlie	77fd1b9187	ac/radv: cleanup some tcs output values access Just consolidates some code to make it easier to change. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:23 +00:00
Dave Airlie	0e6f0d400b	ac/radv: remove total_vertices variable This just removes an unneeded variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:19 +00:00
Dave Airlie	e9b9fb3616	ac/radv: don't mark tess inner as used if we don't use it. This just avoids marking it as a used output if we don't actually use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:15 +00:00
Dave Airlie	d5b2d7ed67	ac/nir: to integer the args to bcsel. dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw was hitting an llvm assert due to one value being an int and the other a float. This just casts both values to integer and fixes the test. Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-20 23:15:18 +00:00
Samuel Pitoiset	1ac741d690	ac/nir: move ac_declare_lds_as_pointer() outside of the switch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-20 10:44:59 +01:00
Samuel Pitoiset	b5d111ae76	radv: allow to force family using RADV_FORCE_FAMILY Useful for pipeline-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-20 10:44:47 +01:00
Samuel Pitoiset	549c7f3724	radv: compact varyings after removing unused ones It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-19 12:19:17 +01:00
Marek Olšák	931ec80eeb	radeonsi: implement 32-bit pointers in user data SGPRs (v2) User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: 1645656 -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)	2018-02-17 04:52:17 +01:00
Marek Olšák	0977b7f7b3	ac: query high bits of 32-bit address space	2018-02-17 04:51:58 +01:00
Bas Nieuwenhuizen	05d84ed68a	radv: Always lower indirect derefs after nir_lower_global_vars_to_local. Otherwise new local variables can cause hangs on vega. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-15 23:45:59 +01:00
Samuel Pitoiset	579b33c1fd	ac/nir: do not reserve user SGPRs for unused descriptor sets In theory this might lead to corruption if we bind a descriptor set which is unused, because LLVM is smart and it can re-use unused user SGPRs. In practice, this doesn't seem to fix anything. As a side effect, this will reduce the number of emitted SH_REG packets. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	309854148c	ac/shader: fix gathering of desc_set_used_mask This was quite wrong. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	61a4fc3ecc	ac/shader: be a little smarter when scanning vertex buffers Although meta shaders don't use any vertex buffers, there is no behaviour change but I think it's better to do this. Though, this saves two user SGPRs for push constants inlining or something else. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Timothy Arceri	9740c8a8aa	ac: implement nir_intrinsic_image_samples Fixes cts test: KHR-GL45.shader_texture_image_samples_tests.image_functional_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	3ad52501dc	ac/nir_to_llvm: fix image size for arrays of arrays Fixes cts test: KHR-GL44.shader_image_size.advanced-changeSize Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Samuel Pitoiset	ad4b58ea70	ac/nir: rename nir_to_llvm_context to radv_shader_context There is still more to do in that area, but it's a good start. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:16 +01:00
Samuel Pitoiset	141db61509	ac: remove nir_to_llvm_context from ac_nir_translate() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:14 +01:00
Samuel Pitoiset	a541117ff4	ac/nir: remove nir_to_llvm_context::nir link Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:12 +01:00
Samuel Pitoiset	e9f0205ca2	ac: move the outputs array to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:10 +01:00
Samuel Pitoiset	07e4268f36	ac/shader: scan force_persample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:08 +01:00
Bas Nieuwenhuizen	7461bd5b8f	ac: Use the renumbered const address space for LLVM 7. The LLVM AMDGPU backend decided to renumber the constant address space .... Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-14 01:05:03 +01:00
Timothy Arceri	10457712ed	ac/nir: add nir_intrinsic_{load,store}_shared support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	c787cbfa33	ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Eric Anholt	1aed66dc1e	radv: Fix compiler warning about uninitialized 'set' The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:47 +00:00
Eric Anholt	091bff8317	ac/nir: Fix compiler warning about uninitialized dw_addr. Even switching the def's condition to be the same chip revision check as the use, the compiler doesn't figure it out. Just NULL-init it. Fixes: `ec53e52742` ("ac/nir: Add ES output to LDS for GFX9.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:29 +00:00
Samuel Pitoiset	f4e85ba93f	ac/nir: remove backlink to nir_to_llvm_context Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:39 +01:00
Samuel Pitoiset	be5f6eb13e	ac/nir: remove nir_to_llvm_context::module Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:36 +01:00
Samuel Pitoiset	90a815ddeb	ac/nir: remove nir_to_llvm_context::builder Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:34 +01:00
Samuel Pitoiset	759acfa180	ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:31 +01:00
Samuel Pitoiset	e7373a6498	ac/nir: drop nir_to_llvm_context from visit_var_atomic() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:29 +01:00
Samuel Pitoiset	485346b05a	ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:27 +01:00
Samuel Pitoiset	cd6dfacda9	ac/nir: drop nir_to_llvm_context from visit_load_push_constant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:25 +01:00
Samuel Pitoiset	5c9e398c83	ac/nir: drop nir_to_llvm_context from cast_ptr() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:23 +01:00
Samuel Pitoiset	5ef5944848	ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:21 +01:00
Samuel Pitoiset	da8b0b8264	ac/nir: drop nir_to_llvm_context from emit_f2f16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:19 +01:00
Samuel Pitoiset	e32f374944	ac: remove unused parameters in abi::load_tess_coord() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:17 +01:00
Samuel Pitoiset	1e69db003d	ac/nir: remove useless bitcast in load_tess_coord() nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:15 +01:00
Samuel Pitoiset	ed179fbdf3	ac: add load_resource() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:13 +01:00
Samuel Pitoiset	ecf229706f	ac: add load_sample_mask_in() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:11 +01:00
Samuel Pitoiset	0f48eeea05	ac: move view_index to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:09 +01:00
Samuel Pitoiset	0efbede949	ac: move push_constants to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:07 +01:00
Samuel Pitoiset	460d3ce726	ac: move tg_size to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:04 +01:00
Samuel Pitoiset	054c92190c	ac/nir: remove unused nir_to_llvm_context:{defs,phis} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:02 +01:00
Timothy Arceri	ef8082baf8	ac: convert nir_op_f2f32 src to a float Fixes the following piglit test: ./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo Where we would end up with the nir such as: vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10 vec1 32 ssa_12 = f2f32 ssa_2 And our pack_64_2x32_split nir to llvm code always produces a 64bit integer as output. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	1b1e5f8edf	ac: fix some 64bit unpack asserts Previously the asserts did not take swizzles into account. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Samuel Pitoiset	3a2bb4db23	ac/nir: compute correct number of user SGPRs on GFX9 For merged shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:16:04 +01:00
Timothy Arceri	c77078c942	ac: pass struct ac_llvm_context to emit_membar() Fixes segfault in piglit test: ./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Timothy Arceri	12a2350e6d	ac: add 64bit support to ac_find_lsb() v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	a9f6b392c7	ac: move get_elem_bits() to ac_llvm_build.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	19f9839f0b	ac: add 64bit bitCount support v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Samuel Pitoiset	bb750d265c	ac/nir: clean up handle_fs_outputs_post() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:33 +01:00
Samuel Pitoiset	528bc14fa5	ac/nir: add radv_load_output() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:30 +01:00
Samuel Pitoiset	834d9845ca	ac/shader: scan info about output PS declarations NIR->LLVM should only be a translation pass, and all scan stuff should be done before. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:27 +01:00
Samuel Pitoiset	a8e04e91de	ac/nir: add radv_export_param() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:26 +01:00
Samuel Pitoiset	e3cfd6b805	ac/nir: remove set but unused export_mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:24 +01:00
Samuel Pitoiset	724136d590	ac/nir: remove dead code in handle_vs_outputs_post() The memcpy can't be reached because the condition is always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:22 +01:00
Samuel Pitoiset	c63d8d0284	ac/nir: remove useless check in si_llvm_init_export_args() values can't be NULL because we use ac_build_export_null() now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:20 +01:00
Samuel Pitoiset	26ab5a4269	ac/nir: use ac_build_export_null() The number of enabled channels should be 0 when exporting null. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:11:44 +01:00
Samuel Pitoiset	bd9f7b7635	ac: add ac_build_export_null() helper Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 22:11:42 +01:00
Fredrik Höglund	5a38d8f103	radv: implement VK_EXT_external_memory_host Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 00:46:07 +01:00
Samuel Pitoiset	757d36ee70	ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Ported from RadeonSI. Only one F1 2017 shader is affected, code size decreased from 532 to 488 on both Polaris10 and Vega10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:13 +01:00
Samuel Pitoiset	2f54d7382d	ac/nir: avoid loading unused VS input components Polaris10: Totals from affected shaders: SGPRS: 122840 -> 120984 (-1.51 %) VGPRS: 78812 -> 78440 (-0.47 %) Spilled SGPRs: 177 -> 129 (-27.12 %) Code Size: 2950028 -> 2941276 (-0.30 %) bytes Max Waves: 17899 -> 17976 (0.43 %) Vega10: Totals from affected shaders: SGPRS: 117144 -> 115776 (-1.17 %) VGPRS: 77580 -> 77532 (-0.06 %) Spilled SGPRs: 0 -> 152 (0.00 %) Code Size: 3352656 -> 3347860 (-0.14 %) bytes Max Waves: 19756 -> 19866 (0.56 %) This increases SGPRs spilling a bit with Talos, but I have some other ideas that might reduce it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:09 +01:00
Samuel Pitoiset	1c57a6da5e	ac/shader: scan vertex inputs usage mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:07 +01:00
Samuel Pitoiset	3488a3f033	radv: run nir_opt_shrink_load LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:44 +01:00
Timothy Arceri	9c52902c76	ac/radeonsi: add num_work_groups to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	f12e2f9c12	ac: implement nir_intrinsic_shader_clock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	b7b89bbddb	ac/radeonsi: create ac_build_shader_clock() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	d116af383f	ac/radeonsi: add load_local_group_size() to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	e3ebffdbb0	ac: don't call emit_outputs() for compute Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	c8066cdfa7	ac/radeonsi: add local_invocation_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	fa5239c153	ac/radeonsi: add workgroup_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Bas Nieuwenhuizen	c7d640fbbf	ac/nir: fix GS load input type. Fixes: `df1d5174fc` "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-06 21:52:38 +01:00
Dave Airlie	e7e81f362d	radv: don't support tc-compat on multisample d32s8 at all. RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: `f4c534ef6` (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-06 19:56:00 +00:00
Samuel Pitoiset	0170ae1e23	ac/nir: remove emission of nir_op_fdiv RadeonSI and RADV lower fdiv. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 23:09:34 +01:00
Samuel Pitoiset	a1d568c830	ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips Fixes: `df1d5174fc` ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 11:05:52 +01:00
Marek Olšák	3bf1e036e8	amd: remove support for LLVM 3.9 Only these are supported: - LLVM 4.0 - LLVM 5.0 - LLVM 6.0 - master (7.0) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 23:47:40 +01:00
Marek Olšák	847d0a393d	radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-02 16:46:22 +01:00
Samuel Pitoiset	df1d5174fc	ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load The old one generates useless instructions in there, found while comparing geometry shaders between RadeonSI and RADV. This improves all Vulkan demos that use geometry shaders, +4% for deferredshadows, +9% for viewportarray, +7% for geometryshader on Polaris10. This seems to also improve DOW3 a little bit (+1%). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 12:32:21 +01:00
Bas Nieuwenhuizen	2ffe395cba	radv: Don't expose VK_KHX_multiview on android. deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-01 23:32:48 +01:00
Marek Olšák	b0a6053a99	ac/nir: use ac_build_buffer_load_format for image buffer loads Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00

... 3 4 5 6 7 ...

2371 Commits