KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Emma Anholt	d95f9d189a	nir: Introduce a nir_vec_scalars() helper using nir_ssa_scalar. Many users of nir_vec() do so by nir_channel()-ing a new ssa defs as movs from other vectors to put the new vector together, which then just have to get copy-propagated into the ALU srcs and DCEed away the temporary movs. If they instead take nir_ssa_scalar, we can avoid that extra work. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>	2022-03-02 22:28:58 +00:00
Marek Olšák	1dcd1eac6a	nir: pass nir_shader into nir_recompute_io_bases instead of func_impl Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	606811bded	nir: add nir_print_xfb_info Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	ad68a1ee5a	nir: add nir_gather_xfb_info_from_intrinsics for lowered IO Drivers will use this. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	d4c051b047	nir: add nir_lower_io_passes() with new transform feedback moved from radeonsi without the vectorization, which won't be needed for now. We will lower IO in st/mesa instead of radeonsi to get the transform feedback info into store instructions. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	3528dcdfa1	nir: add nir_io_semantics::no_varying, no_sysval_output, and helpers This is for drivers that have separate store instructions for varyings, system value outputs (such as clip distances), and transform feedback. The flags tell the driver not to store the output to those locations. This will be used by radeonsi initially, and then maybe by a new linker. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	548b2d47b2	nir: scalarize transform feedback info in nir_lower_io_to_scalar Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	4636fa7f38	nir: add transform feedback info into nir_intrinsic_store_output This will allow compaction of transform feedback varyings because they are no longer tied to varying slots with this information. It will also make transform feedback info available to all NIR passes after IO is lowered. It's meant to replace pipe_stream_output_info. Other intrinsics are not used with transform feedback. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	2c6e41bfe1	nir: fix nir_io_semantics::gs_streams in nir_lower_io_to_scalar gs_streams is relative to the component. Also clear the high bits. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Marek Olšák	73ef225fc2	nir: validate write_mask for all intrinsics that have it Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>	2022-03-01 21:59:55 +00:00
Mike Blumenkrantz	b28cff9f4a	nir/lower_psiz_mov: stop clobbering existing exports for this pass to work with xfb, the original value in the shader must be preserved when xfb is active, and the driver must export only the newly created output Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>	2022-02-28 15:42:19 +00:00
Mike Blumenkrantz	3267417c22	nir/lower_psiz: create the store instruction more accurately creating this at the start of the shader means it will get optimized out when the pass is used to overwrite existing psiz values, and creating it at the end means it will get optimized out in geometry shaders, so instead just walk the instructions and create another store right after the existing one Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>	2022-02-28 15:42:19 +00:00
Emma Anholt	b1f349dff4	nir: Allow the _replicates opcodes to have num_components != 4. This required relaxing a core NIR assertion which I don't think is doing any important validation. The shader-db effects here are small, but they're important for avoiding a regression when we start doing per-component DCE in opt_shrink_vectors (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468) softpipe shader-db: total instructions in shared programs: 2859777 -> 2859454 (-0.01%) instructions in affected programs: 18881 -> 18558 (-1.71%) total temps in shared programs: 293994 -> 293914 (-0.03%) temps in affected programs: 418 -> 338 (-19.14%) i915g: total instructions in shared programs: 407562 -> 407544 (<.01%) instructions in affected programs: 570 -> 552 (-3.16%) r300: total instructions in shared programs: 1414450 -> 1414459 (<.01%) instructions in affected programs: 44494 -> 44503 (0.02%) total vinst in shared programs: 473782 -> 473727 (-0.01%) vinst in affected programs: 1102 -> 1047 (-4.99%) total sinst in shared programs: 231224 -> 231216 (<.01%) sinst in affected programs: 432 -> 424 (-1.85%) total temps in shared programs: 197605 -> 197607 (<.01%) temps in affected programs: 103 -> 105 (1.94%) crocus hsw: total instructions in shared programs: 8158185 -> 8158134 (<.01%) instructions in affected programs: 10927 -> 10876 (-0.47%) Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15178>	2022-02-25 12:31:48 -08:00
Timur Kristóf	f629fbd778	nir: Add new variable mode for task/mesh payload. Task shader outputs work differently than other shaders, so they need special consideration. Essentially, they have two kinds of outputs: 1. Number of mesh shader workgroups to launch. Will be still represented by a shader output. 2. Optional payload of up to (at least) 16K bytes. These payload variables behave similarly to shared memory, but the spec doesn't actually define them as shared memory (also, they may be implemented differently by each backend), so we need to add a new NIR variable mode for them. These payload variables can't be represented by shader outputs because the 16K bytes don't fit the 32x vec4 model that NIR uses for its output variables. This patch adds a new NIR variable mode: nir_var_mem_task_payload and corresponding explicit I/O intrinsics, as well as support for this new mode in nir_lower_io. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>	2022-02-25 06:52:07 +00:00
Iago Toral Quiroga	f1d20ec67c	nir/nir_opt_move: handle non-SSA defs We just skip register defs and avoid moving register reads across them. This allows us to run this pass in non-SSA form. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>	2022-02-24 11:36:00 +00:00
Iago Toral Quiroga	fe2249eac5	nir: add a nir_instr_def_is_register helper This returns true if the instruction has a dest that is not an SSA value. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>	2022-02-24 11:36:00 +00:00
Iago Toral Quiroga	0a04468704	nir/nir_opt_move: allow to move uniform loads Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>	2022-02-24 11:36:00 +00:00
Rhys Perry	a9ac270c5f	nir/validate: don't add instrs not present in shader to shader_gc_list This makes the set smaller and GC list validation faster. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13547>	2022-02-21 11:57:22 +00:00
Rhys Perry	925c5f817d	nir/validate: don't validate the GC list by default This seems really slow. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13547>	2022-02-21 11:57:18 +00:00
Connor Abbott	6761550357	nir/serialize: Don't access blob->data directly It won't work if the blob is fixed-size and we overrun the size, which will be the case with the Vulkan pipeline cache. This gets a bit tricky for the repeated-header optimization, because we can't read the header from the blob. Instead we have to store the header itself. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028>	2022-02-19 01:25:46 +00:00
Alyssa Rosenzweig	2d6233d04f	nir: Check all sizes in nir_alu_instr_is_comparison nir_alu_instr_is_comparison needs to consider all comparison opcodes regardless of size. Otherwise, they will be missed by nir_opt_move/sink. Without this change, lowering booleans to integers regresses register pressure (and spills/fills) significantly in certain shaders on Panfrost, like android/com.miHoYo.GenshinImpact/1420.shader_test. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15073>	2022-02-18 19:22:01 +00:00
Jose Maria Casanova Crespo	90f966e05f	v3dv/v3d: Fix copyright holder to Raspberry Pi Ltd Acked-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15057>	2022-02-18 11:50:07 +01:00
Alyssa Rosenzweig	7ec1d96e5e	nir: Set internal=true in nir_builder_init_simple_shader Matches the expected use by callers. We do need to fix up a few callers which use this call for external shaders. v2: Fix up a radv call site (Rhys). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> [v1] Acked-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14936>	2022-02-17 23:30:46 +00:00
Ian Romanick	a01b262990	nir: Add missing dependency on nir_opcodes.py Commit `38800b38` changed nir_opcodes.py, but that doesn't seem to have triggered nir_opt_algebraic.py. The change in `75ef5991` depends on opt_algebraic lowering 16-bit versions of slt, but if opt_algebraic is not rebuilt, this may not happen. This resulted in some people seeing assertion failures in, for example, dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_3.step, due to the backend seeing nir_op_slt that it didn't know how to handle. v2: Add nir_opcodes.py to nir_algebraic_py so that all the per-driver algebraic passes pick up the dependency too. Rename it to nir_algebraic_depends. Suggested by Emma. Closes: #6047 Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15050>	2022-02-17 22:57:33 +00:00
Lionel Landwerlin	768930a73a	nir: fix lower_memcpy memcpy is divided into chunks that are vec4 sized max. The problem here happens with a structure of 24 bytes : struct { float3 a; float3 b; } If you memcpy that struct, the lowering will emit 2 load/store, one of sized 8, next one sized 16. But both end up located at offset 0, so we effectively drop 2 floats. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a3177cca99` ("nir: Add a lowering pass to lower memcpy") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15049>	2022-02-17 15:12:45 +00:00
Emma Anholt	3f4bfecee6	nir: Add some notes about const/uniform array access rules in GL. I was doing some RE on freedreno and we had some questions about when the hardware might need non-uniform or non-constant array access for various descriptor types, so let's leave some notes for the next person. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13621>	2022-02-16 20:06:21 +00:00
Samuel Pitoiset	74b932f8d3	nir: add nir_intrinsic_load_vrs_rates_amd This intrinsic specific to RADV will be used to load VRS rates from an user SGPR when RADV_FORCE_VRS is enabled by the application. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14713>	2022-02-16 08:11:12 +01:00
Daniel Schürmann	2a92452a0e	nir/opt_shrink_vectors: Remove shrinking of store intrinsics data source This is done via nir_opt_shrink_stores. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14480>	2022-02-11 11:50:47 +01:00
Daniel Schürmann	2dba7e6056	nir: split nir_opt_shrink_stores from nir_opt_shrink_vectors This patch moves the shrinking of store sources into a separate pass. The reasoning behind this is that this pass usually only needs to be called once while nir_shrink_vectors might better be called several times. This allows to move the pass(es) out of the optimization loops. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14480>	2022-02-11 11:34:01 +01:00
Ian Romanick	1cb3d1a6ae	nir: Produce correct results for atan with NaN Properly handling NaN adversely affects several hundred shaders in shader-db (lots of Skia and a few others from various synthetic benchmarks) and fossil-db (mostly Talos and some Doom 2016). Only apply the NaN handling work-around when the shader demands it. v2: Add comment explaining the 1.0*y_over_x. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Fixes: `2098ae16c8` ("nir/builder: Move nir_atan and nir_atan2 from SPIR-V translator") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Ian Romanick	7d0d9b9fbc	nir: Properly handle various exceptional values in frexp frexp_sig of ±0, ±Inf, or NaN should just return the input unmodified. frexp_exp of ±Inf or NaN is undefined, and frexp_exp of ±0 should return the input unmodified. This seems to already work. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Fixes: `23d30f4099` ("spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Ian Romanick	38800b385c	nir: All set-on-comparison opcodes can take all float types Extend `4195a9450b` so that the next poor fool doesn't come along and say, "sge does the right thing for 16-bit sources, but slt gives a NIR validation failure. What the deuce?" NOTE: This commit is necessary to prevent regressions in GLSLstd450Step tests of 16-bit sources at "spriv: Produce correct result for GLSLstd450Step with NaN". Fixes: `4195a9450b` ("nir: sge operation is defined for floating-point types") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Ian Romanick	97ce3a56bd	nir/search: Constify instr parameter to nir_search_expression::cond Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Ian Romanick	4dd4135551	nir: Constify def parameter to nir_ssa_def_bits_used Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Otavio Pontes	510d248299	nir: Use proper macro to set bits of variable correctly When slots is 64 only the first bit was being set, instead of setting all 64 bits of the variable, so for that case the function get_variable_io_mask() always returned 0. This behaviour caused variables that are being used both on producer and consumer to be considered unused and thus being removed on nir_remove_unused_io_vars(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14955>	2022-02-10 17:19:54 +00:00
Georg Lehmann	c2168f845e	nir/lower_mediump: Treat u2u16 like i2i16. There is a comment in nir_fold_16bit_sampler_conversions saying that these are the same, but the code only checks for i2i16. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14893>	2022-02-10 16:13:54 +00:00
Emma Anholt	20469009c7	nir: Delete the per-instr SSA liveness impl. It was introduced for nir-to-tgsi, and I found that it was the wrong approach. There's a reason nobody else does RA this way. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14404>	2022-02-10 00:36:57 +00:00
Emma Anholt	f4ee7146f9	nir: Split the flag for lowering of fabs and fneg to source modifiers. i915 and r300 have fneg source modifier but not fabs, and doing it in NIR can save us some backend pain. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14938>	2022-02-08 18:50:01 -08:00
Lionel Landwerlin	c78be5da30	intel/fs: lower ray query intrinsics v2: Add helper for acceleration->root_node computation (Caio) v3: Update comment on "done" bit (Caio) Remove progress bool value for impl function (Caio) Don't use nir_shader_instructions_pass to search the shader (Caio) v4: Rename variable for if/else block (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	e06f9d49bc	nir/lower_shader_calls: consider relocated constants as rematerializable After all they're constants. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	6d9ae6ec1e	intel: add a new intrinsic to get the shader stage from bindless shaders We'll use this to apply ray tracing operations in our trivial return shader based on the stage we're in. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	b8f087b0e6	nir/builder: add nir_ior_imm() helper Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	bb40e999d1	intel/nir: use a single intel intrinsic to deal with ray traversal In the future we'll want to reuse this intrinsic to deal with ray queries. Ray queries will use a different global pointer and programmatically change the control/level arguments of the trace send instruction. v2: Comment on barrier after sync trace instruction (Caio) Generalize lsc helper (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	4deb8e86df	nir: change intel dss_id intrinsic to topology_id This will allow to reuse the same intrinsic for various topology based ID. v2: fix intrinsic comment (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Alyssa Rosenzweig	aaf00a1b4d	nir,zink: Make lower_discard_if a common pass This pass (lowering discard_if to control flow and unconditional discard) was originally written for Zink, but is useful for hardware that lacks conditional discard instructions like AGX. In theory AGX could implement a conditional discard with CSEL, but the vendor compiler uses a lowering like this one. Since I like not writing code, I'd like to use the pass that's already in tree. v2: Don't preserve dominance (Jason). Assert we don't see demotes or terminates (Jason). Add Mike's ack. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14217>	2022-02-04 19:09:23 +00:00
Lionel Landwerlin	e227bb9fd5	nir/builder: add ishl_imm helper v2: add (y >= x->bit_size) condition (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13739>	2022-02-02 17:09:46 +00:00
Connor Abbott	503a5bae59	nir: Add support for lowering shuffle to a waterfall loop Qualcomm doesn't natively support shuffle, but it does natively support relative shuffles where the delta is a constant. Therefore we'll expose emulated support for both. Add support for this emulation of subgroupShuffle() to NIR. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14412>	2022-02-01 16:27:45 +00:00
Connor Abbott	913bec10c4	nir/lower_subgroups: Rename lower_shuffle to lower_relative_shuffle This option only applies to relative shuffles (up/down/xor), and in a moment we're going to add an option to lower normal shuffles, so rename it. While we're here, rename lower_shuffle() to lower_to_shuffle() for similar reasons. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14412>	2022-02-01 16:27:45 +00:00
Marcin Ślusarz	24fef8f33d	intel/compiler: Use Task/Mesh InlineData for the first few push constants Replace load_mesh_global_arg_addr_intel with a more general intrinsic load_mesh_inline_data_intel, since inline data now hold both a pointer descriptor information and the first few push constants. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>	2022-01-29 06:32:19 +00:00
Christian Gmeiner	e5f9cdac1f	nir/nir_lower_tex_shadow: support tex_instr without deref src Use texture_index if there is no deref src. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14308>	2022-01-28 18:40:53 +00:00
Christian Gmeiner	e67bca3fe7	nir: make lower_sample_tex_compare a common pass This pass was originally written for d3d12, but is useful for hardware that lacks sample compare support like some etnaviv GPU models. Also rename the lowering pass and some surrounding code to nir_lower_tex_shadow as suggested by Emma. I'd like to use the pass that's already in tree. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14308>	2022-01-28 18:40:53 +00:00
Emma Anholt	61400f8a2d	nir/lower_locals_to_regs: Do an ad-hoc copy propagate on our generated MOV. I noticed the inefficiency in NIR-to-TGSI output while trying to debug a failure handling some arrays in r600. While this makes reading CTS shaders easier, the effect in the real world is pretty limited. From softpipe shader-db: total instructions in shared programs: 2929840 -> 2929836 (<.01%) instructions in affected programs: 118 -> 114 (-3.39%) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14321>	2022-01-25 06:01:13 +00:00
Bas Nieuwenhuizen	d1530a3f3b	Revert "nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants" This reverts commit `a1af902531`. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5423 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14532>	2022-01-21 16:58:11 +00:00
Rhys Perry	af51efe195	nir/builder: assume scalar alignment if not provided Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14455>	2022-01-21 13:45:33 +00:00
Rhys Perry	e9e1a44872	nir/builder: set write mask if not provided Zero isn't really a valid write mask. If it's provided, use a full write mask. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14455>	2022-01-21 13:45:33 +00:00
Rhys Perry	495debebad	nir/algebraic: optimize expressions using fmulz/ffmaz Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Rhys Perry	14b8227083	nir: add some missing nir_alu_type_get_base_type Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Rhys Perry	f2fbba7920	nir/algebraic: optimize open-coded fmulz/ffmaz This pattern will be found in future versions of D3D9 DXVK. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Rhys Perry	312a284980	nir/algebraic: add ignore_exact() wrapper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Rhys Perry	7f05ea3793	nir: add nir_op_fmulz and nir_op_ffmaz Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Emma Anholt	cac6f633b2	nir/opt_offsets: Use nir_ssa_scalar to chase offset additions. For nir_to_tgsi, I want to be able to fold into the base from a vector load_const, which the ad-hoc scalar chasing couldn't handle. r300: total instructions in shared programs: 1278731 -> 1256502 (-1.74%) instructions in affected programs: 457909 -> 435680 (-4.85%) total flowcontrol in shared programs: 8316 -> 8313 (-0.04%) flowcontrol in affected programs: 5 -> 2 (-60.00%) total temps in shared programs: 213687 -> 213774 (0.04%) temps in affected programs: 13140 -> 13227 (0.66%) total consts in shared programs: 952850 -> 949929 (-0.31%) consts in affected programs: 386352 -> 383431 (-0.76%) Fixes: #5781 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309>	2022-01-19 22:28:34 +00:00
Emma Anholt	645ca56425	nir/opt_offsets: Also apply the max offset to top-level constant folding. nir_to_tgsi wants this for disabling folding into shared var accesses at all. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309>	2022-01-19 22:28:34 +00:00
Emma Anholt	ec4b9909f0	nir/opt_offsets: Disable unsigned wrap checks on non-native-integers HW. Since we don't have 32-bit ints, these checks for 32-bit unsigned wrapping don't help and just reduce optimization opportunities (particularly for DX9 addressing math). Doesn't affect any current consumers. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309>	2022-01-19 22:28:34 +00:00
Emma Anholt	700d2fbd0a	nir: Add a .base field to nir_load_ubo_vec4. This lets nir-to-tgsi fold the constant offset of addressing calculations into the CONST[] reference, which is important for D3D9-era compatibility: HW of that age has limited uniform space, and if we do the addressing math as math in the shader for dynamic indexing, the nir_load_consts end up taking up uniforms we don't have available. r300: total instructions in shared programs: 1279699 -> 1279167 (-0.04%) instructions in affected programs: 134796 -> 134264 (-0.39%) total instructions in shared programs: 1279699 -> 1279167 (-0.04%) instructions in affected programs: 134796 -> 134264 (-0.39%) total temps in shared programs: 213912 -> 213736 (-0.08%) temps in affected programs: 2166 -> 1990 (-8.13%) total consts in shared programs: 953237 -> 952973 (-0.03%) consts in affected programs: 45980 -> 45716 (-0.57%) Acked-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309>	2022-01-19 22:28:34 +00:00
Dave Airlie	ccbf700d6c	nir: remove gl.h include from nir headers. This saves a lot of pointless gl.h includes across the board, it moves the one place that needs GLenum into a separate file only used in those passes that require it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Dave Airlie	1352e0ba0c	mesa/*: add a shader primitive type to get away from GL types. This creates an internal shader_prim enum, I've fixed up most users to use it instead of GL types. don't store the enum in shader_info as it changes size, and confuses other things. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Connor Abbott	9c9e8c3349	nir: Reorder ffma and fsub combining It's relatively common to do something like "a * b - c", which on most GPUs can be implemented in a single instruction. Before opt_algebraic_late this will be something like "fadd(fmul(a, b), fneg(c))", and we want to turn it info "ffma(a, b, fneg(c))". But because the fsub pattern was first we instead turned it into "fsub(fmul(a, b), c)". Fix this by reordering them. Selected shader-db results on freedreno: total instructions in shared programs: 1561330 -> 1551619 (-0.62%) instructions in affected programs: 780272 -> 770561 (-1.24%) helped: 1941 HURT: 491 helped stats (abs) min: 1 max: 147 x̄: 7.98 x̃: 4 helped stats (rel) min: 0.07% max: 30.77% x̄: 4.36% x̃: 3.17% HURT stats (abs) min: 1 max: 307 x̄: 11.76 x̃: 5 HURT stats (rel) min: 0.09% max: 18.71% x̄: 2.26% x̃: 1.38% 95% mean confidence interval for instructions value: -4.57 -3.41 95% mean confidence interval for instructions %-change: -3.21% -2.84% Instructions are helped. total nops in shared programs: 358926 -> 356263 (-0.74%) nops in affected programs: 167116 -> 164453 (-1.59%) helped: 1395 HURT: 859 helped stats (abs) min: 1 max: 108 x̄: 6.80 x̃: 3 helped stats (rel) min: 0.17% max: 100.00% x̄: 19.15% x̃: 10.57% HURT stats (abs) min: 1 max: 307 x̄: 7.95 x̃: 3 HURT stats (rel) min: 0.00% max: 381.82% x̄: 20.04% x̃: 10.00% 95% mean confidence interval for nops value: -1.77 -0.59 95% mean confidence interval for nops %-change: -5.55% -2.87% Nops are helped. total non-nops in shared programs: 1202404 -> 1195356 (-0.59%) non-nops in affected programs: 496682 -> 489634 (-1.42%) helped: 1951 HURT: 265 helped stats (abs) min: 1 max: 39 x̄: 4.02 x̃: 3 helped stats (rel) min: 0.07% max: 15.38% x̄: 2.97% x̃: 1.96% HURT stats (abs) min: 1 max: 22 x̄: 2.97 x̃: 2 HURT stats (rel) min: 0.05% max: 10.00% x̄: 1.14% x̃: 0.75% 95% mean confidence interval for non-nops value: -3.38 -2.99 95% mean confidence interval for non-nops %-change: -2.60% -2.36% Non-nops are helped. total systall in shared programs: 288317 -> 292975 (1.62%) systall in affected programs: 87876 -> 92534 (5.30%) helped: 388 HURT: 431 helped stats (abs) min: 1 max: 214 x̄: 14.39 x̃: 8 helped stats (rel) min: 0.25% max: 100.00% x̄: 22.12% x̃: 11.96% HURT stats (abs) min: 1 max: 232 x̄: 23.77 x̃: 7 HURT stats (rel) min: 0.00% max: 1300.00% x̄: 51.71% x̃: 17.30% 95% mean confidence interval for systall value: 3.07 8.30 95% mean confidence interval for systall %-change: 9.49% 23.97% Systall are HURT. (The systall hurt is probably just due to having having fewer instructions to hide latency with.) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14554>	2022-01-18 17:44:50 +00:00
Qiang Yu	2cee73f0f7	nir: fix nir_tex_instr hash not count is_sparse field This fixes nir_opt_cse miss replace a non-sparse tex instruction with a sparse tex instruction and fail the nir_validate_shader(). Fixes: `3a7972f72a` ("nir,spirv: add sparse texture fetches") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362>	2022-01-18 16:10:35 +08:00
Rhys Perry	d95a0b52e4	nir/unsigned_upper_bound: don't follow 64-bit f2u32() Fixes Doom Eternal crash. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `72ac3f6026` ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14555>	2022-01-17 10:59:21 +00:00
Emma Anholt	f6ffefba3e	nir: Apply nir_opt_offsets to nir_intrinsic_load_uniform as well. Doing this for ir3 required adding a struct for limits of how much base to fold in (which NTT wants as well for its case of shared vars), otherwise the later work to lower to the 1<<9 word limit would emit more instructions. The shader-db results are that sometimes the reduction in NIR instruction count results in the fewer sampler prefetches due to the shader being estimated to be shorter (dota2, nexuiz): total instructions in shared programs: 8996651 -> 8996776 (<.01%) total cat5 in shared programs: 86561 -> 86577 (0.02%) Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023>	2022-01-16 19:11:29 +00:00
Emma Anholt	b024102d7c	freedreno/ir3: Use nir_opt_offset for removing constant adds for shared vars. Saves some work in carchase and manhattan31: instructions in affected programs: 2842 -> 2818 (-0.84%) nops in affected programs: 1131 -> 1105 (-2.30%) non-nops in affected programs: 1236 -> 1238 (0.16%) mov in affected programs: 57 -> 61 (7.02%) dwords in affected programs: 2144 -> 2150 (0.28%) cat0 in affected programs: 1195 -> 1169 (-2.18%) cat1 in affected programs: 151 -> 155 (2.65%) cat2 in affected programs: 142 -> 140 (-1.41%) sstall in affected programs: 190 -> 178 (-6.32%) (ss) in affected programs: 63 -> 63 (0.00%) systall in affected programs: 532 -> 511 (-3.95%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023>	2022-01-16 19:11:29 +00:00
Daniel Schürmann	79a987ad2a	nir/opt_if: also merge break statements with ones after the branch This optimizations turns loop { ... if (cond1) { if (cond2) { do_work_1(); break; } else { do_work_2(); } do_work_3(); break; } else { ... } } into: loop { ... if (cond1) { if (cond2) { do_work_1(); } else { do_work_2(); do_work_3(); } break; } else { ... } } As this optimizations moves code into the NIF statement, it re-iterates on the branch legs in case of success. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587>	2022-01-13 02:30:32 +00:00
Daniel Schürmann	dad609d152	nir/opt_if: merge two break statements from both branch legs This optimization turns loop { ... if (cond) { do_work_1(); break; } else { do_work_2(); break; } } into: loop { ... if (cond) { do_work_1(); } else { do_work_2(); } break; } Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587>	2022-01-13 02:30:32 +00:00
Daniel Schürmann	8a78706643	nir: refactor nir_opt_move This patch is a rewrite of nir_opt_move. Differently from the previous version, each instruction is checked if it can be moved downwards and then inserted before the first user of the definition. The advantage is that less insert operations are performed, the original order is kept if two movable instructions have the same first user, and instructions without user in the same block are moved towards the end. v2: Only return true if an instruction really changed the position. Don't care for discards, this will be handled by another MR. v3: fix self-referring phis and update according to nir_can_move_instr(). v4: use nir_can_move_instr() and nir_instr_ssa_def() v5: deduplicate some code Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657>	2022-01-12 13:41:54 +00:00
Marcin Ślusarz	f286ecf906	nir: handle per-view clip/cull distances Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>	2022-01-11 22:45:23 +00:00
Marcin Ślusarz	0d6f83cbf1	nir: remove invalid assert affecting per-view variables per-view variables can have arbitrary (but > 0) number of array levels Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>	2022-01-11 22:45:23 +00:00
Marcin Ślusarz	4fed440724	nir: add load_mesh_view_count and load_mesh_view_indices intrinsics Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>	2022-01-11 22:45:23 +00:00
Rhys Perry	67fc7a1763	nir/uniform_atomics: fix is_atomic_already_optimized without workgroups dims_needed would have been zero, so this would always returned true for non-compute stages. Also fix this for variable workgroup sizes. Improves Shadow of the Tomb Raider RX 6800 performance by 10.6%, 11.5% and 4.5% (day_of_dead, jungle and paititi scenes). radv_perf before and after: {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '62.913333333333334', 'min_fps': '62.81', 'max_fps': '62.98', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '64.02666666666666', 'min_fps': '63.93', 'max_fps': '64.11', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '74.81666666666666', 'min_fps': '74.72', 'max_fps': '74.88', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '69.57', 'min_fps': '69.52', 'max_fps': '69.63', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '71.41000000000001', 'min_fps': '71.31', 'max_fps': '71.5', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '78.16666666666667', 'min_fps': '78.07', 'max_fps': '78.23', 'interations': '3'} Performance now seems slightly better than AMDVLK 2021.Q4.3: {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '68.02666666666666', 'min_fps': '67.95', 'max_fps': '68.16', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '70.24666666666667', 'min_fps': '69.83', 'max_fps': '70.51', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '77.19', 'min_fps': '77.18', 'max_fps': '77.2', 'interations': '3'} fossil-db (Sienna Cichlid): Totals from 40 (0.03% of 134621) affected shaders: CodeSize: 62676 -> 65996 (+5.30%) Instrs: 11372 -> 12111 (+6.50%) Latency: 144122 -> 142848 (-0.88%); split: -1.09%, +0.21% InvThroughput: 19686 -> 19847 (+0.82%); split: -0.06%, +0.87% VClause: 304 -> 306 (+0.66%) SClause: 603 -> 604 (+0.17%); split: -0.83%, +1.00% Copies: 780 -> 858 (+10.00%) Branches: 235 -> 329 (+40.00%) PreSGPRs: 1072 -> 1083 (+1.03%); split: -0.37%, +1.40% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14407>	2022-01-10 19:57:38 +00:00
Rhys Perry	b00138090e	nir/lower_shader_calls: fix store_scratch write_mask Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>	2022-01-10 19:01:04 +00:00
Danylo Piliaiev	b8d486f298	nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8 Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but not signed dot product. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Gert Wollny	6f348d9c99	nir_lower_io: propagate the "invariant" flag to outputs Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer to correctly declare output variables. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>	2022-01-07 16:35:43 +00:00
Emma Anholt	558a600629	nir_to_tgsi: Enable fdot_replicates flag. That's how the TGSI math opcodes work. This lets lower_vec_to_regs coalesce the DP output into the .yzw channels, giving an impressive shader-db win on softpipe: total instructions in shared programs: 2929840 -> 2794036 (-4.64%) instructions in affected programs: 1651438 -> 1515634 (-8.22%) total temps in shared programs: 372730 -> 332744 (-10.73%) temps in affected programs: 118151 -> 78165 (-33.84%) and a minor one on r300: total instructions in shared programs: 51238 -> 51149 (-0.17%) instructions in affected programs: 2621 -> 2532 (-3.40%) total vinst in shared programs: 15655 -> 15618 (-0.24%) vinst in affected programs: 468 -> 431 (-7.91%) total temps in shared programs: 9838 -> 9828 (-0.10%) temps in affected programs: 59 -> 49 (-16.95%) and a bigger one on i915g: total instructions in shared programs: 398064 -> 395901 (-0.54%) instructions in affected programs: 29271 -> 27108 (-7.39%) total tex_indirect in shared programs: 12261 -> 12233 (-0.23%) tex_indirect in affected programs: 98 -> 70 (-28.57%) LOST: 0 GAINED: 5 The r300 change is less impressive because it does some backend copy-prop, but also because intermediate storage of DPs now takes a vec4 instead of a scalar. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14200>	2022-01-07 09:58:24 +00:00
Jesse Natalie	c09c0b351f	nir_opt_dead_cf: Remove dead ifs An if that looks like: if (x) { } else { } That has no phis following it is dead. Currently these are only removed by peephole select, but that means that 'x' is considered used until that pass is run, which can make it difficult to apply sane lowering in the case where loading 'x' requires complex or expensive transformations, but 'x' is really unused. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14400>	2022-01-07 05:15:48 +00:00
Alyssa Rosenzweig	24ea7cbb06	nir: Extend store_combined_output_pan Extend store_combined_output_pan to take a dual source blend input in addition to colour, depth, and stencil inputs. Use the last source for this, and represent the type with the DEST_TYPE index. This is a hack but there is no SRC2_TYPE and NIR doesn't seem to mind as long as we know what we mean. This allows the backend to emit a combined "blend render target #0" instruction taking two sources. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13714>	2022-01-02 01:12:05 +00:00
Alyssa Rosenzweig	5c168f09eb	nir: Eliminate store_combined_output_pan BASE It's meaningless for this intrinsic and is just adding noise to the lowering pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13714>	2022-01-02 01:12:05 +00:00
Daniel Schürmann	17ecd0b31a	nir/opt_algebraic: lower fneg_hi/lo to fmul This pattern, found in the FSR upscaling shader, helps the vectorization efforts by keeping the chain of vectorized instructions intact. Radeon can optimize it to per-component fneg modifiers. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688>	2021-12-21 13:23:37 +01:00
Caio Oliveira	729df14e45	nir: Handle volatile semantics for loading HelperInvocation builtin SPV_EXT_demote_to_helper_invocation added OpDemoteToHelperInvocation operation to turn an invocation into a helper invocation, but the value of HelperInvocation (a builtin from Input storage class) couldn't be modified dynamically without breaking compatibility. For the extension the operation OpIsHelperInvocation was added to get the dynamic value. For SPIR-V 1.6, the demote operation was promoted, but now to get the dynamic value the shader must issue a load to HelperInvocation with Volatile memory access semantics. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14209>	2021-12-17 16:37:14 -08:00
Rhys Perry	a65285f54b	nir/opt_access: infer CAN_REORDER for global access fossil-db (Sienna Cichlid): Totals from 352 (0.26% of 134621) affected shaders: VGPRs: 17240 -> 17272 (+0.19%) CodeSize: 1753640 -> 1755744 (+0.12%); split: -0.04%, +0.16% Instrs: 323190 -> 323801 (+0.19%); split: -0.03%, +0.22% Latency: 3241205 -> 3241293 (+0.00%); split: -0.10%, +0.10% InvThroughput: 568927 -> 568067 (-0.15%); split: -0.16%, +0.00% SClause: 12109 -> 10444 (-13.75%); split: -13.76%, +0.01% Copies: 27802 -> 27717 (-0.31%); split: -0.56%, +0.26% PreSGPRs: 14699 -> 14690 (-0.06%) PreVGPRs: 15793 -> 15799 (+0.04%) fossil-db (Polaris10): Totals from 348 (0.26% of 135668) affected shaders: SGPRs: 21446 -> 21574 (+0.60%); split: -0.15%, +0.75% VGPRs: 17004 -> 16996 (-0.05%); split: -0.09%, +0.05% CodeSize: 1782796 -> 1783060 (+0.01%); split: -0.03%, +0.05% Instrs: 337828 -> 337921 (+0.03%); split: -0.03%, +0.06% Latency: 3726328 -> 3726721 (+0.01%); split: -0.09%, +0.10% InvThroughput: 1307917 -> 1299841 (-0.62%); split: -0.62%, +0.00% VClause: 4327 -> 4337 (+0.23%); split: -0.09%, +0.32% SClause: 12178 -> 10529 (-13.54%); split: -13.55%, +0.01% Copies: 40227 -> 40244 (+0.04%); split: -0.19%, +0.24% PreSGPRs: 14946 -> 14937 (-0.06%) PreVGPRs: 15637 -> 15643 (+0.04%) fossil-db (Pitcairn): Totals from 351 (0.26% of 135668) affected shaders: SGPRs: 20382 -> 20619 (+1.16%); split: -0.79%, +1.95% CodeSize: 1789732 -> 1789836 (+0.01%); split: -0.04%, +0.04% MaxWaves: 1947 -> 1949 (+0.10%) Instrs: 352274 -> 352318 (+0.01%); split: -0.04%, +0.06% Latency: 4057829 -> 4058226 (+0.01%); split: -0.08%, +0.09% InvThroughput: 1332245 -> 1317578 (-1.10%); split: -1.11%, +0.01% VClause: 8581 -> 8583 (+0.02%); split: -0.13%, +0.15% SClause: 12187 -> 10552 (-13.42%); split: -13.43%, +0.02% Copies: 44906 -> 44915 (+0.02%); split: -0.24%, +0.26% PreSGPRs: 16571 -> 16562 (-0.05%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14227>	2021-12-17 18:51:24 +00:00
Rhys Perry	403ae3b48e	nir/algebraic: optimize more 64-bit imul with constant source Two 64-bit shifts and an addition are usually faster than the several multiplications nir_lower_int64 creates. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14227>	2021-12-17 18:51:24 +00:00
Rhys Perry	c56cf157c5	nir/opt_load_store_vectorize: improve ssbo/global alias analysis If either the global access or the ssbo access is restrict, they shouldn't alias. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14227>	2021-12-17 18:51:24 +00:00
Jason Ekstrand	deec7a590b	anv,nir: Use sample_pos_or_center in lower_wpos_center Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>	2021-12-17 16:02:16 +00:00
Jason Ekstrand	e8acc5a7ea	nir: Add a new sample_pos_or_center system value Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>	2021-12-17 16:02:16 +00:00
Marcin Ślusarz	504e5cb4e8	nir/print: print const value near each use of const ssa variable Without/with NIR_DEBUG=print,print_const: -vec4 32 ssa_60 = fadd ssa_59, ssa_58 +vec4 32 ssa_60 = fadd ssa_59 /(0xbf800000, 0x3e800000, 0x00000000, 0x3f800000) = (-1.000000, 0.250000, 0.000000, 1.000000)/, ssa_58 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13880>	2021-12-17 10:04:50 +00:00
Marcin Ślusarz	23f8f836e0	nir/print: group hex and float vectors together Vectors are much easier to follow in this format, because developer cares either about hex or float values, never both. Before/after: -vec4 32 ssa_222 = load_const (0x00000000 /* 0.000000 /, 0x00000000 / 0.000000 /, 0x3f800000 / 1.000000 /, 0x3f800000 / 1.000000 /) +vec4 32 ssa_222 = load_const (0x00000000, 0x00000000, 0x3f800000, 0x3f800000) = (0.000000, 0.000000, 1.000000, 1.000000) -vec1 32 ssa_174 = load_const (0xbf800000 / -1.000000 */) +vec1 32 ssa_174 = load_const (0xbf800000 = -1.000000) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13880>	2021-12-17 10:04:50 +00:00
Marcin Ślusarz	d2b4051ea9	nir/print: move print_load_const_instr up ... to avoid forward declarations in future commit Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13880>	2021-12-17 10:04:50 +00:00
Marcin Ślusarz	f7e63ec5d8	nir/print: compact printing of intrinsic indices Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14222>	2021-12-16 09:43:13 +00:00
Marcin Ślusarz	d8fa625bb3	nir/print: expand printing of io semantics.gs_streams gs_streams can be set for at least 2 other intrinsics. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14222>	2021-12-16 09:43:13 +00:00
Marcin Ślusarz	be25db9f0f	nir/print: simplify printing of IO semantics Some of the tested flags are set for other intrinsics and they are printed only when set, so there's no point in checking exact intrinsic name or shader stage. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14222>	2021-12-16 09:43:13 +00:00
Caio Oliveira	b1156f23a2	Revert "nir: disable a NIR test due to undebuggable & locally unreproducible CI failures" This reverts commit `6eb3fe2d4f`. The root cause was a bug in Meson when using the new gtest protocol and a test failed before producing the XML file expected by it. This was fixed in later versions of Meson, so we've bumped the required meson version to use that feature. The failure should now be properly identified, so re-enabling the NIR test. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14204>	2021-12-15 23:28:09 +00:00
Caio Oliveira	dcc7b19cae	nir: Initialize nir_register::divergent Fixes: `c7fc44f9eb` ("nir/from_ssa: Respect and populate divergence information") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14205>	2021-12-15 22:39:06 +00:00
Juan A. Suarez Romero	b8f6685bb5	nir: use call_once() to init debug variable For data-race safety, let's use this function to ensure NIR debug is initialized only once. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14057>	2021-12-14 08:01:17 +00:00
Juan A. Suarez Romero	18c039b2e1	tgsi-to-nir: initialize NIR_DEBUG envvar This envvar is initialized when creating a NIR shader, but it needs to be used before. So initialize it here. v2 (Juan): - Use static variable for first initialization. Fixes: `f77ccdfb4a` ("nir: add NIR_DEBUG envvar") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14057>	2021-12-14 08:01:17 +00:00
Jordan Justen	211e0606c7	nir/lower_tex: Add filter for tex offset lowering Rework: * Add callback_data (s-b Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14142>	2021-12-13 16:56:23 -08:00
Samuel Pitoiset	be53b3d1bf	nir/lower_tex: add lower_lod_zero_width On AMD, the hardware will return 0 for the raw LOD if the sum of the absolute values of derivatives is 0 but Vulkan expects the value to be in the [-inf, -22.0f] range. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14147>	2021-12-13 10:00:07 +00:00
Marcin Ślusarz	87f03b1662	nir: limit lower_clip_cull_distance_arrays input to traditional stages Compute, task, mesh & raytracing stages don't support ClipDistance/CullDistance as input. This change is not needed for correctness. Just something I stumbled on. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14149>	2021-12-13 08:32:23 +00:00
Marek Olšák	2785141c16	nir: add nir_has_divergent_loop function Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>	2021-12-11 20:07:35 +00:00
Marek Olšák	26b522eae5	nir: serialize divergent fields Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>	2021-12-11 20:07:35 +00:00
Marek Olšák	6eb3fe2d4f	nir: disable a NIR test due to undebuggable & locally unreproducible CI failures debian-vulkan but not any other CI pipeline consistently fails with: FileNotFoundError: [Errno 2] No such file or directory: 'nir_tests.xml' I have to assume that either debian-vulkan is broken, or the NIR test infrastructure is broken. That's not all. I got the same failure when I wanted to add a new test, which means the CI is preventing us from adding new NIR tests, which is a very serious problem with the CI or NIR tests. The python error doesn't imply that it's a test failure, so something else is broken. If you don't want such commits to happen again, print better error messages. See also the discussion in the MR. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>	2021-12-11 20:07:35 +00:00
Marek Olšák	2ab310b78b	nir: handle more intrinsics in divergence analysis Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>	2021-12-11 20:07:35 +00:00
Emma Anholt	d199d65c3a	nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr. We weren't doing much motion in nir-to-tgsi because we considered all our lowered load-ubos as unmovable. softpipe shader-db: total temps in shared programs: 563942 -> 563136 (-0.14%) temps in affected programs: 9833 -> 9027 (-8.20%) r300 shader-db: instructions in affected programs: 22858 -> 23575 (3.14%) temps in affected programs: 2039 -> 1813 (-11.08%) (NIR had given r300 -19% instrs for +40% temps, so this feels like a worthwhile trade back). Reivewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14138>	2021-12-11 02:12:27 +00:00
Emma Anholt	de33205f88	nir/algebraic: Move all the individual transforms to a common table. Cuts 28% of the remaining relocations in libvulkan_intel.so, shrinks binary size by 290kb. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	a29b54f014	nir/algebraic: Mark the automaton's filter tables as const. Moves it to .rodata instead of .data. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	45a8d11b6e	nir/algebraic: Pack various bitfields in the nir_search_value_union. This gets our union's size down to 22 bytes (now smaller than any of the union's types were before we made the union!). Cuts another 48kb off of the drivers. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	53f49b7066	nir/algebraic: Move relocations for variable conds to a table. This helps concentrate the dirty pages from the relocations, reduces how many relocations there are, and reduces the size of each variable assuming variables mostly don't have conditions or the conditions are mostly reused). Reduces libvulkan_intel.so size by 49kb. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	8485a78977	nir/algebraic: Move relocations for expression conds to a table. This helps concentrate the dirty pages from the relocations, reduces how many relocations there are, and reduces the size of each expression (assuming expressions mostly don't have conditions or the conditions are mostly reused). Reduces libvulkan_intel.so size by 8.7kb. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	7635379dc7	nir/algebraic: Remove array-of-cond code You can't have an array of them after removing many-comm-expr, there's no space in the struct. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	5d82c61a30	nir/algebraic: Replace relocations for nir_search values with a table. Even with packing all 3 types into a 40-byte union (nir_search_constant being 24 bytes and nir_search_expression having formerly been 32), and having a single array of them, this cuts 1.7MB from each of libvulkan_intel.so and libgallium_dri.so. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:09:00 +00:00
Emma Anholt	e7d8717375	nir/algebraic: Drop the check for cache == None. The cache is always set. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:08:59 +00:00
Emma Anholt	a263474d3b	nir/algebraic: Move some generated-code algebraic opt args into a struct. I'm going to be adding some more tables to reduce relocations in the generated code, so move the current tables to a struct for arg-passing sanity. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>	2021-12-07 07:08:59 +00:00
Ian Romanick	b88202b0e4	nir/constant_folding: Optimize txb with bias of constant zero to tex v2: Fail gracefully when bias_idx < 0. See comment in the code for the rationale. See also issue #5722. All Haswell and newer Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 19757733 -> 19753431 (-0.02%) instructions in affected programs: 277248 -> 272946 (-1.55%) helped: 1644 HURT: 1 helped stats (abs) min: 1 max: 16 x̄: 2.62 x̃: 2 helped stats (rel) min: 0.05% max: 11.11% x̄: 2.11% x̃: 1.61% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.35% max: 0.35% x̄: 0.35% x̃: 0.35% 95% mean confidence interval for instructions value: -2.72 -2.51 95% mean confidence interval for instructions %-change: -2.19% -2.03% Instructions are helped. total cycles in shared programs: 938517439 -> 938384079 (-0.01%) cycles in affected programs: 19548849 -> 19415489 (-0.68%) helped: 1358 HURT: 269 helped stats (abs) min: 1 max: 2328 x̄: 133.01 x̃: 16 helped stats (rel) min: <.01% max: 41.12% x̄: 1.40% x̃: 0.48% HURT stats (abs) min: 1 max: 1302 x̄: 175.70 x̃: 30 HURT stats (rel) min: <.01% max: 69.03% x̄: 6.24% x̃: 1.04% 95% mean confidence interval for cycles value: -99.14 -64.79 95% mean confidence interval for cycles %-change: -0.47% 0.19% Inconclusive result (%-change mean confidence interval includes 0). LOST: 21 GAINED: 32 All Ivy Bridge and older Intel GPUs had similar results. (Ivy Bridge shown) total instructions in shared programs: 15302017 -> 15301485 (<.01%) instructions in affected programs: 22565 -> 22033 (-2.36%) helped: 168 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 3.17 x̃: 3 helped stats (rel) min: 0.04% max: 4.39% x̄: 3.05% x̃: 3.27% 95% mean confidence interval for instructions value: -3.45 -2.89 95% mean confidence interval for instructions %-change: -3.19% -2.91% Instructions are helped. total cycles in shared programs: 550119761 -> 549989147 (-0.02%) cycles in affected programs: 12834251 -> 12703637 (-1.02%) helped: 164 HURT: 0 helped stats (abs) min: 20 max: 4547 x̄: 796.43 x̃: 294 helped stats (rel) min: 0.23% max: 53.84% x̄: 2.05% x̃: 0.37% 95% mean confidence interval for cycles value: -942.62 -650.24 95% mean confidence interval for cycles %-change: -3.17% -0.94% Cycles are helped. fossil-db results: Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 142073649 -> 141307526 (-0.5%) SENDs in all programs: 6876848 -> 6876778 (-0.0%) Loops in all programs: 38283 -> 38283 (+0.0%) Cycles in all programs: 8410049681 -> 8402902960 (-0.1%) Spills in all programs: 190623 -> 190599 (-0.0%) Fills in all programs: 297780 -> 297756 (-0.0%) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14025>	2021-12-06 19:50:42 +00:00
Lionel Landwerlin	0cbcc15afe	nir: add a ray query optimization pass Just remove queries that are never used or proceeded with. The latter case leading to undefined values. v2: Don't use nir_shader_instructions_pass() to find variables (Caio) Simplify replacement (Caio) v3: Don't track all the queries intrinsic effects (Caio) Rename things to represent only read queries (Caio) Use set instead of hash_table (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>	2021-12-04 20:46:35 +00:00
Lionel Landwerlin	5a9cdab170	nir: track variables representing ray queries v2: Fix missing ray_query variable check (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>	2021-12-04 20:46:35 +00:00
Lionel Landwerlin	0d6f050b46	nir: add intrinsics for ray queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>	2021-12-04 20:46:35 +00:00
Lionel Landwerlin	0800ec2c77	nir: add a new access flag to allow access in helper invocations v2: Add nir_print support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>	2021-12-04 20:46:35 +00:00
Lionel Landwerlin	54489b3c09	nir/print: printout ACCESS_STREAM_CACHE_POLICY Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>	2021-12-04 20:46:35 +00:00
Lionel Landwerlin	f98984ad13	nir/lower_io: include the variable access in the lowered intrinsic Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13718>	2021-12-04 20:46:35 +00:00
Marcin Ślusarz	b717872e08	intel/compiler: Get mesh_global_addr from the Inline Parameter for Task/Mesh Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Timur Kristóf	f28adc711f	nir: Print task and mesh shader I/O variable names. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14007>	2021-12-03 21:34:45 +00:00
Timur Kristóf	7e66da89f8	nir: Fix sorting per-primitive outputs. Fixes: `59860d4873` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14006>	2021-12-03 17:06:47 +00:00
Rhys Perry	a2d8c5b26d	nir/algebraic: optimize a*#b & -4 fossil-db (Sienna Cichlid): Totals from 611 (0.47% of 128647) affected shaders: CodeSize: 3096680 -> 3090976 (-0.18%) Instrs: 570494 -> 569249 (-0.22%) Latency: 5765865 -> 5759619 (-0.11%) InvThroughput: 969840 -> 967608 (-0.23%) VClause: 9690 -> 9688 (-0.02%) Copies: 42884 -> 42894 (+0.02%); split: -0.01%, +0.03% PreVGPRs: 28290 -> 28288 (-0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13752>	2021-12-03 13:41:07 +00:00
Rhys Perry	2368c36427	nir/opt_offsets: remove need to loop try_extract_const_addition fossil-db (Sienna Cichlid): Totals from 1 (0.00% of 134572) affected shaders: no stat changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14009>	2021-12-03 11:51:49 +00:00
Rhys Perry	5c0fe11072	nir/opt_offsets: fix try_extract_const_addition recursion This initially looks like a miscompilation bug, but I don't think it's actually possible for it to create incorrect code. fossil-db (Sienna Cichlid): Totals from 32 (0.02% of 134572) affected shaders: VGPRs: 1336 -> 1320 (-1.20%) CodeSize: 90552 -> 89468 (-1.20%) Instrs: 17007 -> 16852 (-0.91%); split: -0.92%, +0.01% Latency: 429040 -> 428136 (-0.21%); split: -0.21%, +0.00% InvThroughput: 84966 -> 84572 (-0.46%); split: -0.47%, +0.00% Copies: 1458 -> 1468 (+0.69%); split: -0.07%, +0.75% Branches: 382 -> 384 (+0.52%) PreSGPRs: 970 -> 968 (-0.21%) PreVGPRs: 1029 -> 1011 (-1.75%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14009>	2021-12-03 11:51:49 +00:00
Juan A. Suarez Romero	f77ccdfb4a	nir: add NIR_DEBUG envvar Move all the NIR related debug environmental variables in a single NIR_DEBUG one. Use NIR_DEBUG=help to print all the available options. v2: - Use a macro to simplify (Marcin, Jason) - Remove wrong changes (Marcin) v3 (Marcin): - Remove rendundant NIR mentioning in option descriptions. - Unwrap option descriptions. - Ensure the constant is unsigned. - Use extern array to remove switch. v4: - Add missing kernel shader (Jason). - Add unlikely() (Marcin). Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13840>	2021-12-03 11:15:29 +00:00
Emma Anholt	06fe04b4d7	nir: Make nir_build_alu() variants per 1-4 arg count. This saves a bunch of generated code to pack up the extra NULLs to get to 4 args, and saves executing the conditions in nir_build_alu() to then skip those NULLs. Saves another 27kb on disk. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13916>	2021-12-01 22:12:19 +00:00
Emma Anholt	e770ec1182	nir: Uninline a bunch of nir.h functions. I aimed for "things that look like big switch statements, or cases where the compiler is unlikely to be able to constant-propagate an argument into something useful." Saves another 80kb on disk. No perf difference on iris shader-db, n=23. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13916>	2021-12-01 22:12:19 +00:00
Lionel Landwerlin	8e568d3f00	nir/opt_deref: don't try to cast empty structures Found while running valgrind : ==3583454== Invalid read of size 4 ==3583454== at 0xF48336: glsl_get_struct_field_offset (nir_types.cpp:84) ==3583454== by 0xC7CD0D: opt_replace_struct_wrapper_cast (nir_deref.c:1068) ==3583454== by 0xC7CDD9: opt_deref_cast (nir_deref.c:1087) ==3583454== by 0xC7DD8E: nir_opt_deref_impl (nir_deref.c:1369) ==3583454== by 0xC7DF4E: nir_opt_deref (nir_deref.c:1428) ==3583454== by 0xA63F3C: brw_kernel_from_spirv (brw_kernel.c:325) ==3583454== by 0xA3BC2C: main (intel_clc.c:481) ==3583454== Address 0xe4f7e88 is 24 bytes after a block of size 48 in arena "client" Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13952>	2021-12-01 08:24:39 +00:00
Mykhailo Skorokhodov	391569e911	nir: Fix read depth for predecessors In some non-trivial cases (the amber script file in the merge request description) phi instruction has more than 32 elements in predecessors tree and that isn't recursion, just large tree. In that case, phis not fully converted into a register or mov, but successfully removed. The fix removes the counter and adds container of visited blocks. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3690 Cc: mesa-stable Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13710>	2021-11-30 00:12:48 +00:00
Rhys Perry	32a8b391e3	nir/tests: add DCE test for loops following a jump Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10284>	2021-11-29 22:22:24 +00:00
Rhys Perry	cc5dd15417	nir/cf: fix insertion of loops/ifs after jumps Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10284>	2021-11-29 22:22:24 +00:00
Rhys Perry	2fe13aa2ad	nir/dce: fix DCE of loops with a halt or return instruction in the pre-header If there is a halt or return instruction right before a loop with a single continue, we would have taken the fast path intended for loops without continues. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `71a985d80b` ("nir/dce: perform DCE for unlooped instructions in a single pass") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10284>	2021-11-29 22:22:24 +00:00
Ilia Mirkin	b7f423006a	nir/lower_clip: support clipdist array + no vars This runs after the "to io" lowering on freedreno. Support this case. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13917>	2021-11-28 04:44:56 +00:00
Ilia Mirkin	7efb1c4b29	nir/lower_clip: increment num_inputs/outputs by appropriate amount The inputs/outputs are meant to be in vec4 units. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13917>	2021-11-28 04:44:56 +00:00
Ilia Mirkin	3bf47700e2	nir/lower_clip: location offset goes into offset, not base Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13917>	2021-11-28 04:44:56 +00:00
Ilia Mirkin	a8930e6302	nir/lower_clip: replace bogus comment about gl_ClipDistance reading in GL gl_ClipDistance most definitely can be read in fragment shaders since GLSL 1.30. This is also accessible in ES with EXT_clip_cull_distance. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13917>	2021-11-28 04:44:56 +00:00
Marek Olšák	e54264c84f	nir: add shader_info::source_sha1, its initialization and printing Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13869>	2021-11-26 11:58:27 +00:00
Rhys Perry	34510ce3cc	nir/lower_subgroups: fix left shift of -1 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5365 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12901>	2021-11-24 16:45:05 +00:00
Rhys Perry	811a7a2d31	nir/lower_tex: don't calculate texture_mask for texture_index>=32 With Vulkan, texture_index can be 32 or larger, which creates a shift exponent larger than 31 (undefined behaviour). Since we don't use texture_mask with Vulkan, just initialize it to 0. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5365 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12901>	2021-11-24 16:45:04 +00:00
Rhys Perry	b425100781	spirv: run nir_copy_prop before nir_rematerialize_derefs_in_use_blocks_impl spirv_to_nir sometimes wraps derefs in vec2 or mov instructions as part of its texture handling. These get in the way of nir_rematerialize_derefs_in_use_blocks_impl. Running copy propagation should get rid of the extra move instructions and get us back to intact deref chains for everything except variable pointer use-cases. fossil-db (Sienna Cichlid): Totals from 6 (0.00% of 134572) affected shaders: CodeSize: 92656 -> 93088 (+0.47%) Instrs: 17060 -> 17138 (+0.46%) Latency: 224408 -> 227539 (+1.40%) InvThroughput: 37402 -> 37924 (+1.40%) VClause: 408 -> 402 (-1.47%) Copies: 1065 -> 1107 (+3.94%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5668 Fixes: `14a12b771d` ("spirv: Rework our handling of images and samplers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13924>	2021-11-24 15:43:51 +00:00
Danylo Piliaiev	99388f0c27	freedreno/ir3: handle global atomics Only for a6xx since we don't know the instructions for global atomics on previous gens. Per Qualcomm's docs in OpenCL atomics are only supported since a5xx together with Generic memory space. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8717>	2021-11-23 18:26:37 +00:00
Emma Anholt	7603187aec	nir: Un-inline more of nir_builder.h. Cuts another 470KB of libnir.a in my release build. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13889>	2021-11-22 20:40:47 +00:00
Emma Anholt	d9bfcf5f5b	nir: Un-inline nir_builder_alu_instr_finish_and_insert() This function is big and I don't think it will won't get meaningfully constant-propagated during inlining without LTO. Move it to a .c file so we just have one copy, saving 2.8MB from libnir.a on an amd64 release build. text data bss total filename before: 18953406 7768312 687260 27408978 build-release/driver-symlinks/iris_dri.so 9734366 5542453 481692 15758511 build-release/lib/libvulkan_intel.so 28687772 13310765 1168952 43167489 (TOTALS) after: 15478350 7767864 687260 23933474 build-release/driver-symlinks/iris_dri.so 6810366 5541685 481692 12833743 build-release/lib/libvulkan_intel.so 22288716 13309549 1168952 36767217 (TOTALS) No statistically significant performance difference on iris shader-db, n=8. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13889>	2021-11-22 20:40:47 +00:00
Ilia Mirkin	3b5b4b5d45	nir: apply interpolated input intrinsics setting when lowering clipdist For drivers that use this in fragment shaders, load_input is going to produce incorrect results (flat-shaded values). Fixes clipping tests on a4xx. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13900>	2021-11-22 20:11:19 +00:00
Ilia Mirkin	df934873e1	nir: always keep the clip distance array size updated Drivers expect to know the number of clip distances irrespective of whether compact arrays are used or not. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13900>	2021-11-22 20:11:19 +00:00
Connor Abbott	508f917d8c	util/dag: Make edge data a uintptr_t Nobody was actually using it as a pointer, and I'm going to introduce a shared function which relies on it not being a pointer so let's fix this once and for all. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>	2021-11-17 13:41:47 +00:00
Samuel Pitoiset	011ea32585	nir: fix constant expression of ibitfield_extract This fixes dEQP-VK.graphicsfuzz.cov-condition-bitfield-extract-integer. For example, nir_ibitfield_extract(3, 1, 2) should return 1. Cc: 21.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13791>	2021-11-16 17:32:21 +00:00
Timur Kristóf	59860d4873	nir: Group per-primitive outputs at the end for driver location assign. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	f23f7ef316	nir: Don't compact per-vertex and per-primitive outputs together. Prevent nir_compact_varyings from putting per-vertex and per-primitive output components in the same slot. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	e1e461d11c	nir: Lower cull and clip distance arrays for mesh shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	6a502a0a2c	nir: Add new option to lower invocation ID from invocation index. Add this as an option to nir_lower_compute_system_values_options instead of just relying on the shader's options. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	7562e34463	nir, spirv: Don't mark NV_mesh_shader primitive indices as per-primitive. They are not per-primitive in NV_mesh_shader, but a flat array. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	d79d9a7a06	nir: Fix nir_lower_io with per primitive outputs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	9cf4124be0	nir: Print Mesh Shader specific info. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Timur Kristóf	5aa39253cb	nir: Rename nir_get_io_vertex_index_src and include per-primitive I/O. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466>	2021-11-16 07:46:55 +00:00
Ilia Mirkin	185826a400	nir: remove double-validation of src component counts The nir_tex_instr_src_size helper already sorts this out correctly, no need to do it twice, and validate_src takes care of it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13781>	2021-11-16 01:23:41 +00:00
Daniel Schürmann	1e4c6e059e	nir/fold_16bit_sampler_conversions: skip sparse residency tex instructions The residency return value mismatches between NIR and Radeon. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13592>	2021-11-15 18:28:20 +00:00
Rhys Perry	719b48f85d	nir/lower_system_values: replace local_invocation_id components with zero fossil-db (Sienna Cichlid): Totals from 360 (0.28% of 128647) affected shaders: VGPRs: 7912 -> 7272 (-8.09%); split: -8.59%, +0.51% CodeSize: 542456 -> 544688 (+0.41%); split: -0.32%, +0.73% MaxWaves: 10866 -> 10952 (+0.79%) Instrs: 95973 -> 96010 (+0.04%); split: -0.34%, +0.38% Latency: 4366023 -> 4344664 (-0.49%); split: -0.90%, +0.41% InvThroughput: 19656659 -> 18297185 (-6.92%); split: -6.92%, +0.00% VClause: 3242 -> 3116 (-3.89%); split: -4.04%, +0.15% SClause: 3422 -> 3504 (+2.40%); split: -0.20%, +2.60% Copies: 8854 -> 9376 (+5.90%); split: -0.89%, +6.79% Branches: 2329 -> 2326 (-0.13%); split: -0.39%, +0.26% PreSGPRs: 7620 -> 7841 (+2.90%); split: -0.43%, +3.33% PreVGPRs: 5765 -> 5504 (-4.53%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel-schuermann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13757>	2021-11-12 18:59:51 +00:00
Alyssa Rosenzweig	e257344a82	nir/lower_pntc_ytransform: Support PointCoordIsSysval Pattern match the point coord sysval and support lowering it as well. This is required to handle flipped framebuffers on Bifrost. However, what this pass normalizes to is the opposite of the hardware mode we used on Bifrost before, so we need to swap modes at the same time to prevent regressions. Fixes Piglit glsl-fs-pointcoord and glsl-fs-pointcoord_gles2 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13073>	2021-11-12 12:34:14 +00:00
Marek Olšák	33b4eb149e	nir: add new SSA instruction scheduler grouping loads into indirection groups Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13604>	2021-11-08 21:20:11 +00:00
Filip Gawin	f32dcb6fe1	nir: assert that variables in optimize_atomic are initialized If you gonna view context of function parse_atomic_op, then you gonna know that index for array (data_src) can be unitialized. Imho this approach is cleaner than doing stuff inside parse_atomic_op. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12995>	2021-11-08 15:10:07 +00:00
Rhys Perry	12294026d5	nir/algebraic: optimize Cyberpunk 2077's open-coded bitfieldReverse() fossil-db (Sienna Cichlid): Totals from 9 (0.01% of 128647) affected shaders: CodeSize: 29900 -> 28640 (-4.21%) Instrs: 5677 -> 5443 (-4.12%) Latency: 96561 -> 95025 (-1.59%) Copies: 571 -> 544 (-4.73%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13673>	2021-11-05 09:31:04 +00:00
Mike Blumenkrantz	16f838576c	nir/lower_io_to_scalar: add support for bo and shared io Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13485>	2021-10-27 16:46:01 +00:00
Alyssa Rosenzweig	d8b1afdc85	nir/lower_blend: Use correct clamp for SNORM nir_lower_blend was written against the OpenGL ES 3.2 specification, which does not support blending SNORM render targets. The ES spec says that non-floating point buffers get clamped to [0, 1] before blending. The story is not so simple: SNORM buffers are blendable in OpenGL and must clamped to [-1, 1] rather than [0, 1]. Handle this case. NIR does have the fsat_signed_mali instruction to clamp to [-1, 1], but it is only implemented in Panfrost, and this pass is in common code. Open code it instead. Panfrost optimizes the open coded version, so this is good enough. Fixes SNORM subtests of Piglit arb_texture_view-rendering-formats. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13499>	2021-10-26 19:16:36 +00:00
Danylo Piliaiev	b7c7abded7	nir/serialize: Make more space for intrinsic_op allowing 1024 ops We are close to the limit of 512 intrinsics, make more space to be able to support up to 1024 intrinsics. Take one bit from packed_const_indices, they shouldn't suffer in a common case. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13456>	2021-10-25 16:17:09 +00:00
Danylo Piliaiev	1eee1fda11	nir/lower_amul: do not lower 64bit amul to imul24 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>	2021-10-21 18:59:57 +00:00
Caio Marcelo de Oliveira Filho	662fbc0120	nir: Use a single binary for gtests Less artifacts and less time running linker. The load_store_vectorizer test is still split since we need to update gitlab-ci scripts to skip certain tests in certain builds. Added a TODO with the concrete suggestion. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13414>	2021-10-20 18:26:31 +00:00
Jason Ekstrand	b62b2fa4b9	compiler/types: Add a wrap_in_arrays helper This has been copied+pasted 3 times now. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>	2021-10-16 05:49:34 +00:00
Jason Ekstrand	5818d47ae6	spirv: Use texture types for sampled images Instead of using gsamplerND types for sampled images, use the new gtextureND types for sampled images and reserve gsamplerND for combined image+samplers. Combined image+sampler bindings still get a gsamplerND type. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>	2021-10-16 05:49:34 +00:00
Jason Ekstrand	b8a0bf2343	nir/deref: Also optimize samplerND -> textureND casts Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>	2021-10-16 05:49:34 +00:00
Jason Ekstrand	2ab5546a96	nir: Allow texture types Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>	2021-10-16 05:49:34 +00:00
Jason Ekstrand	3ace6b968b	compiler/types: Add a texture type This is separate from images and samplers. It's a texture (not a storage image) without a sampler. We also add C-visible helpers to convert between sampler and image types. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>	2021-10-16 05:49:34 +00:00
Jason Ekstrand	d343aef942	nir/serialize: Pack deref modes better With nir_var_image, we've now run out of bits in our packed blob for deref instructions. We could revert to an unpacked blob or we could be a bit more clever about how we encode deref modes and pack them into 5 bits. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13386>	2021-10-16 03:47:10 +00:00
Jason Ekstrand	9272a952c9	nir: Re-arrange the variable modes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13386>	2021-10-16 03:47:10 +00:00
Jason Ekstrand	956199e870	nir: s/nir_var_mem_image/nir_var_image/g We typically use nir_var_mem_* for stuff that has an explicit byte-based memory layout. Images are opaque. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13386>	2021-10-16 03:47:10 +00:00
Dylan Baker	e73096bd6d	meson: use gtest protocol for gtest based tests when possible With the `gtest` protocol meson will add some extra arguments to the test to generate better junit results, which may be useful. This protocol is only available in meson 0.55.0+, so keep using the default `exitcode` protocol for meson older than that. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8484>	2021-10-16 03:22:24 +00:00
Jason Ekstrand	58f605e4d4	nir: Drop our attempt at typed-based image mode validation This is broken for bindless images declared as local variables. It turns out nir_variable::data::bindless is only used for uniforms and we already assume anything in nir_var_function_temp or similar is bindless. We could try to make a tricky assert but now that we have everything else passing but now that we've got everyone converted the extra validation probably isn't necessary. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13384>	2021-10-15 22:35:59 +00:00
Jason Ekstrand	4c5a88d735	nir: Validate image variable modes We can also significantly simplify the foreach_image_variable helper. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743>	2021-10-15 14:58:56 +00:00
Jason Ekstrand	6818811fc4	nir/lower_readonly_images_to_tex: Also rewrite variable modes Storage images will start using nir_var_mem_image but sampled images still use nir_var_uniform. If we're going to rewrite types, we need to rewrite the modes as well. Otherwise, nir_validate will get grumpy and drivers might get confused. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743>	2021-10-15 14:58:56 +00:00
Jason Ekstrand	2a53c33fbe	nir: Add a nir_foreach_image_variable() iterator Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743>	2021-10-15 14:58:55 +00:00
Caio Marcelo de Oliveira Filho	de3705edb0	nir: Add nir_var_mem_image Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743>	2021-10-15 14:58:55 +00:00
Caio Marcelo de Oliveira Filho	872750bb96	nir/schedule: Handle nir_intrisic_scoped_barrier Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743>	2021-10-15 14:58:55 +00:00
Mike Blumenkrantz	f769f34680	nir/print: print bindless info as applicable this is useful to know Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13204>	2021-10-14 15:11:38 +00:00
Ian Romanick	ae99ea6f4d	nir/loop_unroll: Always unroll loops that iterate at most once Two carchase compute shaders (shader-db) and two Fallout 4 fragment shaders (fossil-db) were helped. Based on the NIR of the shaders, all four had structures like for (i = 0; i < 1; i++) { ... for (...) { ... } } All HSW+ platforms had similar results. (Ice Lake shown) total loops in shared programs: 6033 -> 6031 (-0.03%) loops in affected programs: 4 -> 2 (-50.00%) helped: 2 HURT: 0 All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 143692018 -> 143692006 (-0.0%) SENDs in all programs: 6947154 -> 6947154 (+0.0%) Loops in all programs: 38285 -> 38283 (-0.0%) Cycles in all programs: 8434822225 -> 8434476815 (-0.0%) Spills in all programs: 191665 -> 191665 (+0.0%) Fills in all programs: 298822 -> 298822 (+0.0%) In the presense of loop unrolling like this, the change in cycles is not accurate. v2: Rearrange the logic in the if-condition to read a little better. Suggested by Tim. Closes: #5089 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13323>	2021-10-13 20:11:13 -07:00
Qiang Yu	50c0451424	nir/linker: rename replace_constant_input to replace_varying_input_by_constant_load To align with replace_varying_input_by_uniform_load and better describe what it does. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12613>	2021-10-13 04:45:15 +00:00
Qiang Yu	2604625043	nir/linker: support uniform when optimizing varying Varying assigned from uniform won't change after interpolation, so move uniform load to fragment shader to eliminate the varying. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12613>	2021-10-13 04:45:15 +00:00
Filip Gawin	28a6e45a0f	nir: avoiding reading unitialized memory when using nir_dest_copy Deeper in chain of calls, function "src_has_indirect" is used (which reads "is_ssa" and "reg.indirect"). Fixes: `d1eae6f36b` ("nir: Properly clean up nir_src/dest indirects") Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13317>	2021-10-13 02:21:20 +00:00
Connor Abbott	b516208a55	nir/lower_ubo_vec4: Fix align_mul=8 special case In order for the load to never straddle the load can't extend past 8 bytes, not 16. For example a vec2 load with align_mul = 8 and align_offset = 4 can straddle. Fixes assertion failures when we stop pushing UBOs in the preamble on a6xx. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13142>	2021-10-12 11:30:52 +00:00
Jason Ekstrand	878d8d96c7	nir/lower_discard_or_demote: Fix metadata Passes generally shouldn't use nir_metadata_all unless they don't change the program in any significant way. Some of these passes insert new instructions so they should definitely not be preserving most of it. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13261>	2021-10-08 23:24:49 +00:00
Chia-I Wu	8cce6281e6	util/vector: make util_vector_init harder to misuse Make u_vector_init a wrapper to u_vector_init_pot. Let both take (element_count, element_size) as parameters. Motivated by `eed0fc4caf` ("vulkan/wsi/wayland: fix an invalid u_vector_init call") v2: rename u_vector_init_pot to u_vector_init_pow2 Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Simon Ser <contact@emersion.fr> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>	2021-10-08 00:15:11 +00:00
Boris Brezillon	56251f924d	nir: Add a nir_sysvals_to_varyings() helper Allow backends to turn some sysvals into input varyings so the frontend (in our case spirv_to_nir()) doesn't have to bother selecting which one is expected. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>	2021-10-07 19:45:35 +00:00
Jason Ekstrand	b71bdc3404	nir/algebraic: Add some opts for comparisons of comparisons Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13167>	2021-10-07 18:21:11 +00:00
Jason Ekstrand	7abf3955ca	nir/algebraic: Add some boolean optimizations Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13167>	2021-10-07 18:21:11 +00:00
Jason Ekstrand	c8b2be0b95	nir/algebraic: Lower fisfinite Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13167>	2021-10-07 18:21:11 +00:00
Rhys Perry	f3723822a4	nir/lower_tex: add lower_to_fragment_fetch_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12214>	2021-10-07 15:36:39 +00:00
Rhys Perry	225fe37c14	nir: add _amd suffix to fragment_mask_fetch and fragment_fetch texops Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12214>	2021-10-07 15:36:39 +00:00
Marcin Ślusarz	3a18963b08	nir/print: pad 64-bit constants with zeroes ... just like other-size constants are. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13223>	2021-10-07 10:49:15 +00:00
Emma Anholt	7dde279db5	nir-to-tgsi: Avoid emitting TXL just for lod 0 on non-vertex shaders. Prompted by comparing virgl fails and finding that it has issues with immediate args to TXL/TXB, at least. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Ian Romanick	cb28361642	nir/algebraic: Small optimizations for SpvOpFOrdNotEqual and SpvOpFUnordEqual No shader-db changes on any Intel platform. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 144380118 -> 143692823 (-0.5%) SENDs in all programs: 6920822 -> 6920822 (+0.0%) Loops in all programs: 38299 -> 38299 (+0.0%) Cycles in all programs: 8434782176 -> 8423078994 (-0.1%) Spills in all programs: 206830 -> 204469 (-1.1%) Fills in all programs: 318737 -> 313660 (-1.6%) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>	2021-10-06 01:53:47 +00:00
Alyssa Rosenzweig	3e8f540753	nir: Add Mali-specific derivative opcodes Add derivative opcodes fddx_must_abs_mali/fddy_must_abs_mali satisfying: fabs(fdd_must_abs_mali(v)) = fabs(fdd(v)) The sign of their result is undefined. On Bifrost and Valhall, these unsigned derivatives can be implemented more efficiently than the correctly-signed counterparts, since the sign fixup requires extra ALU instructions. On backends where this is the case, it is useful to optimize fabs(fdd(v)) to fabs(fdd_must_abs_mali(v)). This pattern comes up with the GLSL builtin `fwidth`. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>	2021-10-06 00:40:57 +00:00
Lionel Landwerlin	d0a3a11258	nir/lower_io: preserve all metadata when no progress Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13168>	2021-10-05 11:23:23 +00:00
Marcin Ślusarz	e26328582a	nir: preserve all metadata when nir_opt_vectorize doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Marcin Ślusarz	54df09c8d4	nir: preserve all metadata when nir_propagate_invariant doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Marcin Ślusarz	804c56f1a2	nir: preserve all metadata when nir_lower_int_to_float doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Boris Brezillon	7cd402c9c8	nir/lower_blend: Shrink blended result if needed Make sure the new and old sources have the same number of components, otherwise the NIR validation pass complains. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	3e07b8d4f8	nir/lower_blend: Make sure we're not passed scaled formats SCALED formats are interpreted as floats, but not in the usual [0, 1] or [-1, 1] range, meaning that the blend lowering logic can't directly apply to those. Assert that the format being passed is not a scaled format. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	15b4cab4d5	nir/lower_blend: Don't lower RTs whose format is set to NONE The caller doesn't necessarily want to lower blend operations on all render targets since some of them might be supported natively (panvk will be in that case). Let's just skip RTs that have a format set to PIPE_FORMAT_NONE to allow that. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	637cd5ac00	nir/lower_blend: Pad src to a 4-component vector nir_ssa_for_src() is not supposed to pad the src vector if dst->num_components > src->num_components. Let's pad things explicitly with nir_pad_vector(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	641bed3103	nir: Make sure src->num_components < dst->num_components in nir_ssa_for_src() The NIR validation complains if the swizzle accesses a component that's not present in the source. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Lionel Landwerlin	daa8a81d99	nir: fix opt_memcpy src/dst mixup Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `f6667cb0ce` ("nir: Add a memcpy optimization pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13079>	2021-09-28 16:36:08 +00:00
Rhys Perry	e43007af56	nir/opt_if: add opt_if_rewrite_uniform_uses Turns: if (a == (b=readfirstlane(a))) use(a) into: if (a == (b=readfirstlane(a))) use(b) Improves divergence analysis and lets us scalarize use(a). Improves Cyberpunk 2077 performance. fossil-db (Sienna Cichlid, Cyberpunk 2077): Totals from 57 (10.56% of 540) affected shaders: VGPRs: 4904 -> 4040 (-17.62%) CodeSize: 624360 -> 626828 (+0.40%); split: -0.06%, +0.46% MaxWaves: 656 -> 824 (+25.61%) Instrs: 119770 -> 119447 (-0.27%); split: -0.49%, +0.22% Latency: 1950256 -> 1633110 (-16.26%); split: -16.26%, +0.00% InvThroughput: 364852 -> 292089 (-19.94%) VClause: 1512 -> 1008 (-33.33%) SClause: 2693 -> 3196 (+18.68%) Copies: 10050 -> 9955 (-0.95%); split: -3.34%, +2.40% Branches: 3476 -> 3547 (+2.04%) PreSGPRs: 4003 -> 5076 (+26.80%) PreVGPRs: 4709 -> 3810 (-19.09%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12472>	2021-09-24 18:41:18 +00:00
Rhys Perry	69f9a96af1	nir: add nir_src_components_read() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12472>	2021-09-24 18:41:18 +00:00
Caio Marcelo de Oliveira Filho	240e60ba76	nir/lower_io_to_vector: Allow Task/Mesh to load from outputs Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12951>	2021-09-24 14:35:15 +00:00
Bas Nieuwenhuizen	0d8bd8518d	nir: Support ray launch size in divergence analysis. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	56b06c09b4	nir: Add AMD rt intrinsics. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	b6be96a2bd	radv: Modify load_sbt_amd intrinsic to get the descriptor. That way we can get the address to the entry, which is needed for some nir builtins because extra data in the entry can be used as shader input. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Timur Kristóf	872d21820f	nir: Exclude non-generic patch variables from get_variable_io_mask. These are I/O variables which are not going to be removed anyway. However, get_variable_io_mask handles their location incorrectly. Found using the GCC undefined behavior sanitizer. Fixes the following error: runtime error: shift exponent 4294967258 is too large for 64-bit type 'long unsigned int' Closes: #5319 Fixes: `cf5f8f55c3` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>	2021-09-20 18:08:16 +00:00
Ian Romanick	d7ba52cce9	nir/edgeflags: Add a flag to indicate the edge flag input is needed Most modern hardware needs the edge flag added as a hidden vertex input and needs code added to the vertex shader to copy the input to an output. Intel hardware is a little different. Gfx4 and Gfx5 hardware works in the previously described mannter. Gfx6+ hardware needs the edge flag as a specific vertex shader input, and that input is magically processed by fixed-function hardware without need for extra shader code. This flag signals only that the vertex shader input is needed. It would be nice if we could decouple adding the vertex shader input from generating the copy-to-output code, but that has proven to be challenging. Not having that code causes other passes to want to eliminate that shader input. v2: Convert conditional to assertion. This pass is only called for vertex shaders. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>	2021-09-17 16:36:08 -07:00
Rhys Perry	a1af902531	nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants This allows for more MAD/FMA instructions to be created. fossil-db (Sienna Cichlid): Totals from 50134 (33.46% of 149839) affected shaders: VGPRs: 2436536 -> 2436000 (-0.02%); split: -0.05%, +0.03% SpillSGPRs: 13136 -> 13135 (-0.01%); split: -0.02%, +0.02% CodeSize: 206621424 -> 206278292 (-0.17%); split: -0.23%, +0.07% MaxWaves: 1116804 -> 1117448 (+0.06%); split: +0.07%, -0.01% Instrs: 38977460 -> 38862886 (-0.29%); split: -0.33%, +0.04% Latency: 832425389 -> 827432260 (-0.60%); split: -0.63%, +0.03% InvThroughput: 184193457 -> 183563350 (-0.34%); split: -0.37%, +0.03% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7458>	2021-09-17 17:28:26 +00:00
Jason Ekstrand	6c7d23e6ca	nir: Stop sweeping indirects They're no longer ralloc'd. Fixes: `879a569884` "nir: Switch from ralloc to malloc for NIR instructions." Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>	2021-09-16 11:28:36 +00:00
Jason Ekstrand	d1eae6f36b	nir: Properly clean up nir_src/dest indirects Now that they're no longer ralloc'd, we have to be much more careful about indirects. We have to make sure every time a source or destination is overwritten, its indirect (if any) is freed. We also have to choose a memory ownership convention for the rewrite functions. Assuming that they will be called with the source from some other instruction, we choose to always make a copy of the indirect (if any). It's the responsibility of the caller to ensure its copy of the indirect is freed. Unfortunately, all this extra logic is going to make nir_instr_rewrite/move_src/dest more expensive because they now have all the logic of nir_src/dest_copy instead of a simple struct assignment. Fortunately, the vast majority of rewrite calls are done by nir_ssa_def_rewrite_uses which is an SSA-only fast-path. Fixes: `879a569884` "nir: Switch from ralloc to malloc for NIR instructions." Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>	2021-09-16 11:28:36 +00:00
Emma Anholt	aed4c0b5a9	nir: Drop the unused instr arg for src/dest copy functions. Now that we don't use ralloc, we don't need this arg to get at the right ralloc ctx. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	879a569884	nir: Switch from ralloc to malloc for NIR instructions. By replacing the 48-byte ralloc header with our exec_node gc_node (16 bytes), runtime of shader-db on my system across this series drops -4.21738% +/- 1.47757% (n=5). Inspired by discussion on #5034. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	feee5e6974	nir/tests: Fix transmuting an SSA dest to be non-SSA With the de-ralloc changes, having the register dest not have its .reg properly initialized caused crashes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	1edff520e2	nir/lower_phis_to_scalar: Use nir_instr_free() to free instrs. Preparation for de-rallocing instrs. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	d1a2870f78	nir: Add all allocated instructions to a GC list. Right now we're using ralloc to GC our NIR instructions, but ralloc has significant overhead for its recursive nature so it would be nice to use a simpler mechanism for GCing instructions. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	22788d68eb	nir: Consistently pass the instr to nir_src_copy(). The arg says it's supposed to be the instr, not the shader. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Emma Anholt	5e37cfb7fe	nir: Consistently pass the shader to the shader arg of instr creation. We were using the ralloc parent in some places, which should work out to be the shader I think, but to de-ralloc the instrs we should just pass the existing shader pointer in. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Emma Anholt	7a4bbe60c1	nir/from_ssa: Use nir_instr_free() to free instrs instead of ralloc. This code was being tricky with passing a mem_ctx instead of the shader, then freeing the mem_ctx when the pass was done and all the parallel copies had been removed from the shader. Use the right type for instr creation and do a bit of manual list management to prepare the way for non-ralloc NIR instrs. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Emma Anholt	b99efb8af0	nir: Pull the instr list free function out to a helper. With the de-rallocing, we're going to have some more places that free a list of instrs. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Emma Anholt	36d9bdca0b	nir: Add a nir_instr_free() to replace ralloc_free(instr). This will gain another step shortly. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Ian Romanick	7956a701d8	nir/lower_gs_intrinsics: Make nir_lower_gs_intrinsics be idempotent Calling this lower pass twice in a row would cause spurious set_vertex_and_primitive_count(0, undef) intrinsics after the proper set_vertex_and_primitive_count intrinsic. This pretty much turns any geometry shader into garbage. Fix this by treating nir_intrinsic_emit_vertex_with_counter and nir_intrinsic_end_primitive_with_counter just like the non-_with_counter versions. If no blocks would need set_vertex_and_primitive_count intrinsics added, exit the pass before doing any work. This prevents the need for DCE to do extra clean up later. Since this pass is potentially called multiple times via multiple invocations of a finalize_nir callback, it is (hypothetically?) possible that control flow could be changed to add new blocks that need this intrinsic. The check implemented in this commit should be robust against that possibility. v2: Add a_block_needs_set_vertex_and_primitive_count. Suggested by Timur. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12802>	2021-09-14 09:13:07 -07:00
Ian Romanick	edf357b233	nir/lower_gs_intrinsics: Return progress if append_set_vertex_and_primitive_count makes progress Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `542d40d698` ("nir: Add new GS intrinsics that maintain a count of emitted vertices.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12802>	2021-09-14 09:12:47 -07:00
Bas Nieuwenhuizen	b05cd10b8e	nir: Avoid visiting instructions multiple times in nir_instr_free_and_dce. Sadly need to poke a bit in the src internals to avoid using yet another heap allocated datastructure. Fixes: `5251548572` ("nir: Add a nir_instr_remove that recursively removes dead code.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5323 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12726>	2021-09-09 21:35:03 +00:00
Rhys Perry	c1f724b2b9	nir: fix serialization of loop/if control Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `e76ae39ae2` ("nir: add support for user defined select control") Fixes: `b56451f82c` ("nir: add support for user defined loop control") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12778>	2021-09-09 10:32:30 +00:00
Qiang Yu	7054c1b7fd	nir/linker: pack varyings with different interpolation qualifier Driver like radeonsi load varying in a scalar manner, so prefer to pack varying with different interpolation qualifier into same slot to save space. But driver like panfrost/bifrost can load varying in vector manner, so prefer to pack varying with same interpolation qualifier. Driver can add interpolation qualifiers which are able to be packed into same varying slot to pack_varying_options nir option. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12537>	2021-09-09 06:00:58 +00:00
Qiang Yu	5a24aed1ac	nir/lower_io_to_vector: check centroid & sample when merge variable These qualifiers should be respected for different varying load code generation. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12537>	2021-09-09 06:00:58 +00:00
Rob Clark	b8b475ad4e	nir/lower_amul: Fix usage of nir_foreach_src() nir_foreach_src() bails after cb returns false for any src. Which isn't the behavior we were looking for. Move progress flag to state struct instead, so we don't skip visiting some sources. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12732>	2021-09-06 15:58:05 +00:00
Rob Clark	5800fde1bb	nir/lower_amul: Handle load/store_global These need more than 24b. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12732>	2021-09-06 15:58:05 +00:00
Enrico Galli	9461fe5cf1	nir: Add CAN_REORDER to load_ubo_dxil Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12707>	2021-09-03 16:21:03 +00:00
Rhys Perry	41ecef7855	nir: add sdot_2x16 and udot_2x16 opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Rhys Perry	ae00f5af61	nir: separate lower_add_sat Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00

... 3 4 5 6 7 ...

3759 Commits