KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	8993e0973f	intel/nir: Drop an unneeded lower_constant_initializers call Even though this is technically a step in the function inlining process as laid out in nir_inline_functions.c, it's not really needed. We already have constant initializers lowered here and no new ones are added by appending the softfp64 functions. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	fa4824c1db	intel/debug: Add a debug flag to force software fp64 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Ian Romanick	55e6454d5e	intel/fs: Fix extract_u8 of an odd byte from a 64-bit integer In the old code, we would generate the exact same instruction for extract_u8(some_u64, 0) and extract_u8(some_u64, 1). The mask-a-word trick only works for even numbered bytes. This fixes the (new) piglit test tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test. v2: Use a SHR instead of an AND. This saves an instruction compared to using two moves. Suggested by Jason. Fixes: `6ac2d16901` ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:45 -08:00
Ian Romanick	4aaf139ea4	intel/fs: nir_op_extract_i8 extracts a byte, not a word Fixes: `6ac2d16901` ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:42 -08:00
Ian Romanick	bbf20a1ca3	intel/compiler: Silence unused parameter warning in brw_interpolation_map.c The parameter is never used, and it's not part of a common interface idiom. Remove it. src/intel/compiler/brw_interpolation_map.c: In function ‘brw_setup_vue_interpolation’: src/intel/compiler/brw_interpolation_map.c:62:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo) ^~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:36 -08:00
Ian Romanick	dea19138dd	intel/compiler: Silence many unused parameter warnings in brw_eu.h In file included from src/intel/compiler/brw_eu_util.c:34:0: src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’: src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_desc_header_present(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’: src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’: src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc_ex_mlen(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’: src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_binding_table_index(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’: src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_sampler(const struct gen_device_info devinfo, uint32_t desc) ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’: src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_return_format(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’: src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_dp_desc_binding_table_index(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’: src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, /< 0 for SIMD4x2 / ^~~~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’: src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, ^~~~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:31 -08:00
Timothy Arceri	81ee2cd8ba	glsl: rename is_record() -> is_struct() Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Caio Marcelo de Oliveira Filho	69cc6272fb	anv: Implement VK_EXT_external_memory_host v2: Ignore the import if handleType == 0. (Jason) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 12:59:50 -08:00
Jason Ekstrand	43f40dc7cb	anv: Implement VK_EXT_inline_uniform_block Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	61e009d2c4	spirv: Use the same types for resource indices as pointers We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	c520f4dec9	anv: Add a concept of a descriptor buffer This buffer goes along side the CPU data structure and may contain pointers, bindless handles, or any other descriptor information. Currently, all descriptors are size zero and nothing goes in the buffer but this commit sets up the framework we will need later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	5c30fffeec	anv: Take references to push descriptor set layouts Technically, descriptor set layouts aren't required to survive past the function they're passed into so we need to reference them. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	8ab95b849e	anv: Refactor descriptor pushing a bit Pull the common code out of the two entrypoints into the helper which fetches the push descriptor set for us. Now that it does more than just get a thing, call it anv_cmd_buffer_push_descriptor_set. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	cab064bc10	anv: drop add_var_binding from anv_nir_apply_pipeline_layout.c It has exactly one caller. Just inline it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	49cf61c6aa	anv: Clean up descriptor set layouts The descriptor set layout code in our driver has undergone many changes over the years. Some of the fields which were once essential are now useless or nearly so. The has_dynamic_offsets field was completely unused accept for the code to set and hash it. The per-stage indices were only being used to determine if a particular binding had images, samplers, etc. The fact that it's per-stage also doesn't matter because that binding should never be accessed by a shader of the wrong stage. This commit deletes a pile of cruft and replaces it all with a descriptive bitfield which states what a particular descriptor contains. This merely describes the data available and doesn't necessarily dictate how it will be lowered in anv_nir_apply_pipeline_layout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	4c50b7c92c	anv: Count image param entries rather than images This is what we're actually storing in the descriptor set and consuming when we bind surface states. This commit renames image_count to image_param_count a few places and moves the decision to not count image params on gen9+ into anv_descriptor_set.c when we build the layout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	3822c7495a	anv: Stop allocating buffer views for dynamic buffers We emit the surface states for those on-the-fly so we don't need the buffer view. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	8c6d410a50	anv: Rework arguments to anv_descriptor_set_write_* Make them all take a device followed by a set. This is consistent with how the actual Vulkan entrypoint parameters are laid out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	5b7a9e7398	anv/descriptor_set: Refactor alloc/free of descriptor sets This commit just puts the free list code together as part of the pool instead of having it inlined into the descriptor set create code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Eric Engestrom	3d4238d26c	anv: use the platform defines in vk.xml instead of hard-coding them Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 11:57:10 +00:00
Lionel Landwerlin	e21c201c96	anv: update supported patch version Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-05 10:39:17 +00:00
Tapani Pälli	3bb8768b9d	anv: toggle on support for VK_EXT_ycbcr_image_arrays We already propagate coord_components correctly and did not have layer restrictions for ycbcr formats. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:39:17 +00:00
Tapani Pälli	33bf3d510c	anv: retain the is_array state in create_plane_tex_instr_implicit This does not seem to fix anything ATM but is the right thing todo. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `f3e91e78a3` ("anv: add nir lowering pass for ycbcr textures") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:38:31 +00:00
Jason Ekstrand	0010d0348a	anv/pipeline: Drop anv_fill_binding_table We zero out the prog data anyway and, now that bias is always zero, this function is accomplishing nothing. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-04 23:56:40 +00:00
Jason Ekstrand	65ee5cc0da	anv: Use an actual binding for gl_NumWorkgroups This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-04 23:56:40 +00:00
Jason Ekstrand	5c96120b5c	intel,nir: Lower TXD with min_lod when the sampler index is not < 16 When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: `cb98e0755f` "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-04 23:56:39 +00:00
Jason Ekstrand	5049fbddb4	anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupport We were accidentally not counting those surfaces Fixes: `ddc4069122` "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-04 23:56:39 +00:00
Sagar Ghuge	e551040c60	nir/glsl: Add another way of doing lower_imul64 for gen8+ On Gen 8 and 9, "mul" instruction supports 64 bit destination type. We can reduce our 64x64 int multiplication from 4 instructions to 3. Also instead of emitting two mul instructions, we can emit single mul instuction and extract low/high 32 bits from 64 bit result for [i/u]mulExtended v2: 1) Allow lower_mul_high64 to use new opcode (Jason Ekstrand) 2) Add lower_mul_2x32_64 flag (Matt Turner) 3) Remove associative property as bit size is different (Connor Abbott) v3: Fix indentation and variable naming convention (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-04 15:50:25 -08:00
Mauro Rossi	ec0f465bc5	android: anv: fix libexpat shared dependency Fixes undefined reference building errors for XML_* functions Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	2019-03-04 20:53:59 +01:00
Mauro Rossi	14e7e26a09	android: anv: fix generated files depedencies (v2) Fix anv_extrypoints.{c,h} and anv_extensions.{c,h} missing dependencies Rename the variable labels according to targets and python scripts Align the building rules as per Automake for simplification Fixes building errors during rebuils due to missing dependencies (v2) Fixed a missing $(VULKAN_API_XML) reference Fixes: `9a508b7` ("android: anv/extensions: fix generated sources build") Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	2019-03-04 20:53:51 +01:00
Jordan Justen	10c5579921	intel/compiler: Move int64/doubles lowering options Instead of calculating the int64 and doubles lowering options each time a shader is preprocessed, save and use the values in nir_shader_compiler_options. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-02 14:33:44 -08:00
Ian Romanick	d1d56f5f9a	intel/fs: Don't assert on b2f with a saturate modifier This ran afoul of Iris's use of nir_lower_clamp_color_outputs which applies fsat() before writes to vertex shader color outpus. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7725d60938` ("intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))")	2019-03-02 13:58:50 -08:00
Lionel Landwerlin	32ffd90002	anv: add support for INTEL_DEBUG=bat As requested by Ken ;) v2: Also decode simple batches (Caio) Fix u_vector usage issues (Lionel) v3: Make binding/instruction/state/surface available (Lionel) v4: Going through device pools for simple batches (Lionel) Centralize search BO callbacks into anv_device.c (Lionel) v5: Clear decoded batch buffer var after use (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-02 12:53:21 +00:00
Matt Turner	e0148bbcfd	intel/compiler: Add commas on final values of compaction table arrays Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-03-01 13:56:25 -08:00
Ian Romanick	1edf67fc3f	intel/fs: Generate if instructions with inverted conditions Per-platform results were all over the place, so I have included all the results here. There is an important note at the bottom of the commit message. Skylake total instructions in shared programs: 15184683 -> 15184679 (<.01%) instructions in affected programs: 2786 -> 2782 (-0.14%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.05% max: 0.84% x̄: 0.44% x̃: 0.44% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.96% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 370961367 -> 370961173 (<.01%) cycles in affected programs: 205867 -> 205673 (-0.09%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 149 x̄: 39.60 x̃: 16 helped stats (rel) min: 0.02% max: 1.05% x̄: 0.45% x̃: 0.55% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -93.01 28.34 95% mean confidence interval for cycles %-change: -0.82% 0.08% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 15465366 -> 15465362 (<.01%) instructions in affected programs: 2799 -> 2795 (-0.14%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.04% max: 0.84% x̄: 0.44% x̃: 0.44% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.96% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 410938419 -> 410938531 (<.01%) cycles in affected programs: 566028 -> 566140 (0.02%) helped: 18 HURT: 17 helped stats (abs) min: 1 max: 16 x̄: 3.50 x̃: 1 helped stats (rel) min: <.01% max: 1.05% x̄: 0.13% x̃: <.01% HURT stats (abs) min: 1 max: 12 x̄: 10.29 x̃: 12 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.09% 95% mean confidence interval for cycles value: 0.31 6.09 95% mean confidence interval for cycles %-change: -0.10% 0.05% Inconclusive result (%-change mean confidence interval includes 0). Haswell total instructions in shared programs: 13749760 -> 13749759 (<.01%) instructions in affected programs: 2241 -> 2240 (-0.04%) helped: 1 HURT: 0 total cycles in shared programs: 385398913 -> 385398363 (<.01%) cycles in affected programs: 554914 -> 554364 (-0.10%) helped: 31 HURT: 1 helped stats (abs) min: 1 max: 453 x̄: 18.00 x̃: 6 helped stats (rel) min: <.01% max: 0.25% x̄: 0.03% x̃: 0.05% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -45.88 11.51 95% mean confidence interval for cycles %-change: -0.05% -0.02% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 180663626 -> 180663881 (<.01%) cycles in affected programs: 472350 -> 472605 (0.05%) helped: 15 HURT: 30 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 8 max: 10 x̄: 9.00 x̃: 9 HURT stats (rel) min: 0.06% max: 0.14% x̄: 0.10% x̃: 0.10% 95% mean confidence interval for cycles value: 4.21 7.12 95% mean confidence interval for cycles %-change: 0.05% 0.08% Cycles are HURT. Sandy Bridge total cycles in shared programs: 154568664 -> 154569225 (<.01%) cycles in affected programs: 356486 -> 357047 (0.16%) helped: 1 HURT: 31 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02% HURT stats (abs) min: 4 max: 33 x̄: 18.16 x̃: 8 HURT stats (rel) min: 0.05% max: 0.23% x̄: 0.14% x̃: 0.10% 95% mean confidence interval for cycles value: 12.19 22.87 95% mean confidence interval for cycles %-change: 0.10% 0.16% Cycles are HURT. Iron Lake total instructions in shared programs: 8206589 -> 8206565 (<.01%) instructions in affected programs: 3024 -> 3000 (-0.79%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.75% max: 0.83% x̄: 0.80% x̃: 0.80% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.82% -0.77% Instructions are helped. total cycles in shared programs: 187657428 -> 187656228 (<.01%) cycles in affected programs: 95748 -> 94548 (-1.25%) helped: 12 HURT: 0 helped stats (abs) min: 80 max: 120 x̄: 100.00 x̃: 100 helped stats (rel) min: 1.00% max: 1.66% x̄: 1.27% x̃: 1.21% 95% mean confidence interval for cycles value: -113.27 -86.73 95% mean confidence interval for cycles %-change: -1.43% -1.11% Cycles are helped. GM45 total instructions in shared programs: 5037569 -> 5037557 (<.01%) instructions in affected programs: 1521 -> 1509 (-0.79%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.75% max: 0.83% x̄: 0.79% x̃: 0.79% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.83% -0.75% Instructions are helped. total cycles in shared programs: 128101478 -> 128100758 (<.01%) cycles in affected programs: 52746 -> 52026 (-1.37%) helped: 6 HURT: 0 helped stats (abs) min: 120 max: 120 x̄: 120.00 x̃: 120 helped stats (rel) min: 1.16% max: 1.66% x̄: 1.41% x̃: 1.41% 95% mean confidence interval for cycles value: -120.00 -120.00 95% mean confidence interval for cycles %-change: -1.70% -1.12% Cycles are helped. This change has almost no effect right now. However, removing this patch (but leaving the patch "nir/algebraic: Replace a bcsel of a b2f with a b2f(!(a \|\| b))") after adding a patch that removes !(a < b) -> (a >= b) optimizations (like https://patchwork.freedesktop.org/patch/264787/) has the following results on Skylake: Skylake total instructions in shared programs: 15071022 -> 15089710 (0.12%) instructions in affected programs: 1022219 -> 1040907 (1.83%) helped: 1 HURT: 3937 helped stats (abs) min: 41 max: 41 x̄: 41.00 x̃: 41 helped stats (rel) min: 1.01% max: 1.01% x̄: 1.01% x̃: 1.01% HURT stats (abs) min: 1 max: 256 x̄: 4.76 x̃: 4 HURT stats (rel) min: 0.05% max: 11.18% x̄: 2.59% x̃: 2.60% 95% mean confidence interval for instructions value: 4.56 4.93 95% mean confidence interval for instructions %-change: 2.54% 2.64% Instructions are HURT. total cycles in shared programs: 369777134 -> 370092923 (0.09%) cycles in affected programs: 17516573 -> 17832362 (1.80%) helped: 115 HURT: 3624 helped stats (abs) min: 1 max: 1721 x̄: 81.18 x̃: 28 helped stats (rel) min: <.01% max: 10.74% x̄: 1.24% x̃: 0.65% HURT stats (abs) min: 1 max: 12640 x̄: 89.71 x̃: 54 HURT stats (rel) min: <.01% max: 28.24% x̄: 4.72% x̃: 4.52% 95% mean confidence interval for cycles value: 75.21 93.71 95% mean confidence interval for cycles %-change: 4.43% 4.64% Cycles are HURT. total spills in shared programs: 9450 -> 9442 (-0.08%) spills in affected programs: 166 -> 158 (-4.82%) helped: 2 HURT: 0 total fills in shared programs: 21115 -> 21094 (-0.10%) fills in affected programs: 438 -> 417 (-4.79%) helped: 2 HURT: 0 LOST: 1 GAINED: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	7725d60938	intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a)) Since Boolean values are either -1 (true) or 0 (false), b2f(inot(a)) maps -1 => 0.0 and 0 => 1.0. This is equivalent to 1.0 + float(boolBitsToInt(a)). On Intel GPUs, ADD is one of the few instructions that can type-convert during write to destination, so we can achieve this in a single instruction: add g47F, g26D, 1D v2: Fix swizzles. v3: Fix typos in comments. Noticed by Ken. All Gen6+ platforms had similar results. (Skylake shown) Skylake total instructions in shared programs: 15185583 -> 15184683 (<.01%) instructions in affected programs: 239389 -> 238489 (-0.38%) helped: 899 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.15% max: 1.85% x̄: 0.49% x̃: 0.44% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.09% max: 0.09% x̄: 0.09% x̃: 0.09% 95% mean confidence interval for instructions value: -1.01 -0.99 95% mean confidence interval for instructions %-change: -0.51% -0.48% Instructions are helped. total cycles in shared programs: 370964249 -> 370961508 (<.01%) cycles in affected programs: 1487586 -> 1484845 (-0.18%) helped: 420 HURT: 268 helped stats (abs) min: 1 max: 232 x̄: 22.41 x̃: 6 helped stats (rel) min: 0.05% max: 22.60% x̄: 1.30% x̃: 0.41% HURT stats (abs) min: 1 max: 230 x̄: 24.90 x̃: 10 HURT stats (rel) min: <.01% max: 21.60% x̄: 1.45% x̃: 0.52% 95% mean confidence interval for cycles value: -7.61 -0.36 95% mean confidence interval for cycles %-change: -0.44% -0.02% Cycles are helped. No changes on Iron Lake or GM45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	cb3e21cd19	intel/fs: Use De Morgan's laws to avoid logical-not of a logic result on Gen8+ Instead of emitting ~(a & b), emit (~a \| ~b) since logical-not of operands is free on Gen8+. v2: Fix swizzles. Fix types for cmod propagation. v3: Simplify logic for inverting source of inot(ixor(a, b)). Suggested by Ken. Skylake and Broadwell had similar results. (Skylake shown) Skylake total instructions in shared programs: 15185593 -> 15185583 (<.01%) instructions in affected programs: 5673 -> 5663 (-0.18%) helped: 12 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.30% max: 5.88% x̄: 1.50% x̃: 0.70% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% 95% mean confidence interval for instructions value: -1.66 0.13 95% mean confidence interval for instructions %-change: -2.60% -0.15% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 370977726 -> 370964249 (<.01%) cycles in affected programs: 869987 -> 856510 (-1.55%) helped: 15 HURT: 2 helped stats (abs) min: 2 max: 6640 x̄: 902.20 x̃: 16 helped stats (rel) min: <.01% max: 4.92% x̄: 1.71% x̃: 1.53% HURT stats (abs) min: 14 max: 42 x̄: 28.00 x̃: 28 HURT stats (rel) min: 1.08% max: 3.18% x̄: 2.13% x̃: 2.13% 95% mean confidence interval for cycles value: -1654.87 69.34 95% mean confidence interval for cycles %-change: -2.29% -0.23% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	8eb36c9129	intel/fs: Emit logical-not of operands on Gen8+ On Gen8+ specifying negation of a logical operation such as AND actually performs a logical-not. Take advantage of this to generate fewer instructions. v2: Major rebase. Use nir_src_as_alu_instr. Fix swizzle handling. No changes on any pre-Gen8 platform. Skylake and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 15466902 -> 15466274 (<.01%) instructions in affected programs: 1262953 -> 1262325 (-0.05%) helped: 682 HURT: 4 helped stats (abs) min: 1 max: 5 x̄: 1.02 x̃: 1 helped stats (rel) min: 0.03% max: 2.40% x̄: 0.18% x̃: 0.04% HURT stats (abs) min: 1 max: 62 x̄: 17.50 x̃: 3 HURT stats (rel) min: 0.03% max: 1.89% x̄: 0.53% x̃: 0.10% 95% mean confidence interval for instructions value: -1.10 -0.73 95% mean confidence interval for instructions %-change: -0.19% -0.15% Instructions are helped. total cycles in shared programs: 410996093 -> 410950440 (-0.01%) cycles in affected programs: 144389048 -> 144343395 (-0.03%) helped: 519 HURT: 51 helped stats (abs) min: 1 max: 1060 x̄: 104.46 x̃: 140 helped stats (rel) min: 0.01% max: 10.98% x̄: 0.34% x̃: 0.03% HURT stats (abs) min: 1 max: 4060 x̄: 167.90 x̃: 22 HURT stats (rel) min: <.01% max: 8.20% x̄: 0.96% x̃: 0.25% 95% mean confidence interval for cycles value: -97.16 -63.02 95% mean confidence interval for cycles %-change: -0.32% -0.13% Cycles are helped. total spills in shared programs: 95311 -> 95329 (0.02%) spills in affected programs: 881 -> 899 (2.04%) helped: 0 HURT: 4 total fills in shared programs: 93629 -> 93634 (<.01%) fills in affected programs: 794 -> 799 (0.63%) helped: 1 HURT: 2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	06eaaf2de9	intel/fs: Refactor ALU source and destination handling to a separate function Other places will need to do this soon to properly handle source swizzles. The patch looks a little odd, but the change is pretty straight forward. All of the swizzle and mask handling is moved out, but the code for handling move instructions and vecN instructions remains in nir_emit_alu. I'm not terribly pleased with the "need_dest" parameter, but get_nir_dest is (somewhat surprisingly) destructive. I am open to suggestions of alternatives. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	fb3ca9109c	intel/fs: Handle OR source modifiers in algebraic optimization Found by inspection. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	c9d5bd050c	intel/fs: Relax type matching rules in cmod propagation from MOV instructions To allow cmod propagation from a MOV in a sequence like: and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD mov.nz.f0(16) null<1>F g31<8,8,1>D A similar change to the vec4 backend had no effect. Somewhere between `c1ec582059` and `40fc4b5acd` (1,094 commits) the effectiveness of this patch diminished, and as of commit `d7e0d47b9d` (nir: Add a bunch of b2[if] optimizations) this optimization no longer has any effect on any platform. A later patch "intel/fs: Use De Morgan's laws to avoid logical-not of a logic result on Gen8+," generates some instruction sequences that require this change in order for cmod propagation to make progress. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	d2056ab993	intel/vec4: Emit constants for some ALU sources as immediate values In some cases of flow control, the constant propagation is not able to determine that the source of an instruction must be a constant value. When we still have NIR SSA values, we can easily determine this. Emit the immediate value during code generation to possible avoid spurious loads of constants into registers. I wrote this patch to prevent a couple trivial regressions in vec4 shaders caused by "nir/algebraic: Replace i2b used by bcsel or if-statement with comparison". The final result was quite a bit better than that... No shader-db changes on any Gen8+ platform. v2: Assert that we never get a negation source modifier on Gen8+. Suggested by Ken. This should never happen because we don't normally use vec4 for Gen8+ (requires and environment variable to force it), and there's no code to generate these negations. Still, erring on the side of caution is better. Haswell total instructions in shared programs: 13776218 -> 13764783 (-0.08%) instructions in affected programs: 663931 -> 652496 (-1.72%) helped: 3495 HURT: 1 helped stats (abs) min: 1 max: 30 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.21% max: 10.00% x̄: 1.79% x̃: 1.49% HURT stats (abs) min: 24 max: 24 x̄: 24.00 x̃: 24 HURT stats (rel) min: 12.24% max: 12.24% x̄: 12.24% x̃: 12.24% 95% mean confidence interval for instructions value: -3.39 -3.15 95% mean confidence interval for instructions %-change: -1.84% -1.75% Instructions are helped. total cycles in shared programs: 386818984 -> 386511910 (-0.08%) cycles in affected programs: 20379636 -> 20072562 (-1.51%) helped: 3052 HURT: 476 helped stats (abs) min: 2 max: 12516 x̄: 110.40 x̃: 6 helped stats (rel) min: 0.05% max: 24.68% x̄: 1.58% x̃: 0.69% HURT stats (abs) min: 2 max: 416 x̄: 62.76 x̃: 24 HURT stats (rel) min: 0.10% max: 10.75% x̄: 4.03% x̃: 2.18% 95% mean confidence interval for cycles value: -115.57 -58.51 95% mean confidence interval for cycles %-change: -0.93% -0.73% Cycles are helped. total spills in shared programs: 100482 -> 100480 (<.01%) spills in affected programs: 79 -> 77 (-2.53%) helped: 3 HURT: 1 total fills in shared programs: 96883 -> 96877 (<.01%) fills in affected programs: 85 -> 79 (-7.06%) helped: 4 HURT: 0 Ivy Bridge total instructions in shared programs: 12000562 -> 11990113 (-0.09%) instructions in affected programs: 572581 -> 562132 (-1.82%) helped: 3106 HURT: 0 helped stats (abs) min: 1 max: 30 x̄: 3.36 x̃: 2 helped stats (rel) min: 0.21% max: 10.00% x̄: 1.86% x̃: 1.49% 95% mean confidence interval for instructions value: -3.49 -3.23 95% mean confidence interval for instructions %-change: -1.91% -1.81% Instructions are helped. total cycles in shared programs: 180958504 -> 180664500 (-0.16%) cycles in affected programs: 19991810 -> 19697806 (-1.47%) helped: 2654 HURT: 486 helped stats (abs) min: 2 max: 12516 x̄: 121.61 x̃: 6 helped stats (rel) min: 0.05% max: 20.66% x̄: 1.48% x̃: 0.68% HURT stats (abs) min: 2 max: 396 x̄: 59.18 x̃: 24 HURT stats (rel) min: 0.05% max: 9.62% x̄: 3.82% x̃: 2.16% 95% mean confidence interval for cycles value: -125.62 -61.64 95% mean confidence interval for cycles %-change: -0.76% -0.56% Cycles are helped. Sandy Bridge total instructions in shared programs: 10842336 -> 10835438 (-0.06%) instructions in affected programs: 395340 -> 388442 (-1.74%) helped: 1926 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.58 x̃: 2 helped stats (rel) min: 0.10% max: 9.68% x̄: 1.78% x̃: 1.42% 95% mean confidence interval for instructions value: -3.73 -3.43 95% mean confidence interval for instructions %-change: -1.84% -1.72% Instructions are helped. total cycles in shared programs: 154590074 -> 154569050 (-0.01%) cycles in affected programs: 8159932 -> 8138908 (-0.26%) helped: 1670 HURT: 228 helped stats (abs) min: 2 max: 260 x̄: 18.13 x̃: 6 helped stats (rel) min: 0.02% max: 8.70% x̄: 0.74% x̃: 0.28% HURT stats (abs) min: 2 max: 1798 x̄: 40.58 x̃: 14 HURT stats (rel) min: 0.03% max: 12.97% x̄: 1.04% x̃: 0.31% 95% mean confidence interval for cycles value: -13.51 -8.64 95% mean confidence interval for cycles %-change: -0.60% -0.46% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8212357 -> 8206587 (-0.07%) instructions in affected programs: 323664 -> 317894 (-1.78%) helped: 1457 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 3.96 x̃: 3 helped stats (rel) min: 0.33% max: 11.49% x̄: 1.86% x̃: 1.44% 95% mean confidence interval for instructions value: -4.14 -3.78 95% mean confidence interval for instructions %-change: -1.93% -1.78% Instructions are helped. total cycles in shared programs: 187668016 -> 187657422 (<.01%) cycles in affected programs: 14856234 -> 14845640 (-0.07%) helped: 1372 HURT: 83 helped stats (abs) min: 2 max: 24 x̄: 7.92 x̃: 6 helped stats (rel) min: 0.02% max: 1.14% x̄: 0.12% x̃: 0.08% HURT stats (abs) min: 2 max: 14 x̄: 3.20 x̃: 2 HURT stats (rel) min: 0.03% max: 0.60% x̄: 0.12% x̃: 0.12% 95% mean confidence interval for cycles value: -7.65 -6.91 95% mean confidence interval for cycles %-change: -0.11% -0.10% Cycles are helped. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:41:46 -08:00
Eric Engestrom	2793417ec6	anv: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	258e463db5	anv: remove spaces around kwargs assignment pylint complains: > C0326: No space allowed around keyword argument assignment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	7b704fd2fd	anv: drop unused parameter I'm guessing a previous version of this script used an index-based map of entrypoints, but that's not the case anymore. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	b503d4e458	anv: simplify chained comparison Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Jason Ekstrand	e8f863e718	intel/compiler: Re-prefix non-logical surface opcodes with VEC4 The scalar back-end uses SHADER_OPCODE_SEND for all surface messages so we no longer need the non-logical opcodes there. Prefix them VEC4 so it's clear that they're only used by the vec4 back-end. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	95ae400abc	intel/schedule_instructions: Move some comments Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	aeaba24fcb	intel/compiler: Drop unused surface opcodes The unused typed surface read/write support in the vec4 back-end has been dropped and the fs back-end now uses SHADER_OPCODE_SEND for all image and buffer ops. There's no reason to keep these opcodes around anymore. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	a04c737215	intel/fs: Get rid of the IMAGE_SIZE opcode Since switching to SHADER_OPCODE_SEND for image operations, we no longer need the non-logical opcode. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	10b7d14c31	intel/vec4: Drop dead code for handling typed surface messages Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	9d437f9482	intel/fs: Drop the fs_surface_builder All of the actual abstraction (except possibly setting size_written) happens as part of the logical opcodes. The only thing that the surface builder is providing at this point is extra levels of functions to call through. I'm going to be adding bindless image support soon and all the extra abstraction here is just getting in the way. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	494a0543e6	intel/fs: Re-order logical surface arguments It makes more sense to start at the surface then move on to the address and then the data. Also, this is a really good test of whether or not we got all the places that use the sources by explicit integer number. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	94f8fd9a0c	intel/fs: Add an enum type for logical sampler inst sources Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Lionel Landwerlin	6e184147dd	intel/compiler: use correct swizzle for replacement The optimization in `4cd1a0be76` introduced a replacement of : cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D ... cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D By : cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D ... mov(8) vgrf11.y:D, vgrf15.yyyy:D The first cmp instruction is storing in x while the second mov is sourcing from y. We need to take into account where the replacement on the scan_inst destination is going to store thing so that the replacement mov can source things from the correct location. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4cd1a0be76` ("i965/vec4: Propagate conditional modifiers from more compares to other compares") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-27 20:06:42 +00:00
Tapani Pälli	a3c366c4b2	android: add liblog to libmesa_intel_common build Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-27 08:53:09 +02:00
Kasireddy, Vivek	7cab8d3661	i965: Add support for sampling from XYUV images Add support to the i965 DRI driver to sample from XYUV8888 buffers. Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 13:08:52 +00:00
Jason Ekstrand	c4fb6b0c81	intel/eu: Add an EOT parameter to send_indirect_[split]_message For split indirect sends we have to put the EOT parameter in the extended descriptor as well as the instruction itself so just calling brw_inst_set_eot is insufficient. Moving the EOT handling handling into the send_indirect_[split]_message helper lets us handle it properly.	2019-02-25 11:35:12 -06:00
Lionel Landwerlin	30828f4646	intel/aub_viewer: silence more compiler warnings format not a string literal and no format arguments. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-25 13:11:16 +00:00
Lionel Landwerlin	91df8b1780	intel/aub_viewer: silence compiler warning buffer_addr may be used uninitialized. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-25 13:11:13 +00:00
Lionel Landwerlin	f1da10e0c5	intel/aub_viewer: printout 48bits addresses Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-25 13:11:05 +00:00
Lionel Landwerlin	239b0d8570	Revert "anv: add support for INTEL_DEBUG=bat" This reverts commit `e4d88396d2`. Apologies, I pushed the wrong commit.	2019-02-24 01:06:39 +00:00
Lionel Landwerlin	e4d88396d2	anv: add support for INTEL_DEBUG=bat As requested by Ken ;) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-23 23:29:04 +00:00
Juan A. Suarez Romero	4f917e6a61	anv: advertise 8 subpixel precision bits On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is used to select between 8 bit subpixel precision (value 0) or 4 bit subpixel precision (value 1). As this value is not set, means it is taking the value 0, so 8 bit are used. On the other side, in the Vulkan CTS tests, if the reference rasterizer, which uses 8 bit precision, as it is used to check what should be the expected value for the tests, is changed to use 4 bit as ANV was advertising so far, some of the tests will fail. So it seems ANV is actually using 8 bits. v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason) v3: use _8Bit definition as value (Jason) v4: (by Jason) anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect This field was added on gen8 even though there's an identically defined one in 3DSTATE_SF. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Kenneth Graunke <kenneth@whitecape.org> CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 17:53:55 +01:00
Juan A. Suarez Romero	3b423eeb2d	genxml: add missing field values for 3DSTATE_SF Fill out "Vertex Sub Pixel Precision Select" possible values. CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 17:53:45 +01:00
Lionel Landwerlin	1d626fc028	intel: fix urb size for CFL GT1 Same 192Kb amount as SKL/KBL GT1 applies. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Fixes: `de7ed0ba55` ("i965/CFL: Add PCI Ids for Coffee Lake.")	2019-02-22 11:53:49 +00:00
Samuel Iglesias Gonsálvez	bd2c5a8203	isl: the display engine requires 64B alignment for linear surfaces v2: Add PRM quote (Lionel) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-22 11:45:45 +00:00
Mauro Rossi	338dacc341	android: intel/isl: remove redundant building rules Fixes the following building error: including ./external/mesa/Android.mk ... build/core/base_rules.mk:183: * external/mesa/src/intel: MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel. make: * [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1 ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c and that source file includes isl_tiled_memcpy.c source Fixes: `96bb328` ("iris: add Android build") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-22 07:56:11 +02:00
Francisco Jerez	7272fe9c08	intel/fs: Rely on undocumented unrestricted regioning for 32x16-bit integer multiply. Even though the hardware spec claims that any "integer DWord multiply" operation is affected by the regioning restrictions of CHV/BXT/GLK, this is inconsistent with the behavior of the simulator and with empirical evidence -- Return false from has_dst_aligned_region_restriction() for such instructions as a micro-optimization. Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	e03be78252	intel/fs: Implement extended strides greater than 4 for IR source regions. Strides up to 32B can be implemented for the source regions of most instructions by leveraging either the vertical or the horizontal stride of the hardware Align1 region. The main motivation for this is that currently the lower_integer_multiplication() pass will happily double the stride of one of the 32-bit sources, which can blow up if the stride of the original source was already the maximum value allowed by the hardware. An alternative would be to use the regioning legalization pass in order to lower such strides into the composition of multiple legal strides, but that would be somewhat less efficient. This showed up as a regression from my commit `cbea91eb57` in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a pre-existing problem that had affected conformance on other platforms without native support for integer multiplication. CHV/BXT were getting around it because the code I removed in that commit had the "fortunate" side effect of emitting narrower regions that didn't hit the hardware stride limit after lowering. Beyond fixing the regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on ICL (that's why this patch is marked for inclusion in mesa-stable even though the original regressing patch was not). According to Jason, a nearly equivalent change had been committed previously as `e8c9e65185` and then (mistakenly?) reverted as `a31d038208`. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 Reported-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	7f9f6263c1	intel/fs: Cap dst-aligned region stride to maximum representable hstride value. This is required in combination with the following commit, because otherwise if a source region with an extended 8+ stride is present in the instruction (which we're about to declare legal) we'll end up emitting code that attempts to write to such a region, even though strides greater than four are still illegal for the destination. Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	e2f475ddff	intel/fs: Lower integer multiply correctly when destination stride equals 4. Because the "low" temporary needs to be accessed with word type and twice the original stride, attempting to preserve the alignment of the original destination can potentially lead to instructions with illegal destination stride greater than four. Because the CHV/BXT alignment restrictions are now being enforced by the regioning lowering pass run after lower_integer_multiplication(), there is no real need to preserve the original strides anymore. Note that this bug can be reproduced on stable branches, but back-porting would be non-trivial, because the fix relies on the regioning lowering pass recently introduced. Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	c3c27762f7	intel/fs: Exclude control sources from execution type and region alignment calculations. Currently the execution type calculation will return a bogus value in cases like: mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u Which will be considered to have a 32-bit integer execution type even though the actual indirect move operation will be carried out with 16-bit precision. Similarly there's no need to apply the CHV/BXT double-precision region alignment restrictions to such control sources, since they aren't directly involved in the double-precision arithmetic operations emitted by these virtual instructions. Applying the CHV/BXT restrictions to control sources was expected to be harmless if mildly inefficient, but unfortunately it exposed problems at codegen level for virtual instructions (namely the SHUFFLE instruction used for the Vulkan 1.1 subgroup feature) that weren't prepared to accept control sources with an arbitrary strided region. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 Reported-by: Mark Janes <mark.a.janes@intel.com> Fixes: `efa4e4bc5f` "intel/fs: Introduce regioning lowering pass." Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Jordan Justen	cd0ac3a6af	genxml: Remove extra space in gen4/45/5 field name Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 13:17:10 -08:00
Jordan Justen	a9b0b72a78	genxml/gen_bits_header.py: Use regex to strip no alphanum chars Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 13:15:59 -08:00
Kenneth Graunke	286b8b8f99	iris: handle PatchVerticesIn as a system value.	2019-02-21 10:26:10 -08:00
Tapani Pälli	96bb328e9b	iris: add Android build Note that at least following additional libs/components require changes since they refer to BOARD_GPU_DRIVERS variable which is used to select the driver: - mixins - minigbm - libdrm - drm_gralloc v2: (feedback by Gustaw Smolarczyk) Fix trailing \ in a few cases Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-21 10:26:10 -08:00
Lionel Landwerlin	89f03d1872	imgui: make sure our copy of imgui doesn't clash with others in the same process Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> +1-by: Mike Lothian <mike@fireburn.co.uk> +1-by: Tapani Pälli <tapani.palli@intel.com> +1-by: Eric Engestrom <eric.engestrom@intel.com> +1-by: Yurii Kolesnykov <root@yurikoles.com> +1-by: myfreeweb <greg@unrelenting.technology> +1-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 18:06:05 +00:00
Lionel Landwerlin	51047cd2e8	build: move imgui out of src/intel/tools to be reused Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> +1-by: Mike Lothian <mike@fireburn.co.uk> +1-by: Tapani Pälli <tapani.palli@intel.com> +1-by: Eric Engestrom <eric.engestrom@intel.com> +1-by: Yurii Kolesnykov <root@yurikoles.com> +1-by: myfreeweb <greg@unrelenting.technology> +1-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 18:06:05 +00:00
Alejandro Piñeiro	0629b2a462	nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fs On GLSL that info is set as a layout qualifier when redeclaring gl_FragCoord, so somehow tied to a specific variable. But in practice, they behave as a global of the shader. On ARB programs they are set using a global OPTION (defined at ARB_fragment_coord_conventions), and on SPIR-V using ExecutionModes, that are also not tied specifically to the builtin. This patch moves that info from nir variable and ir variable to nir shader and gl_program shader_info respectively, so the map is more similar to SPIR-V, and ARB programs, instead of more similar to GLSL. FWIW, shader_info.fs already had pixel_center_integer, so this change also removes some redundancy. Also, as struct gl_program also includes a shader_info, we removed gl_program::OriginUpperLeft and PixelCenterInteger, as it would be superfluous. This change was needed because recently spirv_to_nir changed the order in which execution modes and variables are handled, so the variables didn't get the correct values. Now the info is set on the shader itself, and we don't need to go back to the builtin variable to set it. Fixes: `e68871f6a` ("spirv: Handle constants and types before execution modes") v2: (Jason) * glsl_to_nir: get the info before glsl_to_nir, while all the rest of the info gathering is happening * prog_to_nir: gather the info on a general info-gathering pass, not on variable setup. v3: (Jason) * Squash with the patch that removes that info from ir variable * anv: assert that OriginUpperLeft is true. It should be already set by spirv_to_nir. * blorp: set origin_upper_left on its core "compile fragment shader", not just on some specific places (for this we added an helper on a previous patch). * prog_to_nir: no need to gather specifically this fragcoord modes as the full gl_program shader_info is copied. * spirv_to_nir: assert that we are a fragment shader when handling this execution modes. v4: (reported by failing gitlab pipeline #18750) * state_tracker: update too due changes on ir.h/gl_program v5: * blorp: minor change after change on previous patch * radeonsi: update due this change. v6: (Timothy Arceri) * prog_to_nir: remove extra whitespace * shader_info: don't use :1 on origin_upper_left * glsl: program.fs.origin_upper_left/pixel_center_integer can be move out of the shader list loop	2019-02-21 11:47:59 +01:00
Alejandro Piñeiro	675eabb560	blorp: introduce helper method blorp_nir_init_shader This initializes the nir shader that will be used by blorp. Right now it doesn't do too much beyond calling nir_builder_init_simple_shader, and setting a name. More stuff will be added on following patches. v2: there is a case were it is used a VERTEX_SHADER (Alejandro)	2019-02-21 11:47:51 +01:00
Eric Engestrom	a16c398668	anv: use anv_shader_bin_write_to_blob()'s return value Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 16:40:13 +00:00
Eric Engestrom	d3115f34a6	anv: drop unused imports Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Eric Engestrom	8cbfcab425	anv: make sure the extensions stay sorted Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Eric Engestrom	bc76ce1033	anv: sort vendors extensions after KHR and EXT Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Eric Engestrom	427aa9d154	anv: sort extensions alphabetically Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Tapani Pälli	886cee1f96	anv: anv: refactor error handling in anv_shader_bin_write_to_blob() v2: blob manages error state internally, just return true if errors did not occur (Jason) CID: 1442546 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 15:39:19 +02:00
Lionel Landwerlin	f509213675	anv: implement VK_EXT_depth_clip_enable A new extension allowing the user to explictly specify the clipping behavior. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-20 09:57:58 +00:00
Samuel Iglesias Gonsálvez	63a919a3ce	isl: remove the cache line size alignment requirement The cacheline size was a requirement for using the BLT engine, which we don't use anymore except for a few things on old HW, so we drop it. Fixes CTS's CL#3500 test: dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r8g8b8_unorm Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 08:28:31 +01:00
Matt Turner	ac21dd4aee	intel/compiler/test: Add unit test for mismatched signedness comparison v2 (idr): Move adding the test to after adding the fix. Reordering the two commits prevents possible headaches for git-bisect with scripts that always do 'ninja check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 11:11:02 -08:00
Matt Turner	2dff9a66b6	intel/compiler: Avoid propagating inequality cmods if types are different v2: Fix silly bug in logic. s/\|\|/&&/ All but one of the affected shaders is in an Unreal4 demo. The other is in Tomb Raider. All of the cases that Ian investigated appear to be sequences like the following if (int(uint(some_float)) < 0) /* other relations too */ ... At least in Tomb Raider, it's not obvious that this sequence came from the original shader. In some of the Unreal demos, the shader contains code like if (int(uint(textureLod(...))) > 0) ... which explicitly generates the offending sequence. All Gen6+ platforms had similar results (Skylake shown): total instructions in shared programs: 15437170 -> 15437187 (<.01%) instructions in affected programs: 4492 -> 4509 (0.38%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.57% 0.75% Instructions are HURT. total cycles in shared programs: 383007996 -> 383007992 (<.01%) cycles in affected programs: 20542 -> 20538 (-0.02%) helped: 6 HURT: 7 helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6 helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for cycles value: -3.30 2.69 95% mean confidence interval for cycles %-change: -0.19% 0.19% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: nagrigoriadis@gmail.com Tested-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>	2019-02-15 11:11:02 -08:00
Matt Turner	e50db60d16	intel/compiler/test: Set devinfo->gen = 7 We emit an FBL instruction which only exists since Gen7. This prevents the test from segfaulting when run with TEST_DEBUG=1. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 11:11:02 -08:00
Jason Ekstrand	367b0ede4d	intel/fs: Bail in optimize_extract_to_float if we have modifiers This fixes a bug in runscape where we were optimizing x >> 16 to an extract and then negating and converting to float. The NIR to fs pass was dropping the negate on the floor breaking a geometry shader and causing it to render nothing. Fixes: `1f862e923c` "i965/fs: Optimize float conversions of byte/word..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-02-14 23:02:44 -06:00
Jason Ekstrand	5064464931	intel/fs: Silence a compiler warning Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:47 -06:00
Jason Ekstrand	9b202239ba	anv: Silence some compiler warnings in release builds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:45 -06:00
Jason Ekstrand	cd60c995a6	anv/blorp: Delete a pointless assert Just a little higher up in the function we assert that the aspect masks are actually equal so there's no reason for the weaker check. Also, the temporary variables were causing compiler warnings in release builds. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:42 -06:00
Kenneth Graunke	39aee57523	anv: Put MOCS in the correct location My patch to switch from struct-based MOCS to numeric MOCS accidentally divided all MOCS entries by 2 in the Vulkan driver. MOCS on Gen9+ is just an array index into a table. But in the hardware packets, the index starts at bit 1. So we need to shift it. Fixes: `0b44644ca6` (genxml: Consistently use a numeric "MOCS" field) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-14 11:03:28 -08:00
Eric Engestrom	f7c56475d2	anv/tests: compile to something sensible in release builds assert()-based tests make no sense without asserts, so make sure asserts are compiled in, even if the rest of the code has asserts turned off. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-14 12:52:34 +00:00
Eric Engestrom	f1374805a8	drm-uapi: use local files, not system libdrm There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Anholt	8f3694e1ab	intel: Use the NIR lowering for isign. Drops one instruction from fs-sign-int.shader_test. No change in shader-db due to it having 0 instances of sign(genIType). This may hurt isign64 if algebraic runs before int64 lowering, but I wasn't sure how to mark the algebraic opt as "every bit size but 64". v2: Update commit message about shader-db. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2019-02-14 00:32:30 +00:00
Dylan Baker	279060cd32	meson: Add dependency on genxml to anvil Currently the Intel "anvil" driver races with the generation of genxml files, while i965 has an explicit dependency. This patch adds the same dependency to anvil. Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-13 22:01:00 +00:00
Juan A. Suarez Romero	1ad26f9417	anv/cmd_buffer: check for NULL framebuffer This can happen when we record a VkCmdDraw in a secondary buffer that was created inheriting from the primary buffer, but with the framebuffer set to NULL in the VkCommandBufferInheritanceInfo. Vulkan 1.1.81 spec says that "the application must ensure (using scissor if neccesary) that all rendering is contained in the render area [...] [which] must be contained within the framebuffer dimesions". While this should be done by the application, commit `465e5a86` added the clamp to the framebuffer size, in case of application does not do it. But this requires to know the framebuffer dimensions. If we do not have a framebuffer at that moment, the best compromise we can do is to just apply the scissor as it is, and let the application to ensure the rendering is contained in the render area. v2: do not clamp to framebuffer if there isn't a framebuffer v3 (Jason): - clamp earlier in the conditional - clamp to render area if command buffer is primary v4: clamp also x and y to render area (Jason) v5: rename used variables (Jason) Fixes: `465e5a86` ("anv: Clamp scissors to the framebuffer boundary") CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 19:19:13 +01:00
Tapani Pälli	3da858a6b9	intel/compiler: add scale_factors to sampler_prog_key_data Patch propagates given scale_factors to lowering options. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 08:42:25 +02:00
Francisco Jerez	374eb3cd6f	intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces. This fixes a rather astonishing problem that came up while debugging an issue in the Vulkan CTS. Apparently the Vulkan CTS framework has the tendency to create multiple VkDevices, each one with a separate DRM device FD and therefore a disjoint GEM buffer object handle space. Because the intel_dump_gpu tool wasn't making any distinction between buffers from the different handle spaces, it was confusing the instruction state pools from both devices, which happened to have the exact same GEM handle and PPGTT virtual address, but completely different shader contents. This was causing the simulator to believe that the vertex pipeline was executing a fragment shader, which didn't end up well. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-11 12:27:22 -08:00
Jason Ekstrand	fd77606b5b	intel/fs: Use enumerated array assignments in fb read TXF setup It's more clear and means we don't have to update the array every time we add an optional texture instruction argument Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-11 10:57:09 -06:00
Caio Marcelo de Oliveira Filho	ee670d09af	intel/compiler: use 0 as sampler in emit_mcs_fetch The sampler will be ignored since the underlying 'ld_mcs' operation won't use it, so just fill the field with 0 instead of the texture to make it clearer that's the case. This will also avoid is_high_sampler() to kick in unnecessarily, in case we are using the operation for a texture with index >= 16. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 14:51:56 -08:00
Ian Romanick	28ef5bb74c	intel/compiler: Silence warning about value that may be used uninitialized For some reason, this warning only occurs for me in release builds. In file included from src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:25:0: src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c: In function ‘brw_nir_lower_mem_access_bit_sizes’: src/compiler/nir/nir_builder.h:501:26: warning: ‘src_swiz[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized] alu_src.swizzle[i] = swiz[i]; ~~~~~~~~~~~~~~~~~~~^~~~~~~~~ src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:225:16: note: ‘src_swiz[2]’ was declared here unsigned src_swiz[4]; ^~~~~~~~ Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Lionel Landwerlin	a934a3d124	anv: assert that color attachment are valid This reverts commit `d76e777988`. Let's make this obvious that there is an application issue if it tries to access an attachment that doesn't exist in the current pass. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d76e777988` ("anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 00:18:16 +00:00
Emil Velikov	8943eb8f03	anv: wire up the state_pool_padding test Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `927ba12b53` ("anv/tests: Adding test for the state_pool padding.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com><Paste> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-02-05 11:39:36 -08:00
Caio Marcelo de Oliveira Filho	8c7c543936	isl: assert that Gen8+ don't have bit6_swizzling v2: Rewrite the condition to more clearly match the comment. (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho	5299c9cbcc	anv: skip bit6 swizzle detection in Gen8+ It is always false on Gen8+. Also, move the variable definition near its use. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 20:44:41 -08:00
Danylo Piliaiev	64d3b148fe	anv: Fix VK_EXT_transform_feedback working with varyings packed in PSIZ Transform feedback did not set correct SO_DECL.ComponentMask for varyings packed in VARYING_SLOT_PSIZ: gl_Layer - VARYING_SLOT_LAYER in VARYING_SLOT_PSIZ.y gl_ViewportIndex - VARYING_SLOT_VIEWPORT in VARYING_SLOT_PSIZ.z gl_PointSize - VARYING_SLOT_PSIZ in VARYING_SLOT_PSIZ.w Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 15:30:43 +00:00
Danylo Piliaiev	d76e777988	anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment From the Vulkan 1.0.98 spec for vkCmdClearAttachments: "If the aspectMask member of any element of pAttachments contains VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED, or must be a valid color attachment." Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-04 14:49:50 +02:00
Jason Ekstrand	48ed2a7bb0	anv: Implement VK_EXT_buffer_device_address Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-01 17:09:42 -06:00
Jason Ekstrand	e644ed468f	intel/fs: Implement nir_intrinsic_global_atomic_* eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	a91f392073	intel/fs: Use SENDS for A64 writes on gen9+ eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	1c25bf4373	intel/fs: Implement load/store_global with A64 untyped messages eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	b4f0d062cd	intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode Previously, we only applied the fix to shaders with a dispatch mode of SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16 instructions. If you have a SIMD8 instruction in a SIMD16 shader, neither would trigger and the restriction could still be hit. Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	79724a0756	intel/fs: Properly handle 64-bit types in LOAD_PAYLOAD By just assigning dst.type to src[i].type, we ensure that the offset at the end of the loop actually offsets it by the right number of registers. Otherwise, we'll get into a case where we copy with a Q type and then offset with a D type and things get out of sync. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:10:57 -06:00
Jason Ekstrand	f02914a991	intel/fs/cse: Split create_copy_instr into three cases Previously, we tried to combine all cases where the instruction being CSE'd writes to more than one MOV worth of registers into one case with a bit of special casing for LOAD_PAYLOAD. This commit splits things so that LOAD_PAYLOAD is entirely it's own case. This makes tweaking the LOAD_PAYLOAD case simpler in the next commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:10:40 -06:00
Jason Ekstrand	f409a08e5f	intel/nir: Add global support to lower_mem_access_bit_sizes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:08:29 -06:00
Oscar Blumberg	fea5b8e5ad	intel/fs: Fix memory corruption when compiling a CS Missing check for shader stage in the fs_visitor would corrupt the cs_prog_data.push information and trigger crashes / corruption later when uploading the CS state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 10:53:33 -08:00
Dylan Baker	4052142de7	meson: remove -std=c++11 from intel/tools for meson all C++ code is already compiled as C++11, so it's unnecessary. It's also the wrong way to do this, if we really needed this the correct way is to set: ```meson executable( ... override_options : ['cpp_std=c++11'], ) ``` Which ensures not only that the correct syntax for the current compiler is used, but also that meson doesn't create arguments like `-std=c++14 ... -std=c++11` Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-31 21:42:16 +00:00
Dylan Baker	8e49b32f63	meson: fix style in intel/tools The `:` in options should always have one space before and after `foo : bar`, and lists do not get spaces around the braces: `[foo]` not `[ foo ]` Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-31 21:42:16 +00:00
Dylan Baker	d93d53fa72	meson: remove build_by_default : true Which is and has always been the default. This is largely an artifact of how the building of these tools was controlled when the meson build was originally created. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-31 21:42:16 +00:00
Jason Ekstrand	a920979d4f	intel/fs: Use split sends for surface writes on gen9+ Surface reads don't need them because they just have the one address payload. With surface writes, on the other hand, we can put the address and the data in the different halves and avoid building the payload all together. The decrease in register pressure and added freedom in register allocation resulting from this change reduces spilling enough to improve the performance of one customer benchmark by about 2x. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	014edff0d2	intel/fs: Add interference between SENDS sources Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	eab1c55590	intel/fs: Support SENDS in SHADER_OPCODE_SEND Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	cca199fd85	intel/disasm: Properly disassemble split sends Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	8babaa84e8	intel/eu: Add support for the SENDS[C] messages Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d6a6e10390	intel/inst: Indent some code We're about to add some more if cases so let's have the giant re-indent in it's own patch to make review easier. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d96969120d	intel/inst: Fix the ia16_addr_imm helpers These have clearly never seen any use.... On gen8, the bottom 4 bits are missing so we need to shift them off before we call set_bits and shift again when we get the bits. Found by inspection. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	e46fb33143	intel/disasm: Rework SEND decoding to use descriptors Instead of fetching the information out of the instruction directly, fetch the descriptor and then pluck the information out of the descriptor. The current scheme works ok for SEND but with SENDS, it all falls to pieces because the descriptor is completely shuffled around. This commit doesn't actually convert everything. One notable exception is URB messages which don't even use descriptors in emit_urb_WRITE yet. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	13a6fabc62	intel/eu: Add more message descriptor helpers We want to be able to extract data from descriptors as well as unify a bit of the descriptor construction. One of the unifications we do is to unify the read/write and dataport descriptors. On gen4-5, read/write are substantially different and the read descriptors change between gen4 and gen4.x. On gen6, they unified layouts between read, write, and dataport. Then, on gen8, they added one bit to the message type field but left it reserved MBZ for read/write messages. This commit chooses to treat that as if they expanded the field everywhere and just didn't have enough enum values for read/write to bother with the extra bit. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	c3aa436bfe	intel/eu/validate: SEND restrictions also apply to SENDC Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	fee6bd8d8e	intel/eu: Use GET_BITS in brw_inst_set_send_ex_desc It's a bit more readable Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	b284d222db	intel/fs: Use SHADER_OPCODE_SEND for varying UBO pulls on gen7+ Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	8514eba693	intel/fs: Use SHADER_OPCODE_SEND for texturing on gen7+ Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	f547cebbe0	intel/fs: Use a logical opcode for IMAGE_SIZE Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d2d3e04501	intel/fs: Use SHADER_OPCODE_SEND for surface messages Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	7f1cf046cd	intel/fs: Add a generic SEND opcode Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	ba3c5300f9	intel/eu: Rework surface descriptor helpers This commit pulls the surface descriptor helpers out into brw_eu.h and makes them no longer depend on the codegen infrastructure. This should allow us to use them directly from the IR code instead of the generator. This change is unfortunately less mechanical than perhaps one would like but it should be fairly straightforward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	5b17379631	intel/eu: Add has_simd4x2 bools to surface_write functions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	2ce93b88c0	intel/fs: Take an explicit exec size in brw_surface_payload_size() Instead of magically falling back to SIMD8 for atomics and typed messages on Ivy Bridge, explicitly figure out the exec size and pass that into brw_surface_payload_size. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	cf42b0f9e2	intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf() Like all the other sends, it's just mlen * REG_SIZE. Fixes: `3cbc02e469` "intel: Use TXS for image_size when we have..." Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	009c0bd840	intel/defines: Explicitly cast to uint32_t in SET_FIELD and SET_BITS If you pass a bool in as the value to set, the C standard says that it gets converted to an int prior to shifting. If you try to set a bool to bit 31, this lands you in undefined behavior. It's better just to add the explicit cast and let the compiler delete it for us. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	077b9557a4	intel/fs: Get rid of fs_inst::equals There are piles of fields that it doesn't check so using it is a lie. The only reason why it's not causing problem is because it has exactly one user which only uses it for MOV instructions (which aren't very interesting) and only on Sandy Bridge and earlier hardware. Just get rid of it and inline it in the one place that it's actually used. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Matt Turner	18b467c066	intel/compiler: Add a file-level description of brw_eu_validate.c Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-01-26 10:33:22 -08:00
Tapani Pälli	540939ecee	android: fix build issues with libmesa_anv_gen* libraries We need this include path to find nir/nir_xfb_info.h. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-25 15:21:06 +02:00
Andrii Simiklit	4759bb2fcf	intel/batch-decoder: fix a vb end address calculation According to the loop implementation (in 'ctx_print_buffer' function), which advances dword by dword over vertex buffer(vb), the vb size should be aligned by 4 bytes too. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-25 15:12:30 +02:00

1 2 3 4 5 ...

4023 Commits