Commit Graph

8177 Commits

Author SHA1 Message Date
Lionel Landwerlin 09caa8902c anv: move internal RT shaders to the internal cache
Those shaders are just like the blorp ones.

v2: Use a single internal cache for blorp/RT (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7f1e82306c ("anv: Switch to the new common pipeline cache")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16741>
2022-05-28 10:14:03 +00:00
Jason Ekstrand 5d0b09be5b anv: Use the base vk_buffer struct
This mostly gets us the vk_buffer_range() helper but may be useful in
the future.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16607>
2022-05-27 18:38:57 -05:00
Jason Ekstrand dfedeccc13 intel: Only set VectorMaskEnable when needed
For cases with lots of very small primitives, this may improve
performance because we're not executing those dead channels all the
time.

Shader-db reports no instruction or cycle-count changes.  However, by
hacking up the driver to report when this optimization triggers, it
appears to affect about 10% of shader-db.

v2 (Kenneth Graunke): Always enable VMask prior to XeHP for now,
because using VMask on those platforms allows us to perform the
eliminate_find_live_channel() optimization.  However, XeHP doesn't
seem to have packed fragment shader dispatch, so we lose that
optimization regardless, and there's no reason not to avoid vmask.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1054>
2022-05-27 21:52:48 +00:00
Jason Ekstrand 0d28de212a anv: Don't disable the fragment shader if XFB is enabled
It turns out that we need a fragment shader for streamout.  Whh?  From
Lionel's reading of simulator sources, it seems the streamout unit is
looking at enabled next stages.  It'll generate output to the clipper in
the following cases :

 - 3DSTATE_STREAMOUT::ForceRendering = ON
 - PS enabled
 - Stencil test enabled
 - depth test enabled
 - depth write enabled
 - some other depth/hiz clear condition

Forcing rendering without a PS seems like a recipe for hangs so it's
probably better to just enable the PS in this case.

Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
2022-05-27 14:33:53 +00:00
Jason Ekstrand 73b3efcd59 anv: Handle the null FS optimization after compiling shaders
Actually compile and cache the no-op fragment shader but remove it from
the pipeline if we determine it's a no-op.  This way we always have it
even if it's not strictly needed.

Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
2022-05-27 14:33:53 +00:00
Jason Ekstrand 9fe6caf4e7 anv: Drop alpha_to_coverage from the NULL FS optimization
Starting with Ivy Bridge, we implement alpha-to-coverage by writting
gl_SampleMask with a pattern based on alpha.  This will show up in
wm_prog_data::uses_omask so we don't need to look at the key.

Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
2022-05-27 14:33:53 +00:00
Jason Ekstrand 1b9248e761 intel/fs: Copy color_outputs_valid into wm_prog_data
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
2022-05-27 14:33:53 +00:00
Jason Ekstrand 8379993223 intel/fs: Drop fs_visitor::emit_alpha_to_coverage_workaround()
It no longer exists.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
2022-05-27 14:33:53 +00:00
David Heidelberg b19c858f3d ci/intel: add RoR and Nheko traces and reenable most of Valve traces
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16633>
2022-05-27 06:51:38 +00:00
Lionel Landwerlin e666089082 intel/disasm: add missing handling of <1;1,0>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7cd9adeb41 ("intel/compiler: In XeHP prefer <1;1,0> regions before compacting")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16704>
2022-05-26 06:42:16 +00:00
Kenneth Graunke 9886615958 intel/compiler: Move spill/fill tracking to the register allocator
Originally, we had virtual opcodes for scratch access, and let the
generator count spills/fills separately from other sends.  Later, we
started using the generic SHADER_OPCODE_SEND for spills/fills on some
generations of hardware, and simply detected stateless messages there.

But then we started using stateless messages for other things:
- anv uses stateless messages for the buffer device address feature.
- nir_opt_large_constants generates stateless messages.
- XeHP curbe setup can generate stateless messages.

So counting stateless messages is not accurate.  Instead, we move the
spill/fill accounting to the register allocator, as it generates such
things, as well as the load/store_scratch intrinsic handling, as those
are basically spill/fills, just at a higher level.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16691>
2022-05-25 06:56:01 +00:00
Michael Skorokhodov 10b6d9230c anv: Update line range
This commit increases the maximum line width to 8.0 for SLK+
and to 7.9921875 for BDW and earlier.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6234
Fixes: fce0027d ("anv: Unbreak wide lines on HSW/BDW")
Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15879>
2022-05-24 23:09:26 +00:00
Kenneth Graunke 59bfc9c6cb intel: Fix analysis invalidation in eliminate_find_live_channel
If we saw a HALT instruction, we would forget to invalidate our analysis
pass information before returning progress.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16677>
2022-05-24 22:36:39 +00:00
Marcin Ślusarz 21d3630cbc intel/tools: fix 32-bit build
Fixes: 0aac3b1009 ("intel/tools/aubinator: add support for 2 "new" subopcodes")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6553
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16695>
2022-05-24 18:27:32 +00:00
Viktoriia Palianytsia e39a5f2b9f anv: Add workaround for sample mask with multisampling
The game Batman: Arkham Knight expects OpenGL behavior
with sample mask and multisampling which is different
from the Vulkan one.
This workaround fix changes key->ignore_sample_mask_out
value that is used for
prog_data->uses_omask definition in brv_fs.cpp(9740)
In that way prog_data->uses_omask also changes it value
and the cloak stops flickering.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6078

Signed-off-by: Viktoriia Palianytsia <v.palianytsia@globallogic.com>

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16551>
2022-05-24 14:43:57 +00:00
Marcin Ślusarz 8187716b55 intel/tools: add macros for gfx12+ variant of VCSUNIT0
Not used for now.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16664>
2022-05-24 08:03:45 +00:00
Marcin Ślusarz ba80c36708 intel/tools/aubinator: list all platforms in help message
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16664>
2022-05-24 08:03:45 +00:00
Marcin Ślusarz 0aac3b1009 intel/tools/aubinator: add support for 2 "new" subopcodes
... and add macros for subopcodes we haven't seen yet

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16664>
2022-05-24 08:03:44 +00:00
Marcin Ślusarz 43ad5fd9b7 intel/tools: drop wrappers around mmio regs macros
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16664>
2022-05-24 08:03:44 +00:00
Marcin Ślusarz b916b30f58 intel/tools: clean up mmio regs definitions
Each unit has the same regs at the same offsets.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16664>
2022-05-24 08:03:44 +00:00
Marcin Ślusarz 3910736f29 intel/tools: add support for GEM_CREATE_EXT in intel_dump_gpu
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16664>
2022-05-24 08:03:44 +00:00
Jason Ekstrand c24aa449d0 vulkan,anv,turnip: Add a common CmdBindVertexBuffers wrapper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16611>
2022-05-20 02:12:37 +00:00
Kenneth Graunke 27314718a3 intel: Drop Wa_1409226450 (stall before instruction cache invalidation)
Production Tigerlake and DG1 hardware shouldn't need this workaround.
It was only needed on the very first steppings which never went public.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16575>
2022-05-19 21:31:45 +00:00
Lionel Landwerlin 1c077ca9c0 u_trace/anv/iris: drop cs argument for recording traces
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16605>
2022-05-19 19:04:28 +00:00
Lionel Landwerlin 5398c9183e intel/ds: fix compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6518
Fixes: efc2782f97 ("intel/perf: store a copy of devinfo")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16601>
2022-05-19 16:42:41 +00:00
Lionel Landwerlin 9d0db8d4c4 intel/perf: deal with OA reports timestamp values on DG2
OA reports on XeHP have their timestamp shifted to the left by 1. To
get that back in the same time domain as the REG_READ you need to
shift it back to the right and you're loosing the top bit.

v2: use ull for 64bit constant (Ian)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin 773f41e3e4 intel/perf: disable sseu setting on Gfx12.5+
This is rejected by i915.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin d2834dd626 intel/perf: add new layout for Gfx12.5 products
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin 66045acdf9 intel/perf: add max vfuncs
New counters will use those from inside their read function to
generate percentage numbers.

v2: Forgot to update Iris (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin c740ca6000 intel/perf: add support new variable counting the number of EUs in slice0-3
v2: MIN2(4, max_slices) (Marcin)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin 6f63bc38e7 intel/perf: add OA A counter type
On Gfx12.5 products, we'll need to capture a couple of A counters that
are not captured in MI_RPC reports. Those are actually global,
previously all A counters were per context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin 376e420abb intel/perf: stop overriding oa_format
This already set in the intel_perf_setup.h file at metric set
creation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin aa04b47c6e intel/perf: add support for GtSlice/GtSliceXDualsubsliceY variables
For those, we'll fish the information out of the devinfo.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin d134a62345 intel/perf: add support for dualsubslice count variable
This is the same as the subslice count.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin efc2782f97 intel/perf: store a copy of devinfo
In the future we'll pull more information off devinfo.

v2: Constify pointers (Ian)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Lionel Landwerlin 0df4b96062 intel/perf: add support for new opcodes in code generation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16144>
2022-05-17 19:55:10 +00:00
Jason Ekstrand fc8d2543fc vulkan,v3dv: Add a driver_internal flag to vk_image_view_init/create
We already had a little workaround for v3dv where, for some if its meta
ops, it had to bind a depth/stenicil image as color.  Instead of
special-casing binding depth/stencil as color, let's flip on the
drier_internal flag and get rid of most of the checks in that case.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16376>
2022-05-17 18:14:55 +00:00
Kenneth Graunke b637f6c3db intel/decoder: Fix binding table pointer decoding with large offsets
XeHP supports a 20:5 pointer format, so the offset can legitimately
be more than UINT16_MAX.  Likewise, with 256B binding table mode on
Icelake/Tigerlake, we might have 18:8 pointers that exceed UINT16_MAX.

Thanks to Felix DeGrood for catching this!

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16538>
2022-05-17 08:52:00 +00:00
David Heidelberg d22eeb5ae0 ci/iris: skqp: remove flaking atlastext for TGL
Example:
 - https://gitlab.freedesktop.org/mesa/mesa/-/jobs/22380389#L4349
 - https://mesa.pages.freedesktop.org/-/mesa/-/jobs/22380389/artifacts///results/gles/report.html

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6460
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16407>
2022-05-17 06:57:19 +00:00
David Heidelberg 317496ba8a ci/iris: skqp: add default GLES rendertests for TGL
Import the intact whole rendertest file from skqp (branch
android-cts-12.1_r1) to be able remove the offending test line in the
following commit.

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16407>
2022-05-17 06:57:19 +00:00
Timothy Arceri d7a071a28f gallium/drivers: set force_indirect_unrolling_sampler for all required drivers
This is set to true for all drivers that have a GLSL level
of support lower than 4.00. This matches the rule for setting the
GLSL IR option EmitNoIndirectSampler.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16543>
2022-05-17 02:12:21 +00:00
Lionel Landwerlin 17fc7b20b1 anv: fix primitives generated queries values
Numbers in some situations are incorrect because we don't stall
properly before capturing the register value.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6412
Fixes: a468f26ca5 ("anv: implement VK_EXT_primitives_generated_query")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16505>
2022-05-14 10:47:29 +00:00
Marcin Ślusarz 1542ab70eb anv: handle primitive shading rate for mesh
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16030>
2022-05-13 13:05:51 +00:00
Marcin Ślusarz 9acb30c8c4 intel/compiler: implement primitive shading rate for mesh
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16030>
2022-05-13 13:05:51 +00:00
Marcin Ślusarz aa1c128b54 anv: disable streamout before emitting mesh shading state
Fixes tests which use secondary command buffers.

Fixes: ef04caea9b ("anv: Implement Mesh Shading pipeline")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16493>
2022-05-13 09:43:02 +00:00
Marcin Ślusarz 29a778fa6b intel/compiler: print name of the unhandled intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16493>
2022-05-13 09:43:02 +00:00
Marcin Ślusarz f083df8710 anv: update task/mesh distribution with the recommended values
Fixes: ef04caea9b ("anv: Implement Mesh Shading pipeline")

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16493>
2022-05-13 09:43:02 +00:00
Marcin Ślusarz 65ff6932dc intel/compiler: handle gl_Viewport and gl_Layer in FS URB setup
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16493>
2022-05-13 09:43:02 +00:00
Marcin Ślusarz 040062df41 intel/compiler: handle VARYING_SLOT_CULL_PRIMITIVE in mesh
It's needed for gl_MeshPerPrimitiveNV[].gl_ViewportMask

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16493>
2022-05-13 09:43:02 +00:00
Vadym Shovkoplias 55c71217ec driconf: Add a limit_trig_input_range option
With this option enabled range of input values for fsin and fcos is
limited to [-2*pi : 2*pi] by calculating the reminder after 2*pi modulo
division. This helps to improve calculation precision for large input
arguments on Intel.

-v2: Add limit_trig_input_range option to prog_key to update shader
     cache (Lionel)

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16388>
2022-05-13 06:47:53 +00:00
Kenneth Graunke ad537edc7c anv: Fix INTEL_DEBUG=bat on XeHP
We no longer emit STATE_BASE_ADDRESS in every batch on XeHP, so the
decoder might not know what the various base addresses are if it's only
looking at a single batch.  Fortunately, they also never change, so we
can just emit them once here.

On earlier platforms, initializing them here should be harmless.  We'll
emit STATE_BASE_ADDRESS if we change them, which will update these.

Thanks to Iván Briano for catching this.

Fixes: 8831cb38aa ("anv: Stop updating STATE_BASE_ADDRESS on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16287>
2022-05-12 11:10:25 -07:00
Jordan Justen ad565f6b70 intel/dev: Enable first set of DG2 PCI IDs
Mostly Matt Roper's kernel patch commit message:

The IDs added here are the subset reserved for 'motherboard down'
designs of DG2. We have all the necessary support upstream to enable
these now.

The remaining DG2 IDs for add-in cards will be enabled in a future
patch once some additional required functionality has fully landed.

Ref: https://patchwork.freedesktop.org/patch/msgid/20220425211251.77154-3-matthew.d.roper@intel.com
Cc: 22.1 <mesa-stable>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16449>
2022-05-12 03:03:57 -07:00
Jordan Justen 4456209ce5 intel/dev: Add INTEL_PLATFORM_DG2_G12
Cc: 22.1 <mesa-stable>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16449>
2022-05-12 03:03:57 -07:00
Jason Ekstrand 352e32e5ba nir/builder: Add a nir_trim_vector helper
This pattern pops up a bunch and the semantics of nir_channels() aren't
very convenient much of the time.  Let's add a nir_trim_vector() which
matches nir_pad_vector().

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16309>
2022-05-11 14:47:33 +00:00
Iván Briano 2e46f38902 anv: re-alloc push constants after secondary command buffers
If the secondary command buffer executed used push constants on a
different set of stages than the primary is using, we may end up not
reallocating them for the primary, getting misrender artifacts at best,
or a nice GPU hang at worst.

Fixes the tests from a CTS from the future:
dEQP-VK.dynamic_rendering.random.*

Cc: mesa-stable

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16439>
2022-05-10 21:56:49 +00:00
Karol Herbst 9c5fd100cc nir: add a nir_remove_non_entrypoints helper
This code just got duplicated a lot. There is still more, but the
remaining instances do a bit more than just removing other functions.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16348>
2022-05-10 03:37:44 +00:00
Emma Anholt af76f0bcfc ci/iris: Cut the glk-deqp test coverage in half.
It's taking 13-14 minutes of deqp-runner time, not counting booting, or
the LAVA-side job getting being queued behind other jobs.  Well past our
10-minute runtime target, and we saw load on these boards causing the
queue to get quite long
(https://gitlab.freedesktop.org/mesa/mesa/-/issues/6409#note_1368750)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16359>
2022-05-10 02:16:04 +00:00
Chia-I Wu b2b810ebff anv: advertise rectangularLines only for Gen10+
We use the non-strict algorithm (with parallelograms) prior to Gen10 for
wide lines.  We can not advertise rectangularLines.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: f6e7de41d7 ("anv: Implement VK_EXT_line_rasterization")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15432>
2022-05-06 18:22:19 +00:00
Lionel Landwerlin 969512d696 intel: fix stall debug option
Missing the parsing bit.

Fixes: 317512e038 ("anv/intel: add a new debug flag for stalling after every draw/dispatch")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16338>
2022-05-06 08:27:47 +00:00
Emma Anholt 3a42e92a4f glsl: Drop the dead MOD_TO_FLOOR path.
It's now called lower_fmod in NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8044>
2022-05-05 22:25:03 +00:00
Emma Anholt 72dba615be ci/iris: Add a bunch of APL and KBL flakes recently.
I got hit by one of them trying to merge !8044.  Just update the list.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8044>
2022-05-05 22:25:03 +00:00
Emma Anholt 3c0e4be89b ci/iris: Demote APL deqp to manual-only for now.
it's been flaking with "2022-05-05 16:29:49.055151: [0m[31mERROR - Failure
getting run results: parsing results: Reading from dEQP: timed out waiting
for fd to be ready (See \"//results/c32.r1.log\")" and a pile of missings
since the brief "whoops, HW CI failed to listen to the test exit code"
regression.

The only ways I know of to hit this case would be:

1) The deqp binary abruptly wedges on its own.  This happens with NFS
failures sometimes, but the rest of the run went fine and we never got the
kernel complaining about NFS, so that seems unlikely.

2) The stderr pipe filled up before stdout was completed, and deqp got
wedged trying to output stderr (happens sometimes when you do like
NIR_DEBUG=print in your run).

Both of these seem unlikely, given that we've got a big .qpa file that
made it all the way to writing out test case durations at the end of the
run before abruptly terminating.  Why didn't we have at least some of the
test results parsed?

The next deqp-runner release we integrate will solve #2, and cleans up
these error paths a bunch, so I'm hoping we get more information soon.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16350>
2022-05-05 18:20:12 +00:00
Lionel Landwerlin 797a8850b9 anv: remove static_state_mask
This is now unnecessary. Either an instruction is never dynamic and
it's emitted in genX_pipeline.c or it can be and it's emitted in
genX_cmd_buffer.c/gfx8_cmd_buffer/gfx7_cmd_buffer.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:45 +00:00
Lionel Landwerlin 74a27a6ccb anv: don't emit 3DSTATE_VF_TOPOLOGY in pipeline batch
v2: drop primitive_topology = 0xffffffff (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:45 +00:00
Lionel Landwerlin 48229d11ba anv: don't emit 3DSTATE_DEPTH_BOUNDS in pipeline batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:45 +00:00
Lionel Landwerlin 76e735d09c anv: don't emit 3DSTATE_BLEND_STATE_POINTERS in pipeline batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:45 +00:00
Lionel Landwerlin e9d000a831 anv: don't emit 3DSTATE_WM in pipeline batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin 065242d623 anv: don't emit 3DSTATE_STREAMOUT in pipeline batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin ce8bb29342 anv: never emit 3DSTATE_CPS in the pipeline batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin 168b13364f anv: rework sample location
On Gfx7 we can only give the sample location for a given multisample
number. This means everytime the multisampling value changes, we have
to re-emit the locations. It's fine because it's also where
(3DSTATE_MULTISAMPLE) the number of samples is stored.

On Gfx8+ though, 3DSTATE_MULTISAMPLE only holds the number of samples
and all the sample locations for all number of samples are located in
3DSTATE_SAMPLE_PATTERN. So to be more effecient there, we need to
track the locations for all sample numbers and compare new values with
the relevant sample count when touching the dynamic state.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin 810518fda7 Revert "anv: fix dynamic state emission"
This reverts commit f348103fce. The
change was causing performance regressions.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin 69e6417e19 anv: add missing logic op set in pipeline dyn state
v2: add ANV_CMD_DIRTY_DYNAMIC_LOGIC_OP check (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 75ad0e4b08 ("anv: support blending logic op dynamic state")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin 5048f15737 anv: reset all dynamic state after secondary execution
We don't know in what state the secondary buffer will leave the HW
when it ends. It's easier to consider everything needs to be reemitted
for now.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
2022-05-03 17:12:44 +00:00
Lionel Landwerlin 4efc997472 anv: fix invalid utrace memcpy l3 config on gfx < 11
device->l3_config is only valid on Gfx11+

This only fixes using GPU_TRACE=1

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 02a4d622ed ("anv: expose a couple of emit helper to build utrace buffer copies")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16291>
2022-05-03 13:18:48 +00:00
Rob Clark c4b5ebe1fc drm-shim: Better mmap offsets
Using the bo pointer address as the offset doesn't go over well when
someone is fuzzing you.  But we already have the mem_addr, we can simply
use that instead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16250>
2022-05-02 19:50:33 +00:00
Caio Oliveira 7cd9adeb41 intel/compiler: In XeHP prefer <1;1,0> regions before compacting
Ken performed some tests with shader-db to evaluate the effects

```
Across all 145,848 shaders generated, the results were:

Total bytes compacted before: 3,326,224
Total bytes compacted after: 60,963,280
```

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15399>
2022-05-02 18:03:01 +00:00
Lionel Landwerlin 0be9cac742 anv: limit clflush usage
Discrete platforms don't have LLC, but on those, we mmap our buffers
with WC. So we shouldn't need to clflush there.

Anv already had a boolean field on the physical device to know whether
we need to use clflush(), based off the memory heaps available. So use
that instead.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15780>
2022-05-02 12:07:01 +00:00
Lionel Landwerlin 44e93b4c6f anv: fix clflush usage on utrace copy batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: cc5843a573 ("anv: implement u_trace support")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15780>
2022-05-02 12:07:01 +00:00
Francisco Jerez 14cad38b19 intel/dev: Compute pixel pipe information based on geometry topology DRM query.
This changes the intel_device_info calculation to call an additional
DRM query requesting the geometry topology from the kernel, which may
differ from the result of the current topology query on XeHP+
platforms with compute-only and 3D-only DSSes.  This seems more
reliable than the current guesswork done in intel_device_info.c trying
to figure out which DSSes are available for the render CS.

Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14143>
2022-04-30 00:00:58 +00:00
Jordan Justen de99a11172 intel_dev_info: Add --hwconfig command line parameter
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14511>
2022-04-28 21:56:32 +00:00
Jordan Justen d9ff9ea9c3 intel/dev: Read hwconfig from i915
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14511>
2022-04-28 21:56:32 +00:00
Emma Anholt 536c8ee96d nir/lower_tex: Make the adding a 0 LOD to nir_op_tex in the VS optional.
This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.

All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
  shadow2darray with NIR, since the shading languages don't expose txl of
  those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>
2022-04-28 21:26:08 +00:00
Nanley Chery b023f18bad isl,iris: Add DG2 CCS modifier support for XeHP
Cc: 22.1 <mesa-stable>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14521>
2022-04-28 20:02:14 +00:00
Nanley Chery a53abeb7fb intel/isl: Add a score for I915_FORMAT_MOD_4_TILED
Enables the modifier in anv.

Cc: 22.1 <mesa-stable>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14521>
2022-04-28 20:02:14 +00:00
Anuj Phogat ac441d0953 isl,iris: Add I915_FORMAT_MOD_4_TILED support for XeHP
This patch adds Tile 4 modifier support to Mesa and allows Mesa to
use Tile 4 on gen12-hp with GBM.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: 22.1 <mesa-stable>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14521>
2022-04-28 20:02:14 +00:00
Tapani Pälli d3ef3657b2 isl: disable mcs (and mcs+ccs) for color msaa on DG2
Fixes lots of various test failures in:
   dEQP-VK.pipeline.multisample.min_sample_shading_disabled.*
   dEQP-GLES3.functional*multisample.*
   KHR-GL*sample_variables.*

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13991>
2022-04-28 05:31:52 +00:00
Lionel Landwerlin f4f350a06c anv: reemit 3DSTATE_STREAMOUT after memcpy
This doesn't fix anything because memcpy is only used before secondary
buffer execution and we dirty everything after that.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16189>
2022-04-27 18:43:00 +00:00
David Heidelberg 657b0ff861 ci/iris: Enable SKQP on Tiger Lake boards
- SKQP gets included now in all amd64 LAVA builds.
 - add test job for Tiger Lake (tgl)
 - add manual test job for Whiskey Lake (whl), because all runners are
   already used
 - document that we have 13 tgl machines

Tests failed (on tgl):
 - gl_simpleaaclip_aaclip, 1 pixel off : https://okias.pages.freedesktop.org/-/mesa/-/jobs/21790629/artifacts///results/gl/report.html

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16048>
2022-04-27 12:35:13 +00:00
David Heidelberg c1e59bea05 ci: intel: Merge anv and iris into src/intel/ci
This commit make simple adding tests which use both GL(ES) and VK.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16048>
2022-04-27 12:35:13 +00:00
Erik Faye-Lund 3620e7e71c vulkan: drop empty vulkan_wsi_args
This is always empty, so let's just get rid of it.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16162>
2022-04-27 11:51:26 +00:00
Sviatoslav Peleshko 28ca5636f6 anv: workaround apps that assume full subgroups without specifying it
Without this we might choose 8 or 16 width, while the app assumes 32.
With subgroup operations it may cause wrong calculations and thus bugs.

Examples of such games are Aperture Desk Job and DOOM Eternal.

v2: Make it a driconf option instead of applying unconditionally, move
    from brw_required_dispatch_width to brw_compile_cs
v3: Rename allow_assuming_full_subgroups -> assume_full_subgroups.
    Include assume_full_subgroups value in anv_pipeline_hash_compute().
v4: Move actual workaround code from brw_fs.c -> anv_pipeline.c.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6171
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15708>
2022-04-26 13:21:43 +00:00
Lionel Landwerlin fe413962b4 anv: skip acceleration structure in binding table emission
With mutable descriptor types, we can end up in a situation where a
binding can be, for instance, both a UBO and an acceleration
structure.

While we can promote the UBO to a binding table entry and the shader
can use it, this isn't true of acceleration structures that have no
surface state. In that case just skip the entry. The shader is already
compiled to use the descriptor entry.

In the non mutable case, the entry will not be created by
anv_nir_apply_pipeline_layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 63e91148b7 ("anv: Enable VK_VALVE_mutable_descriptor_type")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15969>
2022-04-25 13:19:28 +00:00
Lionel Landwerlin b7828f56ba anv: fix acceleration structure descriptor template writes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d258b0bf0e ("anv: Add support for binding acceleration structures")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16058>
2022-04-25 11:01:56 +00:00
Lionel Landwerlin ace22edd30 anv: remove unused enum
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16058>
2022-04-25 11:01:56 +00:00
Lionel Landwerlin 107acf5a4a intel: fixup number of threads per EU on XeHP
Computations for indexing in-memory data structures for ray queries
depend on this.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4f9141607f ("intel: Add device info for DG2")
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15925>
2022-04-25 10:06:02 +00:00
Lionel Landwerlin 5a52cfd88b anv: fix INTEL_DEBUG=sync
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3684012770 ("anv: implement DEBUG_SYNC")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16101>
2022-04-22 21:59:50 +00:00
Jason Ekstrand 2d3b3b757a anv: Clean up pipeline cache helpers a bit
Instead of having two different helpers, delete the pipeline_cache ones.
Also, instead of manually handling the cache == NULL case in every
vkCreateFooPipelines call, handle it inside the helpers.  This means
that BLORP can use them too by passing cache=NULL.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13184>
2022-04-22 19:38:52 +00:00
Jason Ekstrand 7f1e82306c anv: Switch to the new common pipeline cache
This patch is intended to be somewhat minimal.  There's a lot of cleanup
work that can be done but we'll leave that to later patches.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13184>
2022-04-22 19:38:52 +00:00
Jason Ekstrand c551f6c4df anv: Rename a fail label in CreateDevice
The rest of them are labeled with the thing they need to destroy first,
not the thing that failed.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13184>
2022-04-22 19:38:52 +00:00
Vadym Shovkoplias 785b6579ae anv: Fix geometry flickering issue when compute and 3D passes are combined
Call flush_pipeline_select_3d in CmdBeginRendering() to emit a dummy MEDIA_VFE_STATE
before switching from GPGPU to 3D.

Original commit with the fix:
bc612536 ("anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D")

Fixes: 3501a3f9ed ("anv: Convert to 100% dynamic rendering")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6201
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15954>
2022-04-21 11:00:07 +00:00
Jordan Justen d257494ec4 intel/dev: Add device info for RPL-P
Cc: mesa-stable
Ref: https://patchwork.freedesktop.org/series/102701/
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16054>
2022-04-21 01:32:53 -07:00
Lionel Landwerlin a468f26ca5 anv: implement VK_EXT_primitives_generated_query
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15638>
2022-04-20 10:37:24 +03:00
Lionel Landwerlin 8ef8e72aac intel/fs: tidy up lower of ray queries
We already expect a single function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15946>
2022-04-19 12:56:06 +00:00
Marcin Ślusarz 5dace41c10 intel/compiler: invalidate metadata in brw_nir_initialize_mue
New "if" blocks may have been inserted.

Fixes: bc4f8c073a ("intel/compiler: inject MUE initialization")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15924>
2022-04-19 11:43:55 +00:00
Marcin Ślusarz 4fddef33d5 intel/compiler: invalidate all metadata in brw_nir_lower_intersection_shader
New "if" blocks were inserted.

Fixes: 303378e1dd ("intel/rt: Add lowering for combined intersection/any-hit shaders")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15924>
2022-04-19 11:43:55 +00:00
Marcin Ślusarz 5bd3ba5b67 anv: invalidate all metadata in anv_nir_lower_ubo_loads
lower_ubo_load_instr may insert "if" blocks.

Fixes: 61749b5a15 ("anv: Add a pass for lowering A64 UBO access")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15924>
2022-04-19 11:43:55 +00:00
Lionel Landwerlin 184084e21c anv: allow getting the address of the beginning of the batch
There is no reason not to be able to get it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 34a0ce58c7 ("anv: add a new execution mode for secondary command buffers")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15968>
2022-04-19 10:43:29 +00:00
Alexey Bozhenko 2d7d907ad1 intel/compiler: fix singleton pointer coverity warning
fix brw_kernel::stats member that was declared as a variable
but used as a pointer to array of 3 elements

CID: 1503279

Signed-off-by: Bozhenko Alexey <oleksii.bozhenko@globallogic.com>

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15975>
2022-04-19 12:36:10 +03:00
Lionel Landwerlin 3684012770 anv: implement DEBUG_SYNC
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15950>
2022-04-19 07:32:01 +00:00
Lionel Landwerlin 317512e038 anv/intel: add a new debug flag for stalling after every draw/dispatch
Useful for hang debugging. Previously Anv incorrectly used DEBUG_SYNC
for this.

v2: Update documentations for sync/stall (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15950>
2022-04-19 07:32:01 +00:00
Lionel Landwerlin a1969fa777 anv: improve INTEL_DEBUG for submit
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15950>
2022-04-19 07:32:01 +00:00
illiliti 67af7e2b40 Use proper types for meson objects
Fix invalid usage of meson objects which violates official meson
specification and thus breaks muon, an implementation of meson
written in C.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15715>
2022-04-18 13:03:08 +03:00
Lionel Landwerlin 04bd007757 intel/fs: require memory fence commit bit on Gfx9
Fixes a hang on Gfx9 GT1 : dEQP-VK.compute.zero_initialize_workgroup_memory.max_workgroup_memory.128

Tested-by: Mark Janes <markjanes@swizzler.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15596>
2022-04-17 21:24:17 +00:00
Lionel Landwerlin b07c215c35 intel: fix URB programming for GT1s
We're missing a programming restriction.

Hopefully fixing
dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.* on
Gfx9atoms

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6216
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>.
Tested-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15596>
2022-04-17 21:24:17 +00:00
Jason Ekstrand ad0dc8e4ab intel/compiler: Set lower_fisnormal
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15985>
2022-04-16 00:26:43 +00:00
Lionel Landwerlin 7be6632f7d anv: use shadow surface for stencil input attachment on gfx7
This fixes a number of tests like :
  dEQP-VK.renderpass*.suballocation.multisample.s8_uint.*
  dEQP-VK.renderpass*.suballocation.multisample.separate_stencil_usage.d24_unorm_s8_uint.*.test_stencil
  dEQP-VK.renderpass*.suballocation.multisample.d24_unorm_s8_uint.*
  dEQP-VK.renderpass*.suballocation.multisample.d32_sfloat_s8_uint.*

Because the driver asserts when generating RENDER_SURFACE_STATE with a
8 Valign value for stencil buffer (only 2 & 4 are supported).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12670>
2022-04-15 09:46:40 +03:00
Lionel Landwerlin 08f3950d6b anv: stop using old entrypoint/struct/enum names for 1.3
v2: More replacements

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15920>
2022-04-13 21:13:56 +00:00
Lionel Landwerlin e11bedb9f5 intel/fs: add a note on possible optimization of root node address
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>
2022-04-13 11:24:49 +00:00
Lionel Landwerlin 9c0805ef91 intel/fs: fix metadata preserve on trace_ray intrinsic
c78be5da30 ("intel/fs: lower ray query intrinsics") introduced a
helper function using nir_(push|pop)_if which invalidated dominance &
block_index for the replacement of nir_intrinsic_rt_trace_ray.

We can still keep dominance/block_index metadata for the lowering of
nir_intrinsic_rt_execute_callable though.

This change uses 2 different lowering function with correct metadata
preservation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c78be5da30 ("intel/fs: lower ray query intrinsics")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>
2022-04-13 11:24:49 +00:00
Jason Ekstrand 69b5424ea4 intel/nir: Lower 8 and 16-bit bitwise unops
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>
2022-04-12 23:19:38 +00:00
Jason Ekstrand a482877c70 intel/fs: Implement 16-bit [ui]mul_high
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>
2022-04-12 23:19:38 +00:00
Mykhailo Skorokhodov 9c7e750ffe intel/fs: Enable b2f(inot(a)) and b2i(inot(a)) optimization for Gfx12+
The commit enables the optimization for Intel Gfx12+ graphics.

Tigerlake
```
total instructions in shared programs: 1289326 -> 1289015 (-0.02%)
instructions in affected programs: 37841 -> 37530 (-0.82%)
helped: 78
HURT: 9
helped stats (abs) min: 1 max: 26 x̄: 4.69 x̃: 3
helped stats (rel) min: 0.10% max: 12.50% x̄: 2.07% x̃: 1.21%
HURT stats (abs)   min: 1 max: 18 x̄: 6.11 x̃: 4
HURT stats (rel)   min: 0.16% max: 1.95% x̄: 0.94% x̃: 0.61%
95% mean confidence interval for instructions value: -4.95 -2.20
95% mean confidence interval for instructions %-change: -2.34% -1.18%
Instructions are helped.

total cycles in shared programs: 105606388 -> 105606442 (<.01%)
cycles in affected programs: 620119 -> 620173 (<.01%)
helped: 49
HURT: 28
helped stats (abs) min: 2 max: 3618 x̄: 228.63 x̃: 12
helped stats (rel) min: 0.02% max: 23.31% x̄: 4.60% x̃: 1.11%
HURT stats (abs)   min: 1 max: 2142 x̄: 402.04 x̃: 29
HURT stats (rel)   min: 0.01% max: 36.42% x̄: 5.01% x̃: 0.46%
95% mean confidence interval for cycles value: -151.80 153.20
95% mean confidence interval for cycles %-change: -3.00% 0.79%
Inconclusive result (value mean confidence interval includes 0).
```

Related-to: 7725d60938
Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14017>
2022-04-12 10:55:05 +00:00
Marcin Ślusarz 65600a34c2 anv: initialize 3DMESH_1D.ExtendedParameter0 when ExtendedParameter0Present
When IndirectParameterEnable==true it's not actually used by the hardware,
but if it's not initialized and INTEL_DEBUG=bat is set, then Valgrind complains.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15850>
2022-04-12 09:10:31 +00:00
Marcin Ślusarz f844ce66c8 anv: fix push constant lowering for task/mesh
Fixes: a6031cd9bd ("anv: fix push constant lowering with bindless shaders")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15850>
2022-04-12 09:10:31 +00:00
Francisco Jerez e858da39e5 intel/perf: Fix OA report accumulation on Gfx12+.
The intel_perf_query path used for performance queries on GL was
passing a bogus "end" pointer to intel_perf_query_result_accumulate(),
causing it to accumulate garbage values.  This was causing the values
of many performance counters to be corrupted.

The "end" pointer was incorrect because the current code was assuming
that different OA reports were located TOTAL_QUERY_DATA_SIZE bytes
apart, which is a hard-coded preprocessor define.  However recent
(Gfx12+) hardware generations use a variable query size determined by
the query layout.  Use the size derived from it instead, and remove
the stale define.

Fixes: 3c51325025 ("intel/perf: switch query code to use query layout")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15783>
2022-04-12 00:11:47 +00:00
Kenneth Graunke b05ac36f01 intel/genxml: Add SAMPLER_MODE bits for enabling Small PL on Icelake
This enables a lower power mode in the sampler hardware in certain
common scenarios.  On Tigerlake, SAMPLER_MODE is not programmable by
userspace but the kernel already sets this bit for us.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>
2022-04-11 19:17:07 +00:00
Kenneth Graunke e3defe7ae7 intel/genxml: Delete SAMPLER_MODE register definition on Gfx12+
While this register still exists, it's no longer a per-context register.
Instead, on Gfx12+, SAMPLER_MODE exists per dual-subslice and is
accessed as a "multicast" register, where you write control which
version is accessed by the "steering control register".

At any rate, userspace cannot write it any longer, and so there's not
much point to it existing in our genxml (which was missing most of the
fields anyway).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>
2022-04-11 19:17:07 +00:00
Kenneth Graunke 8092704705 intel/genxml: Add new "Low Quality Filter" field on Gfx12+.
This allows the sampler to perform faster filtering of 8-bit UNORM
textures by filtering them at a different precision.  The filtering
is intended to still be OpenGL and DirectX spec compliant.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>
2022-04-11 19:17:07 +00:00
Kenneth Graunke 9a70385e2b intel/genxml: Add SAMPLER_STATE::Allow Low Quality LOD Calculation field
This allows the hardware to perform a faster LOD calculation in many
simple cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>
2022-04-11 19:17:07 +00:00
Vitalii.Lomaka 1407a4db69 intel/batch-decoder: Fix uninitialized scalar variables
CID: 1498516
CID: 1498560

Signed-off-by: Vitalii Lomaka <vitalii.lomaka@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15685>
2022-04-08 18:35:34 +00:00
Benjamin Cheng 0666b7fecc anv: drop from_wsi bit from anv_image
It was originally introduced in ca791f5c but it was never actually set
anywhere. It doesn't serve any purpose other than some sanity checking
so let's clean it up for now.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15799>
2022-04-07 18:46:50 +00:00
Ian Romanick b5fa43952a intel/fs: Better handle constant sources of FS_OPCODE_PACK_HALF_2x16_SPLIT
I noticed that a *LOT* of fragment shaders in Shadow of the Tomb Raider,
for instance, end up with a sequence of NIR like:

    vec1 32 ssa_2 = load_const (0x00000000 = 0.000000)
    ...
    vec1 32 ssa_191 = pack_half_2x16_split ssa_188, ssa_2
    vec1 32 ssa_192 = pack_half_2x16_split ssa_189, ssa_2
    vec1 32 ssa_193 = pack_half_2x16_split ssa_190, ssa_2

This results in an assembly sequence like:

    mov(8)          g28<1>UD        0x00000000UD
    mov(8)          g21<2>HF        g28<8,8,1>F
    shl(8)          g21<1>UD        g21<8,8,1>UD    0x00000010UD
    mov(8)          g21<2>HF        g25<8,8,1>F
    mov(8)          g19<2>HF        g28<8,8,1>F
    shl(8)          g19<1>UD        g19<8,8,1>UD    0x00000010UD
    mov(8)          g19<2>HF        g23<8,8,1>F
    mov(8)          g20<2>HF        g28<8,8,1>F
    shl(8)          g20<1>UD        g20<8,8,1>UD    0x00000010UD
    mov(8)          g20<2>HF        g24<8,8,1>F

After this commit, the generated assembly is:

    mov(8)          g21<1>UD        0x00000000UD
    mov(8)          g21<2>HF        g23<8,8,1>F
    mov(8)          g19<1>UD        0x00000000UD
    mov(8)          g19<2>HF        g17<8,8,1>F
    mov(8)          g20<1>UD        0x00000000UD
    mov(8)          g20<2>HF        g18<8,8,1>F

Tiger Lake, Ice Lake, Skylake, and Haswell had similar results. (Ice Lake shown)
total instructions in shared programs: 20119086 -> 20119034 (<.01%)
instructions in affected programs: 9056 -> 9004 (-0.57%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 16 x̄: 6.50 x̃: 4
helped stats (rel) min: 0.29% max: 1.75% x̄: 1.00% x̃: 0.98%
95% mean confidence interval for instructions value: -11.01 -1.99
95% mean confidence interval for instructions %-change: -1.56% -0.44%
Instructions are helped.

total cycles in shared programs: 861019414 -> 861021044 (<.01%)
cycles in affected programs: 279862 -> 281492 (0.58%)
helped: 4
HURT: 2
helped stats (abs) min: 6 max: 936 x̄: 239.00 x̃: 7
helped stats (rel) min: 0.03% max: 8.13% x̄: 2.09% x̃: 0.09%
HURT stats (abs)   min: 18 max: 2568 x̄: 1293.00 x̃: 1293
HURT stats (rel)   min: 0.36% max: 1.14% x̄: 0.75% x̃: 0.75%
95% mean confidence interval for cycles value: -972.56 1515.89
95% mean confidence interval for cycles %-change: -4.77% 2.49%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 17812327 -> 17812263 (<.01%)
instructions in affected programs: 9867 -> 9803 (-0.65%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 28 x̄: 8.00 x̃: 4
helped stats (rel) min: 0.32% max: 1.80% x̄: 1.00% x̃: 0.95%
95% mean confidence interval for instructions value: -15.46 -0.54
95% mean confidence interval for instructions %-change: -1.54% -0.47%
Instructions are helped.

total cycles in shared programs: 904768620 -> 904773291 (<.01%)
cycles in affected programs: 454799 -> 459470 (1.03%)
helped: 4
HURT: 4
helped stats (abs) min: 36 max: 586 x̄: 344.50 x̃: 378
helped stats (rel) min: 0.47% max: 4.04% x̄: 2.01% x̃: 1.77%
HURT stats (abs)   min: 1 max: 5572 x̄: 1512.25 x̃: 238
HURT stats (rel)   min: <.01% max: 2.77% x̄: 1.46% x̃: 1.53%
95% mean confidence interval for cycles value: -1122.40 2290.15
95% mean confidence interval for cycles %-change: -2.26% 1.71%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 18581 -> 18579 (-0.01%)
spills in affected programs: 323 -> 321 (-0.62%)
helped: 1
HURT: 0

total fills in shared programs: 24985 -> 24981 (-0.02%)
fills in affected programs: 1348 -> 1344 (-0.30%)
helped: 1
HURT: 0

Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 143585431 -> 143513657 (-0.0%)
Instructions helped: 14403

Cycles in all programs: 8439312778 -> 8439371578 (+0.0%)
Cycles helped: 10570
Cycles hurt: 3290

Gained: 146
Lost: 74

All of the lost and gained fossil-db shaders are SIMD32 fragment
shaders.  14,247 of the affected shaders are from Shadow of the Tomb
Raider.  154 are from Batman Arkham Origins, and the remaining two are
from Octopath Traveler.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15089>
2022-04-07 18:26:23 +00:00
Ian Romanick c08302670b intel/compiler: Fix sample_d messages on DG2
DG2 can only do sample_d and sample_d_c on 1D and 2D surfaces.  The
maximum number of gradient components and coordinate components should
be 2.  In spite of this limitation, the Bspec lists a mysterious R
component before the min_lod, so the maximum coordinate components is 3.

Fixes the following Vulkan CTS failures on DG2:

    dEQP-VK.glsl.texture_functions.texturegradclamp.isampler1d_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.isampler2d_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1d_fixed_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1d_float_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2d_fixed_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2d_float_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.usampler1d_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.usampler2d_fragment

The Fixes: tag below is a bit misleading. This commit fixes some test
cases similar to ones fixed by the Fixes: commit.  I just want to make
sure this commit gets applied everywhere that commit was also applied.

Fixes: 635ed58e52 ("intel/compiler: Lower txd for 3D samplers on XeHP.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15781>
2022-04-07 17:09:28 +00:00
Jason Ekstrand 13fc698cef anv/formats: Relax usage checks if EXTENDED_USAGE_BIT is set
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14153>
2022-04-07 15:56:33 +00:00
Lionel Landwerlin b5031bd6f7 intel/nir: don't report progress on rayqueries if no queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c78be5da30 ("intel/fs: lower ray query intrinsics")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15769>
2022-04-07 08:24:19 +00:00
Lionel Landwerlin 56ef501e3a blorp: disable depth bounds
Otherwise the driver setting interacts with it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 939ddccb7a ("anv: Add support for depth bounds testing.")
Fixes: 1df871f8ff ("iris: Add support for depth bounds testing.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15763>
2022-04-06 19:00:50 +00:00
Lionel Landwerlin 3069337144 anv: remove unused 3DSTATE_DEPTH_BOUNDS fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15763>
2022-04-06 19:00:50 +00:00
Lionel Landwerlin 88f77aa811 anv: disable preemption on 3DPRIMITIVE on gfx12
To workaround a push constant corruption issue.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5963
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5662
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15753>
2022-04-06 12:51:15 +00:00
Vadym Shovkoplias 04a6693871 anv: fix EXT_depth_clip_control
This fixes arb_clip_control-clip-control and depth_clamp piglit
tests on zink.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6186
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15561>
2022-04-06 13:26:52 +03:00
Jason Ekstrand 29b8097408 anv: Enable VK_EXT_debug_utils
It's implemented in common code as long as you use vk_command_buffer.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>
2022-04-06 01:18:23 +00:00
Mike Blumenkrantz 6fd344ff98 anv: expose VK_EXT_image_2d_view_of_3d
sampling only available on gen9+

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15754>
2022-04-05 20:30:31 +00:00
Omar Akkila 4208895175 ci: bump VK-GL-CTS to 1.3.1.1
Signed-off-by: Omar Akkila <omar.akkila@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15668>
2022-04-04 23:04:33 +00:00
Jason Ekstrand 94ce812497 anv: Advertise two more formats
These both require swizzling so border colors won't work.  However,
they're conveniently in the list of formats for which custom border
colors require you to specify a format in the sampler.  That list
constists of:

    - VK_FORMAT_B4G4R4A4_UNORM_PACK16
    - VK_FORMAT_B5G6R5_UNORM_PACK16
    - VK_FORMAT_B5G5R5A1_UNORM_PACK16

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6226
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>
2022-04-04 21:42:23 +00:00
Jason Ekstrand e32b9e5c3f anv: Generalize border color swizzles
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>
2022-04-04 21:42:23 +00:00
Jason Ekstrand 54509d27d9 anv: Disallow blending on swizzled formats
Fixes: c20f78dc5d ("anv: Support swizzled formats.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>
2022-04-04 21:42:23 +00:00
Jason Ekstrand 257a20f40d intel/isl: Add a helper for swizzling color values
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>
2022-04-04 21:42:23 +00:00
Ian Romanick 7fd1955412 nir: intel/compiler: Lower TXD on array surfaces on DG2+
DG2 can only do sample_d and sample_d_c on 1D and 2D surfaces.  Cube
maps and 3D surfaces were already handled, but 1D array and 2D array
surfaces were not.

Fixes the following Vulkan CTS failures on DG2:

    dEQP-VK.glsl.texture_functions.texturegradclamp.isampler1darray_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.isampler2darray_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1darray_fixed_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1darray_float_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2darray_fixed_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2darray_float_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.usampler1darray_fragment
    dEQP-VK.glsl.texture_functions.texturegradclamp.usampler2darray_fragment

The Fixes: tag below is a bit misleading. This commit adds another
lowering, similar to the one in the Fixes: commit, that probably should
have been added at the same time.  I just want to make sure this commit
gets applied everywhere that commit was also applied.

Fixes: 635ed58e52 ("intel/compiler: Lower txd for 3D samplers on XeHP.")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15681>
2022-03-31 12:59:18 -07:00
Rohan Garg d876abeaa8 anv: Drop dead code in anv_UpdateDescriptorSets
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15666>
2022-03-30 15:19:47 +02:00
Lionel Landwerlin 684a4ea30c intel/clc: fix missing pointer write
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 346a7f14fb ("intel/compiler: Add code for compiling CL-style SPIR-V kernels")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15611>
2022-03-30 07:56:25 +00:00
Mike Blumenkrantz 65ec846f77 intel/isl: fix 2d view of 3d textures
according to KHR_gl_texture_3D_image:

    If <target> is EGL_GL_TEXTURE_3D_KHR, <buffer> must be the name of a
    complete, nonzero, GL_TEXTURE_3D (or equivalent in GL extensions) target
    texture object, cast
    into the type EGLClientBuffer.  <attr_list> should specify the mipmap
    level (EGL_GL_TEXTURE_LEVEL_KHR) and z-offset (EGL_GL_TEXTURE_ZOFFSET_KHR)
    which will be used as the EGLImage source; the specified mipmap level must
    be part of <buffer>, and the specified z-offset must be smaller than the
    depth of the specified mipmap level.

thus a 2d view of a 3d surface is not only legal, it's part of the spec and
must be supported when available

cc: mesa-stable

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15584>
2022-03-29 21:44:51 +00:00
Kenneth Graunke 8831cb38aa anv: Stop updating STATE_BASE_ADDRESS on XeHP
Now that we're using 3DSTATE_BINDING_TABLE_POOL_ALLOC to set the base
address for the binding table pool separately from surface states, we
don't actually need to update surface state base address anymore.

Instead, we can just set STATE_BASE_ADDRESS once at context creation,
and never bother updating it again, saving some heavyweight flushes
and freeing us from the need for address offsetting trickery.

This patch was originally written by Jason Ekstrand, with fixes from
Lionel Landwerlin, but was targeting Icelake.  Doing it there requires
additional changes (15:5 -> 18:8 binding table pointer formats) which
also involve some trade-offs, whereas the XeHP change is purely a win,
so we'll do it here first.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15616>
2022-03-29 20:45:59 +00:00
Kenneth Graunke 1967fd3b10 intel/compiler: Call inst->resize_sources before setting the sources
You should probably resize the sources array before accessing entries
that might be out of bounds.  inst->resize_sources() always allocates
enough space for at least 3 sources, so this is really only an issue
when there are 4+ sources.

Fixes: a920979d4f ("intel/fs: Use split sends for surface writes on gen9+")
Fixes: 4f86a70599 ("intel/fs: Lower DW untyped r/w messages to LSC when available")
Fixes: d372abe397 ("intel/fs: Add surface OWORD BLOCK opcodes")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15632>
2022-03-29 13:06:17 -07:00
Kenneth Graunke 9bc97e4fc1 intel/decoder: Fix decoder handling of binding table pool alloc on XeHP
3DSTATE_BINDING_TABLE_POOL_ALLOC no longer has a "Binding Table Pool
Enable" bit.  It is always enabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15625>
2022-03-29 02:35:54 -07:00
Georg Lehmann 922916bf64 nir: Move lower_usub_sat64 to nir_lower_int64_options.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421>
2022-03-28 20:02:52 +00:00
Kenneth Graunke 823745dc27 intel/compiler: Use nir_opt_uniform_atomics()
In general, an atomic intrinsic may perform separate atomics for every
enabled SIMD channel, as each channel may operate on different memory.

However, an extremely common case is for all channels to access the same
memory location.  In this case, we can simply perform a reduction/scan
across the subgroup, and perform one atomic for the whole subgroup,
rather than one per channel.  For example, if an intrinsic says to take
the minimum value of the existing memory and the value in each channel,
we can do a thread-local minimum of all enabled channels, then do a
single atomic to take the minimum of that and the existing memory.

Our hardware doesn't optimize the case where multiple channels ask for
atomics on the same memory location; it assumes the compiler will do so.

nir_opt_uniform_atomics() uses divergence analysis to detect this case,
adds the necessary subgroup operations, and moves the atomic inside a
conditional that disables all but a single invocation.  It even detects
cases where the shader code already performs this kind of optimization,
and avoids doing it a second time.

This may not be the optimal solution for us.  In the backend, we could
detect this case and emit send(1) instructions with NoMask, rather than
generating if...send(16)...endif, and a lot of unnecessary ALU ops.  But
it's simple to do, reuses the same path as ACO, and still provides most
of the benefit by cutting up to 16x atomics down to a single atomic,
which is more merciful to the memory bus.

Improves performance of Shadow of the Tomb Raider by 5.5% on XeHP.

Improves performance of a customer-internal benchmark on XeHP at
3840x2160 and low settings by approximately 30%.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke 49ef23f4a6 intel/compiler: Convert to LCSSA and use divergence analysis.
We'll use this more shortly.  For now, enable it to separately in case
anything bisects to this.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke b3942beecf intel/compiler: Set divergence analysis options
Although we don't use divergence analysis yet, we've had several
work-in-progress series that make use of it.  We may as well set
our options so that those series can assume they're in place.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Kenneth Graunke 6fa66ac228 intel/compiler: Implement nir_intrinsic_last_invocation
We haven't exposed this intrinsic as it doesn't directly correspond to
anything in SPIR-V.  However, it's used internally by some NIR passes,
namely nir_opt_uniform_atomics().

We reuse most of the infrastructure in brw_find_live_channel, but with
LZD/ADD instead of FBL.  A new SHADER_OPCODE_FIND_LAST_LIVE_CHANNEL is
like SHADER_OPCODE_FIND_LIVE_CHANNEL but from the other side.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Caio Oliveira c32d386ce2 intel/compiler: Inline TUE map computation into TUE Input lowering
Refactor since the TUE compute function is simpler now and the
comments make sense being near the lowering.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>
2022-03-25 23:29:19 +00:00
Caio Oliveira c36ae42e4c intel/compiler: Use nir_var_mem_task_payload
Instead of reusing the in/out slot mechanism, use a separated NIR
variable mode.  This will make easier later to implement staging the
output in shared memory (and storing all at the end to the URB).

Note to get 64-bit type support we currently rely on the
brw_nir_lower_mem_access_bit_sizes() pass.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>
2022-03-25 23:29:19 +00:00
Boris Brezillon 49c8b93288 anv: Stop using VK_OUTARRAY_MAKE()
We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED()
so people don't get tempted to use it and make things incompatible with
MSVC (which doesn't support typeof()).

Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>
2022-03-25 11:00:03 +00:00
Caio Oliveira f82731d0d7 intel/fs: Fix IsHelperInvocation for the case no discard/demote are used
Use emit_predicate_on_sample_mask() helper that does check where to
get the correct mask depending on whether discard/demote was used or
not.

Fixes: 45f5db5a84 ("intel/fs: Implement "demote to helper invocation"")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>
2022-03-25 08:20:27 +00:00
Caio Oliveira bb311c22df intel/fs: Initialize the sample mask in flags register when using demote
Without this change, a check for "is helper invocation" could read
uninitialized values.

Fixes: 45f5db5a84 ("intel/fs: Implement "demote to helper invocation"")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>
2022-03-25 08:20:27 +00:00
Lionel Landwerlin 8cdd5647c6 anv: don't store sample location sample count
This information should match the current pipeline sample count.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Lionel Landwerlin 6f5f817c0f anv: fix dynamic sample locations on Gen7/7.5
3DSTATE_MULTISAMPLE should be baked into the pipeline if not dynamic.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 27ee40f4c9 ("anv: Add support for sample locations")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Lionel Landwerlin 8ad78671b3 anv: use local dynamic pointer more
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Lionel Landwerlin 1d250b7b95 anv: fix color write enable interaction with color mask
Color writes & color masks occupy the same fields in the BLEND_STATE
structure. So we need to store color mask (which are not dynamic) on
the pipeline to merge that information with color writes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b15bfe92f7 ("anv: implement VK_EXT_color_write_enable")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6111
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Lionel Landwerlin a4f502de32 anv: fix VK_DYNAMIC_STATE_COLOR_WRITE_ENABLE_EXT state
First, there is a problem if you do the following

 vkCmdSetColorWriteEnableEXT(attachmentCount = 8)
 vkCmdBindPipeline(GFX, with attachmentCount = 4)
 vkCmdDraw()
 vkCmdBindPipeline(GFX, with attachmentCount = 8)
 vkCmdDraw()

Because in the dynamic state emission code we rely on the first
pipeline to figure the number of BLEND_STATE entries to prepare. This
is wrong, we should fill all entries so that the dynamic state works
regardless of the number of attachments in the pipeline. With regard
to the dynamic values, we should retain enable/disable values that do
not concern the current pipeline.

Second, 3DSTATE_WM was not always reemitted when the pipeline changed.
But since it is not emitted as part of the pipeline, this results in
inconsistent state being programmed.

Third, we end up disabling the fragment stage completely in some
cases. And that is programming the pipeline inconsistently and
triggering a hang on TGL.

v2: Fix comment (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b15bfe92f7 ("anv: implement VK_EXT_color_write_enable")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Lionel Landwerlin f348103fce anv: fix dynamic state emission
The problem is that we missed looking at pipeline changes. Pipelines
hold bits of dynamic states and when it changes we might need to
reemit a packet.

v2: fix comment (Tapani)
    Add missing anv_cmd_buffer_needs_dynamic_state() use (Tapani)

Cc: mesa-stable
Fixes: 505d176a8e ("anv: disable baked in pipeline bits from dynamic emission path")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Lionel Landwerlin 1cd7d6ce37 anv: allow baking of 3DSTATE_DEPTH_BOUNDS in pipeline batch
If it's not dynamic.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
2022-03-24 10:49:07 +00:00
Mark Janes 85e314db5d Revert "intel/fs: handle interpolation modes for at_sample and at_offset too"
This reverts commit 5afbb0e730.

Closes: #6198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15534>
2022-03-23 11:03:47 -07:00
Daniel Schürmann 832d67e99d nir: rename nir_src_is_dynamically_uniform to nir_src_is_always_uniform
As this function doesn't check for any control-flow
dependence, it only returns true for statically
(or globally) uniform values.
The same holds true for is_binding_dynamically_uniform()
in nir_opt_gcm().
Rename to better reflect that property.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14994>
2022-03-23 14:02:08 +00:00
Lionel Landwerlin df059c6781 intel/clc: deal with SPIRV-Tools linker new behavior
We're now required to provide all modules to link at the same SPIRV
version.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin 21aa1d3de1 intel/clc: fixup shared memory offsets
We're running the io lowering twice so need to reset some fields so
the offset don't go over what is really needed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin de9c2312ea intel/clc: compile fix
Fixes: c15bf88f01 ("intel: Add a little OpenCL C compiler binary")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin a7f264f33a intel/clc: add option to printout kernel prog_data
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin 451f907d16 intel/kernel: enable linkage cap
Linkage should have happened before this in intel_clc. This just
silence a parser warning.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin bb4ff3e6e2 intel/kernel: enable groups caps
This is roughly the same as SpvCapabilityGroupNonUniform
(subgroup_basic).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin 218db59b25 intel/dev: default to B stepping on DG2 for offline compiler
Most people won't have A0 stepping now.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>
2022-03-23 10:24:31 +00:00
Lionel Landwerlin dc8c77cc8f anv: implement EXT_tooling_info
This is required by 1.3. Fixes CTS with newer loader :

   dEQP-VK.api.tooling_info.validate_getter

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: df8ac77af8 ("anv: Advertise Vulkan 1.3")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15491>
2022-03-23 09:51:57 +00:00
Iván Briano 5afbb0e730 intel/fs: handle interpolation modes for at_sample and at_offset too
Cc: mesa-stable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15424>
2022-03-22 19:05:05 +00:00
Kenneth Graunke 49dd707ca2 intel: Add INTEL_DEBUG=noccs alias for INTEL_DEBUG=norbc
When CCS compression first came out on Skylake, we referred to it as
"renderbuffer compression", or RBC for short.  However, that name has
long since fallen out of favor, and we refer to it as CCS nearly
everywhere.

This patch renames DEBUG_NO_RBC to DEBUG_NO_CCS inside the codebase
for clarity, and adds INTEL_DEBUG=noccs.  The legacy INTEL_DEBUG=norbc
name continues to work, because it's one line of code and having both
names makes our lives easier in the interim.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15447>
2022-03-22 06:23:10 +00:00
Nanley Chery da82358a52 ci/anv: Changes from enabling 8/16bpp CCS more
- Fixes in dEQP-VK.dynamic_rendering.suballocation.multisample_resolve.*
- Fails in dEQP-VK.drm_format_modifiers.export_import.*

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15420>
2022-03-21 17:36:10 +00:00
Nanley Chery e2f0c859c2 Revert "anv: Disable CCS_E for some 8/16bpp copies on TGL+"
This reverts commit d68b2db89c.

With this change, no regressions have been observed with the
dEQP-VK.synchronization* test group. There are regressions with
dEQP-VK.drm_format_modifiers.export_import.*, but those have been
root-caused to be test issues (see 3575).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6125
Fixes: 57445adc89 ("anv: Re-enable CCS_E on TGL+")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15420>
2022-03-21 17:36:10 +00:00
Lionel Landwerlin 9ca29c687b intel/clc: disable tool prior to Gfx12.5 platforms
This tool is currently only aimed at Gfx version 12.5+ with
COMPUTE_WALKER. We could make it work on earlier platforms but they
require pushing gl_SubgroupInvocation and the CLC code is missing the
back-end compiler set-up bits for that.

v2: Commit description by Jason

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Lionel Landwerlin c735c4ca85 intel/clc: specify supported extensions
Having everything ever known to man is confusing our SPIRV parser.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Lionel Landwerlin a29b1d5716 intel/clc: allow producing SPIRV files
Useful to debug the parser.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Lionel Landwerlin 77e929a527 intel/clc: allow multiple CL files to be compiled together
v2: use util_dynarray_append() (Jason)
    identation fixes (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Jason Ekstrand c15bf88f01 intel: Add a little OpenCL C compiler binary
v2: Fix up indentation (Marcin)
    s/gen/gfx/ (Marcin)
    Deal with fd closing in success/fail cases (Marin)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Lionel Landwerlin ec6e247a40 intel/fs: handle inline data on OpenCL style kernels
This is for Gfx12.5 with the COMPUTE_WALKER::Inline Data payload. We
do this in a similar way to the compute kernels.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Jason Ekstrand 4d8e788663 intel/kernel: Implement some Intel built-in functions
v2: Document mangled function names (Marcin)
    Fixup progress & metadata (Marcin)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Jason Ekstrand 346a7f14fb intel/compiler: Add code for compiling CL-style SPIR-V kernels
v2: simplify INTEL_DEBUG expressions (Marcin)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Jason Ekstrand 8c11912582 intel/debug: Dump KERNEL source when INTEL_DEBUG=cs
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Jason Ekstrand d1bddfba6b intel/nir: Add optimizations to help OpenCL-style kernels
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Lionel Landwerlin 4ec5da7270 intel/nir/fs: replace COMPUTE || KERNEL by gl_shader_stage_is_compute()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Jason Ekstrand 8d7cbe026e anv: Drop GetPhysicalDeviceQueueFamilyProperties
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15459>
2022-03-18 10:10:37 -05:00
Jason Ekstrand cdaa3a899c anv: Use layerCount for clears and transitions in BeginRendering
The Vulkan spec was recently clerified to say that transitions only
happen to the bound layers:

    "Automatic layout transitions apply to the entire image subresource
    attached to the framebuffer. If multiview is not enabled and the
    attachment is a view of a 1D or 2D image, the automatic layout
    transitions apply to the number of layers specified by
    VkFramebufferCreateInfo::layers. If multiview is enabled and the
    attachment is a view of a 1D or 2D image, the automatic layout
    transitions apply to the layers corresponding to views which are
    used by some subpass in the render pass, even if that subpass does
    not reference the given attachment."

This is in the context of render passes but it applies to dynamic
rendering because the implicit layout transition stuff is a Mesa pseudo-
extension and inherits those rules.

For clears, the Vulkan spec says:

    "renderArea is the render area that is affected by the render pass
    instance. The effects of attachment load, store and multisample
    resolve operations are restricted to the pixels whose x and y
    coordinates fall within the render area on all attachments. The
    render area extends to all layers of framebuffer."

Again, this is in the context of render passes but the same principals
apply to dynamic rendering where the layerCount and renderArea are
specified as part of the vkCmdBeginRendering() call.

Fixes: 3501a3f9ed ("anv: Convert to 100% dynamic rendering")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15441>
2022-03-18 09:27:15 -05:00
Lionel Landwerlin 8b71118aa0 anv: flush tile cache with query copy command
This fixes the test_resolve_non_issued_query_data vkd3d-proton test.

This change is required on TGL+ (maybe ICL?) because on all platforms
3D pipeline writes are not coherent with CS. On previous platform we
fixed this by flushing the render cache to make sure data is visble to
CS before it writes to memory. But on more recently platforms,
flushing the render cache leaves the data in the tile cache which is
still not coherent with CS, so we need to flush that one too.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14552>
2022-03-18 10:02:33 +00:00
Lionel Landwerlin 4e30da7874 anv: emit timestamp & availability using the same part of CS
We've run into issues before where PIPE_CONTROL races MI_STORE_*
commands. So make sure we emit the availability using the same type of
CS so that memory writes are properly ordered.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14552>
2022-03-18 10:02:33 +00:00
Ian Romanick 19330eeb1d intel/fs: Force destination types on DP4A instructions
Most of the time, this doesn't matter.  On the versions with _sat, if
the destination type is incorrect, the clamping will not happen
correctly.

Fixes the following CTS tests:

dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_ss_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_su_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_us_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_uu_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_ss_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_su_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_us_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_uu_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_ss_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_su_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_us_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_uu_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_ss_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_su_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_us_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_uu_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_ss_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_su_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_us_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_uu_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_ss_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_su_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_us_v4i8_out32
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_uu_v4i8_out32

v2: Update anv-tgl-fails.txt.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Fixes: 0f809dbf40 ("intel/compiler: Basic support for DP4A instruction")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15417>
2022-03-17 22:39:04 +00:00
Felix DeGrood 3bd9b25060 intel: change INTEL_MEASURE output to microseconds
Change time event durations from ns -> us. Microseconds are easier
to work with.

Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15348>
2022-03-17 22:14:42 +00:00
Felix DeGrood 2e6d14cc7b intel: increase INTEL_MEASURE batch/buffer sizes
Increase default batch_size and buffer_size from 16 -> 64. These
are sized to be big enough to service most games. As games have
become more demanding, larger sizes become necessary.

Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15348>
2022-03-17 22:14:42 +00:00
Felix DeGrood e0c9032db8 anv: add indirect draw to INTEL_MEASURE
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15348>
2022-03-17 22:14:42 +00:00
Lionel Landwerlin d68b9f0e6b anv: zero-out anv_batch_bo
anv_batch_bo has a length field that we use to flush cachelines. Not
having that field initialized properly leads us to access out of bound
memory.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15425>
2022-03-17 15:56:14 +00:00
Lionel Landwerlin 78acae3865 anv: fix variable shadowing
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 83fee30e85 ("anv: allow multiple command buffers in anv_queue_submit")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15425>
2022-03-17 15:56:14 +00:00
Sagar Ghuge 2e336c602d intel/fs: Add Wa_14014435656
For any fence greater than local scope, always set flush type to at
least invalidate so that fence goes on properly.

v2: Fixup condition to trigger workaround (Lionel)

v3: Simplify workaround (Curro)

v4: Don't drop the existing WA (Curro)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: 22.0 <mesa-stable>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>
2022-03-17 14:18:02 +00:00
Sagar Ghuge 6031ad4bf6 intel/fs: Add Wa_22013689345
v2: Use a simpler framework (Lionel)

v3: Rebase, add task/mesh (Lionel)

v4: Fixup fence exec size (SIMDX -> SIMD1)

v5: Fix invalidate_analysis, add finishme comment (Curro)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: 22.0 <mesa-stable>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>
2022-03-17 14:18:02 +00:00
Anuj Phogat 5cc4075f95 anv, iris: Add Wa_16011411144 for DG2
v2: Use CS_STALL instead of FLUSH_ENABLE in Iris (Lionel)
    Add missing CS_STALL after SO_BUFFER change in Anv (Lionel)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: 22.0 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>
2022-03-17 14:18:02 +00:00
Jason Ekstrand 893fa30afe anv: Include scissors in viewport calculations
It's tricky to always get the render area to the viewport code.  In
particular, it's not provided to secondary command buffers as part of
the inheritance info so we have to bend over backwards and look for a
framebuffer.  With VK_KHR_dynamic_rendering, there is no framebuffer and
this approach won't work and we'll need something better if we want
competent guardbands in secondary command buffers.

The good news is that any client that's sloppily rendering and trusting
the clipper to keep things inside the render area will set a scissor and
that's something they have to set inside the secondary.  We can dig
through the scissor state and also include the corresponding scissor (if
any) and use that for our render area.  This should give us the same
secondary command buffer performance with VK_KHR_dynamic_rendering.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 13:13:45 -05:00
Jason Ekstrand b4e38e174f anv: Move viewport/scissor emit to genX_cmd_buffer.c
There's never been a particularly good reason to stick these in gfx7/8.
We mostly did it to deduplicate the binary a bit but this shouldn't emit
all that much code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 13:13:45 -05:00
Jason Ekstrand 2c04373c45 anv: Calculate the real guardband based on render area
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 13:13:45 -05:00
Jason Ekstrand 12d815bcac intel/guardband: Take min/max instead of total size
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 13:13:45 -05:00
Jason Ekstrand 3501a3f9ed anv: Convert to 100% dynamic rendering
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 13:13:36 -05:00
Jason Ekstrand 8112e6d601 anv: Drop pipeline pass/subpass in favor of rendering_info
This is about the only "small" change we can make in the process of
converting from render-pass-based to dynamic-rendering-based.  Make
everything in pipeline creation work in terms of dynamic rendering and
create the dynamic rendering structs from the render pass as-needed.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:16 -05:00
Jason Ekstrand ee9c068043 anv/pipeline: Stop pretending we're the validator
This was ill-conceived at best.  Yes, it checks for a few error
conditions but it doesn't check much and what checks it has are very far
away from the code that relies on those invariants.  If we care about
these invariants, we should add asserts near the code that makes those
assumptions rather than pretending to be the validation layers.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:16 -05:00
Jason Ekstrand 2da152b5e6 anv: Stop treating color input attachments specially
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:15 -05:00
Jason Ekstrand 1ad0f1b004 anv/pass: Make unused color attachments VK_ATTACHMENT_UNUSED
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:15 -05:00
Jason Ekstrand 9bbecbed7a anv: Better null surface state size for dynamic rendering
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:15 -05:00
Jason Ekstrand fff3f8bfe5 anv: Fix handling of null depth/stencil attachments with dynamic rendering
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:15 -05:00
Jason Ekstrand 83101429bf anv: Convert to vk_framebuffer
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
2022-03-16 12:51:15 -05:00
Vadym Shovkoplias 550f48a826 anv: implement EXT_depth_clip_control
A new extension allowing the application to use the OpenGL depth
range in NDC, i.e. with depth in range [-1, 1], as opposed to
Vulkan’s default of [0, 1].

v2:
- call gfx8_cmd_buffer_emit_viewport on ANV_CMD_DIRTY_PIPELINE (Jason)
- remove redundant !! operator since negativeOneToOne must be true or
  false (Tapani)
- coding style changes (Lionel)

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6070
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15304>
2022-03-16 11:22:24 +00:00
Lionel Landwerlin a54f5e8e00 anv: silence compiler warnings
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6146
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15387>
2022-03-16 01:02:05 +00:00
Jason Ekstrand 0b4a80b4c4 anv: Use vk_shader_module_to_nir()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15305>
2022-03-15 23:13:16 +00:00
Ernst Sjöstrand e5f3689cff intel/compiler: Fix non-trivial designated initializer
Not supported by GCC 7.

src/compiler/nir/nir_builder_opcodes.h:14156:118:
sorry, unimplemented: non-trivial designated initializers not supported
src/intel/compiler/brw_mesh.cpp:515:7:
note: in expansion of macro ‘nir_store_per_primitive_output’

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Fixes: bc4f8c073a ("intel/compiler: inject MUE initialization")
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15360>
2022-03-14 09:56:04 +00:00
Dave Airlie 7edda218fd intel: add some missing debug recompile info.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15315>
2022-03-12 08:12:41 +00:00
Jason Ekstrand d65dbe8018 anv: Allow MSAA resolve with different numbers of planes
The Vulkan spec for VK_KHR_depth_stencil_resolve allows a format
mismatch between the primary attachment and the resolve attachment
within certain limits.  In particular,

    VUID-VkSubpassDescriptionDepthStencilResolve-pDepthStencilResolveAttachment-03181

    If pDepthStencilResolveAttachment is not NULL and does not have the
    value VK_ATTACHMENT_UNUSED and VkFormat of
    pDepthStencilResolveAttachment has a depth component, then the
    VkFormat of pDepthStencilAttachment must have a depth component with
    the same number of bits and numerical type

    VUID-VkSubpassDescriptionDepthStencilResolve-pDepthStencilResolveAttachment-03182

    If pDepthStencilResolveAttachment is not NULL and does not have the
    value VK_ATTACHMENT_UNUSED, and VkFormat of
    pDepthStencilResolveAttachment has a stencil component, then the
    VkFormat of pDepthStencilAttachment must have a stencil component
    with the same number of bits and numerical type

So you can resolve from a depth/stencil format to a depth-only or
stencil-only format so long as the number of bits matches.
Unfortunately, this has never been tested because the CTS tests which
purport to test this are broken and actually test with a destination
combined depth/stencil format.

Fixes: 5e4f9ea363 ("anv: Implement VK_KHR_depth_stencil_resolve")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15333>
2022-03-11 22:25:42 +00:00
Lionel Landwerlin 6cea8a43fa anv: silence compiler warning
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
2022-03-11 08:47:15 +00:00
Lionel Landwerlin 90000aea9b anv: make a couple of descriptor function private
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
2022-03-11 08:47:15 +00:00
Lionel Landwerlin e12698724e anv: rename host only descriptor internal flag
We add an assert to verify that those are not bound.

v2: Drop != 0 (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
2022-03-11 08:47:15 +00:00
Lionel Landwerlin 87f59b18cf anv: don't lazy allocate surface states in descriptor sets
In 4001d9ce1a we started lazily allocating surface states in the
descriptor sets rather than upfront in the descriptor pool. This was
to workaround vkd3d-proton allocating more than we could handle at the
HW level.

The issue introduced in that change is that we didn't protect the
descriptor pool free list as well as the anv_state_stream which are
now potentially used from different threads through the descriptor set
write functions.

This reverts the lazy allocation part of that change. Host only
descriptor sets changes remain.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4001d9ce1a ("anv: Handle VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_VALVE for descriptor sets")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
2022-03-11 08:47:15 +00:00
Lionel Landwerlin 71cd6a7b84 anv: fix acceleration structure descriptor copies
We're not supposed to have a
VkWriteDescriptorSetAccelerationStructureKHR when doing a copy. We
should instead get the acceleration structure object from the source
descriptor.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 03e1e19246 ("anv: Refactor descriptor copy")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
2022-03-11 08:47:15 +00:00
Mike Blumenkrantz 5ab0e3f0bb anv: fix some dynamic rasterization discard cases in pipeline construction
cc: mesa-stable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15280>
2022-03-11 04:02:02 +00:00
Mike Blumenkrantz 1e3e7b3a4d anv: fix CmdSetColorWriteEnableEXT for maximum rts
Fixes: b15bfe92f7 ("anv: implement VK_EXT_color_write_enable")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15280>
2022-03-11 04:02:02 +00:00
Mike Blumenkrantz 52f6978484 anv: fix xfb usage with rasterizer discard
in the initial implementation, a stream like:

* CmdBeginTransformFeedbackEXT
* CmdSetRasterizerDiscardEnableEXT
* CmdDraw
* CmdEndTransformFeedbackEXT
* CmdBeginTransformFeedbackEXT
* CmdDraw
* CmdEndTransformFeedbackEXT

would never enable transform feedback, as it only checked for the change
in rasterizer_discard state

Fixes: 4d531c67df ("anv: support rasterizer discard dynamic state")

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15269>
2022-03-11 03:37:17 +00:00
Marcin Ślusarz 823cffbe1c anv: include Primitive Header in mesh shader per-primitive output
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
2022-03-09 16:52:59 +00:00
Marcin Ślusarz f410c1142f anv: set number of viewports in clip state (mesh)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
2022-03-09 16:52:59 +00:00
Marcin Ślusarz 81df66bfff intel/compiler: mark some variables as per-primitive in FS if they come from MS
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
2022-03-09 16:52:59 +00:00
Marcin Ślusarz 8c16ce53a9 intel/compiler: handle ViewportIndex, PrimitiveID and Layer in MUE setup
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
2022-03-09 16:52:59 +00:00
Marcin Ślusarz bc4f8c073a intel/compiler: inject MUE initialization
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
2022-03-09 16:52:59 +00:00
Marcin Ślusarz 333a490e32 intel/compiler: shift mesh urb read/write window when offset is too large
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
2022-03-09 16:52:59 +00:00
Kenneth Graunke e3a0e97300 intel: Limit Wa_1607854226 to Gfx12.0 only
This workaround is needed on all Gfx12.0 parts, but doesn't appear to be
necessary on XeHP.  The other drivers do not appear to be applying this
workaround on those parts.  As further evidence, we accidentally added
the 3DSTATE_BINDING_TABLE_POOL_ALLOC commands after switching back to
GPGPU mode, which would be an incorrect way to implement the workaround,
and things seem to be working.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Jason Ekstrand a83c91a261 blorp: Add a binding_table_offset_to_pointer helper
On Gen11+, we have a feature that requires us to shift binding table
offsets by 3.  This adds a helper which gives the driver a hook to do
this if it so chooses.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
2022-03-09 09:18:59 +00:00
Matt Turner 8860ff3310 intel/perf: Destination array calculation into function
Cuts 119 KiB from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
 917511       0       0  917511   e0007 meson-generated_.._intel_perf_metrics.c.o (before)
 796986       0       0  796986   c293a meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14130948 365708  210004 14706660 e067e4 iris_dri.so (before)
14009332 365708  210004 14585044 de8cd4 iris_dri.so (after)

   text    data     bss     dec     hex filename
8124225  214264   22820 8361309  7f955d libvulkan_intel.so (before)
8002609  214264   22820 8239693  7dba4d libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner d80d3c6760 intel/perf: Fix mistake in description string
Along with fixing the grammar, this allows it to be deduplicated since
the properly worded description exists in later generations' XMLs.

Cuts 96 B from iris_dri.so and libvulkan_intel.so.

   text	   data	    bss	    dec	    hex	filename
 917613	      0	      0	 917613	  e006d	meson-generated_.._intel_perf_metrics.c.o (before)
 917511	      0	      0	 917511	  e0007	meson-generated_.._intel_perf_metrics.c.o (after)

   text	   data	    bss	    dec	    hex	filename
14131044 365708	 210004	14706756 e06844	iris_dri.so (before)
14130948 365708	 210004	14706660 e067e4	iris_dri.so (after)

   text	   data	    bss	    dec	    hex	filename
8124321	 214264	  22820	8361405	 7f95bd	libvulkan_intel.so (before)
8124225	 214264	  22820	8361309	 7f955d	libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner 7024b8e0eb intel/perf: Mark intel_perf_counter_* enums as PACKED
Reduces their sizes from 4 bytes to 1. Cuts 6 KiB from iris_dri.so and
libvulkan_intel.so.

   text    data     bss     dec     hex filename
 924401       0       0  924401   e1af1 meson-generated_.._intel_perf_metrics.c.o (before)
 917613       0       0  917613   e006d meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14137732 365708  210004 14713444 e08264 iris_dri.so (before)
14131044 365708  210004 14706756 e06844 iris_dri.so (after)

   text    data     bss     dec     hex filename
8131009  214264   22820 8368093  7fafdd libvulkan_intel.so (before)
8124321  214264   22820 8361405  7f95bd libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner 6c0246dcf4 intel/perf: Store indices to strings rather than pointers
The compiler does a good job of deduplicating strings already, but we
can eliminate the pointers to each string by combining the strings into
a single char array and storing only an index into that array.

The longest of the char arrays is the descriptions array, which is a
little over 45 KiB, so still under MSVC's 64 KiB string literal limit
[0]. Because the string length is under 64 KiB we can use uint16_t as
the index type, which roughly doubles our savings as compared to an int.

This cuts 77 KiB from iris_dri.so (0.5%) and libvulkan_intel.so (0.9%).

   text    data     bss     dec     hex filename
 926811   25920       0  952731   e899b meson-generated_.._intel_perf_metrics.c.o (before)
 924401       0       0  924401   e1af1 meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14190852 391628  210004 14792484 e1b724 iris_dri.so (before)
14137732 365708  210004 14713444 e08264 iris_dri.so (after)

   text    data     bss     dec     hex filename
8184097  240184   22820 8447101  80e47d libvulkan_intel.so (before)
8131009  214264   22820 8368093  7fafdd libvulkan_intel.so (after)

relinfo:
iris_dri.so (before): 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
iris_dri.so (after) : 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users

libvulkan_intel.so (before): 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (after) :  8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users

[0] https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?view=msvc-170&viewFallbackFrom=vs-2019

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner df5e743c80 intel/perf: Use slimmer intel_perf_query_counter_data struct
intel_perf_query_counter contains fields for things we can't or don't
want to store in our static data (like runtime-determined max values) or
oa_read_counter function pointers which are dependent on the GPU gen and
would make deduplication very ineffective.

Cuts 16 KiB from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
 926811   43200       0  970011   ecd1b meson-generated_.._intel_perf_metrics.c.o (before)
 926811   25920       0  952731   e899b meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14190852 408908  210004 14809764 e1faa4 iris_dri.so (before)
14190852 391628  210004 14792484 e1b724 iris_dri.so (after)

   text    data     bss     dec     hex filename
8184097  257464   22820 8464381  8127fd libvulkan_intel.so (before)
8184097  240184   22820 8447101  80e47d libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner bbbbb0325b intel/perf: Use a function to initialize perf counters
And specifically mark it with ATTRIBUTE_NOINLINE. Otherwise it will be
inlined and actually slightly increase code size.

Cuts 505 KiB from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
1538720       0       0 1538720  177aa0 meson-generated_.._intel_perf_metrics.c.o (before)
 926811   43200       0  970011   ecd1b meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14751700 365708  210004 15327412 e9e0b4 iris_dri.so (before)
14190852 408908  210004 14809764 e1faa4 iris_dri.so (after)

   text    data     bss     dec     hex filename
8744913  214264   22820 8981997  890ded libvulkan_intel.so (before)
8184097  257464   22820 8464381  8127fd libvulkan_intel.so (after)

Relocations increase because the counter initializations are moved from
code (in .text) to pointers (in .text) to .rodata, which require
relocations.

relinfo:
iris_dri.so (before): 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
iris_dri.so (after) : 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users

libvulkan_intel.so (before):  8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (after) : 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner 5e6c7a572e intel/perf: Deduplicate perf counters
No changes in resulting code (yes, seriously!). GCC constant propagates
the static const arrays into the code, yielding bit for bit identical
results. This will however enable further cleanups.

Before this patch, we emit 11916 different initializations of
intel_perf_query_counter. With this patch we emit an array of 539 and
initialize the intel_perf_query_counters in terms of those.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00
Matt Turner 3172b5bbb8 intel/perf: Don't print leading space from desc_units()
Just an annoyance I noticed when I needed to generate the description
string in two different places.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
2022-03-07 21:09:54 +00:00