Commit Graph

119495 Commits

Author SHA1 Message Date
Francisco Jerez 0dd18d70ae intel/fs/gen6: Generalize aligned_pairs_class to SIMD16 aligned barycentrics.
This is mainly meant to avoid shader-db regressions on SNB as we start
using VGRFs for barycentrics more frequently.  Currently the
aligned_pairs_class is only useful in SIMD8 mode, because in SIMD16
mode barycentric vectors are typically 4 GRFs.  This is not a problem
on Gen4-5, because on those platforms all VGRF allocations are
pair-aligned in SIMD16 mode.  However on Gen6 we end up using either
the fast or the slow path of LINTERP rather non-deterministically
based on the behavior of the register allocator.

Fix it by repurposing aligned_pairs_class to hold PLN-aligned
registers of whatever the natural size of a barycentric vector is in
the current dispatch width.

On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:

   total instructions in shared programs: 13983257 -> 14527274 (3.89%)
   instructions in affected programs: 1766255 -> 2310272 (30.80%)
   helped: 0
   HURT: 11608

   LOST:   26
   GAINED: 13

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:34 -08:00
Francisco Jerez 0db4455c1f intel/fs/gen6: Constrain barycentric source of LINTERP during bank conflict mitigation.
This avoids regressions on SNB due to the bank conflict mitigation
pass moving a VGRF-allocated barycentric vector to a misaligned
location, which would prevent the PLN instruction from being used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:29 -08:00
Francisco Jerez 369aef851d intel/fs/gen4-6: Allocate registers from aligned_pairs_class based on LINTERP use.
Previously we would hardcode fs_visitor::delta_xy barycentrics to be
allocated from aligned_pairs_class on hardware with PLN source
alignment restrictions (pre-Gen7).  Instead allocate any registers
consumed by LINTERP from aligned_pairs_class, even if some barycentric
vector had ended up in a temporary.

On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:

   total instructions in shared programs: 13983257 -> 14527274 (3.89%)
   instructions in affected programs: 1766255 -> 2310272 (30.80%)
   helped: 0
   HURT: 11608

   LOST:   26
   GAINED: 13

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:20 -08:00
Francisco Jerez 54b1b71e73 intel/fs: Allow limited copy propagation of a LOAD_PAYLOAD into another.
This is particularly useful in cases where register coalaesce is
unlikely to succeed because the LOAD_PAYLOAD isn't a plain copy --
E.g. when a LOAD_PAYLOAD is shuffling the contents of a barycentric
vector in order to transform it into the PLN layout.

This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series.  On SKL:

   total instructions in shared programs: 18596672 -> 18976097 (2.04%)
   instructions in affected programs: 7937041 -> 8316466 (4.78%)
   helped: 39
   HURT: 67427

   LOST:   466
   GAINED: 220

On SNB:

   total instructions in shared programs: 13993866 -> 14202963 (1.49%)
   instructions in affected programs: 7611309 -> 7820406 (2.75%)
   helped: 624
   HURT: 52943

   LOST:   6
   GAINED: 18

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:09 -08:00
Francisco Jerez 8eb4f2092a intel/fs: Add support for copy-propagating a block of multiple FIXED_GRFs.
In cases where a LOAD_PAYLOAD instruction copies a single block of
sequential GRF registers into the destination (see
is_identity_payload()), splitting the block copy into a number of ACP
entries (one for each LOAD_PAYLOAD source) is undesirable, because
that prevents copy propagation into any instructions which read
multiple components at once with the same source (the barycentric
source of the LINTERP instruction is going to be the overwhelmingly
most common example).

Technically it would also be possible to do this for VGRF sources, but
there is little benefit from that since register coalesce already
covers many of those cases -- There is no way for a block of
FIXED_GRFs to be coalesced into a VGRF though.

This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series.  On SKL:

   total instructions in shared programs: 18595160 -> 18828562 (1.26%)
   instructions in affected programs: 13374946 -> 13608348 (1.75%)
   helped: 7
   HURT: 108977

   total spills in shared programs: 9116 -> 9106 (-0.11%)
   spills in affected programs: 404 -> 394 (-2.48%)
   helped: 7
   HURT: 9

   total fills in shared programs: 8994 -> 9176 (2.02%)
   fills in affected programs: 898 -> 1080 (20.27%)
   helped: 7
   HURT: 9

   LOST:   469
   GAINED: 220

On SNB:

   total instructions in shared programs: 13996898 -> 14096222 (0.71%)
   instructions in affected programs: 8088546 -> 8187870 (1.23%)
   helped: 2
   HURT: 66520

   total spills in shared programs: 2985 -> 2961 (-0.80%)
   spills in affected programs: 632 -> 608 (-3.80%)
   helped: 2
   HURT: 0

   total fills in shared programs: 3144 -> 3128 (-0.51%)
   fills in affected programs: 1515 -> 1499 (-1.06%)
   helped: 2
   HURT: 0

   LOST:   0
   GAINED: 4

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:41 -08:00
Francisco Jerez e328fbd9f8 intel/fs: Add partial support for copy-propagating FIXED_GRFs.
This will be useful for eliminating redundant copies from the FS
thread payload, particularly in SIMD32 programs.  For the moment we
only allow FIXED_GRFs with identity strides in order to avoid dealing
with composing the arbitrary bidimensional strides that FIXED_GRF
regions potentially have, which are rarely used at the IR level
anyway.

This enables the following commit allowing block-propagation of
FIXED_GRF LOAD_PAYLOAD copies, and prevents the following shader-db
regressions (including SIMD32 programs) in combination with the
interpolation rework part of this series.  On ICL:

   total instructions in shared programs: 20484665 -> 20529650 (0.22%)
   instructions in affected programs: 6031235 -> 6076220 (0.75%)
   helped: 5
   HURT: 42073

   total spills in shared programs: 8748 -> 8925 (2.02%)
   spills in affected programs: 186 -> 363 (95.16%)
   helped: 5
   HURT: 9

   total fills in shared programs: 8663 -> 8960 (3.43%)
   fills in affected programs: 647 -> 944 (45.90%)
   helped: 5
   HURT: 9

On SKL:

   total instructions in shared programs: 18937442 -> 19128162 (1.01%)
   instructions in affected programs: 8378187 -> 8568907 (2.28%)
   helped: 39
   HURT: 68176

   LOST:   1
   GAINED: 4

On SNB:

   total instructions in shared programs: 14094685 -> 14243499 (1.06%)
   instructions in affected programs: 7751062 -> 7899876 (1.92%)
   helped: 623
   HURT: 53586

   LOST:   7
   GAINED: 25

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:33 -08:00
Francisco Jerez 5153d06d92 intel/fs: Extend copy propagation dataflow analysis to copies with FIXED_GRF source.
This involves indexing the ACP tables used internally by
fs_copy_prop_dataflow::setup_initial_values() by reg_space() instead
of register number.  Both are nearly equivalent for virtual GRFs
(barring the single bit of entropy lost in the hash), and this makes
handling FIXED_GRFs straightforward.

Because we're only going to support FIXED_GRFs for the source of a
copy, this change is only strictly necessary during the second pass
that checks for source interference, but we also apply the same change
to the first pass for consistency.

Note that this shouldn't change the behavior of the copy propagation
pass until we start inserting FIXED_GRF entries into the ACP.  Even
then FIXED_GRF writes are extremely rare so this change will hardly
ever have an effect, but they aren't completely non-existing so we
need to handle them for correctness.

No functional nor shader-db changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:27 -08:00
Francisco Jerez ab0d1b3b3d intel/fs: Rework fs_inst::is_copy_payload() into multiple classification helpers.
This reworks the current fs_inst::is_copy_payload() method into a
number of classification helpers with well-defined semantics.  This
will be useful later on in order to optimize LOAD_PAYLOAD instructions
more aggressively in cases where we can determine it's safe to do so.

The closest equivalent of the present fs_inst::is_copy_payload()
method is the is_coalescing_payload() helper introduced here.

No functional nor shader-db changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:19 -08:00
Francisco Jerez 1873202f44 intel/fs: Generalize fs_reg::is_contiguous() to register files other than VGRF.
No functional nor shader-db changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:20:59 -08:00
Francisco Jerez d9a57c85cc intel/fs: Try to vectorize header setup in lower_load_payload().
In cases where LOAD_PAYLOAD is provided a pair of contiguous registers
as header sources, try to use a single SIMD16 instruction in order to
initialize them.  This is unlikely to affect the overall cycle count
of the shader, since the compressed instruction has twice the issue
time, except due to the reduced pressure on the instruction cache.

Main motivation is avoiding instruction-count regressions in
combination with the following copy propagation improvements, which
will allow the SIMD16 g0-1 header setup emitted for framebuffer writes
to be copy-propagated into its LOAD_PAYLOAD, leading to the emission
of two SIMD8 MOV instructions instead of a single SIMD16 MOV.

Reverting this commit on top of the copy propagation changes would
lead to the following shader-db regressions on SKL and other
platforms:

 total instructions in shared programs: 14926738 -> 14935415 (0.06%)
 instructions in affected programs: 1892445 -> 1901122 (0.46%)
 helped: 0
 HURT: 8676

Without the following copy propagation changes this doesn't have any
effect on shader-db on Gen7+, because we would typically set up the FB
write header with a separate SIMD16 MOV that isn't currently
copy-propagated into the LOAD_PAYLOAD, so the individual SIMD8 MOVs
result of LOAD_PAYLOAD lowering would get register-coalesced away
under normal circumstances.  However that wasn't the case for MRF
LOAD_PAYLOAD destinations on Gen6 and earlier, because register
coalesce only kicks in for GRFs, leaving a number of redundant SIMD8
MOVs lying around.  On SNB this leads to the following shader-db
improvements:

 total instructions in shared programs: 10770538 -> 10734681 (-0.33%)
 instructions in affected programs: 2700655 -> 2664798 (-1.33%)
 helped: 17791
 HURT: 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:20:46 -08:00
Marek Olšák 3ba16d36c9 st/dri: do FLUSH_VERTICES before calling flush_resource 2020-01-17 15:04:35 -05:00
Marek Olšák bec9c90b5e gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES 2020-01-17 15:04:35 -05:00
Lionel Landwerlin ddb80f9276 anv: enable VK_KHR_swapchain_mutable_format
Enable new tests in dEQP-VK.image.swapchain_mutable.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand 4bdf8547f4 vulkan/wsi: Implement VK_KHR_swapchain_mutable_format
This is only the core WSI code for the extension.  It adds the image
format list and the flags to vkCreateImage as well as handling things
properly in the modifier queries.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand a218f13278 vulkan/wsi: Filter modifiers with ImageFormatProperties
Just because a modifier is returned for the given format, that doesn't
mean it works with all usages and flags.  We need to filter the list by
calling vkGetPhysicalDeviceImageFormatProperties2.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand 210e68874b vulkan/wsi: Use the interface from the real modifiers extension
The anv implementation still isn't quite complete, but we can at least
start using the structs from the real extension.

v2: Fix circular pNext list (Lionel)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand c78926b84d vulkan/wsi: Move the ImageCreateInfo higher up
Future changes will be easier if we can modify it based on whether or
not we're using modifiers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand 6790397346 anv: Support modifiers in GetImageFormatProperties2
Images with modifiers come with restrictions:

 1. They have to be simple 2D images right now

 2. They need to have a sensible format (not compressed, multi-plane, or
    non-power-of-two)

 3. If a CCS modifier is being requested, they have to actually support
    CCS_E and be CCS-compatible with any other formats the client may
    wish to use for image views.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand 44f5a92c0b anv: Drop some VK_IMAGE_TILING_OPTIMAL checks
The DRM format modifiers extension adds a TILING_DRM_FORMAT_MODIFIER
which will be used for modifiers so we can no longer use OPTIMAL to
indicate tiled inside the driver.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Samuel Pitoiset 0099f85232 aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system
LLVM only supports GFX8+. Using CLRXdisasm works most of the time,
so it's useful to add support for it.

Original patch by Daniel Schürmann.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>
2020-01-17 17:41:32 +00:00
Andres Rodriguez 51de5d5ac6 vulkan/wsi: disable the hardware cursor
Ensure the hardware cursor is disabled when we set the mode for a
VkDisplayKHR object. The extension doesn't expose any mechanisms to
program the hardware cursor, so we need to ensure it is hidden.

Currently, it seems like X is responsible for disabling the cursor
before handing over the lease. But that seems a little frail, and we
should be disabling the cursor ourselves so it works correctly
independently of how the lease was prepared for us.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>
2020-01-17 17:15:52 +00:00
Krzysztof Raszkowski ad820d5aca gallium/swr: Disable showing detected arch message.
When swr driver is in use it print detected architecture
message to std::err. It can be harmfull when swr is using
in multinodes environments.
It can be enabled setting env var SWR_PRINT_INFO to 1.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2020-01-17 16:41:53 +00:00
Samuel Pitoiset b9b393f0ce aco: fix emitting slc for MUBUF instructions on GFX6-GFX7
Same as GFX10, only GFX8/GFX9 moved that bit near the opcode.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>
2020-01-17 16:56:04 +01:00
Boris Brezillon 6af63c939b panfrost/midgard: Fix swizzle for store instructions
The current logic considers that the nir_intrinsic_component(store_intr)
encodes the source components start, but it actually encodes the
destination one. Source component offset adjustment is taken care of in
install_registers_instr(), when offset_swizzle() is called.

This fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.45
when PAN_MESA_DEBUG=deqp (looks like exposing GLES3 features has an
impact on the varyings layout).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
2020-01-17 12:54:31 +00:00
Erik Faye-Lund be95c816a7 docs: do not double-close link tag
Fixes: f8148d0cc1 "docs: remove mailing list as way of submitting patches"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:19:16 +01:00
Erik Faye-Lund b009a7644b docs: remove double-closed definition-list
Fixes: bc17ac5866 "docs: add documentation for building with meson"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:19:10 +01:00
Erik Faye-Lund b387f68f49 docs: move paragraph closing tag
The pre-tag right before is a block-level tag, which means it implicitly
terminates the paragraph. So there's no paragraph to close after this.
Instead, move the paragraph-closing before the pre-tag, to explicitly
close the paragraph.

Fixes: 41b3eb08d9 "docs: update meson docs for windows"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:19:03 +01:00
Erik Faye-Lund a370cfd96e docs: use code-tags instead of pre-tags
Similar to the previous two commits, it seems more appropriate to use
code-tags here than pre-tag.

Fixes: 9af6c38def "docs: Add use of Closes: tag for closing gitlab issues"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:52 +01:00
Erik Faye-Lund 1de361e56b docs: use code-tags instead of pre-tags
Similar to the previous commit, code-tags seems more appropriate than
pre-tags here. So let's change it.

Fixes: ca0c1e69ca "docs: update releasing process to use new scripts and gitlab"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:48 +01:00
Erik Faye-Lund 36e0275275 docs: use code-tag instead of pre-tag
It's unlikely the author meant to use <pre>-here, as that starts a whole
new block. Instead, the inline code-tag seems more appropriate here.

Fixes: 41b3eb08d9 "docs: update meson docs for windows"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:42 +01:00
Erik Faye-Lund f0677086a1 docs: open paragraph before closing it
Fixes: 44c5e634a5 "docs: update meson docs for windows"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:36 +01:00
Erik Faye-Lund a0d25c4d87 docs: fix paragraphs
Paragraphs are terminated by pre-tags, so the latter one closes a new,
empty one. Let's split the paragraph in two around the pre-tag instead.

Fixes: c0dfe8c6df "docs: do not use div for line-breaking"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:25 +01:00
Erik Faye-Lund 750d664226 docs: fix typo in html tag name
Fixes: 5d11a828e1 "docs: update install docs for meson"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:06 +01:00
Pierre-Eric Pelloux-Prayer 5b1c4e1b75 util: call bind_sampler_states before setting sampler_views
Fixes the following valgrind error:

    Invalid read of size 16
       at 0x28F458A1: si_set_sampler_view_desc (in radeonsi_drv_video.so)
       by 0x28F4657E: si_set_sampler_views (in radeonsi_drv_video.so)
       by 0x28D62BF5: util_compute_blit (in radeonsi_drv_video.so)
       by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
       by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
       by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
     Address 0x18142a10 is 0 bytes inside a block of size 48 free'd
       at 0x48369AB: free (vg_replace_malloc.c:540)
       by 0x28D62D51: util_compute_blit (in radeonsi_drv_video.so)
       by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
       by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
       by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
     Block was alloc'd at
       at 0x4837B65: calloc (vg_replace_malloc.c:762)
       by 0x28EFB2EC: si_create_sampler_state (in radeonsi_drv_video.so)
       by 0x28D62C30: util_compute_blit (in radeonsi_drv_video.so)
       by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
       by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
       by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)

Fixes: 69430d7e59 ("va: use a compute shader for the blit")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2321
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
2020-01-17 10:14:57 +01:00
Eric Anholt d55573aac6 nir: Fix printing of ~0 .locations.
I kept wondering what "429" meant in variable declarations, when it was
just a truncated ~0 snprintf.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
2020-01-16 23:29:10 +00:00
Eric Engestrom 65641e0c7a meson: use github URL for wraps instead of completely unreliable wrapdb
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>
2020-01-16 23:06:43 +00:00
Dylan Baker d7cef7c67b docs: Update release calendar for 20.0
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>
2020-01-16 22:41:55 +00:00
Andreas Baierl 2ebfc6db16 lima: Fix alpha blending
Introduce separate helper functions to set the blendfactor bits.

Lima uses bits 0-2 for the type, bit 3 sets the inverted function
and bit 4 is set if alpha is used.
alpha_src_factor and alpha_dst_factor don't need the alpha bit, so
they are masked with 0xf. There is only place for 4 bits anyway.
If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need
to change it to PIPE_BLENDFACTOR_ONE first.
This is exactly what the blob does and we pass all
dEQP-GLES2.functional.fragment_ops.blend.* tests now.
Better than the blob btw...

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
2020-01-16 16:43:41 +00:00
Daniel Schürmann 3bca0af25d aco: ignore parallelcopies to the same register on jump threading
The more conservative lowering to CSSA inserts unnecessary parallelcopies
which might get coalesced and can be ignored on jump threading.

v2: outline is_empty_block() check.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
2020-01-16 16:01:59 +01:00
Daniel Schürmann 427e5eeb02 aco: handle phi affinities transitively through parallelcopies
This can coalesce most unnecessarily inserted parallelcopies
from lowering to CSSA.

v2: refactor loop a bit to make it more efficient and readable.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
2020-01-16 16:01:59 +01:00
Daniel Schürmann d098024c40 aco: rework lower_to_cssa()
This patch changes lower_to_cssa to be much more conservative
about assumptions which phi operands might interfere.
Previously, this pass wasn't exhaustive and could miss some corner cases.

v2: remove optimizations to find better insertion points as it's hard
to guarantee that they are always correct and have overall no benefit.

Fixes: 0b8216b2cd ('aco: Lower to CSSA')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
2020-01-16 16:01:59 +01:00
Samuel Pitoiset 300f8dec76 aco: implement stream output with vec3 on GFX6
GFX6 doesn't support vec3.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Samuel Pitoiset a445cb35bd aco: do not combine additions of DS instructions on GFX6
The offset field doesn't work as expected on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Samuel Pitoiset 923005bf54 aco: do not select 96-bit/128-bit variants for ds_read/ds_write on GFX6
Only GFX7 and later support large ds_read/ds_write.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Lionel Landwerlin 44ffeb4fee intel/perf: report query split for mdapi
Also forgotten in the initial implementation.

v2: Report begin timestamp scaled by the timestamp frequency (Windows
    behavior)

v3: Rename split to disjoint to match GL terminology (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
2020-01-16 15:29:40 +02:00
Lionel Landwerlin 3bb8a4bfec intel/perf: expose timestamp begin for mdapi
This was forgotten in the initial implementation.

v2: ensure the value is written for both GL & Vulkan queries

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
2020-01-16 15:29:28 +02:00
Tapani Pälli 630cbb45ac anv: set depth stall enabled when depth flush enabled on gen12
This implements HW workaround #1409600907 for anv driver.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
2020-01-16 14:05:54 +02:00
Tapani Pälli 3cec148455 iris: set depth stall enabled when depth flush enabled on gen12
This implements HW workaround #1409600907 for iris driver.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
2020-01-16 14:05:54 +02:00
Lionel Landwerlin 308efbf2f3 anv: implement another workaround for non pipelined states
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:51:30 +02:00
Lionel Landwerlin 9eca823cce iris: implement another workaround for non pipelined states
v2: add comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:51:22 +02:00