Commit Graph

119803 Commits

Author SHA1 Message Date
Jason Ekstrand 4d03e53127 intel/isl: Allow CCS_E on more formats
Now that BLORP supports copies on everything except R11G11B10_FLOAT,
we should be able to support CCS_E those formats.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
2020-01-25 17:48:54 +00:00
Jason Ekstrand f132e0fddf intel/blorp: Add support for CCS_E copies with UNORM formats
Some of the smaller bit-size formats which support CCS_E don't have a
UINT representative in their compression class.  However, we should be
able to use UNORM just fine and still get bit-exact copies.  We just
have to do a conversion to/from UNORM when we bitcast.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
2020-01-25 17:48:54 +00:00
Erico Nunes ae0b8ba5d5 lima/ppir: fix src read mask swizzling
The src mask can't be calculated from the dest write_mask.
Instead, it must be calculated from the swizzled operators of the src.
Otherwise, liveness calculation may report incorrect live components for
non-ssa registers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:55 +01:00
Erico Nunes ab36523ae7 lima/ppir: split ppir_op_undef into undef and dummy again
Those were renamed/merged some time ago but it turns out that
ppir_op_undef can't be shared.
It was being used for undefined ssa operations and for read-before-write
operations that may happen to e.g. uninitialized registers (non-ssa)
inside a loop.
We really don't want to reserve a register for the undef ssa case, but
we must reserve and allocate register for the unitialized register case
because when it happens inside a loop it may need to hold its value
across iterations.

This dummy node might be eliminated with a code refactor in ppir in case
we are able to emit the write and allocate the ppir_reg before we emit
the read. But a major refactor we need this to keep this code to avoid
apparent regressions with the new liveness analysis implementation.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:55 +01:00
Erico Nunes 4ca3de06ec lima/ppir: fix ssa undef emit
The ssa doesn't need to be manually added to block->comp->reg_list.
Doing so actually causes other registers to be marked as undef=true
later.

This patch alone fixes a few deqp tests that have undefs.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:55 +01:00
Erico Nunes d6b1917c01 lima/ppir: handle write to dead registers in ppir
nir can output writes to dead registers when expanding vec4 operations
to non-ssa registers. In that case, some components of the vec4 may be
assigned but never read. These are also not currently removed by a nir
dead code elimination pass as they are not ssa.
In order to prevent regalloc from allocating a live register for this
operation, an interference must be assigned to it during liveness
analysis.

This workaround may be removed in the future if the assignments to dead
components can be removed earlier in ppir or nir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:02 +01:00
Marek Olšák eb7cd575da radeonsi: fix a regression since the addition of si_shader_llvm_vs.c
Fixes: cd5b99c541 - radeonsi: move VS shader code into si_shader_llvm_vs.c
Closes: #2416
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
2020-01-25 05:59:24 +00:00
Marek Olšák 688d2901b8 radeonsi: make screen available to shader part compilation
to fix a crash in is_multi_part_shader.

Fixes: 1a0890dcf3 - radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
2020-01-25 05:59:24 +00:00
Jason Ekstrand 07a441d53f anv: Rework CCS memory handling on TGL-LP
The previous way we were attempting to handle AUX tables on TGL-LP was
very GL-like.  We used the same aux table management code that's shared
with iris and we updated the table on image create/destroy.  The problem
with this is that Vulkan allows multiple VkImage objects to be bound to
the same memory location simultaneously and the app can ping-pong back
and forth between them in the same command buffer.  Because the AUX
table contains format-specific data, we cannot support this ping-pong
behavior with only CPU updates of the AUX table.

The new mechanism switches things around a bit and instead makes the aux
data part of the BO.  At BO creation time, a bit of space is appended to
the end of the BO for AUX data and the AUX table is updated in bulk for
the entire BO.  The problem here, of course, is that we can't insert the
format-specific data into the AUX table at BO create time.

Fortunately, Vulkan has a requirement that every TILING_OPTIMAL image
must be initialized prior to use by transitioning the image from
VK_IMAGE_LAYOUT_UNDEFINED to something else.  When doing the above
described ping-pong behavior, the app has to do such an initialization
transition every time it corrupts the underlying memory of the VkImage
by using it as something else.  We can hook into this initialization and
use it to update the AUX-TT entries from the command streamer.  This way
the AUX table gets its format information, apps get aliasing support,
and everyone is happy.

One side-effect of this is that we disallow CCS on shared buffers.
We'll need to fix this for modifiers on the scanout path but that's a
task for another patch.  We should be able to do it with dedicated
allocations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand b29cf7daf3 anv: Make anv_vma_alloc/free a lot dumber
All they do now is take a size, align, and flags and figure out which
heap to allocate in.  All of the actual code to deal with the BO is in
anv_allocator.c.  We want to leave anv_vma_alloc/free in anv_device.c
because it deals with API-exposed heaps so it still makes sense to have
it there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand fd0f9d1196 anv: Make AUX table invalidate a PIPE_* bit
This commit moves it in with all the other cache invalidation operations
as if it were done by PIPE_CONTROL even though it's a pair of register
writes.  This means we only have to write the GFX_AUX_TABLE_BASE_ADDR
register once at device initialization instead of every invalidate.
Invalidates are now a single LRI instead of two.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand 658dc9ca50 anv: Add another align_down helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand 64ca8a3272 isl: Add a helper for calculating subimage memory ranges
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand 4793116036 anv: Delete a redundant calculation
We compute the same thing with the same variable name at the top of the
function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand a1e9adc9ce intel/aux-map: Factor out some useful helpers
This breaks add_mapping() into three pieces:

    1. get_aux_entry() adds AUX-TT pages as needed and returns the
       L1 entry index, L1 entry address, and L1 entry map.

    2. gen_aux_map_format_bits_for_isl_surf() computes the format-
       specific information that goes in the AUX-TT entry.

    3. add_mapping() is a lot dumber function that now just adds the
       requested mapping with the requested format bits.

This lets us break out some additional helpers in the API which we want
to use for more direct AUX-TT management in ANV.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand bea62ea566 intel/aux-map: Add some #defines
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Marek Olšák 0366c8c5b7 radeonsi: expose shader cache stats to the HUD
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák c046551e60 radeonsi: print shader cache stats with AMD_DEBUG=cache_stats
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák 2fd3bb23ab radeonsi: restructure si_shader_cache_load_shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák 0db74f479b radeonsi: use the live shader cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák 4bb919b0b8 gallium/util: add a cache of live shaders for shader CSO deduplication
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák f36f85d958 util/simple_mtx: add a missing include to get ASSERTED
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Caio Marcelo de Oliveira Filho 6a0dda63dd intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT
Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
2020-01-24 23:52:30 +00:00
Caio Marcelo de Oliveira Filho c1a2ac2abe anv: Always initialize target_stencil_layout
Pass down stencil data from the subpass attachment like we do
elsewhere.  Only stencil attachments will make use of it.

Fixes warnings like

    ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’:
    ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     4656 |       att_state->current_stencil_layout = target_stencil_layout;
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
2020-01-24 14:01:38 -08:00
Jason Ekstrand 41bffe0913 anv: Replace aux_surface.isl.size_B checks with aux_usage checks
Now that aux_usage has a unified meaning, aux_usage == NONE if and only
if aux_surface.isl.size_B > 0.  In most of these cases, the question
we're asking is "does have compression?" and not "have we allocated an
aux surface for compression?".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Jason Ekstrand e693a57232 anv: Rework the meaning of anv_image::planes[]::aux_usage
Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant
CCS_D.  This sort-of made sense before we had anv_layout_to_aux_usage
but now that we have that helper.  However, in our more modern aux
tracking model, all aux usage goes through anv_layout_to_* and we're
better off making the meaning of anv_image::planes[]::aux_usage be
AUX_USAGE_NONE if and only if there is no compression.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Samuel Pitoiset de64719024 radv: print NIR shaders after lowering FS inputs/outputs
This is confusing otherwise.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
2020-01-24 19:54:58 +00:00
Jason Ekstrand 17e225ee1e intel/isl: Add a hack for the Gen12 A0 texture buffer bug
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand 4cd23420bd intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand 98aab272a8 intel/disasm: Properly disassemble indirect SENDs
Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD.
This is more correct because there is no GRF involved.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand 3b2eafbea9 intel/fs: Don't unnecessarily fall back to indirect sends on Gen12
The instruction encoding for SENDS changed on Gen12 and it now supports
embedding the entire extended message descriptor in the instruction if
it's an immediate.  Stop falling back to doing an indirect SEND just
because we had something in [15:12] of ex_desc.ud.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand c70a786c77 anv: Improve BTI change cache flushing
This commit makes two changes:

 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly
    for the flush at the end of cmd_buffer_begin_subpass.

 2. Because BLORP ops such as vkCmdClearAttachments may come in the
    middle of a render pass, we have to also flag the need for a cache
    flush after the blorp op.

Fixes: 185630c6bc "anv/blorp: Do the gen11 BTI flush"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:26 +00:00
Alyssa Rosenzweig e39c52787e panfrost: Fix 32-bit warning for `indices`
../src/gallium/drivers/panfrost/pan_context.c: In function ‘panfrost_draw_vbo’:
../src/gallium/drivers/panfrost/pan_context.c:1551:70: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
                 ctx->payloads[PIPE_SHADER_FRAGMENT].prefix.indices = (u64) NULL;
                                                                      ^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95 <ixn@keemail.me>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig 58aa2b8cfc pan/decode: Remove SHORT_SLIDE indirection
../src/panfrost/pandecode/decode.c: In function ‘pandecode_compute_fbd’:
../src/panfrost/pandecode/decode.c:789:35: warning: taking address of packed member of ‘struct mali_compute_fbd’ may result in an unaligned pointer value [-Waddress-of-packed-member]
  789 |         pandecode_u32_slide(num, s->unknown ## num, ARRAY_SIZE(s->unknown ## num))
      |                                  ~^~~~~~~~~
../src/panfrost/pandecode/decode.c:800:9: note: in expansion of macro ‘SHORT_SLIDE’
  800 |         SHORT_SLIDE(1);
      |         ^~~~~~~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig 7d52b3a18b pan/midgard: Remove pack_color define
Unused at the moment.

../src/panfrost/midgard/midgard_compile.c:124:29: warning: ‘m_pack_colour’ defined but not used [-Wunused-function]
  124 |  static midgard_instruction m_##name(unsigned ssa, unsigned address) { \
      |                             ^~
../src/panfrost/midgard/midgard_compile.c:145:22: note: in expansion of macro ‘M_LOAD_STORE’
  145 | #define M_LOAD(name) M_LOAD_STORE(name, false)
      |                      ^~~~~~~~~~~~
../src/panfrost/midgard/midgard_compile.c:213:1: note: in expansion of macro ‘M_LOAD’
  213 | M_LOAD(pack_colour);
      | ^~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig 6c95ea6bd7 pan/decode: Remove last_size
Fixes ../src/panfrost/pandecode/decode.c: In function ‘pandecode_jc’:
../src/panfrost/pandecode/decode.c:2859:14: warning: variable ‘last_size’ set but not used [-Wunused-but-set-variable]
 2859 |         bool last_size;

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig d126515a16 panfrost: Don't use implicit mali_exception_status enum
Fixes ../src/panfrost/pandecode/public.h:53:33: warning: ‘enum mali_exception_access’ declared inside parameter list will not be visible outside of this definition or declaration

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Samuel Pitoiset 4a553212fa radv: enable ACO support for GFX6
CTS should pass, as well as Crucible and the few number of Piglit tests.

List of game benchmarks tested:
- Dawn of War 3
- Serious Sam 2017
- Shadow of The Tomb Raider
- The Talos Principle
- Thrones of Britannia
- Total Warhammer 2
- Total War: Three Kingdoms

Note that F12017 hangs with or without ACO on GFX6 at the moment.

My whole pipelinedb (~30 games) doesn't trigger any compiler crashes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset d4b4f40595 aco: copy the literal offset of SMEM instructions to a temporary
GFX6 only supports up to 8-bit for the literal offset, so make sure
it's copied to a temporary SGPR before emitting a SMEM instruction.
The optimizer will propagate the literal offset if possible anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset 1ac49ba908 aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6
It's required to insert 1 wait state if the dst VGPR of any v_interp_*
is followed by a read with v_readfirstlane or v_readlane to fix GPU
hangs on GFX6. Note that v_writelane_* is apparently not affected.
This hazard isn't documented anywhere but AMD confirmed it.

This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset b9cc50fbce aco: fix a hardware bug for MRTZ exports on GFX6
GFX6 (except OLAND and HAINAN) has a bug that it only looks at
the X writemask component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Brian Ho f55e215b8c turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries
Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results
of a query to a buffer based off the query's availability.

Fixes: #2238
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 9a3656b9fd turnip: Implement vkCmdResetQueryPool
Clears the available bit for each requested query on the GPU.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 97fa4cb3dc turnip: Implement vkGetQueryPoolResults for occlusion queries
Implements fetching the results of a query pool with the
VK_QUERY_RESULT_WAIT_BIT, VK_QUERY_RESULT_WITH_AVAILABILITY_BIT,
and VK_QUERY_RESULT_PARTIAL_BIT flags.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 24b95485dc turnip: Update query availability on render pass end
Unlike on an immidiate-mode renderer, Turnip only renders tiles on
vkCmdEndRenderPass. As such, we need to track all queries that were
active in a given render pass and defer setting the available bit
on those queries until after all tiles have rendered.

This commit adds a draw_epilogue_cs to tu_cmd_buffer that is
executed as an IB at the end of tu_CmdEndRenderPass. We then emit
packets to this command stream that update the availability bit of a
given query in tu_CmdEndQuery.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho f750dd2ab8 turnip: Implement vkCmdEndQuery for occlusion queries
Mostly a translation of freedreno's implementation of glEndQuery for
GL_SAMPLES_PASSED query objects with a slight modification to set the
availability bit of the query bo (slot->available) if the query was
not ended inside a render pass.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 5824a59ee2 turnip: Implement vkCmdBeginQuery for occlusion queries
Mostly a translation of freedreno's implementation of glBeginQuery for
GL_SAMPLES_PASSED query objects with special logic for handling tiled
render passes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 78dea40b1c turnip: Implement vkCreateQueryPool for occlusion queries
General structure is inspired by anv's implementation in genX_query.c.
We define a packed struct that tracks sample count at the beginning of
the query and at the end; the result of the occlusion query is then
slot->end - slot->begin.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho a155ab93a3 turnip: Update tu_query_pool with turnip-specific fields
tu_query_pool was forked from radv_query_pool, but we will need a
different set of fields to implement queries in turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Jason Ekstrand 0aa13245c1 anv: Allow HiZ in read-only depth layouts
This improves the performance of Aztec Ruins by 5% on ICL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00