Commit Graph

119633 Commits

Author SHA1 Message Date
Marek Olšák 4bb919b0b8 gallium/util: add a cache of live shaders for shader CSO deduplication
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák f36f85d958 util/simple_mtx: add a missing include to get ASSERTED
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Caio Marcelo de Oliveira Filho 6a0dda63dd intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT
Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
2020-01-24 23:52:30 +00:00
Caio Marcelo de Oliveira Filho c1a2ac2abe anv: Always initialize target_stencil_layout
Pass down stencil data from the subpass attachment like we do
elsewhere.  Only stencil attachments will make use of it.

Fixes warnings like

    ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’:
    ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     4656 |       att_state->current_stencil_layout = target_stencil_layout;
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
2020-01-24 14:01:38 -08:00
Jason Ekstrand 41bffe0913 anv: Replace aux_surface.isl.size_B checks with aux_usage checks
Now that aux_usage has a unified meaning, aux_usage == NONE if and only
if aux_surface.isl.size_B > 0.  In most of these cases, the question
we're asking is "does have compression?" and not "have we allocated an
aux surface for compression?".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Jason Ekstrand e693a57232 anv: Rework the meaning of anv_image::planes[]::aux_usage
Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant
CCS_D.  This sort-of made sense before we had anv_layout_to_aux_usage
but now that we have that helper.  However, in our more modern aux
tracking model, all aux usage goes through anv_layout_to_* and we're
better off making the meaning of anv_image::planes[]::aux_usage be
AUX_USAGE_NONE if and only if there is no compression.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Samuel Pitoiset de64719024 radv: print NIR shaders after lowering FS inputs/outputs
This is confusing otherwise.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
2020-01-24 19:54:58 +00:00
Jason Ekstrand 17e225ee1e intel/isl: Add a hack for the Gen12 A0 texture buffer bug
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand 4cd23420bd intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand 98aab272a8 intel/disasm: Properly disassemble indirect SENDs
Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD.
This is more correct because there is no GRF involved.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand 3b2eafbea9 intel/fs: Don't unnecessarily fall back to indirect sends on Gen12
The instruction encoding for SENDS changed on Gen12 and it now supports
embedding the entire extended message descriptor in the instruction if
it's an immediate.  Stop falling back to doing an indirect SEND just
because we had something in [15:12] of ex_desc.ud.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand c70a786c77 anv: Improve BTI change cache flushing
This commit makes two changes:

 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly
    for the flush at the end of cmd_buffer_begin_subpass.

 2. Because BLORP ops such as vkCmdClearAttachments may come in the
    middle of a render pass, we have to also flag the need for a cache
    flush after the blorp op.

Fixes: 185630c6bc "anv/blorp: Do the gen11 BTI flush"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:26 +00:00
Alyssa Rosenzweig e39c52787e panfrost: Fix 32-bit warning for `indices`
../src/gallium/drivers/panfrost/pan_context.c: In function ‘panfrost_draw_vbo’:
../src/gallium/drivers/panfrost/pan_context.c:1551:70: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
                 ctx->payloads[PIPE_SHADER_FRAGMENT].prefix.indices = (u64) NULL;
                                                                      ^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95 <ixn@keemail.me>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig 58aa2b8cfc pan/decode: Remove SHORT_SLIDE indirection
../src/panfrost/pandecode/decode.c: In function ‘pandecode_compute_fbd’:
../src/panfrost/pandecode/decode.c:789:35: warning: taking address of packed member of ‘struct mali_compute_fbd’ may result in an unaligned pointer value [-Waddress-of-packed-member]
  789 |         pandecode_u32_slide(num, s->unknown ## num, ARRAY_SIZE(s->unknown ## num))
      |                                  ~^~~~~~~~~
../src/panfrost/pandecode/decode.c:800:9: note: in expansion of macro ‘SHORT_SLIDE’
  800 |         SHORT_SLIDE(1);
      |         ^~~~~~~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig 7d52b3a18b pan/midgard: Remove pack_color define
Unused at the moment.

../src/panfrost/midgard/midgard_compile.c:124:29: warning: ‘m_pack_colour’ defined but not used [-Wunused-function]
  124 |  static midgard_instruction m_##name(unsigned ssa, unsigned address) { \
      |                             ^~
../src/panfrost/midgard/midgard_compile.c:145:22: note: in expansion of macro ‘M_LOAD_STORE’
  145 | #define M_LOAD(name) M_LOAD_STORE(name, false)
      |                      ^~~~~~~~~~~~
../src/panfrost/midgard/midgard_compile.c:213:1: note: in expansion of macro ‘M_LOAD’
  213 | M_LOAD(pack_colour);
      | ^~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig 6c95ea6bd7 pan/decode: Remove last_size
Fixes ../src/panfrost/pandecode/decode.c: In function ‘pandecode_jc’:
../src/panfrost/pandecode/decode.c:2859:14: warning: variable ‘last_size’ set but not used [-Wunused-but-set-variable]
 2859 |         bool last_size;

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig d126515a16 panfrost: Don't use implicit mali_exception_status enum
Fixes ../src/panfrost/pandecode/public.h:53:33: warning: ‘enum mali_exception_access’ declared inside parameter list will not be visible outside of this definition or declaration

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Samuel Pitoiset 4a553212fa radv: enable ACO support for GFX6
CTS should pass, as well as Crucible and the few number of Piglit tests.

List of game benchmarks tested:
- Dawn of War 3
- Serious Sam 2017
- Shadow of The Tomb Raider
- The Talos Principle
- Thrones of Britannia
- Total Warhammer 2
- Total War: Three Kingdoms

Note that F12017 hangs with or without ACO on GFX6 at the moment.

My whole pipelinedb (~30 games) doesn't trigger any compiler crashes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset d4b4f40595 aco: copy the literal offset of SMEM instructions to a temporary
GFX6 only supports up to 8-bit for the literal offset, so make sure
it's copied to a temporary SGPR before emitting a SMEM instruction.
The optimizer will propagate the literal offset if possible anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset 1ac49ba908 aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6
It's required to insert 1 wait state if the dst VGPR of any v_interp_*
is followed by a read with v_readfirstlane or v_readlane to fix GPU
hangs on GFX6. Note that v_writelane_* is apparently not affected.
This hazard isn't documented anywhere but AMD confirmed it.

This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset b9cc50fbce aco: fix a hardware bug for MRTZ exports on GFX6
GFX6 (except OLAND and HAINAN) has a bug that it only looks at
the X writemask component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Brian Ho f55e215b8c turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries
Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results
of a query to a buffer based off the query's availability.

Fixes: #2238
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 9a3656b9fd turnip: Implement vkCmdResetQueryPool
Clears the available bit for each requested query on the GPU.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 97fa4cb3dc turnip: Implement vkGetQueryPoolResults for occlusion queries
Implements fetching the results of a query pool with the
VK_QUERY_RESULT_WAIT_BIT, VK_QUERY_RESULT_WITH_AVAILABILITY_BIT,
and VK_QUERY_RESULT_PARTIAL_BIT flags.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 24b95485dc turnip: Update query availability on render pass end
Unlike on an immidiate-mode renderer, Turnip only renders tiles on
vkCmdEndRenderPass. As such, we need to track all queries that were
active in a given render pass and defer setting the available bit
on those queries until after all tiles have rendered.

This commit adds a draw_epilogue_cs to tu_cmd_buffer that is
executed as an IB at the end of tu_CmdEndRenderPass. We then emit
packets to this command stream that update the availability bit of a
given query in tu_CmdEndQuery.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho f750dd2ab8 turnip: Implement vkCmdEndQuery for occlusion queries
Mostly a translation of freedreno's implementation of glEndQuery for
GL_SAMPLES_PASSED query objects with a slight modification to set the
availability bit of the query bo (slot->available) if the query was
not ended inside a render pass.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 5824a59ee2 turnip: Implement vkCmdBeginQuery for occlusion queries
Mostly a translation of freedreno's implementation of glBeginQuery for
GL_SAMPLES_PASSED query objects with special logic for handling tiled
render passes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho 78dea40b1c turnip: Implement vkCreateQueryPool for occlusion queries
General structure is inspired by anv's implementation in genX_query.c.
We define a packed struct that tracks sample count at the beginning of
the query and at the end; the result of the occlusion query is then
slot->end - slot->begin.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho a155ab93a3 turnip: Update tu_query_pool with turnip-specific fields
tu_query_pool was forked from radv_query_pool, but we will need a
different set of fields to implement queries in turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Jason Ekstrand 0aa13245c1 anv: Allow HiZ in read-only depth layouts
This improves the performance of Aztec Ruins by 5% on ICL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand bf3a262a80 anv: Add a usage parameter to anv_layout_to_aux_usage
Most places we actually know the usage and can provide it.  There are
two exceptions to this:

 1. We pass 0 into get_blorp_surf_for_anv_image when we use
    ANV_IMAGE_LAYOUT_EXPLICIT_AUX because anv_layout_to_aux_usage is
    never actually called so it doesn't matter.

 2. We pass 0 into anv_layout_to_aux_usage in transition_color_buffer.
    However, the coming commits which will begin using the usage
    parameter only care about depth.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand f8a4de6316 anv: Use isl_aux_state for HiZ resolves
Rather than looking at the aux usage, we look at the isl_aux_state which
provides us with more detailed information.  This commit adds a couple
helpers to isl which let us quickly determine if we have valid depth/hiz
on the initial layout and if we need valid depth/hiz for the final
layout.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand 9a1232a745 anv: Add a layout_to_aux_state helper
This new helper maps VkImageLayout enums to isl_aux_state enums which
are the hardware's concept of image layouts.  We can then use the aux
state to get the fast clear type and the aux usage.  This should yield
no functional change in driver behavior.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand 769d6ba200 anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves
As of 52ad1712ed, TRANSFER_SRC_OPTIMAL and SHADER_READ_ONLY_OPTIMAL
are now identical for depth buffers so there's no reason why we need to
use the "wrong" layout.  Technically, according to Vulkan, blits and
MSAA resolves are transfer ops so we should use the transfer layout now
that we can.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand 71c0f9e76d intel/blorp: resize src and dst surfaces separately
When copying to an RGB surface, we treat it as an R only one of three
times the width, which may end up being larger than the maximum size
supported by the hardware and so it hits the shrink path. This forced
both source and destination surfaces to be shrunk, even though it's not
necessary for the former, and may even hit some assertions in some
cases, such as the surface being compressed.

Fixes several tests under dEQP-VK.api.copy_and_blit.core.image_to_image.dimensions.*

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>
2020-01-24 17:02:40 +00:00
Samuel Pitoiset 918f00eef8 aco: combine MRTZ (depth, stencil, sample mask) exports
Instead of emitting up to 3 for each different components (depth,
stencil and sample mask). This is needed to fix a hw bug on GFX6.

Totals from affected shaders:
SGPRS: 34728 -> 35056 (0.94 %)
VGPRS: 26440 -> 26476 (0.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1346088 -> 1344180 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3922 -> 3915 (-0.18 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
2020-01-24 16:42:15 +00:00
Timur Kristóf c787b8d2a1 aco/gfx10: Fix VcmpxExecWARHazard mitigation.
The SOPP instruction shouldn't have a definition, and its block
should be set to -1 in order to prevent it from being recognized
as a branch.
Also fix a typo in the readme.

Fixes: d6dfce02d0
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
2020-01-24 16:21:08 +00:00
Timur Kristóf 8a32f57fff aco: Transform uniform bitwise instructions to 32-bit if possible.
This allows removing superfluous s_cselect instructions
that come from turning booleans into 64-bit vector condition.

v2 by Daniel Schürmann:
- Make the code massively simpler
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
- Eliminate extra moves by not always using the SCC definition
- Use s_absdiff_i32 for uniform XOR
- Skip the transformation for uncommon or invalid instructions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
2020-01-24 14:40:45 +00:00
Martin Fuzzey d1925fec53 etnaviv: update Android build files
etnaviv no longer builds on Android, fix this.

Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>
2020-01-24 14:03:28 +00:00
Rhys Perry b046f55086 aco: use nir_move_copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry 72e9a23443 radv/aco: use ACO for GS copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry f8f7712666 aco: implement GS copy shaders
v5: rebase on float_controls changes
v7: rebase after shader args MR and load/store vectorizer MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry de4ce66f5c aco: remove needs_instance_id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry e192e268de aco: explicitly mark end blocks for exports
For GS copy shaders, whether we want to do exports is conditional. By
explicitly marking the end blocks, we can mark an IF's then branch as an
export block and ensure that's where the assembler inserts null exports.

v6: only fixup exports in the end block, like before
v8: simplify some code

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry d46a54ecff radv/aco: allow ACO for GS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry 8bad100f83 aco: implement GS on GFX7-8
GS is the same on GFX6, but GFX6 isn't fully supported yet.

v4: fix regclass
v7: rebase after shader args MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry 40bb81c9dd radv/aco,aco: implement GS on GFX9+
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry 70f63c1988 aco: improve support for s_sendmsg
In particular, the messages needed for GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry 0da7b3b18b radv: move gs copy shader creation before other variants
ACO lowers output derefs which breaks the shader_info pass used by gs copy
shader creation.

v3: rebase

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Timur Kristóf 23edcf6490 aco: Make a better guess at which instructions need the VCC hint.
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
2020-01-24 13:14:23 +00:00