Commit Graph

130009 Commits

Author SHA1 Message Date
Boris Brezillon f04e5ef7ff panfrost: Add missing tile-buffer formats to the format enum
Some tile-buffer formats are missing, add them.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 685d45ff93 pan/bi: Special-case load_input for blend shaders
Blend shaders are passed blend inputs through r0-r3. Let's emit a MOV
from those register when we see a load_input intrinsic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 0d40460757 pan/bi: Reserve r0-r3 in blend shaders
Blend shaders are passed the source color through r0-r3. Let's avoid
allocating those. The is definitely not the right solution but is good
enough for now.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 3432d0a3e5 pan/bi: Special-case BLEND instruction emission for blend shaders
Blend shaders shouldn't use the blend descriptors stored in the FAU RAM
since this is what triggered the blend shader call in the first place.
The descriptor is instead extracted from the compiler inputs and passed
as a constant to the blend instruction.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 2f3f5da91d pan/bi: Collect return addresses of blend calls
We will need that for blend shaders so they can be passed a return
address and jump back to the fragment shader when they're done.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 8da0a1d5fd pan/bi: Add load_output support
This is mapped to the LD_TILE instruction. Note that multi-sample RTs
are not supported yet.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon c7748968ba panfrost: Flag blend shader function as an entry point
Some lowering functions used by bifrost are searching for an entry point.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 6d3fce5680 panfrost: Scalarize nir_load_blend_const_color_rgba
Bifrost is a scalar architecture, which means we can't load all
components of the blend constant at once. We could add a lowering pass
to scalarize nir_load_blend_const_color_rgba, but it's easier to handle
that at when lowering the blend equations.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 8d707cd918 panfrost: Add a "Bifrost Internal Blend" descriptor
This descriptor can be passed directly as a constant to the bifrost
BLEND instruction and we'll need to pass this information to blend
shaders. Let's extract the "Bifrost Internal Blend" descriptor from the
"Bifrost Blend Overlay" definition.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon e6186c2042 pan/bi: Support indirect jumps
We need that for blend shaders which are passed the return address
through r48.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 1a1d9cce46 pan/bi: Add support for load_blend_const_color_{r,g,b,a}_float
Needed for blend shaders.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 16179c89d1 pan/bi: Rework blend descriptor access handling
The current logic assumes blend descriptors are always retrieved from
the blend descriptor slots present in the FAU RAM, but this assumption
no longer stands when we add blend shaders to the mix. In that case we
need to use an 'opaque blend' whose descriptor is passed through
embedded constants.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 6dd2a76126 pan/bi: Get rid of the regs argument in bi_assign_fau_idx()
Regs are already part of the bundle struct, let's just pass a pointer
to this bundle object instead of passing both the bundle and regs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon f25850bf5f pan/bi: Use canonical name for FAU RAM sources
The uniform_constant field and BIFROST_SRC_CONST_{LO,HI} definitions
seem to imply that those only deal with embedded constants. Let's
rename them to reflect the fact that they actually encode accesses to
the Fast-Access-Uniform RAM.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 111cf7f0e8 pan/bi: Copy blend shader info from compile_inputs
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon 6c61f0b8e5 panfrost: Extend compile_inputs to pass a blend descriptor
This is needed for BLEND instructions used from a blend shader so we can
store the result of the shader-based blending back to the tile buffer.
We let the gallium driver build this blend descriptor for us in order
to keep the compiler cmdstream-agnostic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Boris Brezillon d8326ceafb panfrost: Fix fixed-function blend on bifrost
The conversion from a 32b float to a 16b fixed-point number was wrong.

Fixes: 8389976b7c ("panfrost: XML-ify the blend descriptors")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>
2020-10-15 08:05:23 +02:00
Iago Toral Quiroga 442f48f27b v3d/compiler: implement load interpolated input intrinsics
We will lower GLSL interpolateAt functions to these.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7155>
2020-10-15 02:04:04 +02:00
Iago Toral Quiroga 3ec165bce9 broadcom/compiler: track partially interpolated fragment inputs
We will need these to implement GLSL's interpolateAt*() functions where
we are required to perform interpolation in the shader at arbitrary
offsets.

Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7155>
2020-10-15 02:04:04 +02:00
Kenneth Graunke 71ed8c5aa6 iris: Fix doubling of shared local memory (SLM) sizes.
Commit 67ee9c5f55 added support for
using the `pipe_compute_state::req_local_mem` field, because Clover
can have a run-time specified size that isn't baked into the shaders.

However, it started adding the static size from the shader to the
dynamic state-supplied size.  The Mesa state tracker fills out
req_local_mem to prog->Base.info.cs.shared_size, which is exactly
what we fill out prog_data->total_shared to be.  Effectively, this
meant that we double-counted the same SLM requirements, doubling
our space requirements.

Fixes a 10% performance regression in Synmark2's OglCSDof test.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>
2020-10-14 23:13:41 +00:00
Kenneth Graunke 341f5bffb7 intel/compiler, anv: Delete cs_prog_data->slm_size
cs_prog_data->slm_size is basically redundant with
prog_data->total_shared, which is the field that we actually use for
controlling the shared local memory size in all drivers.  We were
still using it in one place for VK_EXT_pipeline_executable_properties,
but we should just fix that and delete the field.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>
2020-10-14 23:13:41 +00:00
Arcady Goldmints-Orlov e881290979 broadcom/compiler: use nir io semantics
This allows to clean up some code.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6721>
2020-10-14 22:54:58 +00:00
Alejandro Piñeiro 9b01598fe5 nir/lower_io_to_scalar: update io semantics on per-component inst
When we replace the original instruction with per-channel operations,
the new instruction should inherint the semantics of the original
instruction.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6721>
2020-10-14 22:54:58 +00:00
Arcady Goldmints-Orlov ac5f0ee19c broadcom/compiler: support varyings with struct types
This adds support for using structs as outputs from vertex shaders and
inputs to fragment shaders.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6721>
2020-10-14 22:54:58 +00:00
Eric Engestrom ebd5b555c1 docs/release-calendar: plan 20.3 release
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6616>
2020-10-14 21:43:35 +00:00
Jason Ekstrand f8117f7051 intel/fs: Allow constant-propagation into SAMPLEINFO and IMAGE_SIZE
Without this, we end up with indirect sampler messages all the time
because we don't propagate the texture/image BTI.  This makes debugging
shaders with imageSize or textureSamples in them a pain.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19720612 -> 19720564 (<.01%)
    instructions in affected programs: 4998 -> 4950 (-0.96%)
    helped: 12
    HURT: 0

All affected shaders were compute shaders in Deus Ex: Mankind Divided.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6794>
2020-10-14 21:35:30 +00:00
Eric Engestrom 438a409290 docs: update calendar and link releases notes for 20.1.10
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7147>
2020-10-14 21:28:23 +02:00
Eric Engestrom 713b666f29 docs: add release notes for 20.1.10
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7147>
2020-10-14 21:26:26 +02:00
Nanley Chery 0d9216a7cb isl: Allow CCS for 8bpp surfaces with 3+ miplevels
I can't find a restriction for enabling CCS on these surfaces in recent
versions of the Bspec. Since I didn't cite my source, I'm not even sure
such a restriction existed in the first place.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7085>
2020-10-14 18:44:08 +00:00
Nanley Chery f94ba6b6f5 iris: Add fast-clear restriction for 8bpp surfaces
For 8bpp surfaces on TGL, prevent LOD1+ from being fast-cleared. This
will be relevant once ISL starts allowing CCS for 8bpp surfaces with
more than 2 miplevels. I verified the problem behind this restriction
with a modified version of the fbo-clearmipmap piglit test.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7085>
2020-10-14 18:44:08 +00:00
Dylan Baker 1affcea37a docs: update calendar and link releases notes for 20.2.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7141>
2020-10-14 17:51:00 +00:00
Dylan Baker dee2fdb3da docs: add SHA256 sums for 20.2.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7141>
2020-10-14 17:51:00 +00:00
Dylan Baker 3c89e7b422 docs: add release notes for 20.2.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7141>
2020-10-14 17:51:00 +00:00
Samuel Pitoiset bb00a6860e radv: fix optimizing needed states if some are marked as dynamic
From the Vulkan spec 1.2.157:

    "VK_DYNAMIC_STATE_STENCIL_TEST_ENABLE_EXT specifies that the
     stencilTestEnable state in VkPipelineDepthStencilStateCreateInfo
     will be ignored and must be set dynamically with
     vkCmdSetStencilTestEnableEXT before any draw call."

So, stencilTestEnable should be ignored if dynamic. While we are
at it, fix depthBoundsTestEnable too.

Cc: 20.2
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3633
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7112>
2020-10-14 17:13:29 +00:00
Eric Anholt 68daac28df docs: Document how to replicate a CI build locally.
Who hasn't needed to do this at some point?  Turns out it's not too hard
to do, and was useful for me in iterating on the Android build.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>
2020-10-14 16:54:59 +00:00
Eric Anholt 0767af3ffe ci/android: Switch to using the Android NDK.
To support Android drivers, we're going to want to be tracking that Mesa's
build succeeds on a real android toolchain.  This still uses the android
stubs since these libs aren't in the NDK.

Note that I had to drop the Intel and AMD drivers currently: we don't have
LLVM cross-compiled for Android in this container, and I'm honestly hoping
ACO saves us from that.  Intel has dependencies on libexpat, which AOSP
really doesn't want to bring in, and it looks to me like those dependencies
could be optional.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>
2020-10-14 16:54:59 +00:00
Eric Anholt ad6189920b symbols-check: Add __cxa_guard_* to the list of approved symbols.
These are introduced by the compiler during static local initialization in
c++ for thread safety.  This seems to end up being public in the driver
with --static-libc++ on android.

Reviewed-by: <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>
2020-10-14 16:54:59 +00:00
Eric Anholt 4722491124 glsl/tests: Make the tests skip on Android binary execution failures.
We don't have a suitable exe wrapper for running them, and the missing
linker is throwing return code 255 instead of an ENOEXEC.  Catch it and
return skip from the tests.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>
2020-10-14 16:54:59 +00:00
Eric Anholt f51ce21e4e meson: Drop adding -Wl,--gc-sections to project c/cpp arguments.
We already have the targets we care about doing this using
ld_args_gc_sections, and by adding it to project arguments we caused
warnings spam in the android clang build about the compile stage not using
the argument.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>
2020-10-14 16:54:59 +00:00
Tony Wasserka d5a72319d6 aco/isel: Remove now unused VS-related code from create_null_export
Also replaced a hardcoded constant with the appropriate register macro.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>
2020-10-14 16:22:51 +00:00
Tony Wasserka c22c702f35 aco/isel: Remove some dead code
exported_pos was always initialized to true (due to the is_pos argument
of the first export_vs_varying call being true), so none of this code has
any effect.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>
2020-10-14 16:22:51 +00:00
Tony Wasserka bf51b11c04 aco/isel: Always export position data from VS/NGG
AMD ISA docs explicitly require this for VS, and this likely extends to
NGG too.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3615
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>
2020-10-14 16:22:51 +00:00
Daniel Schürmann f29c81f863 aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible
This patch also does a slight rework of export_fs_mrt_color()
to avoid setting of enabled channels which are not used.

Totals from 52404 (38.38% of 136546) affected shaders (NAVI):
SGPRs: 3097443 -> 3097435 (-0.00%)
CodeSize: 189151600 -> 188546200 (-0.32%)
Instrs: 36445061 -> 36445104 (+0.00%); split: -0.00%, +0.00%
Cycles: 1739388020 -> 1739388192 (+0.00%); split: -0.00%, +0.00%
VMEM: 21071501 -> 21071665 (+0.00%); split: +0.00%, -0.00%
SMEM: 3470983 -> 3470982 (-0.00%); split: +0.00%, -0.00%
PreSGPRs: 2058965 -> 2058962 (-0.00%)
PreVGPRs: 1860294 -> 1860295 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann 7240edec2a aco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10
Totals from 767 (0.56% of 136546) affected shaders (NAVI):
CodeSize: 2862208 -> 2850036 (-0.43%)
Instrs: 561572 -> 561574 (+0.00%)
Cycles: 6455420 -> 6455428 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann 2f125908b3 radv,aco: lower_pack_half_2x16
This patch also optimizes pack_half_2x16(a, 0.0).

Totals from 1949 (1.43% of 136546) affected shaders (RAVEN):
SGPRs: 83376 -> 83336 (-0.05%)
CodeSize: 3532144 -> 3512352 (-0.56%)
Instrs: 660746 -> 660682 (-0.01%); split: -0.01%, +0.00%
Cycles: 6780716 -> 6780472 (-0.00%); split: -0.00%, +0.00%
VMEM: 990886 -> 990883 (-0.00%); split: +0.00%, -0.00%
SMEM: 150506 -> 150538 (+0.02%); split: +0.05%, -0.03%
SClause: 30595 -> 30594 (-0.00%); split: -0.01%, +0.00%
Copies: 40801 -> 40729 (-0.18%)
PreSGPRs: 52335 -> 52341 (+0.01%); split: -0.03%, +0.04%
PreVGPRs: 45104 -> 45097 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann dae1e6f756 aco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16
Apparently, we forgot to remove some debug code.
This patch also fixes the round mode check to consider
the destination bit width.

Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
SGPRs: 100848 -> 100280 (-0.56%)
VGPRs: 68536 -> 66044 (-3.64%); split: -3.68%, +0.05%
CodeSize: 4882296 -> 4837220 (-0.92%); split: -0.94%, +0.01%
MaxWaves: 18990 -> 19019 (+0.15%); split: +0.19%, -0.04%
Instrs: 938150 -> 930388 (-0.83%); split: -0.83%, +0.00%
Cycles: 8699824 -> 8667648 (-0.37%); split: -0.38%, +0.01%
VMEM: 1144502 -> 1059680 (-7.41%); split: +0.06%, -7.48%
SMEM: 170076 -> 167999 (-1.22%); split: +0.22%, -1.44%
VClause: 18428 -> 18422 (-0.03%)
SClause: 41375 -> 41353 (-0.05%); split: -0.06%, +0.00%
Copies: 60008 -> 60054 (+0.08%); split: -0.31%, +0.39%
PreVGPRs: 56163 -> 56142 (-0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann 9185b7c069 aco: add validation rules for p_split_vector
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann aec872cda0 aco: use p_split_vector for nir_op_unpack_half_*
This enables the use of SDWA if possible

Totals from 9933 (7.27% of 136546) affected shaders (RAVEN):
VGPRs: 731764 -> 731772 (+0.00%); split: -0.00%, +0.00%
CodeSize: 90944852 -> 90671472 (-0.30%); split: -0.30%, +0.00%
Instrs: 17881885 -> 17867831 (-0.08%); split: -0.08%, +0.00%
Cycles: 1597904072 -> 1597771260 (-0.01%); split: -0.01%, +0.00%
VMEM: 1702328 -> 1697383 (-0.29%); split: +0.13%, -0.42%
SMEM: 659583 -> 659049 (-0.08%); split: +0.01%, -0.09%
VClause: 318024 -> 318025 (+0.00%); split: -0.00%, +0.00%
SClause: 631670 -> 631707 (+0.01%); split: -0.01%, +0.01%
Copies: 1504107 -> 1504626 (+0.03%); split: -0.01%, +0.04%
PreVGPRs: 683153 -> 683180 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann f503699e10 nir/opt_algebraic: optimize unpack_half_2x16_split_x(ushr, a, 16)
Same as extract_u16(a, 1)

Totals from 2021 (1.48% of 136546) affected shaders (RAVEN):
VGPRs: 129516 -> 129524 (+0.01%); split: -0.00%, +0.01%
CodeSize: 12485704 -> 12486600 (+0.01%); split: -0.00%, +0.01%
Instrs: 2435041 -> 2434999 (-0.00%); split: -0.00%, +0.00%
Cycles: 20952552 -> 20952624 (+0.00%); split: -0.00%, +0.00%
VMEM: 374492 -> 374212 (-0.07%); split: +0.01%, -0.08%
SMEM: 123309 -> 123291 (-0.01%); split: +0.00%, -0.02%
VClause: 64156 -> 64164 (+0.01%)
Copies: 191620 -> 191616 (-0.00%); split: -0.03%, +0.03%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00
Daniel Schürmann a38a497b86 aco: use p_create_vector for nir_op_pack_half_2x16
This enables the use of SDWA if possible

Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
VGPRs: 68508 -> 68516 (+0.01%)
CodeSize: 4897024 -> 4881068 (-0.33%); split: -0.33%, +0.00%
MaxWaves: 18992 -> 18990 (-0.01%)
Instrs: 946942 -> 939161 (-0.82%); split: -0.82%, +0.00%
Cycles: 8737668 -> 8705704 (-0.37%); split: -0.37%, +0.00%
VMEM: 1155362 -> 1145245 (-0.88%); split: +0.00%, -0.88%
SMEM: 170435 -> 170165 (-0.16%); split: +0.01%, -0.16%
VClause: 18426 -> 18425 (-0.01%)
SClause: 41376 -> 41375 (-0.00%)
Copies: 59813 -> 59787 (-0.04%); split: -0.15%, +0.10%
PreVGPRs: 56126 -> 56136 (+0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
2020-10-14 15:31:38 +00:00