Now that we have access to the interior switch statement not going through
the txs special case for coord_components, we can just use it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
This lowers mediump FS outputs to fp16 in the ir3 backend. For now
this is a modest improvement, which mostly helps us whittle down the
full mediump work. Once the GLSL level support lands, then right hand
side of the store output intrinsics will be fp16 expressions and we'll
cancel out the fp16 -> fp32 -> fp 16 round trip here.
We've had different attempts at implementing this: rewriting stores in
the GLSL IR, lowering GLSL IR outputs to temporaries and inserting
conversions when writing the temporaries to the outputs. In the end,
GLSL ends up getting in the way a lot and doing it at the nir level is
easier and still possible since we have the output var precisions.
This part of the fp16 work is more of a step on the way towards full
fp16 support and will add a few extra conversion instructions:
total instructions in shared programs: 8151 -> 8163 (0.15%)
instructions in affected programs: 1187 -> 1199 (1.01%)
helped: 4
HURT: 10
total nops in shared programs: 3146 -> 3152 (0.19%)
nops in affected programs: 563 -> 569 (1.07%)
helped: 5
HURT: 10
total non-nops in shared programs: 5005 -> 5011 (0.12%)
non-nops in affected programs: 92 -> 98 (6.52%)
helped: 0
HURT: 3
total dwords in shared programs: 12832 -> 12800 (-0.25%)
dwords in affected programs: 96 -> 64 (-33.33%)
helped: 1
HURT: 0
total last-baryf in shared programs: 118 -> 115 (-2.54%)
last-baryf in affected programs: 21 -> 18 (-14.29%)
helped: 1
HURT: 0
total full in shared programs: 424 -> 417 (-1.65%)
full in affected programs: 15 -> 8 (-46.67%)
helped: 7
HURT: 0
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
So far we only handle full regs of arrays during pre-allocation.
This patch is to handle half regs of arrays and also consider the size
of half regs when finding out conflicts.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
This pass tries to fold f2f16 conversion into alu instructions.
This will be useful to help reduce the number of instructions once
mesa starts supporting precision lowering. For example:
add.f r0.w, r0.w, c0.x
cov.f32f16 hr2.x, r0.w
to
add.f hr2.x, r0.w, c0.x
Additionally this pass also tries to fold f2f16 conversion into load_input
instruction:
bary.f r0.x, 3, r0.w
cov.f32f16 hr0.x, r0.x
to
bary.f hr1.x, 3, r0.x
v2: Edit to not fold OPC_MAX_F and OPC_MIN_F, since that's not valid.
v3: Add OPC_ABSNEG_F to the blacklist as well.
v4: Don't remove dead cov instructions, DCE will do that later; don't
iterate through sources when a cov only has one; remove special
handling of IR3_REG_ARRAY and IR3_REG_RELATIV.
v5: Handle folding into u32.u32 movs of floats correctly, don't bail
out on IR3_REG_RELATIV or IR3_REG_ARRAY movs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
Nothing used by mesa, but crashdec tool uses a few of these. And since
the practice is these days to sync mesa->envytools, adding these on the
mesa side first.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3833>
This is a builtin type (treated as uint, but with special type-aware
decoding) in envytools/cffdump. Lets teach gen_header.py about it and
drop the enum hack in the xml so I don't have to keep deleting the enum
when I sync the xml back to the freedreno envytools tree.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3833>
Similar to vkCmdClearAttachments(), we use CP_COND_REG_EXEC to
conditionally execute both the gmem and sysmem paths, except for after
the last subpass where it's known whether we're using sysmem rendering
or not.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3713>
This has only lightly been tested. It passes dEQP-VK.api.smoke.triangle,
so at least we're able to show a triangle. For now, it's just enabled
under a debug flag. In the future we'll probably want some heuristics
like what freedreno has and another debug flag to disable it except when
it's forced.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3713>
We may need shader workarounds for some formats, but for now this seems
to work at least as well as the gmem path for clearing multisample
attachments. And soon we'll start calling this even on the gmem path,
since we leave the final decision of whether to use sysmem or not up
till the end, so we can't have it assert or otherwise working tests
would assert.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3713>
I merely ported a freedreno patch to turnip which
updates some magic regsiter values.
commit ff6e148a3d
Author: Rob Clark <robdclark@chromium.org>
CommitDate: Tue Oct 29 09:19:34 2019 -0700
Subject: freedreno/a6xx: add a618 support
That's all that Rob did for gallium for a618, so I assume that's we need
for turnip also.
Tested manually with:
dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.*
pass 300/555
fail 0/555
skip 255/555
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3743>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3743>
The value of some magic regsiters differ across chipsets. fd6_context
manages the differences by initializing them at runtime. Let's do the
same.
Add to tu_physical_device a subset of those found in fd6_context:
RB_UNKNOWN_8E04_blit
RB_CCU_CNTL_gmem
PC_UNKNOWN_9805
SP_UNKNOWN_A0F8
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3743>
Loses some information about which formats can be used in which cases, but
we encode that information in the format table anyway.
Important notes:
* RB6_R10G10B10A2_UNORM becomes FMT6_R10G10B10A2_UNORM_DEST
* TFMT6_8_8_8_UNORM becomes FMT6_8_8_8_X8_UNORM (not FMT6_8_8_8_UNORM)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3798>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3798>
The previous commit leads to match immed values unexpectedly.
This makes constlen for each shader including bvert wrong.
Also fixes atan2 for mediump deqp tests.
Fixes: cbd1f47433 ("freedreno/ir3: convert back to 32-bit values for half constant registers.")
v2: Move conversion up above fabs/fneg modifier handling as well.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3737>
This lets is_same_type_reg() recognize that the dst and src of the
immediate MOV are the same and unblocks fp16 constant propagation.
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3737>
The freedreno gen_header.py script now only works under python3.
It contains a "print()" call which prints a blank line under python3
but prints "()" under python2.7.
However the Android build currently uses python2.
This leads to incorrect code generation and a later build error.
.../STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/adreno_common.xml.h:163:2: error: expected identifier or '('
()
Fix this by adding MESA_PYTHON3 and using it for the freedreno scripts.
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3736>
This means you can directly use format utils on it without having to have
your own GL enum to number-of-components switch statement (or whatever) in
your vulkan backend.
Thanks to imirkin for fixing up the nouveau driver (and a couple of core
details).
This fixes the computed qualifiers for EXT_shader_image_load_store's
non-integer sizeNxM qualifiers, which we don't have tests for.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3d)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
In addition to preparing us for dynamically resizing them, which has to
be controlled by the device, this greatly reduces the memory usage when
allocating large numbers of command buffers, making
dEQP-VK.api.object_management.max_concurrent.command_buffer_primary go
from crash -> pass.
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3621>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3621>
Document the first DWORD, which at least for the Vulkan blob on a640
isn't always 2.
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3600>