Changes in this patch include:
- Rename all files in src/intel/common path
- Update the filenames used in source and build files
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
Starting with d0d039a4d3, we emit writes to the push constant chunk
of the payload to stomp out-of-bounds data to zero for Vulkan. Then, in
369eab9420, we started emitting shader preamble code for emulated
push constants on Gen12.5 parts. In either of these cases, we can run
into issues if we don't have a proper live range for some of the payload
registers where they get used for something and then smashed by our push
handling code. We've not seen many issues with this yet because it only
happens when you have dead push constants.
Fixes: d0d039a4d3 "anv: Emit pushed UBO bounds checking code..."
Fixes: 369eab9420 "intel/fs: Emit code for Gen12-HP indirect..."
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9501>
It's easier to compare with the HW docs than a pile of hex.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9501>
In cases where the alpha coverage is enabled but the color attachment is
either unused or absent there should be a dummy mrt to make the draw behave
correctly.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Yannik Marek <yannik@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8952>
If a VkRenderPassInputAttachmentAspectCreateInfo is provided, we use the
aspects specified there. Otherwise, we default to every aspect in the
format. For attachments which are not input attachments, aspectMask is
left zero.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>
The Android ones we put in anv_android.c. Maybe one day we'll want a
vk_android.h to put some common Android stuff but, for now, let's keep
it contained to ANV's android code.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>
The variable-length stack allocations are causing issues with ubsan when
the array size is zero. Also, a heap allocation is probably safer.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>
- no 3D and cube textures
- no mipmapping
- no border color
- image_sample is the only supported opcode with a sampler (behaves like _lz)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>
The maximum value which OPC_GETSIZE could return for one dimension
is 0x007ff0, however sampler buffer could be much bigger.
Blob uses OPC_GETBUF for them.
Fixes tests:
dEQP-VK.memory.pipeline_barrier.transfer_dst_uniform_texel_buffer.1048576
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>
Otherwise, we won't be able to use OPC_GETBUF to get their size.
After this change we also could get rid of the hack for OPC_GETSIZE
which scaled the size for texture buffers.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>
It appears that storage for varyings in a wave has an upper
limit of wavesize * max_a831 where max_a831 is 64.
Exceeding the limit seam to force gpu to reduce primitives
processed per wave, at least calculations make sense with
such interpretation.
With blob SP_HS_WAVE_INPUT_SIZE never exceeds 64 and setting
it to 65 in freedreno leads to a hang.
Copied from the commit to freedreno e5499ca2
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8187>
This fixes the assembly for many scenarios where you want to use shader
replacement.
Note: unfortunately this leaks the identifier string created while
lexing, but I couldn't find a way to avoid leaking it except for
bringing in ralloc or something (which would be way more complicated).
The only other place doing something similar in mesa is the glsl parser,
which is using ralloc (actually a linear context).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
Trying to specify a floating-point value in a @const line would result
in it getting interpreted as a FLUT value and failing parsing. Fix this
by making the various FLUT tokens include the surrounding parentheses.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
Mesa fixed pipeline texture loading on programmable pipeline hardware emits
a generic fragment shader program which contains gl_TexCoord.xyzw as a vec4
and then expects to configure the varying assignments to the shader in the
pipeline command stream, to select what is wired to the XYZW fragment shader
inputs.
This gl_TexCoord.xyzw is turned into texture load with projection (TGSI TXP
opcode, similar for NIR). Texture load with projection does not exist in the
Vivante GPU as a dedicated opcode and is emulated. The shader program first
divides texture coordinates XYZ by projector W and then applies regular TEX
opcode to load the texture (i.e. TEX(gl_TexCoord.xyzw/gl_TexCoord.wwww)).
For point sprites, XY are the point coordinates from VS, Z=0 and W=1, always.
The Vivante GPU can only configure varying to be either of -- point coord X,
point coord Y, used, unused -- which covers XYZ, but not W. Z is fine because
unused means 0.
W used to be 0 too before this patch and that led to division by 0 in shader.
The only known way to solve this is to set Z=0, W=1 in the shader program
itself if the point sprites are enabled. This means we have to generate a
special shader variant which does extra SET to set the W=1 in case the point
sprites are enabled.
In case of TGSI, emitting the SET.TRUE opcode permits setting W=1 without
allocating additional constants. With NIR, use nir_lower_texcoord_replace()
to lower TEXn to PNTC, which sets Z=0, W=1, and let NIR optimize the shader.
Note that nir_lower_texcoord_replace() must be called before input linking
is set up, as it might add new FS input.
Also note that it should be possible to simply drop PIPE_CAP_POINT_SPRITE
in the long run, ST would then apply the same optimization pass, but that
option is so far misbehaving. And for etnaviv TGSI this is not applicable
yet.
This fixes neverball point sprites (exit cylinder stars) and eglretrace of
gl4es pointsprite test:
https://github.com/ptitSeb/gl4es/blob/master/traces/pointsprite.tgz
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8618>
Fixes this assert in debug builds:
in __GI___assert_fail (assertion=0x7ffff731f66b "util_cpu_caps.nr_cpus >= 1", file=0x7ffff731f650 "../src/util/u_cpu_detect.h", line=116,
function=0x7ffff7323280 <__PRETTY_FUNCTION__.11654> "util_get_cpu_caps") at assert.c:101
in util_get_cpu_caps () at ../src/util/u_cpu_detect.h:116
in _mesa_float_to_float16_rtz (val=0) at ../src/util/half_float.h:93
in util_format_r16g16b16a16_float_pack_rgba_float (dst_row=0x7fffffffbdc0 "", dst_stride=0, src_row=0x7fffffffbf90, src_stride=0, width=1, height=1)
at src/util/format/u_format_table.c:13459
in util_format_pack_rgba (format=PIPE_FORMAT_R16G16B16A16_FLOAT, dst=0x7fffffffbdc0, src=0x7fffffffbf90, w=1) at ../src/util/format/u_format.h:1525
in util_pack_color (rgba=0x7fffffffbf90, format=PIPE_FORMAT_R16G16B16A16_FLOAT, uc=0x7fffffffbdc0) at ../src/gallium/auxiliary/util/u_pack_color.h:432
in v3dv_get_hw_clear_color (color=0x7fffffffbf90, internal_type=6, internal_size=8, hw_color=0x7fffffffbf10) at ../src/broadcom/vulkan/v3dv_cmd_buffer.c:1241
v2: move call from physical device to instance init.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9408>
This restores many of the hurt shaders from the previous patch at the
expense of re-adding ldvary tracking in the scheduler.
total instructions in shared programs: 13760415 -> 13755738 (-0.03%)
instructions in affected programs: 1207560 -> 1202883 (-0.39%)
helped: 5080
HURT: 1731
Instructions are helped.
total max-temps in shared programs: 2322991 -> 2322828 (<.01%)
max-temps in affected programs: 5063 -> 4900 (-3.22%)
helped: 229
HURT: 108
Max-temps are helped.
total sfu-stalls in shared programs: 31827 -> 31545 (-0.89%)
sfu-stalls in affected programs: 478 -> 196 (-59.00%)
helped: 304
HURT: 21
Sfu-stalls are helped.
total inst-and-stalls in shared programs: 13792242 -> 13787283 (-0.04%)
inst-and-stalls in affected programs: 1220856 -> 1215897 (-0.41%)
helped: 5162
HURT: 1697
Inst-and-stalls are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>
We get optimal ldvary pipelining by doing the following:
1) Carefully merge a paired ldvary into the previous instruction when
possible.
2) When the above succeeds, flag the ldvary as scheduled immediately so
we can merge one of its children into the current instruction.
3) When scheduling ldvary sequences, only pick up instructions that are
part of the sequence to avoid picking up something that prevents
successful pipelining.
This patch skips 3) assuming some hurt shaders in exchange for better
scheduling flexibility during ldvary sequences. Besides eliminating most
of the code dedicated to special handling ldvary sequences, this also
usually allows us to produce better code by merging instructions that are
unrelated to ldvary sequences into the ldvary sequences, which is
particularly effective to fill up the gaps produced when scheduling the
first and last ldvary sequences as well as the gaps produced by flat
and noperspective varyings sequences that don't have both mul and add
instructions.
Notice that there are some hurt shaders, because some times the extra
scheduler flexibility can lead to picking up instructions that will
break a sequence without compensating for that, typically an ldunif
that prevents us from doing the fixup for a follow-up ldvary. We will
try to correct some of these cases with the next patch.
total instructions in shared programs: 13786037 -> 13760415 (-0.19%)
instructions in affected programs: 3201387 -> 3175765 (-0.80%)
helped: 16155
HURT: 4146
Instructions are helped.
total max-temps in shared programs: 2324834 -> 2322991 (-0.08%)
max-temps in affected programs: 22160 -> 20317 (-8.32%)
helped: 1340
HURT: 103
Max-temps are helped.
total sfu-stalls in shared programs: 30685 -> 31827 (3.72%)
sfu-stalls in affected programs: 782 -> 1924 (146.04%)
helped: 253
HURT: 1416
Inconclusive result.
total inst-and-stalls in shared programs: 13816722 -> 13792242 (-0.18%)
inst-and-stalls in affected programs: 3171642 -> 3147162 (-0.77%)
helped: 15331
HURT: 4179
Inst-and-stalls are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>
These checks depend on prev_inst being set, so move them down below
with all the other checks with the same requirement.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>
In case a new gl_PointCoord shader input is created, increment shader
input count and set valid driver_location to the new input variable,
otherwise the input gets aliased to input 0 and shows up in NIR_PRINT
output as whatever shader input 0 is instead of gl_PointCoord. Also
set the input as used, otherwise it might get removed.
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9214>