Save the number of slice in AV1 slice parameter, so that the
underlying driver (such as virgl) can handle the slice parameters
better.
Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn>
Suggested-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23386>
And split them into UBO and SSBO
v2 (Lionel):
- Get rid of robustness fields in anv_shader_bin
v3 (Lionel):
- Do not pass unused parameters around
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>
We cannot find anything in the Vulkan spec requiring this. D3D12 [1]
says it's undefined as long as it doesn't crash the OS :
"Out of bounds indexing of any descriptor table from the shader
results in a largely undefined memory access, including the
possibility of reading arbitrary in-process memory as if it is a
hardware state descriptor and living with the consequence of what
the hardware does with that. This could produce a device reset, but
will not crash Windows."
[1] : https://learn.microsoft.com/en-us/windows/win32/direct3d12/advanced-use-of-descriptor-tables#out-of-bounds-indexing
Found 2 titles affected by this change
Some pretty good results on Cyberpunk 2077 :
Totals from 10285 (100.00% of 10285) affected shaders:
Instrs: 7638709 -> 7517360 (-1.59%); split: -1.64%, +0.05%
Cycles: 148047414 -> 148470916 (+0.29%); split: -0.83%, +1.12%
Subgroup size: 112544 -> 112576 (+0.03%); split: +0.04%, -0.01%
Spill count: 98 -> 90 (-8.16%)
Fill count: 90 -> 82 (-8.89%)
Max live registers: 495274 -> 479502 (-3.18%); split: -3.21%, +0.03%
Max dispatch width: 87824 -> 91168 (+3.81%); split: +4.10%, -0.29%
Gaining 297 shaders in SIMD16/32, loosing 16 SIMD32 shaders
Some not so good results on Strange Brigade :
Totals from 4027 (100.00% of 4027) affected shaders:
Instrs: 2080355 -> 2013880 (-3.20%); split: -3.20%, +0.01%
Cycles: 25405149 -> 25170579 (-0.92%); split: -1.37%, +0.45%
Max live registers: 167303 -> 168958 (+0.99%)
Max dispatch width: 33264 -> 32496 (-2.31%)
Loosing 96 SIMD16 shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>
This replicates the same fix we did for Anv and null descriptors with
A64 messages from commit efcda1c530 ("anv: fix null descriptor
handling with A64 messages").
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>
Currently we put the upload and cmd bos into GART, however in the
future this might change, but for conditional rendering we must have
a GART space to read the value from. This creates a separate buffer
allocations that are gart forced. This will be used to provide
cond render with a gart location.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24520>
Structs are not allowed to contain an image in regular glsl. The only time
they are intended to be allowed to be declared in a struct is when
they are bindless.
Unfortunately the bindless spec does not meantion this behaviour
explicitly so there is no spec quote to reference but you can see in
the original commit to allow them in mesa that spec clarification was
provided 48b7882200
The spec also states that certain uses are implicitly bindless as per
the following spec quote:
"When used as shader inputs, outputs, uniform block members,
or temporaries, the value of the sampler is a 64-bit unsigned
integer handle and never refers to a texture image unit."
Given images are not allowed in regular glsl for the above types
similair to being forbidden in structs, we can also assume
declarations in structs are implicitly bindless.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24269>
- bresenham and smooth lines
These two need to override multisample rasterization to get correct
results on CTS tests.
- stippled lines
The stipple factor needs to be remapped from [1, 256] to [0, 255].
-rectangular and strict lines
Rectangular lines need multisample rasterization rules to get correctly
rasterized even for one sample. That way we get strict lines too for
VK_LINE_RASTERIZATION_MODE_DEFAULT_EXT.
As per the DX rasterization rules:
Rasterization rules for primitives are, in general, unchanged by multisample antialiasing, except:
- For a triangle, a coverage test is performed for each sample location (not for a pixel center).
If more than one sample location is covered, a pixel shader runs once with attributes interpolated at the pixel center.
The result is stored (replicated) for each covered sample location in the pixel that passes the depth/stencil test.
- A line is treated as a rectangle made up of two triangles, with a line width of 1.4.
- For a point, a coverage test is performed for each sample location (not for a pixel center).
For single sample rasterization we get the same results for the
triangles and points, but for lines we get the rectangular form instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24517>
dEQP-GLES31.functional.image_load_store.buffer.image_size.writeonly_7
produces:
t7 opcode: CP_COMPUTE_CHECKPOINT (6e) (8 dwords)
{ ADDR_0_LO = 0x15000 }
{ ADDR_0_HI = 0x5 }
0x18
{ ADDR_1_LEN = 3 }
0xf
{ ADDR_1_LO = 0x2e010 }
{ ADDR_1_HI = 0x5 }
and it was asserting due to sizedwords==7. Without the assert, we were
dereffing a len past the end of the packet. This len value we were
loading is also suspiciously not the location of the ADDR_1_LEN field in
the packet's XML. But then, the command stream at ADDR_1 was clearly 0xf
long, and that puts ADDR_1_LEN at the spot we would expect compared to
SET_RENDER_MODE's ADDR_1.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>
A bunch of our piglit fails were due to failing to compile shaders due to
a lack of spilling support. I used a simple shader with a large local
array with tunable size to determine the MEMSIZEPERITEM increment and the
location of HWSTACKOFFSET (matching a3xx locations). Unfortunately
fibers_per_sp I had to guess by taking a big spilling shader and cranking
it up until it rendered correctly. The value I found made HWSTACKOFFSET's
shift value match a6xx's, as a bit of confirmation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>