The spec seems clear this is not allowed but the Nvidia binary
forces apps to add layout qualifiers so this works around the
issue for No Mans Sky until the CTS can be sorted out.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The spec is quite clear this is not allowed:
From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec:
"Layout qualifiers can appear in several forms of declaration.
They can appear as part of an interface block definition or
block member, as shown in the grammar in the previous section.
They can also appear with just an interface-qualifier to establish
layouts of other declarations made with that qualifier:
layout-qualifier interface-qualifier ;
Or, they can appear with an individual variable declared with
an interface qualifier:
layout-qualifier interface-qualifier declaration ;"
From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec:
"Layout qualifiers cannot be used on formal function parameters,
and layout qualification is not included in parameter matching."
However on the Nvidia binary driver they actually fail to compile
if image function params don't have a layout qualifier. This results
in applications such as No Mans Sky using layout qualifiers on params.
I've submitted a CTS test to expose this problem in the Nvidia driver
but until that is resolved this patch will help Mesa drivers work
around the issue.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This fixes compilation of some "No Mans Sky" shaders where the stringification
happens in branches intended for DX12.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This hijacks the top 16-bits of swizzle, to pass in the swizzle
for the second channel.
This fixes handling .yx swizzles of 64-bit values.
This should fixup radeonsi and llvmpipe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107524
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We could enable it for lower versions of GL but this allows us
to just use the existing version/extension checks that are already
used by the core profile.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Now that the drivers are lowering to surface indices themselves, we no
longer need to push the surface index into the shader.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, the back-end compiler turn image access into magic uniform
reads and there was a complex contract between back-end compiler and
driver about setting up and filling out those params. As of this
commit, both drivers now lower image_deref_load_param_intel intrinsics
to load_uniform intrinsics controlled by the driver and lower the other
image_deref_* intrinsics to image_* intrinsics which take an actual
binding table index. There are still "magic" uniforms but they are now
added and controlled entirely by the driver and that contract no longer
spans components.
This also has the side-effect of making most image use compile-time
binding table indices. Previously, all image access pulled the binding
table index from a uniform. Part of the reason for this was that the
magic uniforms made it difficult to decouple binding table indices from
the uniforms and, since they are indexed completely differently
(especially in Vulkan), it was hard to pull them apart. Now that the
driver is handling both, it's trivial to decouple the two and provide
actual binding table indices.
Shader-db results on Kaby Lake:
total instructions in shared programs: 15166872 -> 15164293 (-0.02%)
instructions in affected programs: 115834 -> 113255 (-2.23%)
helped: 191
HURT: 0
total cycles in shared programs: 571311495 -> 571196465 (-0.02%)
cycles in affected programs: 4757115 -> 4642085 (-2.42%)
helped: 73
HURT: 67
total spills in shared programs: 10951 -> 10926 (-0.23%)
spills in affected programs: 742 -> 717 (-3.37%)
helped: 7
HURT: 0
total fills in shared programs: 22226 -> 22201 (-0.11%)
fills in affected programs: 1146 -> 1121 (-2.18%)
helped: 7
HURT: 0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This commit expands the current memory access enum to contain the extra
two bits provided for images. We choose to follow the SPIR-V convention
of NonReadable and NonWriteable because readonly implies that you *can*
read so readonly + writeonly doesn't make as much sense as NonReadable +
NonWriteable.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The GLSL spec allows you to set both the "readonly" and "writeonly"
qualifiers on images to indicate that it can only be used with
imageSize. However, we had no way of representing this int he linked
shader and flagged it as GL_READ_ONLY. This is good from a "does it use
this buffer?" perspective but not from a format and access lowering
perspective. By using GL_NONE for if "readonly" and "writeonly" are
both set, we can detect this case in the driver and handle it correctly.
Nothing currently relies on the type of surface in the "readonly" +
"writeonly" case but that's about to change. i965 is the only drier
which uses the ImageAccess field and gl_bindless_image::access is
currently unused.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Having the array length component stored in .z was a small convenience
for the ISL image param filling code and an annoyance in the NIR
lowering code. The only convenience of treating 1D arrays like 2D
arrays in the lowering code is in the address calculation code so let's
put all the complexity there as well.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This commit moves our storage image format conversion codegen into NIR
instead of doing it in the back-end. This has the advantage of letting
us run it through NIR's optimizer which is pretty effective at shrinking
things down. In the common case of rgba8, the number of instructions
emitted after NIR is done with it is half of what it was with the
lowering happening in the back-end. On the downside, the back-end's
lowering is able to directly use predicates and the NIR lowering has to
use IFs.
Shader-db results on Kaby Lake:
total instructions in shared programs: 15166910 -> 15166872 (<.01%)
instructions in affected programs: 5895 -> 5857 (-0.64%)
helped: 15
HURT: 0
Clearly, we don't have that much image_load_store happening in the
shaders in shader-db....
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Dead code will get rid of them eventually but it's better if they're
just gone so we guarantee they won't trip up later passes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Instead of requiring 4 components, this allows them to potentially use
fewer. Both the SPIR-V and GLSL paths still generate vec4 intrinsics so
drivers which assume 4 components should be safe. However, we want to
be able to shrink them for i965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We have a name for that, it's called a uvec. This just makes the
function name a bit shorter. While we're here, we also add an assert
for one of the assumptions this function makes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There is nothing inherent about these opcodes that requires them to only
take scalars. It's very convenient if we let them take vectors as well.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Found by inspection. This doesn't help much now but we'll see this
pattern with images if you load UNORM and then store UNORM.
Shader-db results on Kaby Lake:
total instructions in shared programs: 15166916 -> 15166910 (<.01%)
instructions in affected programs: 761 -> 755 (-0.79%)
helped: 6
HURT: 0
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This adds the "(a << N) >> M" family of mask or sign-extensions. Not a
huge win right now but this pattern will soon be generated by NIR format
lowering code.
Shader-db results on Kaby Lake:
total instructions in shared programs: 15166918 -> 15166916 (<.01%)
instructions in affected programs: 36 -> 34 (-5.56%)
helped: 2
HURT: 0
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
If it's not the right bit-size, it may not actually be the correct
extraction. For now, we'll only worry about 32-bit versions.
Fixes: 905ff86198 "nir: Recognize open-coded extract_u16"
Fixes: 76289fbfa8 "nir: Recognize open-coded extract_u8"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Blending isn't valid for integer formats. Rather than having drivers
worry about this, just disable blending in this case. This hopefully
will increase hits in the CSO cache as well, by eliminating most of the
meaningless fields in this case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This doesn't seem to make any difference in testing, but it fixes a
failed assertion when dumping sm3 shaders.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Setting GL_POINT_SPRITE_COORD_ORIGIN to GL_LOWER_LEFT did not work for
vgpu9. We can use the rasterizer sprite_coord_enable bitfield as-is.
We need to index into it using the TGSI semantic index, not the
register index.
This fixes the Piglit fbo-gl_pointcoord and glsl-fs-pointcoord tests.
Testing done: Piglit, Mesa sprite demos
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
The flag_rect and flag_buffer fields didn't sufficiently capture
the state changes needed for those resource types. For example,
if a texture binding was changed from a 500x500 rect texture to a
400x400 rect texture we didn't set SVGA_NEW_TEXTURE_CONSTS. But
we need to do that to emit the new texcoord scale factors to the
constant buffers. Rather than track the sizes of all bound
resources, just set the flag if the resource is a rect. Same
story with texture buffers.
Also, since rect/buffer textures are usable with VS/GS shaders,
add SVGA_NEW_TEXTURE_CONSTS to the flags we check for emitting
VS/GS constants.
This seems to help with XFCE / xfwm4 desktop scaling.
VMware issue 2156696.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Add const qualifiers. Add 'f' suffix on floats to avoid double
promotion.
Remove unneeded shader type assertion since the switch statement
handled it already.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Move this now-unused function into the existing comment block, which was its only prior use.
../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:2645:1: warning:
unused function 'partitionLoadStore' [-Wunused-function]
partitionLoadStore(uint8_t comp[2], uint8_t size[2], uint8_t mask)
Fixes: ("86e4440361 nouveau: codegen: Disable more old resource handling code")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
During intra stage linking some out variables can be dropped because
it is not used in a shader with the main function. But these out vars
can be referenced on later stages which can lead to further linking
errors.
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
Newer blit tests are enabling depth&stencils blits. We currently don't
support it but can do by iterating over the aspects masks (copy some
logic from the CopyImage function).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9f44745eca ("anv: Use blorp to implement VkBlitImage")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
OpenGL ES spec states:
"For normalized fixed-point rendering surfaces, the combination format
RGBA and type UNSIGNED_BYTE is accepted."
This fixes following failing VK-GL-CTS tests:
KHR-GLES3.packed_pixels.pbo_rectangle.rgba8_snorm
KHR-GLES3.packed_pixels.rectangle.rgba8_snorm
KHR-GLES3.packed_pixels.varied_rectangle.rgba8_snorm
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
https://bugs.freedesktop.org/show_bug.cgi?id=107658
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Andres Gomez <agomez@igalia.com>