Now that all consumers of GLSL use NIR, make the remaining drivers take
the path that relies on NIR to really do optimization.
nouveau steam shader-db runtime -6.69631% +/- 1.29235% (n=12).
No change on shader-db there.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16364>
The only interesting ones here were LOWER_IF_THRESHOLD (which previously
had connected to some lowering in GLSL that was broken in the face of side
effects), and FMA (which turned GLSL IR's fma() into TGSI_OPCODE_FMA
instead of MAD).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8044>
This way we can make allow_draw_out_of_order true by default for all
apps, iff the driver allows it.
And allow_draw_out_of_order=false can still be used in drirc, for
apps that need this optim to be turned off.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16139>
pipe_context shouldn't have functions that return values because this
prevent multithreading.
Move this hook to pipe_screen.
Fixes: 606e42027e ("gallium: add hook on getting canonical format")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16078>
GL spec states that the stride for indirect multidraws:
* cannot be negative
* can be zero
* must be a multiple of 4
some drivers can't support strides which are not a multiple of the
size of the indirect struct being used, however, so rewrite those to
direct draws
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15963>
On swizzled copies canonical formats are used to reduce the formats to a
simpler subset.
Nevertheless, it is possible that some of the canonical formats defined
in Gallium are actually not supported by the drivers themselves.
This provides a driver-defined hook that can be used to provide an
alternative canonical format in case the canonical one defined by
Gallium is not supported by the driver.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15693>
it's used by 3 different drivers, so it shouldn't be in radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15907>
After removal inline function `_mm_shuffle_epi8`, the
macro `PIPE_ARCH_SSSE3` have no need anymore, remove it too.
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14077>
This cap is no longer TGSI specific, so let's rename it to reflect
reality.
Because the name got a bit vague when removing the TGSI-bits, let's add
some more details to the name.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This isn't specific to TGSI, so let's update the name to reflect
reality.
Because the name of the opcode was TGSI specific, let's pick a new one,
based on the naming of the PIPE_CAP_TEXTURE_QUERY_LOD cap.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This cap is no longer TGSI-specific, so let's update the name to reflect
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
These aren't spiecic to TGSI any more, so let's rename them to reflect
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
These aren't specific to TGSI, so let's rename them to reflect the
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
Similar to the previous commits, these aren't TGSI specific, so let's
drop TGSI from their name.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This cap is no longer specific to TGSI, so let's rename it and update
the documentation to reflect that.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This cap no longer has anything to do with TGSI, as the lowering happens
on GLSL IR, and applies just as much to NIR drivers. So let's rename
this cap and update the docs to reflect the current situation.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
this draw mode in particular requires driver-specific conversions
for queries (e.g., number of vertices), so pass that info through
the only limitation is that it doesn't work for dlists,
but I have yet to see a real use case of a statistics query being used with dlists
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15326>
This is to match the vulkan partial bit so lavapipe can work
with new llvmpipe changes.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>
Currently this just has wait, but in order to get the right answer
for vulkan partial, lavapipe/llvmpipe need to pass a partial flag
through here in the future.
This just changes the API so that's possible.
v2: use an enum (zmike)
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15009>
for drivers where separate cull distance variables are required, this
lets them avoid having to write yet another pass to undo gallium's mangling
of shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14878>
The reference buffer address is used as the indication in h264 DPB
Tier2, when reference buffer was reallocated, h264 DPB would lose
track of that reference picture. Adding a pointer obsolete_buf in
vlVaSurface data structure for tracking this released buffer, also
in h264_picture_desc adding a private field, which contains
past_ref[16] for tracking previously released buffer vs current
buffer for reference frames.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5868
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14646>
Instead of using actual sample count as parameter, we only use a bool
to indicate if the target is multi sample. This is because we don't
know the sample count when glGetInternalformativ() case.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362>
Softpipe was the only driver still using this feature. I had enabled it
in ba22f014f9 ("softpipe: Enable PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS;") for
an instr count win, but it's really not important to that driver and it's
not worth keeping the knob around just for that.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14360>
Add nr_sparse_levels in pipe_resource for updating the
gl_texture_object->NumSparseLevels.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14223>
For ARB_sparse_texture extension to query driver supported page
size for each texture format.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14223>
This allows us to force all shaders to offer shader features only
provided to compatibility shaders.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14040>
This capability is enabled for drivers supporting formatless image
writing in shader.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
Allow drivers to register a callback for when a shader program is linked.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13674>
this reworks PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER into an
enum as PIPE_CAP_TEXTURE_TRANSFER_MODES, enabling drivers to choose
a (sometimes) faster, compute-based download mechanism based on a new
pipe_screen hook
compute pbo download is implemented using shaders with a prolog to convert
the input format to generic rgb float values, then an epilog to convert
to the output value. the prolog and epilog are determined based on a vec4
of packed ubo data which is dynamically updated based on the API usage
currently, the only known limitations are:
* GL_ARB_texture_cube_map_array is broken somehow (and disabled)
* AMD hardware somehow can't do depth readback?
otherwise it should work for every possible case
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
this can be used to query whether a driver expects a given texture
copy to be faster as a compute shader or using cpu/gfx transfers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Driver should enable this cap if it prefers varyings to be aligned
to power of two in a slot, i.e. vec4 in .xyzw, vec3 in .xyz, vec2 in .xy
or .zw
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13151>
turnip was playing fast and loose with the name, using the R8_G8B8 format
name but actually setting the descriptors up to read G8_B8R8 like Vulkan
(sensibly) wants. This caused trouble when trying to make freedreno and
turnip share code. By having both orderings as format names, we can share
the descriptor code and also confuse readers less.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
The only user, turnip, was actually treating it as this layout, matching
vulkan's specification of how the planes map to RGB values. (Y=G means
that Cb=B and Cr=R).
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
This allows specific per-application override.
The existing MESA_EXTENSION_OVERRIDE env variable is kept.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13364>
vulkan requires that vertex attribute access be aligned to the size of
a component for the attribute, but GL has no such requirements
the existing alignment caps are unnecessarily restrictive for applying
this limitation, so this cap now pre-calculates the masks for elements
and vertex buffers in vbuf to enable rewriting misaligned buffers
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13556>
The GLSL spec says it's an error if a shader statically writes to these
2 variables.
Until this commit, Mesa refused to link a shader if it had an unused
function writing to one of these variables while another (used) function
wrote to the other.
This commit adds an option to perform dead function elimination after
the intra-stage linking step but before performing these checks.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12897>
Use createImageFromFds2 together with __DRI_IMAGE_PRIME_LINEAR_BUFFER, so
the driver's resource_from_handle hook will be aware that this specific
image is the linear buffer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13362>
This adds 60 bytes to both structures. It eliminates "False Sharing"
for atomic operations (see wikipedia).
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11618>
This indicates driver that a given blit is coming from the DRI
frontend.
This information can then be used to pick an appropriate blitting
method.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12763>
Enable the import of memory via opaque fd handles, which
are based upon memory-fds. The extension is necessary for sharing
images and buffers from Vulkan.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
Add utility functions to allocate aligned memory backed by
mem_fd objects. Add interface to Gallium for same allocation.
It will be used in later commits for external memory support
in Vulkan/OpenGL.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
Turnip will use this for mapping the to the corresponding hardware format,
and we could also extend mesa/st to avoid adding extra samplers for
external image sampling of YV12 on freedreno.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13046>
The main motivation is to improve the score of viewperf13/snx.
This new interface is designed to be optimal for display lists as implemented
by the vbo module. It has much lower CPU overhead in the frontend, threaded
context, and the driver.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13050>
This is basically the same workaround as in 9b577f2a88 (driconf, glsl: Add a
vs_position_always_invariant option) commit but for tesselation evaluation
shaders. Some applications do not mark outputs as precise in tesselation
evaluation shaders which can lead to different results in case some
optimizations were applied.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13027>
These two can be vertex formats with vulkan so lavapipe can
expose them.
Fixes
dEQP-VK.pipeline.vertex_input.single_attribute.uvec4.as_a2r10g10b10_uint_pack32_rate_vertex
Fixes: e002f5a086 ("gallium: change pipe_vertex_element::src_format to uint8_t")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12987>
This is required for d3d10+, which has depth_clamp always enabled
regardless of depth_clip (in contrast to OpenGL, where enabling
depth_clamp disables depth_clip). There doesn't seem to be a GL
extension for it, but it will be used for lavapipe to implement
VK_EXT_depth_clip_enable.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12260>
For being used by mesa state tracker.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12336>
GLES2 drivers are allowed to reject some GLSL constructs, like dynamic
loop bounds (which neither i915g nor vc4 can fully support), but gallium
hasn't had any way to trigger a link failure. Add a return msg to the
finalize_nir hook, which is called at the end of GLSL linking, and use
that. This means that some other callers of finalize need to do something
with the msg, and we (for now) just throw it away.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12218>
We would like draw-only display lists to have immutable draw info and
this is the only GL non-draw state in pipe_draw_info (not counting
view_mask).
It also allows removing some code from draw_vbo for tessellation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
As part of adding support for inline uniforms in Iris, I was going to
add a finalize_nir hook. I went looking to see how other drivers use
the "optimize" parameter, and I discovered that *nobody* uses it at all.
v2: Fix typo in commit message. Noticed by Mike.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12317>
When this flag is set, u_threaded_context will try not to map it directly
for better buffer placement. It's set by drivers when visible VRAM is too
small.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12257>
This will give the driver a chance to set a device name separate from the
driver name, using info probed during screen creation. All drivers
querying driconf in screen creation now have to call parsing on their own,
but other drivers get fallback parsing after screen creation.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12135>
This just adds the new field. It will be used to lower 64-bit attribs
in drivers (via a helper).
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11370>
This is the distinction between formats:
- PIPE_FORMAT_R64_FLOAT is the legacy "convert-to-float" vertex format.
- PIPE_FORMAT_R64_UINT is the raw double format.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11370>
This removes the bitfield packing/unpacking.
pipe_format entries are reordered to have vertex formats first because
vertex formats must be <= 255.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11370>