In certain cases variant->opaque could be set to true, which
reset command list for tiles fully covered by a triangle
with this shader. This is obviously wrong in presence of
framebuffer fetch.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13252>
This is overkill, but hey 4-bits per channel is hardly something to
care about.
(Suggestions welcome for a better version).
Fixes:
dEQP-GLES2.functional.fbo.render.*rgba4*
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12001>
These currently always had one sampler state embedded, but got messy
when images was 1 and samplers was 0.
This should fix some undefined reads seen
Fixes: e639e311a1 ("llvmpipe/cs: overhaul cs variant key state.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13012>
This is required for d3d10+, which has depth_clamp always enabled
regardless of depth_clip (in contrast to OpenGL, where enabling
depth_clamp disables depth_clip). There doesn't seem to be a GL
extension for it, but it will be used for lavapipe to implement
VK_EXT_depth_clip_enable.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12260>
The data in the user buffer is only valid for a short period of time, and
we could use-after-free it if rendering hadn't been flushed by shader
deletion time.
Fixes: #5254
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12724>
The state wasn't storing the shader depth/stencil outputs
per-sample, so only the last sample emitted was being used
for the late depth test and stencil ref.
Noticed while trying to fix some vulkan depth stencil resolve
issues
Fixes: a0195240c4 ("llvmpipe: handle multisample early depth test/late depth write")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12504>
This change adds:
- an alternative rasterizer, which rasterizes bins in a left->right &
top->bottom linear fashion;
- triangle -> rectangle detection;
- 1:1 blit detection;
- a special TGSI -> LLVM IR code generation that uses 8-bit SSE integers
in AoS fashion (as opposed to 32bits floats.)
Altogether these changes yield a 2x to 3x performance improvement for 2D
workloads. It was designed to render Windows 7 Aero and other Windows
built-in 3D applications (like Windows Media Player, Internet Explorer
11, UWP applications) with minimum CPU utilization, but it should be
generally applicable to other 2D-on-3D applications, like desktop
compositors, HTML browsers, 3D based UI toolkits, etc.
This was mostly the brainchild of Keith Whitwell back in 2010. I wrote
TGSI -> AoS translation. And many others added bug-fixes and
enhancements over the years: Roland Scheidegger, Brian Paul, and James
Benton.
Known issues:
- piglit spec@!opengl 1.1@quad-invariance will warn that "left and right
half should match" due to rounding error difference
- These optimized paths to kick in is that depth-buffer must not be
used, so some applications which want to benefit from these improvements
might need to be modified to ensure they use painter's algorithm instead
of depth-buffers.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Keith Whitwell <keithw@vmware.com>
v2: Incorporate Dave Airlie feedback: cleanup LP_DEBUG_xx; shrink 3+
empty lines.
v3: silence unused var warning, adapt to new upstream code (point setup)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11969>
Don't depend moving between samples on key->multisample
Big CI wins
Reported-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Fixes: 210d714f46 ("llvmpipe: handle multisample color stores.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10780>
Don't be smarter than state tracker here, of d3d10 wants to do
something the state tracker should hard code that. Since lavapipe
wants to use clip_halfz and depth clipping independently.
This fixes some issues blitting Z that zink was seeing
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10068>
When depth clamp is disabled the viewport values aren't meaningful,
however the value is about to be converted to a unorm so needs
to still be clamped to 0/1.
This might not be the best place for this, maybe it should be in
the write swizzled code.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10068>
item->base will be freed for the NULL reference write
so just use a temporary to avoid it.
This was found with asan and lavapipe:
dEQP-VK.api.copy_and_blit.core.blit_image*
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8912>
Instead of calling this function again to unbind trailing slots,
extend it to do it when images are being set. This reduces CPU overhead.
Only st/mesa benefits.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
We often do this:
pipe->set_constant_buffer(pipe, shader, slot, &cb);
pipe_resource_reference(&cb->buffer, NULL);
That results in atomic increment in set_constant_buffer followed by
atomic decrement after set_constant_buffer. This new interface
eliminates those atomics.
For the case above, this should be used instead:
pipe->set_constant_buffer(pipe, shader, slot, true, &cb);
cb->buffer = NULL; // if cb is not a local variable, else do nothing
AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
pipe_alpha_state and pipe_depth_state will be packed together
because they have only a few bitfields each. This will eventually
remove 4 bytes of padding in pipe_depth_stencil_alpha_state.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
Need to update the z value after updating the pos at pixel
center, and later reupdate it again, so we can avoid some
LLVM IR values not being dominant issues.
Fixes:
dEQP-VK.renderpass.suballocation.multisample.s8_uint.samples_4
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6381>
ARB_framebuffer_no_attachments + multisampling means blend
can have an effect even outside of colorbufs
Fixes:
dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.samples_4.alpha_opaque
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6381>
Currently llvmpipe calls finish on the context when a shader
variant has to be destroyed just in case the variant is currently
in use by the setup engine.
Fix this by reference counting the shaders, and reference counting
the shader variants.
Whenever a shader is used in the rasteriser backend, it is added
to a reference list and removed when the rasterizer is finished with it.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6341>
Currently the test crashes with LLVM errors
Stored value type does not match pointer operand type!
store <8 x i32> %s_dst, <8 x i8>* %261
Change the stored type for 8-bit stencil formats.
Fixes:
GTF-GL45.gtf44.GL31Tests.texture_stencil8.texture_stencil8_gl44
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5926>
Serialize and check if the object is in the cache, it there is
a cached object skip compilation code once we've constructed
the function interface.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
When we use an object cache for the MCJIT we can have identical
cache entries from the same shader variant in different shaders,
but the JIT objcache uses the function name to relink things,
so it has to be consistent. Just drop the variants from the
function names.
Note the modules still have the variant info.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Move the color storage before the late Z test as for sample
shading it needs to be inside a loop with the fragment shader.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4122>