Remove the logic that we don't need. In a future commit, it will be
extended to optimize aspects of buffer clears and copies that need to be
optimized.
Changes:
- remove the logic that generated multiple loads/stores per thread,
only 1 load and store can occur in the shader now, allowing clearing/
copying max 4 dwords per thread
- put the src buffer in SSBO slot 0, and the dst buffer in SSBO slot 1
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
- always use L2_LRU (never use ACCESS_NON_TEMPORAL) - for better perf
- never use ACCESS_COHERENT because the address might not be aligned to
a cache line
- assume the wave size is always 64
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
Shaders loaded from the shader cache should be printable. Before this,
sometimes only "(null)" was printed.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
Effects:
- multi-component subdword handling removed because it's lowered
- 3-dword loads selected correctly instead of 4-dword loads
- the failure of dEQP-GLES3.functional.buffer.copy.subrange.large_to_small
due to LLVM exposed by a future commit is mysteriously fixed by this
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29053>
Toggle on dithering if it has been enabled on device and is set on the
rendering flags used.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19414>
After dropping obsolete skips, the piglit jobs take too long to run, so
switch to 1-in-3 fractional runs.
TODO we need nightly full runs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28323>
Don't bother even trying to test legacy gl features that the hw does not
support and cannot be emulated (or at least not reasonably).
Some of this could in theory be emulated with GS shader on a6xx+, at the
cost of extra draw time overhead. But unless someone shows up with a
real world need for these features (ie. something other than piglit/cts)
I don't think we should penalize every other gl app with the extra
overhead. If you want a slow-but-implements-gl-rusty-sharp-edges driver,
zink already exists to fill that niche. (And if someone does find some
app that needs these features, the right answer is probably driconf to
trigger a transparent fallback to zink.)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8850
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28323>
These should have been in xfails, so we'd have noticed to remove them
when corresponding fixes and piglit fixes (uprevs) landed.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28323>
We originally skipped these before they were supported, but forgot to
remove them from the skips list once gl46 and related extensions were
added.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28323>
These opcodes were removed.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
The latency is from LLVM's SISchedule.td
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
Similar to the prior fix for msm. On the dmabuf import path, tu_bo_init
can be outside of the vma lock, but left inside for code simplicity.
Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29093>
There's a success path on found dmabuf while the iova won't be cleaned
up. This change defers iova alloc till lookup miss and also to prepare
for later racy dmabuf re-import fix.
Also documented a potential leak on error path due to unable to tell
whether a gem handle should be closed or not without refcounting.
Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29093>