If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.
Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Implement the state tracker manager drawable interface flush_swapbuffer
method by plumbing it through to dri3 if available.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Add a state tracker interface method to flush outstanding swapbuffers, and
add a call to it from the mesa state tracker during glFinish().
This doesn't strictly mean the outstanding swapbuffers have actually finished
executing but is sufficient for glFinish()
to be able to be used as a replacement for glXWaitGL().
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Provide a dri3 implementation for the image loader extension method.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This method may be used by dri drivers to make sure all outstanding
buffer swaps have been flushed to hardware.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This can be used to guard support for EXT_memory_object and related
extensions.
v2: update gallium docs
v3 (Timothy Arceri):
- add cap to nv50
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Remember to add the offset to the start of the buffer in the relocation
or else we write 0xff into random bytes elsewhere.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
This reduces the number of BOs that we need for the BO lists during
a submission.
Currently uses a fairly simple linear search for finding free space,
that could eventually be improved to a binary tree, which with some
per-node info could make a check for space O(1) and finding it O(log n),
in the number of buffers in that slab.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
As of 4.11, the kernel isn't bothering to set the subslice hashing mode
on Apollolake, leaving it at the default of 8x8. (It initializes it to
16x4 on most platforms.)
Performance data for GPUTest Triangle on Apollolake at 1024x640:
X-tiled RT:
-----------
8x8 -> 16x4: 2.4325% +/- 0.383683% (n=107)
8x8 -> 8x4: -3.75105% +/- 0.592491% (n=40)
8x8 -> 16x16: 6.17238% +/- 0.67157% (n=30)
Y-tiled RT:
-----------
8x8 -> 16x4: 1.30307% +/- 0.297292% (n=205)
8x8 -> 8x4: -0.769282% +/- 0.729557% (n=35)
8x8 -> 16x16: 3.00254% +/- 0.715503% (n=40)
8x MSAA RT (INTEL_FORCE_MSAA=8):
--------------------------------
8x8 -> 16x4: 1.38889% +/- 0.93729% (n=7)
8x8 -> 8x4: -2.10643% +/- 1.15153% (n=3)
8x8 -> 16x16: 3.87183% +/- 1.08851% (n=5)
Based on this, we choose 16x16 for Apollolake.
Skylake GT2 with X-tiled buffers appears to be a toss-up between 16x4
and 16x16, and with Y-tiled buffers it doesn't seem to really matter.
So we'll leave Skylake alone for now.
The hashing mode doesn't seem to make a measurable impact on more
complex benchmarks.
Acked-by: Matt Turner <mattst88@gmail.com>
This isn't used in any of these drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
One could have vX+1 which introduces another entrypoint without
implementing older ones.
v2: Rebase, while keeping loaderPrivate
Fixes: 1bf703e4ea ("dri_interface,egl,gallium: only expose RGBA visuals
on Android")
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The cacheline alignment restriction is on the base address; the pitch
can be anything.
Fixes assertion failures when using primus (say, on glxgears, which
creates a 300x300 linear BGRX surface with a pitch of 1200):
intel_blit.c:190: get_blit_intratile_offset_el: Assertion `mt->surf.row_pitch % 64 == 0' failed.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Move AVX512BW specific intrinics to be Core-only.
Move some AVX512F intrinsics back to common implementation file.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Switch to a 1:1 mapping template:generated for future maintenance.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Copy/paste error was duplicating a gen_knobs.cpp rule.
Fixes: 5079c277b5 ("swr: [scons] Fix windows build")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Disable an optimization which implemented sse/avx operations on avx512
using avx512 intrinsics (to avoid switching between lane widths).
Compile with SIMD_OPT_128_AVX512 / SIMD_OPT_256_AVX512 defined to enable
these optimizations.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Fix problems found when enabling USE_SIMD16_FRONTEND, mostly related to
vMask / movemask_ps(pd).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
This is more lines of code but the python is far easier to read than the
sed expressions we were using before. Also, this allows us to pull the
API version from anv_entrypoints.py so it never gets out-of-sync.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The VkVersion class is probably overkill but it makes it really easy to
compare versions in a way that's safe without the caller having to think
about patch vs. no patch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This way we can use "from anv_extensions import *" in the entrypoint
generator without worrying too much about pollution
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
When building sandboxed, we may encounter additional errors. Ignore the errors,
as we are in a constrained environment.
This can be observed when building latest git with OBS.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
I believe this should be enough for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This implements a wait for glXWaitGL, glXCopySubBuffer, dri flush_front and
creation of fake front until all pending SwapBuffers have been committed to
hardware. Among other things this fixes piglit glx-copy-sub-buffers on dri3.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: <mesa-stable@lists.freedesktop.org>