The QBO workaround compute grid launch emits the render condition atom
when dirty, so install the render condition in the context only after
launching the compute grid. This avoids a redundant SET_PREDICATION.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
There is a firmware regression that causes failures. Work around it by
using the compute shader for query_buffer_objects to summarize the query
results.
v2: rename to PREDICATION_OP_BOOL64 (consistent with sid.h)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The predication bits are "visible or no overflow" and "not visible or
overflow", so we need to invert the check relative to the GL and Gallium
interface semantics.
Also, predication by the other streamout-related queries is not allowed.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The issue here is that the immediate is treated as a 64-bit value,
and fetching it does not work reliably with swizzles that are different
from xy and zw.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is mostly mechanical search-and-replace, plus touching up the
macros in u_dump_defines.c manually a bit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
v2: rename cap to PIPE_CAP_QUERY_SO_OVERFLOW and be a bit more explicit
in the documentation
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes following build issues:
In file included from vendor/intel/external/android_ia/mesa/src/mesa/drivers/dri/common/dri_util.c:45:
vendor/intel/external/android_ia/mesa/src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found
...
In file included from vendor/intel/external/android_ia/mesa/src/mesa/drivers/dri/i965/intel_screen.c:44:
vendor/intel/external/android_ia/mesa/src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found
Fixes: 601093f9 (xmlconfig: move into src/util)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
this change reverts commit 4f695731, we want to be able to build
with -DDEBUG and gen_decoder on Android.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Makes it possible to build Mesa on Android with -DDEBUG with
the next patch that reverts 4f695731.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
LLVM complained about passing an i32 to a float clamp.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 0f9e32519b "ac/nir: clamp shadow texture comparison value on VI"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Since make_surface() can fail, if the format isn't support by hw or
simlar error, we need to check the result before dereferencing it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reported by valgrind at:
glsl_to_tgsi_visitor::visit(ir_expression*) (st_glsl_to_tgsi.cpp:1560)
When compiling the Deus Ex shaders.
Fixes: 28a5e7104 ("st/glsl_to_tgsi: handle precise modifier")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This looks like it's supported since llvm 3.9 at least,
so switch over radeonsi and radv to using it, -pro also
uses this. We can now drop creating lds for these operations
as the ds_swizzle operation doesn't actually write to lds at all.
Acked-by: Marek Olšák <marek.olsak@amd.com>
(stable requested due to fixing radv CIK conformance tests)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
This makes it match radeonsi. The LLVM backend itself will emit the
correct instruction, but LLVM might do incorrect optimizations since it
thinks the output is undefined when the input is 0, even though it's not
supposed to be. We really need a new intrinsic, or for the backend to
become smarter and recognize this pattern.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
The optimizations are only valid for 32-bit integers. They were
mistakenly firing for 64-bit integers as well.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
As time goes on, extension advertising is going to get more complex.
Today, we either implement an extension or we don't. However, in the
future, whether or not we advertise an extension will depend on kernel
or hardware features. This commit introduces a python codegen framework
that generates the anv_EnumerateFooExtensionProperties functions as well
as a pair of anv_foo_extension_supported functions for querying for the
support of a given extension string. Each extension has an "enable"
predicate that is any valid C expression. For device extensions, the
physical device is available as "device" so the expression could be
something such as "device->has_kernel_feature". For instance
extensions, the only option is VK_USE_PLATFORM defines.
This mechanism also means that we have a single one-line-per-entry table
for all extension declarations instead of the two tables we had in
anv_device.c and the one we had in anv_entrypoints_gen.py. The Python
code is smart and uses the XML to determine whether an extension is an
instance extension or device extension.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This will allow us to keep everything in one place when it comes to
declaring what extensions are supported.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
It fixes misused x and y variables on the calculation of the memory copy regions.
Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Eric: use gbm_bo_get_bpp() instead of local function, split clamp patch]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
It fixes misused x and y variables on the calculation of the memory copy regions.
Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Eric: use gbm_bo_get_bpp() instead of local function, split clamp patch]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
GLES/gl.h has historically provided some typedefs that are not
used in the API itself. Restore these typedefs that were lost to
avoid breaking applications.
These seem to be the only typedefs removed in the update.
Fixes: 7fd0817 "Update Khronos-supplied headers"
[Eric: added a big warning to revert this patch when pulling the updated header]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Turn comments into actual code, that the compiler can check for us :)
(Speaking of, one of the comments had a typo. Challenge: find it)
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
As of last commit, no invalid swap interval can be stored, so there's
no need to sanitize the values when reading them anymore.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
When we have an interface block like:
layout (xfb_buffer = 0, xfb_offset = 0) out Block {
vec4 var1;
layout (xfb_stride = 48) vec4 var2;
vec4 var3;
};
According to ARB_enhanced_layouts spec:
"The *xfb_stride* qualifier specifies how many bytes are consumed by
each captured vertex. It applies to the transform feedback buffer
for that declaration, whether it is inherited or explicitly
declared. It can be applied to variables, blocks, block members, or
just the qualifier out. [ ...] While *xfb_stride* can be declared
multiple times for the same buffer, it is a compile-time or
link-time error to have different values specified for the stride
for the same buffer."
This means xfb_stride actually applies to the buffer, and not to the
individual components.
In the above example, it means that var2 consumes 16 bytes, and var3 is
at offset 32.
This has been confirmed also by John Kessenich, the main contact for the
ARB_enhanced_layouts specs, and also because this commit fixes:
GL45.enhanced_layouts.xfb_block_member_stride
This commit is in practice a revert of 598790e856 (glsl: apply
xfb_stride to implicit offsets for ifc block members).
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
I don't know the condition for the flush, but we better turn this off.
The sL1 flush is used when CE dumps stuff into a ring buffer and the ring
buffer wraps.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Fixes: 064550238e ("radeonsi: use CLEAR_STATE to initialize some
registers")
Bugzilla: https://bugs.freedesktop.org/101969
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Python is the scripting language we've been using for scripts that need
to run across all supported platforms.
Shell is *not* a portable language for scripts.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Fixes: 601093f95d ("xmlconfig: move into src/util")
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Roland Scheidegger <sroland@vmware.com>
It's a single atomic add, so it makes sense to inline it.
Improves performance in Piglit's drawoverhead microbenchmark's
"DrawArrays ( 1 VBO, 0 UBO, 0 ) w/ no state change" subtest by
0.400922% +/- 0.310389% (n=350) on my i7-7700HQ.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>