Compute shaders require reconfiguring the L3 for shared local memory
support. We have to be able to write the L3 registers to do that.
This effectively turns off compute shaders prior to Kernel 4.2.
(Previously, the extension enable was in an API_OPENGL_CORE conditional.
However, that isn't necessary - core Mesa extension handling already
restricts it properly. I've moved it out in this patch.)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
This fixes some piglit subtests for ARB_program_interface_query.
V3: remove some of the unnecessary parentheses
V2: fix alignment
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
There is a function dedicated to demoting unused varyings lets
trust it to do its job.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
After lowering the matching flag is_unmatched_generic_inout is lost so
we need to move this validation before lowering.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
An SSO program can have multiple stages and we only want to add the externally
facing varyings. The current code was adding both the packed inputs and outputs
for the first and last stage of each program.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Conditions modified allow skl+ to use blitter:
- for all tiling formats
- to write data to YF/YS tiled surfaces
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This allows the fallback paths to handle it correctly.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Overlapping blits are anyway undefined in OpenGL. So no need
of overlap check here.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Fast copy blit is currently enabled for use only with Yf/Ys tiling.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
It's very rare that a GL app calls glVertex3dv(), but one in particular
calls it lot, always with Z = 0. Check for that condition and convert
the call into glVertex2f. This reduces VBO memory used and reduces
the number of times we have to switch between float[2] and float[3]
vertex formats in the svga driver. This results in a small but
measurable performance improvement.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
We only want to set the SVGA_NEW_STIPPLE dirty flag when the polygon
stipple state changes. Before, we only set the flag when we were
enabling stipple, but not disabling.
We don't really have to add SVGA_NEW_STIPPLE to the dirty FS state
set since it's a subset of SVGA_NEW_RAST, but let's be explicit.
This doesn't fix any known bugs.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
and svga_set_sampler_views(). If there's no change, return early
and don't set a SVGA_NEW_x dirty state flag.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
gcc 4.9.3 shows the following error:
brw_vue_map.c:260:20: warning: array subscript is above array bounds
[-Warray-bounds]
return brw_names[slot - VARYING_SLOT_MAX];
This is because BRW_VARYING_SLOT_COUNT is a valid value for the enum
type. Adding an assert will generate no additional code but will teach
the compiler to not complain.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
It allows to call nouveau_vp3_bsp_next multiple times
between one begin/end.
It is required to support st/va.
https://bugs.freedesktop.org/show_bug.cgi?id=89969
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
[imirkin: create strparm_bsp function, simplified w0 calculation]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
The counter was not set but used by the nouveau driver.
It is required otherwise visual output is garbage.
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
This fixes the same tests that commit 8cf2e892f was attempting to fix:
ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset
ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize
as confirmed by Samuel.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
This reverts commit 8cf2e892fc. It's
entirely bogus to attempt to store anything about the binding in the
buffer object itself, which might be bound any number of times.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Instead, keep track of GL_DEBUG_OUTPUT and (un)install the pipe_debug_callback
accordingly. Hardware drivers can still use the absence of the callback to
skip more expensive operations in the normal case, and users can no longer be
surprised by the need to set the debug flag at context creation time.
v2:
- re-add the proper initialization of debug contexts (Ilia Mirkin)
- silence a potential warning (Ilia Mirkin)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Experimentally, 4M causes corruption and slowness, try to ramp it up
with size instead.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
H264 doesn't have a bitplane bo. We just need a device reference, so use
the one from the client.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Commit 5bb5eeea fixes a bug indicating that the surfaces should have the
API buffer size. Hovewer it picked the wrong value.
This patch adds a new variable, which takes into account
glBindBufferRange() values. This patch fixes the following CTS
regressions:
ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset
ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
It's the same behavior that we use for later LLVM.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reported by Tom^ on IRC. The original intent was to mark the pointer
constant as well as the data being pointed to, so move the *.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Note these are a bit uglier, due to avoidance of GNU C extensions. But
drivers which do not need to be built with compilers that don't support
the extension can wrap these macros with their own.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Found during NIR_TEST_CLONE=1 piglit run. We were using block->index
but forgetting to require it. Causing things to not work with a cloned
shader which didn't preserve block_index.
Signed-off-by: Rob Clark <robclark@freedesktop.org>