This just tells the state tracker to turn on the GL_ARB_shader_texture_lod
extension. This simply allows the GLSL compiler to emit TXL and TXD
instructions for both vertex and fragment shaders. We already support
these opcodes in the svga driver. Though, the shadow2DGrad() Piglit
tests are failing.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Improves performance of RoboHornet's 2D Canvas toDataURL benchmark
[http://www.robohornet.org/#e=canvastodataurl] by approximately 5x
on Baytrail on ChromiumOS.
Elapsed time drops by -81.4861% +/- 1.22619% (n=3 s=14.9105, confidence=95%).
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Uses SSE 4.1's MOVNTDQA instruction (streaming load) to read from
uncached memory without polluting the cache.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
This patch make changes to correctly set up the Dispatch GRF Start
Register in case of 'SIMD16 only' FS dispatch.
This fixes an issue of incorrect rendering on dolphin emulator with
GL_SAMPLE_SHADING enabled.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This theoretically works on earlier hardware as well, but the extension
requires at least GL3.0.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
V2: fix interaction with VertexAttribFormat, since that landed after
this was originally written
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
With this patch, the llvmpipe and draw modules will calculate the depth bias
according to floating point depth buffer semantics described in the
arb_depth_buffer_float specification, when the driver has a z buffer bound
with a format type of UTIL_FORMAT_TYPE_FLOAT.
By default, the driver will use the existing UNORM calculation for depth bias.
A new function, draw_set_zs_format, was added to calculate the Minimum
Resolvable Depth value and floating point depth sense for the draw module.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This brings over the batch-wrap-prevention and aperture space checking
code from the normal brw_draw.c path, so that we don't need to flush the
batch every time.
There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't
high enough up in the state emit sequences -- before, we implicitly had
one at the batch flush before any state was emitted, so Mesa's workaround
emits didn't really matter. Since the SNB fixes by Ken, I didn't see any
regressions after 3 piglit runs.
Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32)
Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90).
Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88)
Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126)
No statistically significant performance difference on unigine tropics
(n=10)
No statistically significant performance difference on openarena (n=755)
The two apps that are hurt happen to include stalls on busy buffer
objects, so I think this is an effect of missing out on an opportune
flush.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
update_array() and update_array_format() are changed to update the new
attrib and binding states, and the client arrays become derived state.
Reviewed-by: Eric Anholt <eric@anholt.net>
...and rename it to _mesa_bind_buffer_gen().
This is so the function can be called from _mesa_BindVertexBuffer().
This patch also adds a caller parameter so we can report the right
entry point in error messages.
Based on a patch by Eric Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This will become derived state as part of the ARB_vertex_attrib_binding
support.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Split out the code for updating the array format into a new function
called update_array_format(). This function will be called by both
update_array() and the new glVertexAttrib*Format() entry points in
ARB_vertex_attrib_binding.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This will be used by the ARB_vertex_attrib_binding implementation.
This reverts commit db38e9a0e1.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Currently, has_surface_tile_offset is equivalent to gen == 4 && !is_g4x.
We already use it for related checks in brw_wm_surface_state.c, so it
makes sense to use it here too. It's simpler and more future-proof.
Broadwell also lacks surface tile offsets. With this patch, I won't
need to update any generation checking; I can simply not set the flag.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Hardware docs say we can only use SIMD8 dispatch in this condition.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
We weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first element.
(Copied straight from the same fix for temps.)
While here fix up a couple of broken comments in the fetch functions,
plus don't name a straight float type float4 which is just confusing.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Otherwise OutputSurface interop has funny results sometimes.
This fixes interop with the mpv media player.
v2 (chk): add proper locking
Signed-off-by: Christian König <christian.koenig@amd.com>
V2: Add comment explaining what emit_alpha_test() is for;
fix spurious temp and bogus whitespace.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
The same setup is required here as when the user-provided shader
explicitly uses KIL or discard.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>