Commit Graph

87122 Commits

Author SHA1 Message Date
Tobias Droste 44a672ef0e configure.ac: Use short names for r600 und r300
There are no non gallium r300 and r600 drivers anymore.
No need to explicilty mention gallium here.
Just cosmetics, no functional change.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste 1ca0486147 configure.ac: Remove useless oCL LLVM check
This is handled by "llvm_check_version_for" for openCL.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste 8c98e27074 configure.ac: Move llvm-config searching outside the function
There's no harm in always searching llvm-config.
This way it's available as soon as possible for all functions.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste 0cc4ffd67b configure.ac: Move LLVM functions to the top
This just moves code around so that all LLVM related stuff is at the
top of the file in the correct order.
No functional change.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste bf4b0fc33b configure.ac: Move LLVM version check to the top
A function with the LLVM version checked is moved to the top.
The function is called where the old code was.
No functional change.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: s/ipos/ipo/, drop "yes" argument from llvm_add_component]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste 9a3bccc75e configure.ac: Use new helper function for LLVM
Use the new helper function to add LLVM targets and components.
The components are added one by one to later find out which component
is missing in case there is one.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: s/ipos/ipo/, drop "yes" argument from llvm_add_component]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste d434633b76 configure.ac: Use new llvm_add_default_components
Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste 352831c5d9 configure.ac: Add helper function for targets/components
Add functions to add and check targets/components.
Not used in this patch.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Tobias Droste 2350387d24 configure.ac: Don't search llvm-config if it's known
This way LLVM_CONFIG can bet set from an env variable if it's outside
the $llvm_prefix.

This is not a must, but it helps testing.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Boyuan Zhang 3949d7c6ea st/va: fix gop size for rate control
The gop_size in rate control is the budget window for internal rate
control calculation, and shouldn't always equal to idr period. Define
a coefficient to let budget window contains a number of idr period for
proper rate control calculation. Adjust the number of i/p frame remaining
accordingly.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-12-05 09:23:38 -05:00
Boyuan Zhang 8206882392 st/va: force to submit two consecutive single jobs
The gop_size in rate control is the budget window for internal rate
control calculation, and shouldn't always equal to idr period. Define
a coefficient to let budget window contains a number of idr period for
proper rate control calculation. Adjust the number of i/p frame remaining
accordingly.

v2: fixed regression issues introduced by previous version

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-12-05 09:23:38 -05:00
Nayan Deshmukh 7b811c362a st/vdpau: fix compiler warning in vlVdpVideoMixerRender
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-12-05 11:20:55 +01:00
Topi Pohjolainen 5b27405eff i965: Release aux buffer when disabling ccs
Otherwise subsequent render cycles keep on using compression
and/or fast clear.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-12-05 09:20:05 +02:00
Bas Nieuwenhuizen 92d7563fba ac/nir: Only use the first component for SSBO atomics.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-12-05 01:40:54 +01:00
Dave Airlie 8033f78f94 radv: fix another regression since shadow fixes.
This fixes:
dEQP-VK.glsl.texture_gather.basic.2d.depth32f.*

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-05 10:14:37 +10:00
Iago Toral Quiroga 66e7effc85 spirv: Builtin Layer is an input for fragment shaders
This change makes it so we emit a load_input intrinsic when Layer
is read in a fragment shader.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-03 20:50:57 +01:00
Bruce Cherniak a7b510f656 swr: Fix active_queries count
The active_query count was incorrect for query types that don't require
a begin_query.  Removed the unnecessary assert.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-12-02 14:36:28 -06:00
George Kyriazis 2085088033 swr: Fix type to match parameters of std::max()
Include propagation of comparisons further down.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-12-02 14:36:28 -06:00
Tim Rowley f1ca377ab1 swr: [rasterizer jitter] include cstdarg in builder_misc.cpp
Fixes build problem with llvm-svn.

v2: use cstdarg instead of stdarg.h

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-12-02 14:36:28 -06:00
Jason Ekstrand 19a541f496 nir: Get rid of nir_constant_data
This has bothered me for about as long as NIR has been around.  Why do we
have two different unions for constants?  No good reason other than one of
them is a direct port from GLSL IR.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-02 10:53:32 -08:00
Timothy Arceri c45d84ad83 Revert "st/mesa: get Version from gl_program rather than gl_shader_program"
This reverts commit 6bf63b0119.

A patch that adds a reference to gl_shader_program_data to gl_program
needs to land befor this one.
2016-12-02 16:44:44 +11:00
Timothy Arceri 6bf63b0119 st/mesa: get Version from gl_program rather than gl_shader_program
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-12-02 13:54:54 +11:00
Timothy Arceri ab8c01386a st/mesa/glsl: move Version to gl_shader_program_data
This is mostly just used during linking however the st uses it
when updating textures.

In order to store gl_program in the CurrentProgram array
rather than gl_shader_program we need to move this field to
the shared gl_shader_program_data struct.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-12-02 13:54:47 +11:00
Rob Clark 534917495d freedreno: no-op render when we need a fence
If app tries to create a fence but there is no rendering to submit, we
need a dummy/no-op submit.  Use a string-marker for the purpose.. mostly
since it avoids needing to realize that the packet format changes in
later gen's (so one less place to fixup for a5xx).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-12-01 20:24:59 -05:00
Rob Clark 0b98e84e9b freedreno: native fence fd support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-12-01 20:24:46 -05:00
Rob Clark 16f6ceaca9 freedreno: some fence cleanup
Prep-work for next patch, mostly move to tracking last_fence as a
pipe_fence_handle (created now only in fd_gmem_render_tiles()), and a
bit of superficial renaming.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-12-01 20:16:31 -05:00
Rob Clark 026a7223a6 gallium: support for native fence fd's
This enables gallium support for EGL_ANDROID_native_fence_sync, for
drivers which support PIPE_CAP_NATIVE_FENCE_FD.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-12-01 20:16:31 -05:00
Rob Clark 72cc1ca58d gallium: wire up server_wait_sync
This will be needed for explicit synchronization with devices outside
the gpu, ie. EGL_ANDROID_native_fence_sync.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-01 20:16:31 -05:00
Rob Clark 0201f01dc4 egl: add EGL_ANDROID_native_fence_sync
With fixes from Chad squashed in, plus fixes for issues that Rafael
found while writing piglit tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:57:35 -08:00
Rob Clark 21b1acfcfe dri: extend fence extension to support native fd fences
Required to implement EGL_ANDROID_native_fence_sync.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:57:35 -08:00
Rob Clark 2ba4c7e154 egl: un-fallthrough sync attr parsing
Doesn't work so well when you start having more than one possible
attrib.  Prep-work for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:57:24 -08:00
Rob Clark cce04a4630 egl: initialize SyncCondition after attr parsing
Reduce the noise in the next patch.  For EGL_SYNC_NATIVE_FENCE_ANDROID
the sync condition is conditional on EGL_SYNC_NATIVE_FENCE_FD_ANDROID
attribute.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:52:55 -08:00
Tim Rowley 05f35a868c tgsi: store writes_primid when scanning tgsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-01 11:33:01 -06:00
Ilia Mirkin 7c16552f8d mesa: only verify that enabled arrays have backing buffers
We were previously also verifying that no backing buffers were available
when an array wasn't enabled. This is has no basis in the spec, and it
causes GLupeN64 to fail as a result.

Fixes: c2e146f487 ("mesa: error out in indirect draw when vertex bindings mismatch")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2016-12-01 06:35:13 -05:00
Eric Anholt 51244859e3 vc4: Avoid false scheduling dependencies for LOAD_IMMs.
Noticed in shaders with branching, where we ended up scheduling delay
slots near the start of a block for the uniforms reset setup.

total instructions in shared programs: 93970 -> 93951 (-0.02%)
instructions in affected programs:     3117 -> 3098 (-0.61%)

3DMMES performance +0.423087% +/- 0.133521% (n=9,10)
2016-11-30 19:58:09 -08:00
Eric Anholt 6c34084d8e vc4: Try to schedule QIR instructions between writing to and reading math.
This helps us get the delay slots between SFU writes and reads filled.

total instructions in shared programs: 94494 -> 93970 (-0.55%)
instructions in affected programs:     59206 -> 58682 (-0.89%)

3DMMES performance +1.89967% +/- 0.157611% (n=10,9)
2016-11-30 19:58:09 -08:00
Eric Anholt d182740ac8 vc4: Improve interleaving of texture coordinates vs results.
The latency_between was trying to handle the delay between the coordinate
write ("before") and the corresponding sample read ("after"), but we were
handing in the two instructions swapped.

This meant that we tried to fit things between a tex_s and its *preceding*
tex_result.  This made us only interleave normal texture coordinates by
accident, and pessimized UBO reads by pushing the tex_result collection
earlier until there was nothing but it (and then its preceding coordinate
setup) left.

In addition to latency reduction, things end up packing better (probably
due to reduced live ranges of the texture results):

total instructions in shared programs: 98121 -> 94775 (-3.41%)
instructions in affected programs:     91196 -> 87850 (-3.67%)

3DMMES performance +1.15569% +/- 0.124714% (n=8,10)
2016-11-30 19:58:09 -08:00
Eric Anholt 1f9daf7cd1 vc4: Fix stray "." on no-op MUL packs.
This happened when the PM bit was set for R4 unpacks, where the MUL pack
was NOP.
2016-11-30 19:58:09 -08:00
Eric Anholt 98d7e87488 vc4: Allow merging instructions with SF set where the other writes NOP.
I'm not sure how I managed to write the SF merge code
(7d8b79f398) without allowing merges with
NOPs.  *Everything* we try to merge with will have a NOP on one or the
other side of the instruction, and that's why that commit showed no
benefit.

total instructions in shared programs: 99347 -> 95128 (-4.25%)
instructions in affected programs:     91906 -> 87687 (-4.59%)

3DMMES performance +2.57105% +/- 0.135276% (n=6,8)
2016-11-30 19:58:09 -08:00
Eric Anholt 8e5ec33f11 vc4: In a loop break/continue, jump if everyone has taken the path.
This should be a win for most loops, which tend to have uniform control
flow.

More importantly, it exposes important information to live variables: that
the break/continue here means that our jump target may have access to
values that were live on our input.  Previously, we were just setting the
exec mask and letting control flow fall through, so an intervening def
between the break and the end of the loop would appear to live variables
as if it screened off the variable, when it didn't actually.

Fixes a regression in glsl-vs-loop-redundant-condition.shader_test when a
perturbing of register allocation caused a live variable to get stomped.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
2016-11-30 19:58:09 -08:00
Ilia Mirkin fda1d0187d anv: expose support for VK_KHR_sampler_mirror_clamp_to_edge
This is already supported in genX_state.c, expose the extension string.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-11-30 20:49:04 -05:00
Jason Ekstrand 27433b26b1 anv/cmd_buffer: Actually use the stencil dimension
In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I
accidentally kept setting the SurfaceType to 2D in the stencil-only case
thanks to a copy+paste error.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-11-30 17:42:42 -08:00
Ilia Mirkin ef59cb0820 swr: add streamout buffer offset into pBuffer pointer
The buffer_size does not take the offset into account. Just add the
offset into the pointer which lines up the structures much better.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:36:03 -05:00
Ilia Mirkin 3d837a8871 swr: fix assertion for max number of so targets
The number has to be less than or equal to the max, not just less than.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:36:00 -05:00
Ilia Mirkin 02b2efa5eb swr: properly report max number of SO components
The components count the number of individual values, not the number of
slots.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:35:56 -05:00
Ilia Mirkin ab3bbe06ed swr: turn off queries around blits
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:35:53 -05:00
Ilia Mirkin d8ce8acdfa swr: don't advertise stream pause/resume
There is no support for resuming streamout. Furthermore, this also
controls glDrawTransformFeedback functionality which requires the same
ability to query how many primitives were sent out of TF.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:35:43 -05:00
Ilia Mirkin 632c11e857 swr: fix range computation for instanced client-side arrays
We need to take the instance divisor and number of instances into
account for instanced client-side arrays, rather than the vertex
parameters.

Loosely based on the comparable nvc0 logic.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-30 20:35:33 -05:00
Ilia Mirkin 3b736acf1b swr: [rasterizer memory] assert when trying to convert an unknown format
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:35:16 -05:00
Ilia Mirkin 763c015ce5 swr: remove warning about multi-layer surfaces
We now support clearing these, and actually rendering to multiple layers
would require GS support, which will fail in much more spectacular ways
for now. Once that is hooked up, there won't be anything else to do
here.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-30 20:35:06 -05:00