Commit Graph

78305 Commits

Author SHA1 Message Date
Timothy Arceri 3fd4280759 glsl: validate arrays of arrays on empty type delclarations
Fixes:
dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_fragment
dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_vertex

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-02-09 13:52:52 +11:00
Kenneth Graunke 74f956c416 i965: Use nir_lower_load_const_to_scalar().
I don't know why, but we never hooked up this pass Eric wrote.
Otherwise, you can end up with stupid scalarized code such as:

   vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0)
   vec4 ssa_8 = ...
   vec1 ssa_9 = feq ssa_8, ssa_7
   vec1 ssa_10 = feq ssa_8.y, ssa_7.y
   vec1 ssa_11 = feq ssa_8, ssa_7.z
   vec1 ssa_12 = feq ssa_8.y, ssa_7.w

ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions.

shader-db on Skylake:

total instructions in shared programs: 9121153 -> 9120749 (-0.00%)
instructions in affected programs: 32421 -> 32017 (-1.25%)
helped: 277
HURT: 69

total cycles in shared programs: 69003364 -> 69000912 (-0.00%)
cycles in affected programs: 899186 -> 896734 (-0.27%)
helped: 313
HURT: 403

This also prevents regressions when disabling channel expressions.

v2: Don't call opt_cse afterwards (requested by Matt).  It should
    happen in the optimization loop below anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-02-08 18:10:34 -08:00
Timothy Arceri 184afd8fd9 mesa: remove now unused sampler index handing code
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-02-09 12:03:02 +11:00
Timothy Arceri edc108765e mesa: compute sampler index in ir_to_mesa rather than using UniformHash
The aim of this is to work towards removing UniformHash from the program
struct so that we don't need to hold onto it in memory and pass it around
outside the linker.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-02-09 12:02:58 +11:00
Kenneth Graunke d0e1d6b7e2 i965: Don't add barrier deps for FB write messages.
There are never render target reads, so there are no scheduling hazards.

Giving the extra flexibility to the scheduler makes it possible to do
FB writes as soon as their sources are available, reducing register
pressure.  It also makes it possible to do the payload setup for more
than one FB write message at a time, which could better hide latency.

shader-db results on Skylake:

total instructions in shared programs: 9110254 -> 9110211 (-0.00%)
instructions in affected programs: 2898 -> 2855 (-1.48%)
helped: 3
HURT:   0
LOST:   0
GAINED: 1

A reduction in instruction counts is surprising, but legitimate:
the three shaders helped were spilling, and reducing register
pressure allowed us to issue fewer spills/fills.

total cycles in shared programs: 69035108 -> 68928820 (-0.15%)
cycles in affected programs: 4412402 -> 4306114 (-2.41%)
helped: 4457
HURT: 213

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-02-08 16:59:35 -08:00
Dave Airlie 6502b3f60e st/mesa: enable AoA for gallium drivers reporting GLSL 1.30
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-09 10:52:09 +10:00
Dave Airlie b74e8c89a6 st/mesa: add atomic AoA support
reuse the sampler deref handling code to do the same
thing for atomics.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-09 10:52:09 +10:00
Dave Airlie 90bbe3d781 mesa: drop unused nonconst sampler functions.
Since we fixed the glsl->tgsi conversion we no longer need
this function.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-09 10:52:08 +10:00
Dave Airlie bb8bbe34e3 st/mesa: handle indirect samplers in arrays/structs properly (v4.1)
The state tracker never handled this properly, and it finally
annoyed me for the second time so I decided to fix it properly.

This is inspired by the NIR sampler lowering code and I only realised
NIR seems to do its deref ordering different to GLSL at the last
minute, once I got that things got much easier.

it fixes a bunch of tests in
tests/spec/arb_gpu_shader5/execution/sampler_array_indexing/

v2: fix AoA tests when forced on.
I was right I didn't need all that code, fixing the AoA code
meant cleaning up a chunk of code I didn't like in the array
handling.

v3: start generalising the code a bit more for atomics.
v3.1: use UniformRemapTable

v4: handle uniforms differently using the param_index,
and go back to UniformStorage
fix issues identified by Timothy with deref handling.
v4.1: squash const fix and move handling 1D const out
of recursive function.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-09 10:52:08 +10:00
Dave Airlie 52801766a0 glsl/ir: add param index to variable.
We have a requirement to store the index into the mesa parameterlist
for uniforms. Up until now we've overwritten var->data.location with
this info. However this then stops us accessing UniformStorage,
which is needed to do proper dereferencing.

Add a new variable to ir_variable to store this value in, and change
the two uses to use it correctly.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-09 10:52:08 +10:00
Francisco Jerez 53739fddc6 i965: Rename define for the PIPE_CONTROL DC flush bit.
Its previous name was somewhat misleading, this really behaves like a
RW cache flush rather than an invalidation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-08 15:48:00 -08:00
Francisco Jerez 10d84ba9f0 i965: Invalidate state cache before L3 partitioning set-up.
The state cache is also L3-backed so it seems sensible to make sure
it's clean as we do for other RO caches before repartitioning the L3.
This wasn't part of my original L3 partitioning code because I was
able to reproduce hangs on Gen7 hardware when the state cache
invalidation happened asynchronously with previous 3D rendering, which
should no longer be possible after the previous change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-08 15:47:21 -08:00
Francisco Jerez 0aa4f99f56 i965: Fix cache pollution race during L3 partitioning set-up.
We need to split the stalling flush from the RO cache invalidation
into a different PIPE_CONTROL command to make sure that the top of the
pipe invalidation happens after any previous rendering is complete.
Otherwise it's possible for previous rendering to pollute the L3 cache
in the short window of time between RO invalidation and the completion
of the stalling flush.  Fixes rendering artifacts on Unigine Heaven,
Metro Last Light Redux and Metro 2033 Redux.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93540
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93599
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-08 15:45:44 -08:00
Francisco Jerez 1817e3c07a i965/fs: Don't emit unnecessary SEL instruction from emit_image_atomic().
The SEL instruction with predication mode NONE emitted when the atomic
operation doesn't need to be predicated is a no-op and might rely on
undocumented hardware behaviour.  Noticed by chance while looking at
the assembly output.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-02-08 15:43:05 -08:00
Matt Turner c300559fbf i965/vec4: Update vec4 unit tests for commit 01dacc83ff.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94050
2016-02-08 15:32:12 -08:00
Francisco Jerez cec6fe2ad8 vtn: Clean up acos implementation.
Parameterize build_asin() on the fit coefficients so the
implementation can be shared while still using different polynomials
for asin and acos.  Also switch back to implementing acos in terms of
asin -- The improvement obtained from cancelling out the pi/2 terms
was negligible compared to the approximation error.
2016-02-08 15:23:43 -08:00
Francisco Jerez f50a651726 nir/spirv: Create integer types of correct signedness.
vtn_handle_type() creates a signed type regardless of the value of the
signedness flag, which usually doesn't make much of a difference
except when the type is used as base sampled type of an image type,
what will cause the base type of the NIR image variable to be
inconsistent with its format and cause an assertion failure in the
back-end (most likely only reproducible on Gen7), and may change the
semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).
2016-02-08 15:23:35 -08:00
Brian Paul 01dacc83ff dri/common: include debug_output.h to silence warning 2016-02-08 10:52:02 -07:00
Brian Paul 59251610ed tgsi: minor whitespace fixes in tgsi_scan.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul 42246ab1f5 tgsi: s/true/TRUE/ in tgsi_scan.c
Just to be consistent.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul da6e879a6c tgsi: use switches instead of big if/else ifs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul 37eb3f0400 tgsi: break gigantic tgsi_scan_shader() function into pieces
New functions for examining instructions, declarations, etc.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul 3c3ef69696 st/mesa: minor formatting fixes in st_cb_bitmap.c 2016-02-08 09:29:38 -07:00
Brian Paul 5fdbfb8d6f mesa: move GL_ARB_debug_output code into new debug_output.c file
The errors.c file had grown quite large so split off this extension
code into its own file.  This involved making a handful of functions
non-static.

Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-02-08 09:29:38 -07:00
Brian Paul 6691ba1fe8 gallium/util: whitespace, formatting fixes in u_debug_stack.c 2016-02-08 09:29:38 -07:00
Brian Paul 5d2539cb49 gallium/util: whitespace, formatting fixes in u_staging.[ch] files
Still some nonsensical comments.
2016-02-08 09:29:38 -07:00
Brian Paul c84a8911fc gallium/util: switch over to new u_debug_image.[ch] code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-08 09:29:38 -07:00
Brian Paul 3917c8f3f9 gallium/util: put image dumping functions into separate file
To try to reduce the clutter in u_debug.[ch]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-08 09:29:38 -07:00
Brian Paul 6c7d4a7173 gallium/util: whitespace, formatting fixes in u_debug.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-08 09:29:38 -07:00
Samuel Pitoiset efe5829578 trace: add missing pipe_context::clear_texture()
This fixes a crash with bin/arb_clear_texture-base-formats and
probably some other tests which use clear_texture().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-08 00:06:32 +01:00
Samuel Pitoiset 1dacbb7b46 trace: remove useless MALLOC() in trace_context_draw_vbo()
There is no need to allocate memory when unwrapping the indirect buf.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-08 00:06:22 +01:00
Vinson Lee ccaf734275 mesa/extensions: Fix NVX_gpu_memory_info lexicographical order.
Fixes MesaExtensionsTest.AlphabeticallySorted.

Fixes: 1d79b99580 ("mesa: implement GL_NVX_gpu_memory_info (v2)")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94016
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-07 14:42:00 -08:00
Ilia Mirkin 88519c6087 glsl: return cloned signature, not the builtin one
The builtin data can get released with a glReleaseShaderCompiler call.
We're careful everywhere to clone everything that comes out of builtins
except here, where we accidentally return the signature belonging to the
builtin version, rather than the locally-cloned one.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
Cc: mesa-stable@lists.freedesktop.org
2016-02-07 17:23:58 -05:00
Ilia Mirkin ac57577e29 glsl: make sure builtins are initialized before getting the shader
The builtin function shader is part of the builtin state, released
when glReleaseShaderCompiler is called. We must ensure that the
builtins have been (re)initialized before attempting to link with the
builtin shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
Cc: mesa-stable@lists.freedesktop.org
2016-02-07 17:23:57 -05:00
Samuel Pitoiset 04c2ca5038 tgsi: use TGSI_WRITEMASK_XYZW instead of hardcoding the mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
2016-02-06 20:24:41 +01:00
Kristian Høgsberg Kristensen 6c4c04690f anv: Deduplicate dispatch calls
This can all be shared between gen8+ and pre-gen8.
2016-02-05 22:36:53 -08:00
Timothy Arceri ea7f64f74d glsl: don't generate transform feedback candidate when not required
If we are not even looking for one don't bother generating a candidate
list.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-06 14:34:43 +11:00
Timothy Arceri c1bbaff1e8 glsl: replace unreachable code with an assert()
All interface blocks will have been lowered by this point so just
use an assert. Returning false would have caused all sorts of
problems if they were not lowered yet and there is an assert to
catch this later anyway.

We also update the tests to reflect this change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-06 14:34:35 +11:00
Jan Vesely e377037bef r600, compute: Do not overwrite pipe_resource.screen
found by inspection.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-05 21:17:15 -05:00
Kristian Høgsberg Kristensen bdefaae2b9 anv: Deduplicate anv_CmdDraw calls
These were all duplicated between gen7_cmd_buffer.c and
gen8_cmd_buffer.c. This commit consolidates both copies in
genX_cmd_buffer.c.
2016-02-05 16:41:56 -08:00
Kristian Høgsberg Kristensen 6cdada0360 anv: Move invariant state to small initial batch
We use the simple batch helper to submit a batch at driver startup time
which holds all the state that never changes.  We don't have a whole lot
and once we enable tesselation there'll be even less. Even so, it's a
simple mechanism and reduces our steady state batch sizes a bit.
2016-02-05 16:13:53 -08:00
Kristian Høgsberg Kristensen c9c3344c4f anv: Split out batch submit helper from anv_DeviceWaitIdle
We'll reuse this mechanism in the next commit.
2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen 381d85545a anv: Share scratch_space helper between gen7 and gen8+
The gen7 pipeline has a useful helper function for this, let's use it in
gen8_pipeline.c too.  The gen7 function has an off-by-one bug though: we
have to compute log2(size / 1024) - 1, but we divide by 2048 instead so
as to avoid the case where size is less than 1024 and we'd return -1.
2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen d1617dbec3 anv: Share URB setup between gen7 and gen8+ 2016-02-05 16:13:52 -08:00
Jason Ekstrand 9401516113 Merge remote-tracking branch 'mesa-public/master' into vulkan 2016-02-05 15:21:11 -08:00
Jason Ekstrand 741744f691 Merge commit mesa-public/master into vulkan
This pulls in the patches that move all of the compiler stuff around
2016-02-05 15:03:44 -08:00
Jason Ekstrand 9645b8eb1f Merge branch mesa-public/master into vulkan 2016-02-05 14:21:13 -08:00
Jan Vesely 5b51b2e000 r600g: Ignore format for PIPE_BUFFER targets
Fixes compute since 7dd31b81fe
gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-02-05 20:23:56 +01:00
Marek Olšák d8e4908b63 mesa/get: fix a breakage after rebase
trivial.
2016-02-05 19:39:13 +01:00
Matt Turner 9f2e22bf34 i965/vec4: don't copy ATTR into 3src instructions with complex swizzles
The vec4 backend, at the end, does this:

    if (inst->is_3src()) {
       for (int i = 0; i < 3; i++) {
          if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0)
             assert(brw_is_single_value_swizzle(inst->src[i].swizzle));

So make sure that we use the same conditions when trying to
copy-propagate. UNIFORMs will be converted to vstride 0 in
convert_to_hw_regs, but so will ATTRs when interleaved (as will happen
in a GS with multiple attributes). Since the vstride is not set at
copy-prop time, infer it by inspecting dispatch_mode and reject ATTRs if
they have non-scalar swizzles and are interleaved.

Fixes assertion errors in dolphin-generated geometry shaders (or
misrendering on opt builds) on Sandybridge or on IVB/HSW with
INTEL_DEBUG=nodualobj.

Co-authored-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93418
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2016-02-05 09:33:19 -08:00