Commit Graph

107617 Commits

Author SHA1 Message Date
Rhys Perry 6790b3a8db ac/nir: make ac_build_isign work on all bit sizes
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:20 +00:00
Rhys Perry bbbfdef683 ac/nir: make ac_build_clamp work on all bit sizes
v2: don't use ac_get_zerof() and ac_get_onef()
v3: rename "intr" to "name"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:58 +00:00
Rhys Perry 7e5004e30a ac/nir: fix 64-bit nir_op_f2f16_rtz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:44 +00:00
Rhys Perry c4ea20c0a0 ac/nir: implement 8-bit nir_load_const_instr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:33 +00:00
Rhys Perry 0ca550e01a radv: ensure export arguments are always float
So that the signature is correct and consistent, the inputs to a export
intrinsic should always be 32-bit floats.

This and the previous commit fixes a large amount crashes from
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_*
tests

Fixes: b722b29f10 ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:22 +00:00
Rhys Perry 64065aa504 radv: bitcast 16-bit outputs to integers
16-bit outputs are stored as 16-bit floats in the outputs array, so they
have to be bitcast.

Fixes: b722b29f10 ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:18 +00:00
Eric Engestrom 23b485c920 gitlab-ci: use ccache to speed up builds
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-02-19 10:09:51 +00:00
Eric Anholt dbe3af67a4 v3d: Move i2b and f2b support into emit_comparison.
This lets us save a resolve to NIR true/false for ifs and discard_if.  No
change in shader-db.
2019-02-18 18:18:37 -08:00
Eric Anholt 0bba9c8489 v3d: Emit a simpler negate for the iabs implementation.
One program affected in my shader-db.

instructions in affected programs: 110 -> 108 (-1.82%)
2019-02-18 18:13:09 -08:00
Eric Anholt 1a775d43c9 v3d: Delay emitting ldvpm on V3D 4.x until it's actually used.
For V3D 3.x, we emitted the ldvpms all at the top so that we didn't need
to do VPM setup when the load_inputs are out of order.  For V3D 4.x, we
can reduce register pressure by delaying our loads until they're actually
needed.  This also avoids a bunch of silly MOVs in the pre-opt VIR dump.

total instructions in shared programs: 6421415 -> 6419933 (-0.02%)
total uniforms in shared programs: 2393139 -> 2393140 (<.01%)
total threads in shared programs: 153864 -> 153906 (0.03%)
2019-02-18 18:09:07 -08:00
Eric Anholt 5a84d46896 v3d: Stop tracking num_inputs for VPM loads.
It's unused in the VS (since we need vattr_sizes[] anyway), so move it to
FS prog data.
2019-02-18 18:09:07 -08:00
Eric Anholt 581eba072d v3d: Add a function to describe what the c->execute.file check means.
This is what pointed out that we were misusing the check for last_thrsw in
the previous commit.
2019-02-18 18:09:07 -08:00
Eric Anholt 441294962c v3d: Fix the check for "is the last thrsw inside control flow"
The execute.file check used to be good enough, until I stopped setting up
the execute mask for uniform ifs.

No known tests fixed, noticed while doing a refactor.

Fixes: 0805060573 ("v3d: Handle dynamically uniform IF statements with uniform control flow.")
2019-02-18 18:09:07 -08:00
Eric Anholt 07d5b5a972 v3d: Fix f2b32 behavior.
Now that we don't have the vir_PF() magic, it's obvious that we were doing
the wrong thing for f2b32 by allowing -0.0 to produce true instead of
false.
2019-02-18 18:09:07 -08:00
Eric Anholt 3022b4bd82 v3d: Kill off vir_PF(), which is hard to use right.
You were allowed to pass in any old temp so that you could hopefully fold
the PF up into the def of the temp.  If we couldn't find one, it
implicitly generated a MOV(nop, reg).  However, that PF could have
different behavior depending on whether the def being folded into was a
float or int opcode, which the caller doesn't necessarily control.

Due to the fragility of the function, just switch all callers over to
vir_set_pf().  This also encourages the callers to use a _dest call for
the inst they're putting the PF on, eliminating a bunch of temps in the
pre-optimization VIR.

shader-db says the change is in the noise:

total instructions in shared programs: 6226247 -> 6227184 (0.02%)
instructions in affected programs: 851068 -> 852005 (0.11%)
2019-02-18 18:09:06 -08:00
Eric Anholt 6186a8d44e v3d: Do bool-to-cond for discard_if as well.
Turns this minimal conditional discard (glsl-fs-discard-01.shader_test):

0x3de0b086c5fe9000 fcmp.pushn  -, r1, r5; mov  r2, 0
0x3dec3086bbfc001f nop                  ; mov.ifa  r2, -1
0x3c047186bbe80000 nop                  ; mov.pushz  -, r2
0x3dea3186ba837000 setmsf.ifna  -, 0    ; nop

into:

0x3c00b186c582a000 fcmp.pushn  -, r2, r5; nop
0x3de83186ba837000 setmsf.ifa  -, 0     ; nop

total instructions in shared programs: 6229820 -> 6226247 (-0.06%)
2019-02-18 18:09:06 -08:00
Eric Anholt 718eef62cb v3d: Refactor bcsel and if condition handling.
Both were doing the same thing to try to get a condition to predicate on.
Noticed when I wanted to do this for discard_if as well.

No change in shader-db.
2019-02-18 18:09:06 -08:00
Eric Anholt 4586f9f902 v3d: Add a helper function for getting a nop register.
Just a little refactor to explain what's going on with QFILE_NULL.
2019-02-18 18:09:06 -08:00
Eric Anholt 339155122b v3d: Drop our hand-lowered nir_op_ffract.
The NIR lowering works fine, though it causes some slight noise due to
what looks like choices about propagating constants up multiply chains
changing.

total instructions in shared programs: 6229671 -> 6229820 (<.01%)
total uniforms in shared programs: 2312171 -> 2312324 (<.01%)
2019-02-18 18:09:06 -08:00
Eric Anholt 16f5085490 v3d: Drop a perf note about merging unpack_half_*, which has been implemented.
This is handled with copy-propagation now.
2019-02-18 18:09:06 -08:00
Eric Anholt 146e432b49 v3d: Fix incorrect flagging of ldtmu as writing r4 on v3d 4.x.
Fixes some stalls in 3DMMES's main vertex shader.

total instructions in shared programs: 6280751 -> 6211270 (-1.11%)
instructions in affected programs: 2935050 -> 2865569 (-2.37%)
2019-02-18 18:09:06 -08:00
Eric Anholt cd5e0b2729 v3d: Use the early_fragment_tests flag for the shader's disable-EZ field.
Apparently we need disable-EZ flagged, not just "does Z writes".

Fixes
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
on 7278, even though it passed in simulation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 051a41d3d5 ("v3d: Add support for the early_fragment_tests flag.")
2019-02-18 18:09:06 -08:00
Eric Anholt 332b969c4e v3d: Sync indirect draws on the last rendering.
Fixes intermittent fails in
dEQP-GLES31.functional.draw_indirect.compute_interop.separate.drawelements_compute_cmd_and_data_and_indices
and others (particularly when run as part of a CTS run)
2019-02-18 18:09:06 -08:00
Eric Anholt 32f16b0b1e v3d: Clear the GMP on initialization of the simulator.
Otherwise, we might have pages accessible that shouldn't be and miss out
on errors.  This is unlikely for most tests since v3d_hw_get_mem() is big
enough that it'll be a freshly zeroed mmap, but if screens are destroyed
and recreated then we'd be reusing the old v3d_hw_get_mem() contents.
2019-02-18 18:09:06 -08:00
Emil Velikov ba652394a3 docs: update calendar, add news item and link release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-18 18:38:14 +00:00
Emil Velikov d7108dac73 docs: add sha256 checksums for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bfb5bdaa97)
2019-02-18 18:36:23 +00:00
Emil Velikov a1ccff4aaf docs: add release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b26488dead)
2019-02-18 18:36:21 +00:00
Ilia Mirkin 57441af8bf i965: always enable EXT_float_blend
From the table in isl_format.c, it appears that all generations
support blending on 32-bit float surfaces.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Ilia Mirkin 9fec653093 st/mesa: enable GL_EXT_float_blend when possible
If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip
EXT_float_blend on (which will affect ES3 contexts).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-18 12:13:54 -05:00
Ilia Mirkin 070a5e5d92 mesa: add explicit enable for EXT_float_blend, and error condition
If EXT_float_blend is not supported, error out on blending of FP32
attachments in an ES2 context.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Samuel Pitoiset 47616810ed radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled
This version is better and safer.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 18:06:07 +01:00
Rob Clark d6c43cceff freedreno/ir3: handle quirky atomic dst for a6xx
The new encoding returns a value via the 2nd src.  The legalize pass
needs to be aware of this to set the correct needs_sy flag, otherwise we
can, in cases where the atomic dst is not used, overwrite the register
that hardware will asynchronously load result into without (sy) flag, so
it gets clobbered by the atomic result.

This fixes a whole lot of rando ssbo+atomic fails, like
dEQP-GLES31.functional.ssbo.layout.single_basic_type.packed.highp_vec4.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 12:01:36 -05:00
Rob Clark 28fc6733cd freedreno/a6xx: fix helper_invocation (sampler mask/id)
Since gl_HelperInvocation is lowered to:

  !((1 << sample_id) & sample_mask_in))

Not setting these enable bits was causing it be broken.  (And probably a
bunch of other stuff too.)

Fixes dEQP-GLES31.functional.shaders.helper_invocation.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 10:37:54 -05:00
Samuel Pitoiset 32ab7a59bb radv: remove unused variable in gather_push_constant_info()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-18 13:30:16 +01:00
Lionel Landwerlin 8c87d029bc i965: scale factor changes should trigger recompile
Found by inspection.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3da858a6b9 ("intel/compiler: add scale_factors to sampler_prog_key_data")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-18 12:18:13 +00:00
Samuel Pitoiset 0d8f096293 radv: write the alpha channel of MRT0 when alpha coverage is enabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:22 +01:00
Samuel Pitoiset 2cf5433b99 ac: use new LLVM 8 intrinsic when loading 16-bit values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:20 +01:00
Samuel Pitoiset f0223143a8 ac: add ac_build_llvm8_tbuffer_load() helper
It uses the new LLVM intrinsics.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:17 +01:00
Tapani Pälli 9762a9f893 mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment
This fixes invalid access to Attachment array which would occur if caller
would exceed MaxColorAttachments. In practice this should not ever happen
because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be
valid and InvalidateFramebuffer will error out before but this should
make coverity happy.

v2: const, remove _EXT (Ian)

CID: 1442559
Fixes: 0c42b5f3cb "mesa: wire up InvalidateFramebuffer"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-18 07:51:55 +02:00
Alyssa Rosenzweig 2c6a7fbeb7 panfrost: Fix clipping region
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:50 +00:00
Alyssa Rosenzweig fa1b36ddc2 panfrost: Preserve w sign in perspective division
This fixes issues where polygons that should be culled (due to negative
w, for instance) may not be.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:34 +00:00
Alyssa Rosenzweig 49985cebea panfrost: Cleanup mali_viewport (clipping) code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:03 +00:00
Alyssa Rosenzweig a94463732a panfrost: Swap order of tiled texture (de)alloc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:33 +00:00
Alyssa Rosenzweig 4a4ed53c01 panfrost: Free imported BOs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:06 +00:00
Alyssa Rosenzweig b5a01296f4 panfrost: Fix various leaks unmapping resources
v2: Don't check for NULL before free()

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:09:41 +00:00
Kenneth Graunke 535251487b nir: Don't reassociate add/mul chains containing only constants
The idea here is to reassociate a * (b * c) into (a * c) * b, when
b is a non-constant value, but a and c are constants, allowing them
to be combined.

But nothing was enforcing that 'b' must be non-constant, which meant
that running opt_algebraic in a loop would never terminate if the IR
contained non-folded constant expressions like 256 * 0.5 * 2.  Normally,
we call constant folding in such a loop too, but IMO it's better for
nir_opt_algebraic to be robust and not rely on that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581
Fixes: 32e266a9a5 i965: Compile fp64 funcs only if we do not have 64-bit hardware support

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-16 23:36:14 -08:00
Chris Wilson e9882b879b i965: Assert the execobject handles match for this device
Object handles are local to the device fd, so double check we are not
mixing together objects from multiple screens on execbuf submission.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-16 23:35:29 -08:00
Rob Clark 99b90ecd35 freedreno/a6xx: cache flush harder
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark 1af0c5d320 freedreno/a6xx: compute support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark 5118dcf8c3 freedreno/a6xx: image/ssbo state emit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00