Commit Graph

109835 Commits

Author SHA1 Message Date
Lionel Landwerlin 0d618bb635 i965: perf: sklgt2: update compute metrics config
This unifies some of the programming between pre-production stepping
and production ones.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-31 10:35:16 +01:00
Lionel Landwerlin 4edaa6f003 i965: perf: sklgt2: update a priority for register programming
This makes no difference in term of programming, it's just a cleanup.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-31 10:35:16 +01:00
Alyssa Rosenzweig e4e6a3deaf panfrost: Implement FIXED formats
Fixes crash in dEQP-GLES2.functional.draw.random.9

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 04:42:37 +00:00
Alyssa Rosenzweig ed160a1160 panfrost: Fix index calculation types and asserts
Fixes crash in
dEQP-GLES2.functional.draw.draw_elements.points.single_attribute.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 04:42:22 +00:00
Alyssa Rosenzweig 0e4c321c15 panfrost: Clean index state between indexed draws
Fixes subsequent tests in
dEQP-GLES2.functional.draw.draw_elements.indices.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 04:41:54 +00:00
Alyssa Rosenzweig 4fcd3189ae panfrost/decode: Print negative_start
This property slipped through..

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 04:41:06 +00:00
Alyssa Rosenzweig 9237204400 panfrost: Implement missing texture formats
- Implements RGB565/RGBA5551 formats
 - Don't advertise support for flipped RGBA5551 and ETC

Fixes remaining tests in dEQP-GLES2.functional.texture.format.* which is
now at 36/36.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig 01fce794dc panfrost: Extend tiling for cubemaps
transfer_unmap now tiles for any tiled resource, not just TEXTURE_2D,
which should more than just cubemaps!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig c87f3ce97f panfrost: Implement command stream for linear cubemaps
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig 70b3e5db7d panfrost/midgard: Emit cubemap coordinates
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig b5f02bdd99 panfrost: Include all cubemap faces in bitmap list
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig 3197b30c6e panfrost/decode: Decode all cubemap faces
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig e658f7225d panfrost: Preliminary work for cubemaps
Again, not yet functional, but this sets up the memory management for
cube maps.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:37 +00:00
Alyssa Rosenzweig 499f31aab8 panfrost/midgard: Add L/S op for writing cubemap coordinates
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:37 +00:00
Alyssa Rosenzweig f67616ce60 panfrost/midgard: Disassemble `cube` texture op
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:37 +00:00
Alyssa Rosenzweig 28b234a092 panfrost: Fix vertex buffer corruption
Fixes crash in dEQP-GLES2.functional.buffer.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-31 02:36:37 +00:00
Rob Clark b2d651b862 iris: fix set_sampler_view
Update to match docs.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-30 13:05:31 -04:00
Rob Clark e167e8f8a2 gallium/docs: clarify set_sampler_views (v2)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-30 13:04:00 -04:00
Rob Clark 7ff6705b8d freedreno/ir3: convert to "new style" frag inputs
Add support for load_barycentric_pixel, load_interpolated_input, and
friends.  For now, this retains support for old-style inputs, which can
probably be dropped with some ttn work.

Prep work for sample-shading support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-30 12:56:01 -04:00
Rob Clark fc865de777 freedreno/ir3: add pass to move varying loads
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-30 12:56:01 -04:00
Rob Clark 831f1a05c0 freedreno/ir3: rework varying packing
Originally we kept track of a table of inputs.  But with new-style frag
inputs this becomes awkward.  Re-work it so that initially we assigned
un-packed varying locations, and then after the shader is compiled scan
to find actual used inputs, and re-pack.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-30 12:56:01 -04:00
Rob Clark 91a1354cd6 freedreno/ir3: re-indent comment
Make it more clear that it applies to the following 'case' statements,
rather than the previous one.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-30 12:56:01 -04:00
Rob Clark 1ae0c030cb nir: add lower_all_io_to_elements
I need this part of lower_all_io_to_temps but without the actual
lowering to temps part.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-30 12:56:01 -04:00
Rob Clark e5e67228f5 nir: print var name for load_interpolated_input too
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2019-03-30 12:55:47 -04:00
Sergii Romantsov 72a921e12a i965,iris/blorp: do not blit 0-sizes
Seems there is no sense in blitting 0-sized sources
or destinations.
Additionaly it may cause segfaults for i965.

v2: Function call replaced with inline check

v3: Added check to avoid devision by zero (L. Landwerlin)

v4: Added simillar check for Iris (L. Landwerlin)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-30 11:50:40 +00:00
Vinson Lee e757a2481f gallium: Fix autotools build with libxatracker.la.
CXXLD    libxatracker.la
/usr/bin/ld: ../../../../src/gallium/auxiliary/.libs/libgallium.a(tgsi_to_nir.o): in function `ttn_finalize_nir':
src/gallium/auxiliary/nir/tgsi_to_nir.c:2111: undefined reference to `gl_nir_lower_samplers_as_deref'
/usr/bin/ld: src/gallium/auxiliary/nir/tgsi_to_nir.c:2113: undefined reference to `gl_nir_lower_samplers'

Fixes: 9a834447d6 ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2019-03-29 23:24:05 -07:00
Timur Kristóf 356ec7a219 gallium: fix autotools build of pipe_msm.la
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Fixes: 9a834447d6 ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929
2019-03-29 23:12:40 -07:00
Jason Ekstrand 7dbd934e26 nir: Lock around validation fail shader dumping
This prevents getting mixed-up results if a multi-threaded app has two
validation errors in different threads.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-29 21:57:51 -05:00
Brian Paul b8e077daee util: no-op __builtin_types_compatible_p() for non-GCC compilers
__builtin_types_compatible_p() is GCC-specific and breaks the
MSVC build.

This intrinsic has been in u_vector_foreach() for a long time, but
that macro has only recently been used in code
(nir/nir_opt_comparison_pre.c) that's built with MSVC.

Fixes: 2cf59861a ("nir: Add partial redundancy elimination for compares")

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-03-29 15:33:43 -06:00
Caio Marcelo de Oliveira Filho 3b20ca34ae iris: Clean up compiler warnings about unused
Removed a few unused variables and iris_getparam_boolean().
Kept 'name' around since there's a commented debug that make use of it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-29 12:07:26 -07:00
Eric Engestrom 8d9c2044a4 egl: hide entrypoints that shouldn't be exported when using glvnd
From GLVND author:
> From a functional standpoint, exporting additional symbols doesn't
> really matter, since libglvnd will load the vendor libraries with
> RTLD_LOCAL.

Suggested-by: Kyle Brenneman <kbrenneman@nvidia.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kyle Brenneman <kbrenneman@nvidia.com>
2019-03-29 16:54:08 +00:00
Karol Herbst fea0caea2b nir/validate: validate that tex deref sources are actually derefs
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-29 16:03:22 +01:00
Karol Herbst 6ffc72472c nir/print: fix printing the image_array intrinsic index
Fixes: 0de003be03 ("nir: Add handle/index-based image intrinsics")

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-29 16:03:22 +01:00
Timothy Arceri 4478c5374b Revert "ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations"
This reverts commit 29132af234.

It seems the new intrinsic causes a hang on radeonsi (VEGA) when running the
piglit test:

tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test
2019-03-29 21:04:01 +11:00
Samuel Pitoiset cc752dea61 ac: fix return type for llvm.amdgcn.frexp.exp.i32.64
This fixes the following piglit with RadeonSI
tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-29 09:18:24 +01:00
Gert Wollny a0edceb00d virgl: Add a caps feature check version
When we add new feature checks on the host side that is used to
enable a cap conditionally that was enabled unconditionally before
we might end up with a feature regression when a new mesa version
is used with an old virglrenderer version that doesn't check for
that cap.

To work around this problem add a version id to the caps that corresponds
to the features that are actually checked on the host and check that
version too when enabling the cap.

Fixes: 2ee197d6e8
    virgl: Enable mixed color FBO attachemnets only when the host supports it

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Pohsien Wang <pwang@chromium.org>
2019-03-29 07:55:31 +00:00
Samuel Pitoiset 62a9d757e6 radv: do not always initialize HTILE in compressed state
Especially when performing a transtion from UNDEFINED->GENERAL,
the driver shouldn't initialize HTILE metadata in compressed
state because it doesn't decompress when the src layout is
GENERAL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110259
Fixes: 3a2e93147f ("radv: always initialize HTILE when the src layout is UNDEFINED")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-29 08:28:18 +01:00
Kenneth Graunke 3fee3d1319 iris: Print the memzone name when allocating BOs with INTEL_DEBUG=buf
This gives me an idea of what kinds of buffers are being allocated on
the fly which could help inform our cache decisions.
2019-03-28 23:37:32 -07:00
Brian Paul 4ee057eaef nir: use {0} initializer instead of {} to fix MSVC build
Trivial change.

Fixes: c6ee46a75 ("nir: Add nir_alu_srcs_negative_equal")
2019-03-28 20:34:23 -06:00
Ian Romanick 7832fb7889 intel/compiler: Use partial redundancy elimination for compares
Almost all of the hurt shaders are repeated instances of the same shader
in synmark's compilation speed tests.

shader-db results:

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15256840 -> 15256389 (<.01%)
instructions in affected programs: 54137 -> 53686 (-0.83%)
helped: 288
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.06% max: 26.67% x̄: 1.99% x̃: 0.74%
95% mean confidence interval for instructions value: -1.76 -1.38
95% mean confidence interval for instructions %-change: -2.47% -1.50%
Instructions are helped.

total cycles in shared programs: 372286583 -> 372283851 (<.01%)
cycles in affected programs: 833829 -> 831097 (-0.33%)
helped: 265
HURT: 16
helped stats (abs) min: 2 max: 74 x̄: 11.81 x̃: 4
helped stats (rel) min: 0.04% max: 9.07% x̄: 0.99% x̃: 0.35%
HURT stats (abs)   min: 2 max: 130 x̄: 24.88 x̃: 8
HURT stats (rel)   min: <.01% max: 12.31% x̄: 1.44% x̃: 0.27%
95% mean confidence interval for cycles value: -12.30 -7.15
95% mean confidence interval for cycles %-change: -1.06% -0.64%
Cycles are helped.

Iron Lake and GM45 had similar results. (GM45 shown)
total instructions in shared programs: 5038653 -> 5038495 (<.01%)
instructions in affected programs: 13939 -> 13781 (-1.13%)
helped: 50
HURT: 1
helped stats (abs) min: 1 max: 15 x̄: 3.18 x̃: 4
helped stats (rel) min: 0.33% max: 13.33% x̄: 2.24% x̃: 1.09%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.83% max: 0.83% x̄: 0.83% x̃: 0.83%
95% mean confidence interval for instructions value: -3.73 -2.47
95% mean confidence interval for instructions %-change: -3.16% -1.21%
Instructions are helped.

total cycles in shared programs: 128118922 -> 128118228 (<.01%)
cycles in affected programs: 134906 -> 134212 (-0.51%)
helped: 50
HURT: 0
helped stats (abs) min: 2 max: 60 x̄: 13.88 x̃: 18
helped stats (rel) min: 0.06% max: 3.19% x̄: 0.74% x̃: 0.70%
95% mean confidence interval for cycles value: -16.54 -11.22
95% mean confidence interval for cycles %-change: -0.95% -0.53%
Cycles are helped.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-28 15:35:53 -07:00
Ian Romanick 2cf59861a8 nir: Add partial redundancy elimination for compares
This pass attempts to dectect code sequences like

    if (x < y) {
        z = y - x;
	...
    }

and replace them with sequences like

    t = x - y;
    if (t < 0) {
        z = -t;
	...
    }

On architectures where the subtract can generate the flags used by the
if-statement, this saves an instruction.  It's also possible that moving
an instruction out of the if-statement will allow
nir_opt_peephole_select to convert the whole thing to a bcsel.

Currently only floating point compares and adds are supported.  Adding
support for integer will be a challenge due to integer overflow.  There
are a couple possible solutions, but they may not apply to all
architectures.

v2: Fix a typo in the commit message and a couple typos in comments.
Fix possible NULL pointer deref from result of push_block().  Add
missing (-A + B) case.  Suggested by Caio.

v3: Fix is_not_const_zero to work correctly with types other than
nir_type_float32.  Suggested by Ken.

v4: Add some comments explaining how this works.  Suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-28 15:35:53 -07:00
Ian Romanick c6ee46a753 nir: Add nir_alu_srcs_negative_equal
v2: Move bug fix in get_neg_instr from the next patch to this patch
(where it was intended to be in the first place).  Noticed by Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-28 15:35:52 -07:00
Ian Romanick be1cc3552b nir: Add nir_const_value_negative_equal
v2: Rebase on 1-bit Boolean changes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-28 15:35:52 -07:00
Ian Romanick ae21b52e1d nir/algebraic: Add missing 16-bit extract_[iu]8 patterns
No shader-db changes on any Intel platform.

v2: Use a loop to generate patterns.  Suggested by Jason.

v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would
replace an extract_i8 with and extract_u8.  This broke ~180 tests.  This
bug was introduced in v2.

Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2]
Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
2019-03-28 15:35:52 -07:00
Ian Romanick cbad201c2b nir/algebraic: Add missing 64-bit extract_[iu]8 patterns
No shader-db changes on any Intel platform.

v2: Use a loop to generate patterns.  Suggested by Jason.

v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would
replace an extract_i8 with and extract_u8.  This broke ~180 tests.  This
bug was introduced in v2.

Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2]
Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
2019-03-28 15:35:52 -07:00
Ian Romanick bc17f5a2a3 nir/algebraic: Remove redundant extract_[iu]8 patterns
No shader-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-28 15:35:52 -07:00
Ian Romanick c152672e68 nir/algebraic: Fix up extract_[iu]8 after loop unrolling
Skylake, Broadwell, and Haswell had similar results. (Skylake shown)
total instructions in shared programs: 15256840 -> 15256837 (<.01%)
instructions in affected programs: 4713 -> 4710 (-0.06%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.06% max: 0.08% x̄: 0.06% x̃: 0.06%

total cycles in shared programs: 372286583 -> 372286583 (0.00%)
cycles in affected programs: 198516 -> 198516 (0.00%)
helped: 1
HURT: 1
helped stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10
helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 0.01% max: 0.01% x̄: 0.01% x̃: 0.01%

No changes on any other Intel platform.

v2: Use a loop to generate patterns.  Suggested by Jason.

v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would
replace an extract_i8 with and extract_u8.  This broke ~180 tests.  This
bug was introduced in v2.

Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2]
Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
2019-03-28 15:35:52 -07:00
Dave Airlie b779baa9bf nir/deref: fix struct wrapper casts. (v3)
llvm/spir-v spits out some struct a { struct b {} }, but it
doesn't deref, it casts (struct a) to (struct b), reconstruct
struct derefs instead of casts for these.

v2: use ssa_def_rewrite uses, rework the type restrictions (Jason)
v3: squish more stuff into one function, drop unused temp (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-29 08:10:50 +10:00
Rafael Antognolli 8e0469f629 i965/blorp: Remove unused parameter from blorp_surf_for_miptree.
It seems pretty useless nowadays.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-28 14:38:23 -07:00
Anuj Phogat 9c421d6b47 iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-28 19:59:59 +00:00