Commit Graph

118321 Commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer f5c1cb2383 radeonsi: dcc dirty flag
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer e3e91cebcd radeonsi: fix multi plane buffers creation
When using 3 planes, the sequence produces this chain:
  plane0 -> plane2
This commit fixes this to produce:
  plane0 -> plane1 -> plane2

Fixes: 86e60bc265 ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2193
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 08:52:16 +01:00
Pierre-Eric Pelloux-Prayer ff0f108666 radeonsi: use gfx9.surf_offset to compute texture offset
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2177
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 08:52:07 +01:00
Sonny Jiang 6c901f0675 radeonsi: use compute shader for clear 12-byte buffer
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 23:25:57 -05:00
Marek Olšák 38e9eb9561 st/mesa: release the draw shader properly to fix driver crashes (iris)
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 22:41:41 -05:00
Marek Olšák 41118246c6 draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-09 21:09:28 -05:00
Marek Olšák a3de63fbb3 st/mesa: don't generate VS TGSI if NIR is enabled
it's no longer needed

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák a90f4453fe st/mesa: remove struct st_vp_variant in favor of st_common_variant
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák 6299b90fd4 st/mesa: remove st_vp_variant::num_inputs
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák bc99b22a30 st/mesa: use a separate VS variant for the draw module
instead of keeping the IR indefinitely in st_vp_variant.

This trivially fixes Selection/Feedback/RasterPos for NIR.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák 17e8839a2f st/mesa: support shader images for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák b7393f1115 st/mesa: support SSBOs for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák e91b044bd8 st/mesa: support samplers for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák 2891c4b2e2 st/mesa: save currently bound vertex samplers and sampler views in st_context
for st_draw_feedback.c

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák 226e7aee70 st/mesa: support UBOs for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák 60db75cb77 gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipe
This is already used in st_draw_feedback.c, because it uses shaders
generated for drivers.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-09 21:09:28 -05:00
Marek Olšák 525c8b90c7 llvmpipe: implement TEX_LZ and TXF_LZ opcodes
gallivm receives these opcodes anyway because st_draw_feedback.c uses
shaders that were assembled for drivers, not llvmpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-09 21:09:28 -05:00
Gurchetan Singh 3c8ddc8f4b drirc: set allow_higher_compat_version for Faster Than Light
With 781a78 ("mesa: enable ARB_direct_state_access in compat for
GL3.1+), it's possible to have DSA with GL3.1+.

FTL creates a GL3.1 compat context, but fails the
_mesa_has_geometry_shaders(..) check in frame_buffer_texture.

Bump the compat version to pass the check.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:27:02 -08:00
Roland Scheidegger 23f1b78e8f util/atomic: Fix p_atomic_add for unlocked and msvc paths
Braces mismatch (flagged by CI, untested).

Fixes: 385d13f26d "util/atomic: Add a _return variant of p_atomic_add"

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:02:58 -08:00
Eric Anholt 0470a03769 freedreno: Track the set of UBOs to be uploaded in UBO analysis.
We were iterating over the entire 32-entry array each time, when we
can just use a bitset to know that we're only uploading from the first
entry normally.

Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL
fishtank.

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-09 14:13:50 -08:00
Eric Anholt 10da0a9d18 freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off.
The default is to not throw GL errors when drawing with mapped
buffers, but we were forcing it on for unclear reasons.  Internally we
keep all our buffers mapped anyway, so it should be a no-op other than
reducing CPU overhead (.23% in a perf report for WebGL fishtank)

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-09 14:13:47 -08:00
Rob Clark dc791d3c68 freedreno/fdperf: use drmOpen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-09 13:09:58 -08:00
Alyssa Rosenzweig a37822f5f7 gallium/util: Support POLYGON in u_stream_outputs_for_vertices
u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is
trivial to support as a special case directly (since we have the number
of vertices directly).

Fixes aborts in Panfrost in apps using GL_POLYGON.

Fixes: e881aa8c12 ("gallium/util: Add u_stream_outputs_for_vertices helper")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Revewied-by: Eric Anholt <eric@anholt.net>
2019-12-09 21:09:05 +00:00
Anuj Phogat 1a32fbd48c intel: Add pci-ids for Jasper Lake
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-09 12:22:57 -08:00
Anuj Phogat 11fdd5f52c intel: Add device info for 1x4x6 Jasper Lake
Also removing the FIXME comments after matching the numbers with
updated documentation.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-09 12:22:56 -08:00
Vasily Khoruzhick 9f5fa496cb lima: expose tiled format modifier in query_dmabuf_modifiers()
Fixes: 8c12f4e5f2 ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-09 15:21:55 +00:00
Vasily Khoruzhick 01a451b04d lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()
Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID
in resource_from_handle() and we don't have RO.

Fixes: 8c12f4e5f2 ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-09 15:21:55 +00:00
Jonathan Marek 9d78cf4584 turnip: add hw binning
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-09 08:22:18 -05:00
Samuel Pitoiset 86dfe92bd0 radv: do not use VK_TRUE/VK_FALSE
For consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-09 09:21:26 +01:00
Dave Airlie d7dc14628a gallivm: add bitfield reverse and ufind_msb
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
2019-12-09 06:05:02 +10:00
Roland Scheidegger 1c7693e3bd gallium/scons: fix graw_gdi build
Fixes: 44a6b0107b (gallivm: add nir->llvm translation (v2))
Reviewed-by: Dave Airlie <Airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-12-07 17:50:53 +01:00
Daniel Schürmann 8259c97b2d aco: propagate temporaries into expanded vectors
Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size: 1708488 -> 1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann df3e674fb3 aco: improve readfirstlane after uniform ssbo loads on GFX7
pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655900 -> 3643916 (-0.33 %)
VGPRS: 2678324 -> 2686324 (0.30 %)
Spilled SGPRs: 1730 -> 1634 (-5.55 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread
Code Size: 136106120 -> 135457616 (-0.48 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601014 -> 600206 (-0.13 %)

Totals from affected shaders:
SGPRS: 307832 -> 295848 (-3.89 %)
VGPRS: 267864 -> 275864 (2.99 %)
Spilled SGPRs: 770 -> 674 (-12.47 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 16 -> 12 (-25.00 %) dwords per thread
Code Size: 22007488 -> 21358984 (-2.95 %) bytes
LDS: 65 -> 65 (0.00 %) blocks
Max Waves: 28668 -> 27860 (-2.82 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 0837471463 aco: use soffset for MUBUF instructions on SI/CI
pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655300 -> 3655900 (0.02 %)
VGPRS: 2677732 -> 2678324 (0.02 %)
Spilled SGPRs: 1730 -> 1730 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 15540 -> 15540 (0.00 %) dwords per thread
Code Size: 136488364 -> 136106120 (-0.28 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601039 -> 601014 (-0.00 %)

Totals from affected shaders:
SGPRS: 316312 -> 316912 (0.19 %)
VGPRS: 273844 -> 274436 (0.22 %)
Spilled SGPRs: 770 -> 770 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 16 -> 16 (0.00 %) dwords per thread
Code Size: 22724904 -> 22342660 (-1.68 %) bytes
LDS: 114 -> 114 (0.00 %) blocks
Max Waves: 30861 -> 30836 (-0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 7b38d95b32 radv: Enable ACO on GFX7 (Sea Islands)
This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 28c95cc402 aco: return to loop_active mask at continue_or_break blocks
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 0f9447ccb0 radv: disable Youngblood app profile if ACO is used
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 746165e540 aco: implement exclusive scan for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 7ae227effd aco: implement inclusive_scan for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann f895a8b1df aco: implement (clustered) reductions for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 9254fb4fc7 aco: don't use a scalar temporary for reductions on GFX10
This patch also adds the scalar temporary for scans on SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 8ad43d8838 aco: flush denorms after fmin/fmax on pre-GFX9
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 21f67a3bdc radv: only flush scalar cache for SSBO writes with ACO on GFX8+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 79ce6c1b33 aco: disable disassembly for SI/CI due to lack of support by LLVM
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 1c4afe38f2 aco: implement 64bit ine/ieq for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 1e1356b2ad aco: implement 64bit i2b for SI /CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann da7ff58835 aco: make 1/2*PI a literal constant on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 90fad7360d aco: implement 64bit VGPR shifts for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 6a586a6006 aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann 23319add93 aco: fix disassembly of writelane instructions.
ACO writes an unused 3rd operand for internal usage
which makes LLVM recoginize it as illegal instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00