Commit Graph

117485 Commits

Author SHA1 Message Date
Kristian H. Kristensen 53782571ae freedreno/a6xx: Only use merged regs and four quads for VS+FS
When other geometry stages are present, we chose two quads and no
merged regs.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 07aedc367c freedreno/blitter: Save tessellation state
We have tessellation state now.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen d2d0c8186d freedreno/a6xx: Only set emit.hs/ds when we're drawing patches
At least the gallium blitter helper will call us to draw with
tessellation shaders set but a non-patch primitive.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen e584790885 freedreno: Use bypass rendering for tessellation
It seems like tiling could work in the Adreno architecture, but we've
only ever seen bypass rendering with tessellation.  For now, let's do
that too.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 47e2c19511 freedreno/a6xx: Program state for tessellation stages
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 03a30e7c3d freedreno/a6xx: Emit constant parameters for tessellation stages
Assemble the information the stages need and emit the constants.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 5dd51d2da7 freedreno/a6xx: Allocate and program tessellation buffer
Tessellation needs a couple of buffers that should hold the entire
output from a full VS+TCS draw call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen f0ef3e9697 freedreno/a6xx: Build the right draw command for tessellation
We need to select the right primitive type, set a bit to turn on
tessellation and or in the TES output primitive type.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 7272e8a709 freedreno/ir3: Allocate const space for tessellation parameters
The tessellation stages need size and stride or the patch layout as
well as locations of attributes in the patch.  The tesselation stages
also use two system memory BOs and need the iovas of those.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 8739ea3ab5 freedreno/ir3: Pre-color TCS header and primitive ID inputs
Similar to GS, the registers are shared and not reinitialized betewen
VS and TCS, so we need to make sure to allocate the same registers for
the system values between stages.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen b12ebe3e81 freedreno/ir3: Don't assume binning shader is always VS
In tessellation mode, the TES is (probably) the binning shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 3cedeba7c9 freedreno/ir3: Setup inputs and outputs for tessellation stages
Similar to GS, some inputs are reused when the chsh from VS to TCS or
TES to GS, so we need to make sure we setup the right inputs and make
the shared system values outputs so they don't get clobbered.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen e28fbbd861 freedreno/ir3: Implement TCS synchronization intrinsics
We add two new IR3 specific nir intrinsics that map to the new condend
and endpatch instructions.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 4915231b8a freedreno/ir3: Implement tess coord intrinsic
Our lowering pass made the z component unused by replacing its uses
by 1 - x - y.  The intrinsic implementation then just need to return
the x and y components.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:08 -08:00
Kristian H. Kristensen e16e48d00c freedreno/ir3: End TES with chsh when using GS
When we have both TES and GS, the TES needs to chain to the VS with
chmask and chsh GS just like the VS does to either TCS or GS.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:05 -08:00
Kristian H. Kristensen 581cd59692 freedreno/ir3: Add new synchronization opcodes
There are two new opcodes in use in tesselation control shaders:
category 0, opcodes 13 and 15.  unk13 is a kill type of instruction
that terminates threads where !p0.x and it used to narrow down a patch
wavefront to just thread 0.  Then, once thread 0 has written the tess
levels, it issues unk15, which might signal the TE that another patch
has been fully written.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:02 -08:00
Kristian H. Kristensen 56ed835bff freedreno/ir3: Extend geometry lowering pass to handle tessellation
VS and TCS pass varyings the same way as VS and GS does. TCS then
writes entire patch to a system memory BO and TES eventually reads
back from the BO once the TE starts generating vertices.  TES outputs
vertices the same way as VS and GS, except when there's a GS as well,
in which case TES passes varyings to GS same way the VS would.

In addition, the TCS needs a little bit of control flow massaging so
that it only runs for valid invocations needs a couple of unknown
instructions to synchronize with the TE.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:59 -08:00
Kristian H. Kristensen 8621fbc37b freedreno/ir3: Add tessellation field to shader key
Whether we're tessellating and which primitives the TES outputs
affects the entire pipeline so let's add a field to the key to track
that.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:56 -08:00
Kristian H. Kristensen 77b96b843e freedreno/ir3: Use imul24 in offset calculations
With the imul24 opcode in place, we can now use it for computing local
offsets (ie for ldlw/stlw).

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:53 -08:00
Kristian H. Kristensen 41984c8422 freedreno/ir3: Add ir3 intrinsics for tessellation
These provide the iovas for system memory buffers used for
tessellation as well as a new HW specific system value.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:50 -08:00
Kristian H. Kristensen d6209a50bb freedreno: Don't count primitives for patches
The gallium helper doesn't like patches and we can't determine how
many primitives it gets tessellated into anyway.  On gens where we
have tessellation, we get the prim count from a HW counter so just
skip counting on the CPU.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:47 -08:00
Kristian H. Kristensen fe450ef4cf freedreno/ir3: Add load and store intrinsics for global io
These intrinsics take a ivec2 for the 64 bit base address and a
integer offset.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:44 -08:00
Kristian H. Kristensen 5d67da13a3 freedreno/ir3: Emit link map as byte or dwords offsets as needed
Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages
that load with ldg (TES) need dwords offsets.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:42 -08:00
Kristian H. Kristensen 1f3b52ce50 freedreno/a6xx: Add register offset for STG/LDG
These instructions take a 64 bit iova as two conescutive registers and
a immediate offset.  This patch adds support for the offset to be a
single register, which is added to the 64 bit iova.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:39 -08:00
Kristian H. Kristensen 3d16ec4a71 freedreno/a6x: Rename z/s formats
What we call eRB6_Z24_UNORM_S8_UINT now is actually
RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually
RB6_Z24_UNORM_S8_UINT.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:36 -08:00
Kristian H. Kristensen 50124afe34 freedreno/a6xx: Fix layered texture type enum
2D array textures and 3D textures are different enum values after all.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:33 -08:00
Kristian H. Kristensen 0276d0766d freedreno: Add nogmem debug option to force bypass rendering
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:31 -08:00
Kristian H. Kristensen 7fed7c2a7d freedreno/a6xx: Clear sysmem with CP_BLIT
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:28 -08:00
Kristian H. Kristensen b0b443dcab freedreno/a6xx: Fix primitive counters again
We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO)
PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit
the geometry pipeline, whether or not xfb is on.  Then for
PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction
that writes out per-stream counts for generated and emitted, but only
when xfb is enabled.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:22 -08:00
Kristian H. Kristensen 835f8d1ba1 freedreno/registers: Add comments about primitive counters
Adding comments about best guess at what the counters count.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:19 -08:00
Kristian H. Kristensen 96968d0ba2 freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DST
Move these two to be in order with the other VS regs.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:16 -08:00
Kristian H. Kristensen ba54f7dd03 freedreno/registers: Fix typo
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:35:27 -08:00
Rhys Perry 78e3ea9a0f aco: add Instruction::usesModifiers() and add more checks in the optimizer
No pipeline-db changes.

v2: use early-exit for VOP3

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1)
2019-11-08 00:14:06 +00:00
Rhys Perry 76544f632d radv: adjust loop unrolling heuristics for int64
In particular, increase the cost of 64-bit integer division.

Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom
, with ACO used for GS this creates shaders requiring a branch with
>32767 dword offset.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-07 23:29:12 +00:00
Erico Nunes 9817bff4da lima: fix bo submit memory leak
Fix memory leak on allocation for lima submit, reported by valgrind.

128 bytes in 1 blocks are definitely lost in loss record 38 of 84
   at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
   by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91)
   by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139)
   by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113)
   by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57)
   by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802)
   by 0x586378F: lima_draw_vbo (lima_draw.c:1351)
   by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184)
   by 0x55D0A57: st_draw_vbo (st_draw.c:268)
   by 0x55576CB: _mesa_draw_arrays (draw.c:374)
   by 0x55576CB: _mesa_draw_arrays (draw.c:351)
   by 0x43610B: Mesh::render_vbo() (mesh.cpp:583)
   by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242)
   by 0x41131B: MainLoop::draw() (main-loop.cpp:133)
   by 0x411947: MainLoop::step() (main-loop.cpp:108)

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-11-07 23:03:01 +00:00
Erico Nunes d939f5d463 lima: fix nir shader memory leak
Fix memory leak on allocation for nir shader, reported by valgrind.

3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84
   at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
   by 0x5750817: ralloc_size (ralloc.c:119)
   by 0x5750977: rzalloc_size (ralloc.c:151)
   by 0x575C173: nir_shader_create (nir.c:45)
   by 0x5763ACB: nir_shader_clone (nir_clone.c:728)
   by 0x55D5003: st_create_fp_variant (st_program.c:1242)
   by 0x55D789F: st_get_fp_variant (st_program.c:1522)
   by 0x55D789F: st_get_fp_variant (st_program.c:1507)
   by 0x56400C3: st_update_fp (st_atom_shader.c:163)
   by 0x563D333: st_validate_state (st_atom.c:261)
   by 0x55D07CB: prepare_draw (st_draw.c:132)
   by 0x55D08DF: st_draw_vbo (st_draw.c:184)
   by 0x55576CB: _mesa_draw_arrays (draw.c:374)
   by 0x55576CB: _mesa_draw_arrays (draw.c:351)

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-11-07 23:03:01 +00:00
Prodea Alexandru-Liviu 1a05811936 Meson: Remove lib prefix from graw and osmesa when building with Mingw.
Also remove version sufix from osmesa swrast on Windows.

v2: Make sure we don't remove lib prefix on *nix platforms.

Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

Cc: "19.3" <mesa-stable@lists.freedesktop.org>
2019-11-07 22:04:50 +00:00
Marek Olšák 0b3111ed84 mesa: expose SPIR-V extensions in the Compatibility profile too
We would like to have GL 4.6 Compatibility too.

The extensions don't support compatibility features, so no other changes
are needed.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-11-07 16:04:30 -05:00
Drew DeVault 299c55df88 st_get_external_sampler_key: improve error message
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 15:57:23 -05:00
Eric Anholt 9d2c8df3eb mesa/st: Make st_pipe_format_to_mesa_format an effective no-op.
All callers other than the unit test just wanted to convert back from
a known-mesa-equivalent format, which is now a no-op.

v2: Fix assertion failure in iris GL startup with BGR565 by continuing
    to return MESA_FORMAT_NONE for non-Mesa formats.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-11-07 19:43:41 +00:00
Eric Anholt 75921a0912 mesa/st: Gut most of st_mesa_format_to_pipe_format().
Now that MESA_FORMAT_x is just a PIPE_FORMAT_x define, we can strip
this function down to just the compression fallbacks.

v2: Restore the SRGB format for ASTC SRGB fallback case.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt 807a800d8c mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.
There are various places in Mesa where we would like to be able to
have a shared format enum between Mesa and gallium (NIR compiler's
image formats, for example, or mapping from gallium's formats to
mesa's and vice versa in st_format.c).  Rewriting all MESA_FORMAT to
PIPE_FORMAT would be disruptive and possibly more work than it's worth
(And I actually prefer MESA_FORMAT's name scheme), so for now just
make it so that there's one shared set of enum values.

The #defines here were generated by printing out from the
tests/st_format.c round-tripping loop, with the exception of 8888
formats where I hand-edited the #defines to point at the corresponding
gallium packed format define.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt d27dda907a mesa: Prepare for the MESA_FORMAT_* enum to be sparse.
To redefine MESA_FORMAT in terms of PIPE_FORMAT enums, we need to fix
places where we iterated up to MESA_FORMAT_COUNT.  I use
_mesa_get_format_name(f) == NULL as the signal that it's not an enum
value with a MESA_FORMAT.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt 6b1c250245 mesa/st: Test round-tripping of all compressed formats.
We checked round-tripping of formats without fallbacks, but weren't
setting the compression support flags in the mock context and thus
needed to skip testing those.  Just set all the flags and assert that
no fallbacks are triggered, so we get full test coverage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt 80a8021d6c mesa: Stop defining a full separate format for RGBA_UINT8.
We have packed formats for RGBA and ABGR already, so we can just
pack/unpack code.

v2: Rebase on endianness macro rename

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-11-07 19:43:41 +00:00
Eric Anholt b28eb044cd gallium: Add equivalents of packed MESA_FORMAT_*UINT formats.
These are the last formats that MESA_FORMAT had and PIPE_FORMAT
didn't.  The .csv entries channel sizes and swizzles all came from the
corresponding UNORM format.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt 6fab4a7b59 gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8.
This is the last unorm format that MESA_FORMAT had and PIPE_FORMAT
didn't.  Note that it's an array format on gallium's side as well,
since it's a NPOT pixel size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt 4bbaac3782 gallium: Add some more channel orderings of packed formats.
This covers everything that MESA_FORMAT had for packed unorm.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt 6196259d95 gallium: Add defines for FXT1 texture compression.
This texture compression is exposed by 830 and 915, and to make
MESA_FORMAT match PIPE_FORMAT defines I need a corresponding
PIPE_FORMAT.

v2: Set is_hand_written so we don't try to generate pack/unpack code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt cb9fefe1db mesa/st: Add mapping of MESA_FORMAT_RGB_SNORM16 to gallium.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00