Commit Graph

117507 Commits

Author SHA1 Message Date
Rob Clark bd21c73d3f freedreno/ir3: ir3_print tweaks
Handle HALF/HIGH flags in all cases, and colorize SSA src notation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark 5da10704bb freedreno/ir3: use SSA flag on dest register too
We did this in some places before, but not consistantly.  But it will be
useful for two-pass RA, to identify which registers have already been
assigned.

While we are cleaning this up, use __ssa_src() and new __ssa_dst()
helper more consistently.  (If nothing else, this reduces the # of
callers of ir3_reg_create() to audit that we didn't miss something)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:14 +00:00
Rob Clark 8449f6183f freedreno/ir3: split pre-coloring to it's own function
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:14 +00:00
Caio Marcelo de Oliveira Filho 087ecd9ca5 spirv: Don't leak GS initialization to other stages
The stage specific fields of shader_info are in an union.  We've
likely been lucky that this value was either overwritten or ignored by
other stages.  The recent change in shader_info layout in commit
84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
made this issue visible.

Fixes: cf2257069c ("nir/spirv: Set a default number of invocations for geometry shaders")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-08 16:28:21 -08:00
Marek Olšák 84a1a2578d compiler: pack shader_info from 160 bytes to 96 bytes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-08 16:54:08 -05:00
Marek Olšák 9950523368 glsl/linker: pass shader_info to analyze_clip_cull_usage directly
This will be needed by the next commit.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-08 16:54:06 -05:00
Marek Olšák 3ef50b023e radeonsi/nir: fix compute shader crash due to nir_binary == NULL
This partially reverts 8b30114dda.

Fixes: 8b30114dda "radeonsi/nir: call nir_serialize only once per shader"
2019-11-08 16:47:59 -05:00
Marek Olšák 8b30114dda radeonsi/nir: call nir_serialize only once per shader
We were calling it twice.

First serialize it, then use it to compute the cache key.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-08 15:30:28 -05:00
Marek Olšák ad56022b0d util: add blob_finish_get_buffer
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-08 15:30:28 -05:00
Eric Anholt b1f38aed84 u_format: Fix swizzle of A1R5G5B5.
Found once I started using the generated unpack code from the Mesa side.

Fixes: 4bbaac3782 ("gallium: Add some more channel orderings of packed formats.")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-08 11:56:02 -08:00
David Stevens 0466239aae virgl: support emulating planar image sampling
Mesa emulates planar format sampling with per-plane samplers. Virgl now
supports this by allowing the plane index to be passed when creating a
sampler view from a planar image. With this change, mesa now passes that
information to virgl.

Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Lepton Wu <lepton@chromium.org>
2019-11-08 17:06:56 +00:00
Krzysztof Raszkowski 084431ce45 gallium/swr: Enable some ARB_gpu_shader5 extensions
Enable / add to features.txt:
- Enhanced textureGather.
- Geometry shader instancing.
- Geometry shader multiple streams.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-08 16:04:47 +00:00
Krzysztof Raszkowski e5ed9a1b91 gallium/swr: Fix GS invocation issues
- Fixed proper setting gl_InvocationID.
- Fixed GS vertices output memory overflow.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-08 14:52:16 +00:00
Timur Kristóf 911a826141 ac: Handle invalid GFX10 format correctly in ac_get_tbuffer_format.
It happens that some games try to access a vertex buffer without
a valid format. This case was incorrectly handled by
ac_get_tbuffer_format which made ACO emit an invalid instruction.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-08 13:30:30 +01:00
Boris Brezillon ee82f9f07e panfrost: Try to evict unused BOs from the cache
The panfrost BO cache can only grow since all newly allocated BOs are
returned to the cache (unless they've been exported).

With the MADVISE ioctl that's not a big issue because the kernel can
come and reclaim this memory, but MADVISE will only be available on 5.4
kernels. This means an app can currently allocate a lot memory without
ever releasing it, leading to some situations where the OOM-killer kicks
in and kills the app (or even worse, kills another process consuming
more memory than the GL app) to get some of this memory back.

Let's try to limit the amount of BOs we keep in the cache by evicting
entries that have not been used for more than one second (if the app
stopped allocating BOs of this size, it's likely to not allocate
similar BOs in a near future).

This solution is based on the VC4/V3D implementation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 11:26:47 +01:00
Boris Brezillon 25059cc41f panfrost: Move BO cache related fields to a sub-struct
We will soon introduce an LRU list to evict BOs that have been unused
for more than 1 second. Let's first move all BO cache fields to a
sub-struct to clarify which fields are used by the BO caching logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 11:26:47 +01:00
Alyssa Rosenzweig 5f768eda43 pan/midgard: Switch base for vertex texturing on T720
There aren't texture pipeline registers anymore; instead, space is
shared with work and ldst registers for output and input respectively.
We need to shift the base registers to represent this correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig ac14facf7a pan/midgard: Pass shader stage to disassembler
Vertex texturing behaves differently from fragment texturing on some
GPUs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig 515941202d pan/midgard: Disassemble half-steps correctly
The meaning of some bits shifts; we need to account for this to print
swizzles sanely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig ec2af6bc97 pan/midgard: Fix printing of half-registers in texture ops
We were using old style half-registers; let's update that to be
consistent, preparing us for more disassmbler changes in this area.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Kristian H. Kristensen 4a4fad7f40 freedreno/ir3: Use regid() helper when setting up precolor regs
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:46:21 -08:00
Kristian H. Kristensen 3699a74a43 freedreno/a6xx: Turn on tessellation shaders
Wow. Very triangle. So shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 53782571ae freedreno/a6xx: Only use merged regs and four quads for VS+FS
When other geometry stages are present, we chose two quads and no
merged regs.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 07aedc367c freedreno/blitter: Save tessellation state
We have tessellation state now.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen d2d0c8186d freedreno/a6xx: Only set emit.hs/ds when we're drawing patches
At least the gallium blitter helper will call us to draw with
tessellation shaders set but a non-patch primitive.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen e584790885 freedreno: Use bypass rendering for tessellation
It seems like tiling could work in the Adreno architecture, but we've
only ever seen bypass rendering with tessellation.  For now, let's do
that too.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 47e2c19511 freedreno/a6xx: Program state for tessellation stages
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 03a30e7c3d freedreno/a6xx: Emit constant parameters for tessellation stages
Assemble the information the stages need and emit the constants.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 5dd51d2da7 freedreno/a6xx: Allocate and program tessellation buffer
Tessellation needs a couple of buffers that should hold the entire
output from a full VS+TCS draw call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen f0ef3e9697 freedreno/a6xx: Build the right draw command for tessellation
We need to select the right primitive type, set a bit to turn on
tessellation and or in the TES output primitive type.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 7272e8a709 freedreno/ir3: Allocate const space for tessellation parameters
The tessellation stages need size and stride or the patch layout as
well as locations of attributes in the patch.  The tesselation stages
also use two system memory BOs and need the iovas of those.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 8739ea3ab5 freedreno/ir3: Pre-color TCS header and primitive ID inputs
Similar to GS, the registers are shared and not reinitialized betewen
VS and TCS, so we need to make sure to allocate the same registers for
the system values between stages.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen b12ebe3e81 freedreno/ir3: Don't assume binning shader is always VS
In tessellation mode, the TES is (probably) the binning shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 3cedeba7c9 freedreno/ir3: Setup inputs and outputs for tessellation stages
Similar to GS, some inputs are reused when the chsh from VS to TCS or
TES to GS, so we need to make sure we setup the right inputs and make
the shared system values outputs so they don't get clobbered.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen e28fbbd861 freedreno/ir3: Implement TCS synchronization intrinsics
We add two new IR3 specific nir intrinsics that map to the new condend
and endpatch instructions.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen 4915231b8a freedreno/ir3: Implement tess coord intrinsic
Our lowering pass made the z component unused by replacing its uses
by 1 - x - y.  The intrinsic implementation then just need to return
the x and y components.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:08 -08:00
Kristian H. Kristensen e16e48d00c freedreno/ir3: End TES with chsh when using GS
When we have both TES and GS, the TES needs to chain to the VS with
chmask and chsh GS just like the VS does to either TCS or GS.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:05 -08:00
Kristian H. Kristensen 581cd59692 freedreno/ir3: Add new synchronization opcodes
There are two new opcodes in use in tesselation control shaders:
category 0, opcodes 13 and 15.  unk13 is a kill type of instruction
that terminates threads where !p0.x and it used to narrow down a patch
wavefront to just thread 0.  Then, once thread 0 has written the tess
levels, it issues unk15, which might signal the TE that another patch
has been fully written.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:02 -08:00
Kristian H. Kristensen 56ed835bff freedreno/ir3: Extend geometry lowering pass to handle tessellation
VS and TCS pass varyings the same way as VS and GS does. TCS then
writes entire patch to a system memory BO and TES eventually reads
back from the BO once the TE starts generating vertices.  TES outputs
vertices the same way as VS and GS, except when there's a GS as well,
in which case TES passes varyings to GS same way the VS would.

In addition, the TCS needs a little bit of control flow massaging so
that it only runs for valid invocations needs a couple of unknown
instructions to synchronize with the TE.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:59 -08:00
Kristian H. Kristensen 8621fbc37b freedreno/ir3: Add tessellation field to shader key
Whether we're tessellating and which primitives the TES outputs
affects the entire pipeline so let's add a field to the key to track
that.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:56 -08:00
Kristian H. Kristensen 77b96b843e freedreno/ir3: Use imul24 in offset calculations
With the imul24 opcode in place, we can now use it for computing local
offsets (ie for ldlw/stlw).

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:53 -08:00
Kristian H. Kristensen 41984c8422 freedreno/ir3: Add ir3 intrinsics for tessellation
These provide the iovas for system memory buffers used for
tessellation as well as a new HW specific system value.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:50 -08:00
Kristian H. Kristensen d6209a50bb freedreno: Don't count primitives for patches
The gallium helper doesn't like patches and we can't determine how
many primitives it gets tessellated into anyway.  On gens where we
have tessellation, we get the prim count from a HW counter so just
skip counting on the CPU.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:47 -08:00
Kristian H. Kristensen fe450ef4cf freedreno/ir3: Add load and store intrinsics for global io
These intrinsics take a ivec2 for the 64 bit base address and a
integer offset.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:44 -08:00
Kristian H. Kristensen 5d67da13a3 freedreno/ir3: Emit link map as byte or dwords offsets as needed
Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages
that load with ldg (TES) need dwords offsets.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:42 -08:00
Kristian H. Kristensen 1f3b52ce50 freedreno/a6xx: Add register offset for STG/LDG
These instructions take a 64 bit iova as two conescutive registers and
a immediate offset.  This patch adds support for the offset to be a
single register, which is added to the 64 bit iova.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:39 -08:00
Kristian H. Kristensen 3d16ec4a71 freedreno/a6x: Rename z/s formats
What we call eRB6_Z24_UNORM_S8_UINT now is actually
RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually
RB6_Z24_UNORM_S8_UINT.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:36 -08:00
Kristian H. Kristensen 50124afe34 freedreno/a6xx: Fix layered texture type enum
2D array textures and 3D textures are different enum values after all.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:33 -08:00
Kristian H. Kristensen 0276d0766d freedreno: Add nogmem debug option to force bypass rendering
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:31 -08:00
Kristian H. Kristensen 7fed7c2a7d freedreno/a6xx: Clear sysmem with CP_BLIT
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:28 -08:00