Commit Graph

163379 Commits

Author SHA1 Message Date
Gert Wollny be570cd322 r600/sfn: sort FS color outputs before all other outputs
The color outputs must be checked against the number of available
color buffers, therefore it is best to sort the color outputs to be
on the driver locations before the other FS outputs.

Fixes: 79ca456b48
   r600/sfn: rewrite NIR backend

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7530

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19804>
2022-11-19 16:59:26 +00:00
Gert Wollny 85e140aa5c r600: Print RAT instruction names in disassembly
Also print the swizzle of the address to indicate what
values may be used.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19804>
2022-11-19 16:59:26 +00:00
Gert Wollny 684e90b15c r600: Update scratch buffer late
For some reason the setup that comes after the scratch buffer
setup calls clobber the PS output configuration. Emitting the
scratch buffer setup as last action before the actual draw commands
seems to fix this.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19804>
2022-11-19 16:59:26 +00:00
Rob Clark 394d8e4122 freedreno/drm/virtio: Defer flush on BO free
Freeing BOs tends to be bursty (ie. when a submit is retired, or
expiring entries from BO cache).  Sending lots of small SET_IOVA
messages to the host can quickly eat up the available virtqueue
slots, resulting in (eventually) starving the guest waiting for
free virtqueue space.  By batching, we can avoid this and handle
things more efficiently on the host (ie. in a single wakeup rather
than many).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19832>
2022-11-19 16:32:25 +00:00
Rob Clark b4a54824e5 freedreno/drm: Support for batched frees
Batch up handles before closing them to give the drm backend a chance to
batch up any extra handling needed (ie. virtio batching up messages to
host to release IOVA).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19832>
2022-11-19 16:32:25 +00:00
Rob Clark e5a60e1df2 freedreno/drm: Add optimized path for freeing many BOs
Submits tend to hold references to a lot of BOs, which get unref'd when
the submit is destroyed/retired.  For now, all this does is reduce lock
aquire/release, but the next commit will build on it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19832>
2022-11-19 16:32:25 +00:00
Alyssa Rosenzweig d7511ad784 asahi: Add batch tracking logic
We already have the notion of an agx_batch, which encapsulates a render
pass. Extend the logic to allow multiple in-flight batches per context, avoiding
a flush in set_framebuffer_state and improving performance for certain
applications designed for IMRs that ping-pong unnecessarily between FBOs. I
don't have such an application immediately in mind, but I wanted to get this
flag-day out of the way while the driver is still small and flexible.

The driver was written from day 1 with batch tracking in mind, so this is a
relatively small change to actually wire it up, but there are lots of little
details to get right.

The code itself is mostly a copy/paste of panfrost, which in turn draws
inspiration from freedreno and v3d.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Alyssa Rosenzweig de1eb9400f asahi: Use the batch for submission
So we can submit background batches.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Alyssa Rosenzweig 0d3b4ff2aa asahi: Use batch_reads for sysvals
Required for proper resource tracking.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Alyssa Rosenzweig 84f623ae7b asahi: Use a pipe_framebuffer_state batch key
More convenient for batch tracking.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Alyssa Rosenzweig d36c911b7b asahi: Use batch instead of ctx for pipelines
So we can support multiple batches later.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Alyssa Rosenzweig fb7257af4e asahi: Hide ctx->batch
This will make it easier to support multiple batches.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Alyssa Rosenzweig 3104b1aaaf asahi: Factor out prepare_for_map
This will be expanded, let's expand in the direction of less spaghetti.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19865>
2022-11-19 15:33:16 +00:00
Lionel Landwerlin 9c1c1888d9 intel/fs: put scratch surface in the surface state heap
In 4ceaed7839 we made scratch surface state allocations part of the
internal heap (mapped to STATE_BASE_ADDRESS::SurfaceStateBaseAddress)
so that it doesn't uses slots in the application's expected 1M
descriptors (especially with vkd3d-proton).

But all our compiler code relies on BSS
(STATE_BASE_ADDRESS::BindlessSurfaceStateBaseAddress).

The additional issue is that there is only 26bits of surface offset
available in CS instruction (CFE_STATE, 3DSTATE_VS, etc...) for
scratch surfaces. So we need the drivers to put the scratch surfaces
in the first chunk of STATE_BASE_ADDRESS::SurfaceStateBaseAddress
(hence all the driver changes).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4ceaed7839 ("anv: split internal surface states from descriptors")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7687
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
2022-11-19 14:58:58 +00:00
Lionel Landwerlin daab161535 iris: move bindless surface state heap inside the surface state heap
We're about to make scratch surface states part of the surface state
heap. Because those are required to be in the low 26bits parts surface
state heap (we're limited in bits handed in the CFE_STATE, 3DSTATE_VS,
etc... instructions), this change splits the 32bit surface state heap
as follow:

   - 8Mb of surface states for scratch
   - 1Gb - 8Mb of binding tables
   - 3Gb of surface states

That way all of the surfaces are located within a 4Gb region visible
from STATE_BASE_ADDRESS::SurfaceStateBaseAddress

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
2022-11-19 14:58:57 +00:00
Lionel Landwerlin 64f1ae4bc5 iris: prevent crash in decoder
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
2022-11-19 14:58:57 +00:00
Bas Nieuwenhuizen 1b5dc33caa radv: Convert instance bvh address to node in bvh build.
So we don't have to do it in the traversal loop. Should 2 and
instructions and a 64-bit shift, so 4/8 cycles per instance node
visit.

Totals from 7 (0.01% of 134913) affected shaders:

CodeSize: 208460 -> 208292 (-0.08%)
Instrs: 38276 -> 38248 (-0.07%)
Latency: 803181 -> 803142 (-0.00%)
InvThroughput: 165384 -> 165376 (-0.00%)
Copies: 4912 -> 4905 (-0.14%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>
2022-11-19 14:24:36 +00:00
Bas Nieuwenhuizen d09ed23b9a radv: Fiddle with opaque flag positions to reduce instructions.
Totals from 7 (0.01% of 134913) affected shaders:

CodeSize: 209076 -> 208460 (-0.29%)
Instrs: 38374 -> 38276 (-0.26%)
Latency: 803899 -> 803181 (-0.09%)
InvThroughput: 165530 -> 165384 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>
2022-11-19 14:24:36 +00:00
Bas Nieuwenhuizen 3884210902 radv: Skip and for node_to_addr with bvh_base.
Cause the bvh base is always 64 byte aligned.

Totals from 7 (0.01% of 134913) affected shaders:

CodeSize: 209216 -> 209076 (-0.07%)
Instrs: 38402 -> 38374 (-0.07%)
Latency: 804537 -> 803899 (-0.08%)
InvThroughput: 165663 -> 165530 (-0.08%)
Copies: 4919 -> 4912 (-0.14%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>
2022-11-19 14:24:36 +00:00
Bas Nieuwenhuizen 0a26975840 radv: Move ray flag compares out of the loop.
To save on and+cmp combos with VALU instructions.

Totals from 7 (0.01% of 134913) affected shaders:

CodeSize: 208476 -> 209216 (+0.35%)
Instrs: 38384 -> 38402 (+0.05%)
Latency: 805725 -> 804537 (-0.15%)
InvThroughput: 165906 -> 165663 (-0.15%)
Copies: 4936 -> 4919 (-0.34%)
PreSGPRs: 393 -> 430 (+9.41%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>
2022-11-19 14:24:36 +00:00
Lionel Landwerlin e2dadda35f Revert "nir/lower_shader_calls: put inserted instructions into a dummy block"
This reverts commit 35d82ecf1e.

Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19820>
2022-11-19 10:53:18 +00:00
Lionel Landwerlin 3686d5a312 nir/lower_shader_calls: wrap only jumps rather than entire code blocks
Moving entire chunks of code into a dummy if block is causing issues
in some situations. To work around the issue that we tried to fix in
35d82ecf1e ("nir/lower_shader_calls: put inserted instructions into a
dummy block") which is that we cannot cut and past a block of
instruction that ends with a jump if there are more instruction behind
where we're going to past. We can instead just wraps the jumps into
dummy if blocks.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19820>
2022-11-19 10:53:18 +00:00
Lionel Landwerlin 96d84e2a77 nir/lower_shader_calls: update metadata before validation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19820>
2022-11-19 10:53:18 +00:00
Konstantin Seurer 6f45c98b58 radv/bvh: Adjust sah cost based on depth
Adds a cost field to radv_ir_node and uses it to model the cost of tree
depth. This improves framerates by 2% if my benchmarking is correct.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19756>
2022-11-19 10:18:50 +00:00
Ian Romanick 2ba55ec504 nir/range_analysis: Set higher default maximum for max_workgroup_count
Fixes: c2a81ebe19 ("nir: Add default unsigned upper bound configuration.")
Closes: #7676
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19835>
2022-11-19 05:40:42 +00:00
Caio Oliveira d989746e55 iris: Pass devinfo directly in iris_setup_uniforms
Instead of reaching through brw_compiler.  This will make easy
future changes on brw_compiler side.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19836>
2022-11-19 05:15:15 +00:00
Michael Skorokhodov a9602134a3 intel/compiler: Require C++17
Fixes: 6c194ddd18 ("intel/compiler: Prepare SIMD selection helpers to handle different prog_datas")

Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19833>
2022-11-19 04:37:51 +00:00
Alyssa Rosenzweig 11a607dbc8 asahi: Don't support 16-bit vertex attributes
Currently broken, let vbuf deal with it. "Fixes" sysprof.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig 9dddbfeaef asahi: Fix logic ops
Need to set colour mask correctly. Fixes spec@!opengl 1.0@gl-1.0-logicop@GL_AND,
at least the non-MSAA portion.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig a22ed99906 asahi: Restrict rendering to what we support
Noticed with Kodi that tries to use rgb10a2.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig 37617ab09e asahi: Don't validate WSI (twiddled) strides
These are made up and won't necessarily be aligned.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig f328207475 asahi: Split out agx_usc.h into a common file
So the tilebuffer helpers can build the "shared" USC word. Also because Ella
will probably want to use these O:)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig 8be506039d asahi: Note some magic bits used with memoryless RTs
Obviously there can't *actually* be memoryless render targets, because
how would partial renders work? The control stream with memoryless looks
like everything would if it went to memory (e.g. full 2D MSAA
attachments for the partial loads/stores even if only a resolved
2D image for the final store). Except the memoryless attachments all
load from the same address 0xeeee0000. Clearly that's not actually what
happens, so what gives? Unclear... but I see the magic bits mentioned
here set, and I assume there are some firmware (or kernel) shenanigans
used to JIT allocate the backing storage for partial renders.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig 3fa87e47d5 asahi: Identify "Sample mask after depth/stencil" bit
Corresponds to Metal [[sample_mask,post_depth_coverage]].

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:11 +00:00
Alyssa Rosenzweig ff616099ce asahi: Identify the pass type enum
Via PowerVR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 2e6369f5f6 asahi: Identify PBE sample count
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 1f0edc0158 asahi: Identify Dimension for Render Target
Metal uses when rendering to multisampled 2D.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 016a699fa9 asahi: Fix agx_set_framebuffer_state for MRT
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 7e662320aa asahi: Set data_valid for the correct level
By inspection.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 9c52001a1d asahi: Implement stencil texturing
Stencil texturing is easy: S8_UINT is textured like R8_UINT (with a
little swizzle fixup), and stencil is always S8_UINT thanks to
u_transfer_helper. So we just need to do some fixups to make
u_transfer_helper's seperate_stencil work and everything will work out.

Passes dEQP-GLES31.functional.stencil_texturing.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 1ffbd53aa2 asahi: Add internal formats for RGB10A2
We need to use I16 as the interchange format here. Fixes:

   dEQP-GLES3.functional.fragment_out.basic.uint.rgb10_a2ui*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig efb5aef935 asahi: Implement perf_debug
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig c8e520985b asahi: Free the scanout resource
Fixes memory leaks with renderonly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 6a12d793d8 agx: Handle collects in backwards isel
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 3b9d271646 agx: Assert more invariants in RA
Was helpful for debugging.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig c2159ce9e4 agx: Validate part of SSA form
To debug backend pass problems.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 1110fcccc2 agx: Split off NIR preprocessing from compiling
So we can specialize after lowering I/O.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 972354b5fd agx: Handle scalar texture destinations
Fixes dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2dshadow_fragment.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig a92fb4f38c agx: Don't depend on GenXML
Separation of concerns, unused #include.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00
Alyssa Rosenzweig 3789dba5f6 agx: Lower packs/unpacks and bitfields
Needed for GLES3. These could be optimized.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19811>
2022-11-19 04:27:10 +00:00