Commit Graph

448 Commits

Author SHA1 Message Date
Nicolai Hähnle 40b12c0f5a radeonsi/gfx10: implement si_update_shaders
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 5726ec0d24 radeonsi/gfx10: implement si_build_vgt_shader_config
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle b45c3debe8 radeonsi/gfx10: keep track of whether NGG is used
We always use NGG by default, except when tessellation is enabled with
extreme geometry shader amplification.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 74a26af913 amd/common/gfx10: add register JSON
A small number of fields now need new disambiguation.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 610e1a81f7 radeonsi: refactor si_update_vgt_shader_config
We'll have to extend this at some point, and using a bitfield union in
this way makes it easier to get the right index without excessive
branching.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-24 21:04:10 -04:00
Nicolai Hähnle b519ddc35c radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9
This will make it easier to use LDS for other purposes in geometry
shaders in the future.

The lifetime of the esgs_ring variable is as follows:
- declared as [0 x i32] while compiling shader parts or monolithic shaders
- just before uploading, gfx9_get_gs_info computes (among other things)
  the final ESGS ring size (this depends on both the ES and the GS shader)
- during upload, the "esgs_ring" symbol is given to ac_rtld as a shared
  LDS symbol, which will lead to correctly laying out the LDS including
  other LDS objects that may be defined in the future
- si_shader_gs uses shader->config.lds_size as the LDS size

This change depends on the LLVM changes for emitting LDS symbols into
the ELF file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle f8315ae04b amd/rtld: layout and relocate LDS symbols
Upcoming changes to LLVM will emit LDS objects as symbols in the ELF
symbol table, with relocations that will be resolved with this change.

Callers will also be able to define LDS symbols that are shared between
shader parts. This will be used by radeonsi for the ESGS ring in gfx9+
merged shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle bf8a1ca902 radeonsi: use the new run-time linker for shaders
v2:
- fix a memory leak

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle bf11c594dd radeonsi: return bool from si_shader_binary_upload
We didn't really use error codes anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Nicolai Hähnle 8b1343ca79 radeonsi: let si_shader_create return a boolean
We didn't really use error codes anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Marek Olšák 993bf52977 radeonsi: always interpolate PrimID as flat 2019-06-11 20:05:21 -04:00
Nicolai Hähnle f480b8aaa4 amd/common: use generated register header 2019-06-03 20:05:20 -04:00
Marek Olšák c9b7a37b8f radeonsi: cull primitives with async compute for large draw calls
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:34 -04:00
Marek Olšák b206f007de radeonsi: make some functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák 301344008f radeonsi: allow si_shader_select_with_key to return an optimized shader or fail
If a prim discard compute shader hasn't finished compilation, we don't want
to any shader.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák ccfcb9d818 ac: rename SI-CIK-VI to GFX6-GFX7-GFX8
Acked-by: Dave Airlie <airlied@redhat.com>

We already use GFX9 and I don't want us to have confusing naming
in the driver. GFXn naming is better from the driver perspective,
because it's the real version of the gfx portion of the hw. Also,
CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI.

It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have
nothing to do with GFXn and they have their own version numbers.
2019-05-15 20:54:10 -04:00
Nicolai Hähnle d814c21b1b radeonsi: overhaul the vertex fetch fixup mechanism
The overall goal is to support unaligned loads from vertex buffers
natively on SI.

In the unaligned case, we fall back to the general case implementation in
ac_build_opencoded_load_format. Since this function is fully general,
we will also use it going forward for cases requiring fully manual format
conversions of dwords anyway.

This requires a different encoding of the fix_fetch array, which will now
contain the entire format information if a fixup is required.

Having to check the alignment of vertex buffers is awkward. To keep the
impact on the fast path minimal, the si_context will keep track of which
vertex buffers are (not) at least dword-aligned, while the
si_vertex_elements will note which vertex buffers have some (at most dword)
alignment requirement. Vertex buffers should be dword-aligned most of the
time, which allows a fast early-out in almost all cases.

Add the radeonsi_vs_fetch_always_opencode configuration variable for
testing purposes. Note that it can only be used reliably on LLVM >= 9,
because support for byte and short load is required.

v2:
- add a missing check to si_bind_vertex_elements

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-13 17:07:23 +02:00
Nicolai Hähnle 8a951c3d2f radeonsi: store sctx->vertex_elements in a local in si_shader_selector_key_vs
Purely as a shorthand in the remainder of the function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-13 17:07:23 +02:00
Timothy Arceri 90f3bf7437 radeonsi/nir: call radeonsi nir opts before the scan pass
Some of the opts are not called in the general optimastion loop
in the state trackers glsl -> nir conversion. We need to call
the radeonsi specific optimisation once before scanning over
the nir otherwise we can end up gathering info on code that
is later removed.

Fixes an assert in the piglit test:

./bin/varying-struct-centroid_gles3

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-01 09:41:07 +10:00
Marek Olšák 440135e5a0 radeonsi/gfx9: rework the gfx9 scissor bug workaround (v2)
Needed to track context rolls caused by streamout and ACQUIRE_MEM.
ACQUIRE_MEM can occur outside of draw calls.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110355

v2: squashed patches and done more rework

Cc: 19.0 <mesa-stable@lists.freedesktop.org>
2019-04-25 11:49:38 -04:00
Nicolai Hähnle 9445a4ab43 radeonsi: add radeonsi_sync_compile option
Force the driver thread to sync immediately with a compiler thread (but
compilation still happens in a separate thread).

This can be useful to simplify debugging compiler issues.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-25 12:35:29 +02:00
Timothy Arceri e907337fad radeonsi/nir: move si_lower_nir() call into compiler thread
This helps improve compile times. For example the shader-db dolphin
shader shaders/dolphin/ubershaders/120.shader_test goes from
~1.69 -> ~1.57 seconds on my machine with this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-02-28 11:54:06 +11:00
Marek Olšák 501ff90a95 radeonsi: rename r600_resource -> si_resource
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 13:32:18 -05:00
Marek Olšák caa2dcd730 radeonsi: fix a u_blitter crash after a shader with FBFETCH
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Timothy Arceri 2817a4ec0b radeonsi: remove unrequired param in si_nir_scan_tess_ctrl()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-02 10:01:15 +11:00
Nicolai Hähnle 6e67e79de4 radeonsi: const-ify si_set_tesseval_regs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:42 +01:00
Samuel Pitoiset 3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Marek Olšák fcc70e4855 radeonsi: track context rolls better for the Vega scissor bug workaround
We should get fewer context rolls with the SET_CONTEXT_REG optimization,
but it would have been for nothing if the scissor state rolled the context
anyway. Don't emit the scissor state if there is no context roll.
2018-10-16 17:23:25 -04:00
Sonny Jiang 084cf3b966 radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuse
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang ce1d72609d radeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang 4de328da07 radeonsi:optimizing SET_CONTEXT_REG for shaders PS
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang f243980f2c radeonsi:optimizing SET_CONTEXT_REG for shaders VS
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang 4052624398 radeonsi:optimizing SET_CONTEXT_REG for shaders GS
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 19:04:13 -04:00
Sonny Jiang eeb9170599 radeonsi: optimizing SET_CONTEXT_REG for shaders ES
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-05 17:53:52 -04:00
Marek Olšák 0b062f0419 radeonsi: don't set the VS prolog key for the blit VS 2018-10-02 12:21:49 -04:00
Marek Olšák 5693ca865d radeonsi: bump MAX_GS_INVOCATIONS
same as the closed driver

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák ac72a6bd0b radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.c
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:20:31 -04:00
Marek Olšák 86b52d4236 radeonsi: reduce LDS stalls by 40% for tessellation
40% is the decrease in the LGKM counter (which includes SMEM too)
for the GFX9 LSHS stage.

This will make the LDS size slightly larger, but I wasn't able to increase
the patch stride without corruption, so I'm increasing the vertex stride.
2018-07-23 20:23:52 -04:00
Sonny Jiang c6737756ad radeonsi: emit_spi_map packets optimization
v2: marek: remove an empty line before break;
    rename reg_val_seq -> spi_ps_input_cntl
    "type * x" -> "type *x"

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-07-20 13:50:26 -04:00
Dave Airlie 0eb65b4944 radeonsi: rename si_compiler -> ac_llvm_compiler
As precursor to moving init to common code, just rename the struct
and move it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:32 +10:00
Marek Olšák 1542169a4a radeonsi: enable shader caching for compute shaders
Compute shaders were not using the shader cache.
2018-06-28 22:27:25 -04:00
Marek Olšák d13f240269 radeonsi: unify duplicated code for initial shader compilation 2018-06-28 22:27:25 -04:00
Marek Olšák f154555733 radeonsi: clean up passing the is_monolithic flag for compilation
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák 6703fec58c amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-19 13:08:50 -04:00
Marek Olšák 22e994bb75 radeonsi: assume that rasterizer state is non-NULL in draw_vbo
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:36 -04:00
Marek Olšák f3b3ee6974 radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:34 -04:00
Marek Olšák 28ee825e19 radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs
same as amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:23 -04:00
Marek Olšák 99e0ba6868 radeonsi: record CLIPVERTEX output usage properly for compatibility profiles
This was missed when adding CLIPVERTEX support into GS & tess.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:20 -04:00
Marek Olšák 2f65c67043 radeonsi: fix passing gl_ClipVertex for GS and tess
Also add the fprintf call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák a7d61c0753 radeonsi: fix color inputs/outputs for GS and tess
GS is tested, tessellation is untested.

Have outputs_written_before_ps for HW VS and outputs_written for other
stages. The reason is that COLOR and BCOLOR alias for HW VS, which
drives elimination of VS outputs based on PS inputs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00