Commit Graph

282 Commits

Author SHA1 Message Date
Marek Olšák 194d9b27cc radeonsi/gfx9: allow the scratch buffer in HS and GS
It works now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák 8ac4923a67 radeonsi: prevent race conditions when doing scratch patching
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák 9dfc030b48 radeonsi: separate scratch state patching code into its own function
Picked from a different branch. When we stop using the scratch patching,
this function will not be called.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák 1b01014cbf radeonsi/gfx9: also apply scratch relocations to the 1st shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák a47289f8fc radeonsi: remove unused parameters from si_shader_apply_scratch_relocs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák f466683cb0 radeonsi/gfx9: fix gl_ViewportIndex
v2: remove unnecessary LLVMBuildAnd calls

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 22:58:27 +02:00
Marek Olšák 4e50062028 radeonsi: pass tessellation ring addresses via user SGPRs
This removes s_load_dword latency for tess rings.

We need just 1 SGPR for the address if we use 64K alignment. The final asm
for recreating the descriptor is:

    // s2 is (address >> 16)
    s_mov_b32 s3, 0
    s_lshl_b64 s[4:5], s[2:3], 16
    s_mov_b32 s6, -1
    s_mov_b32 s7, 0x27fac

v2: bitcast the descriptor type from v2i64 to v4i32

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 9fd9a7d0ba radeonsi: remove VS epilog code, compile VS with PrimID export on demand
The use of PrimID in the pixel shader is too rare to deserve such
a sizable support code.

The initial idea of the VS epilog was to move the clipping code there and
remove it based on states, but optimized variants are now used to do that
and are easier to support, so the VS epilog has turned out to be not so
useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 3b2e93e472 radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 678d568c7b radeonsi: don't load PrimID in TES if it's not used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák ed9a51cd3b radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák ef40937854 radeonsi: add reference counting for shader selectors
The 2nd shader of merged shaders should take a reference of the 1st shader.
The next commit will do that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 6c15e15af4 radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 887ef1de34 radeonsi/gfx9: set TES registers for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 49cd0cbfd5 radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS
not implemented yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 2857b14bba radeonsi/gfx9: always compile monolithic ES-GS (asynchronously)
In addition to the non-monolithic variant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 6a9c20fdd5 radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 8b220877ad radeonsi/gfx9: set registers and shader key for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák ab197ad8d1 radeonsi/gfx9: add GS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 067dacd1b1 radeonsi/gfx9: define and set LS-HS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 0588146cb0 radeonsi/gfx9: set up shader registers for merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 62abdb17bb radeonsi/gfx9: add initial code generation for non-monolithic merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák b1ed3ffc56 radeonsi: separate out VS prolog key generation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Dave Airlie e2659176ce radeonsi/ac: move vertex export remove to common code.
This code can be shared by radv, we bump the max to
VARYING_SLOT_MAX here, but that shouldn't have too
much fallout.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:17:47 +01:00
Marek Olšák 96b0cfc82e radeonsi: turn si_shader_key::mono into a non-union
A merged LS-HS shader needs both fix_fetch and inputs_to_copy
for compilation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák 3f2a0649ab radeonsi: adjust ESGS ring buffer size computation on VI
Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák 60a20e6879 radeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák bd2cde0c25 radeonsi: add si_shader_selector::vs_needs_prolog
cleanup

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák 777f305840 radeonsi: don't set VGT_GS_MODE as part of the GS state
The VS state sets it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Nicolai Hähnle d6588d9962 radeonsi: cope with missing disassembly
For robustness and testing purposes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-14 22:51:07 +02:00
Nicolai Hähnle 472c84d1ad radeonsi: provide VS_STATE input to all VS variants
v2: fix incorrect change in get_tcs_out_patch_stride

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:20 +02:00
Marek Olšák 283c31afa1 radeonsi: unify HS max_offchip_buffers workarounds
Vulkan doesn't set more than 508.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák 172b05a37e radeonsi/gfx9: don't generate LS and ES states
these shaders don't exist on GFX9

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák 5271d12a6e radeonsi/gfx9: trivial shader and ring changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák 6d21fd51b6 radeonsi/gfx9: disable RB+ on Vega10
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák c9b004af58 radeonsi/gfx9: handle GFX9 in a few places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák 518d834162 radeonsi: don't hang on shader compile failure
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-24 18:25:05 +01:00
Grazvydas Ignotas 529a767041 util/disk_cache: use a helper to compute cache keys
This will allow to hash additional data into the cache keys or even
change the hashing algorithm easily, should we decide to do so.

v2: don't try to compute key (and crash) if cache is disabled

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:15:52 +11:00
Marek Olšák e9c6953ddb radeonsi: require that compiler threads are enabled
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Timothy Arceri 628e84a58f gallium/util: replace pipe_mutex_unlock() with mtx_unlock()
pipe_mutex_unlock() was made unnecessary with fd33a6bcd7.

Replaced using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_unlock(\([^)]*\)):mtx_unlock(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:53:05 +11:00
Timothy Arceri ba72554f3e gallium/util: replace pipe_mutex_lock() with mtx_lock()
replace pipe_mutex_lock() was made unnecessary with fd33a6bcd7.

Replaced using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_lock(\([^)]*\)):mtx_lock(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:38 +11:00
Timothy Arceri be188289e1 gallium/util: replace pipe_mutex_destroy() with mtx_destroy()
pipe_mutex_destroy() was made unnecessary with fd33a6bcd7.

Replace was done with:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_destroy(\([^)]*\)):mtx_destroy(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:16 +11:00
Timothy Arceri 75b47dda0c gallium/util: replace pipe_mutex_init() with mtx_init()
pipe_mutex_init() was made unnecessary with fd33a6bcd7.

Replace was done using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_init(\([^)]*\)):(void) mtx_init(\&\1, mtx_plain):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:07 +11:00
Timothy Arceri 6084855528 radeonsi: add support for an on-disk shader cache
V2:
- when loading from disk cache also binary insert into memory cache.
- check that the binary loaded from disk is the correct size. If not
  delete the cache item and skip loading from cache.

V3:
- remove unrequired variable

Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-03 12:09:08 +11:00
Marek Olšák 35915af6c9 radeonsi: fix broken tessellation on Carrizo and Stoney
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-25 00:03:09 +01:00
Marek Olšák 24847dd1b5 gallium/u_queue: isolate util_queue_fence implementation
it's cleaner this way.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 20:26:39 +01:00
Marek Olšák 63c462226e radeonsi: fix issues with monolithic shaders
R600_DEBUG=mono has had no effect since:

    commit 1fabb29717
    Author: Marek Olšák <marek.olsak@amd.com>
    Date:   Tue Feb 14 22:08:32 2017 +0100

    radeonsi: have separate LS and ES main shader parts in the shader selector

Also, this assertion was failing:
    si_state_shaders.c:1307: si_shader_select_with_key: Assertion
    `!shader->is_optimized' failed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák 84e72f2962 radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read them
We were unconditionally storing these outputs, sometimes even one component
at a time, but apps never read them in TES.

Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can
easily disable them on demand.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Nicolai Hähnle 066a117be7 radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIK
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.

Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-21 10:45:13 +01:00
Marek Olšák 45240ce598 radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00