Commit Graph

336 Commits

Author SHA1 Message Date
Nicolai Hähnle 7091fe887b radeonsi: use SI_MAX_IO_GENERIC instead of magic values
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-12 10:46:04 +02:00
Nicolai Hähnle cb2ac69628 radeonsi: split per-patch from per-vertex indices
Make it a bit clearer that the index spaces are logically seperate by
having them defined in different functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle b84b631c63 radeonsi: load patch_id for TES-as-ES when exporting for PS
For some reason, this change is only necessary on SI.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle 0549ea15ec radeonsi: fix primitive ID in fragment shader when using tessellation
In a VS->TCS->TES->PS pipeline, the primitive ID is read from TES exports,
so it is as if TES were using the primitive ID.

Specifically, this fixes a bug where the primitive ID is not reset at
the start of a new instance.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Marek Olšák 194d9b27cc radeonsi/gfx9: allow the scratch buffer in HS and GS
It works now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák 8ac4923a67 radeonsi: prevent race conditions when doing scratch patching
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák 9dfc030b48 radeonsi: separate scratch state patching code into its own function
Picked from a different branch. When we stop using the scratch patching,
this function will not be called.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák 1b01014cbf radeonsi/gfx9: also apply scratch relocations to the 1st shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák a47289f8fc radeonsi: remove unused parameters from si_shader_apply_scratch_relocs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák f466683cb0 radeonsi/gfx9: fix gl_ViewportIndex
v2: remove unnecessary LLVMBuildAnd calls

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 22:58:27 +02:00
Marek Olšák 4e50062028 radeonsi: pass tessellation ring addresses via user SGPRs
This removes s_load_dword latency for tess rings.

We need just 1 SGPR for the address if we use 64K alignment. The final asm
for recreating the descriptor is:

    // s2 is (address >> 16)
    s_mov_b32 s3, 0
    s_lshl_b64 s[4:5], s[2:3], 16
    s_mov_b32 s6, -1
    s_mov_b32 s7, 0x27fac

v2: bitcast the descriptor type from v2i64 to v4i32

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 9fd9a7d0ba radeonsi: remove VS epilog code, compile VS with PrimID export on demand
The use of PrimID in the pixel shader is too rare to deserve such
a sizable support code.

The initial idea of the VS epilog was to move the clipping code there and
remove it based on states, but optimized variants are now used to do that
and are easier to support, so the VS epilog has turned out to be not so
useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 3b2e93e472 radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 678d568c7b radeonsi: don't load PrimID in TES if it's not used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák ed9a51cd3b radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák ef40937854 radeonsi: add reference counting for shader selectors
The 2nd shader of merged shaders should take a reference of the 1st shader.
The next commit will do that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 6c15e15af4 radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 887ef1de34 radeonsi/gfx9: set TES registers for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 49cd0cbfd5 radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS
not implemented yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 2857b14bba radeonsi/gfx9: always compile monolithic ES-GS (asynchronously)
In addition to the non-monolithic variant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 6a9c20fdd5 radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 8b220877ad radeonsi/gfx9: set registers and shader key for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák ab197ad8d1 radeonsi/gfx9: add GS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 067dacd1b1 radeonsi/gfx9: define and set LS-HS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 0588146cb0 radeonsi/gfx9: set up shader registers for merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák 62abdb17bb radeonsi/gfx9: add initial code generation for non-monolithic merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák b1ed3ffc56 radeonsi: separate out VS prolog key generation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Dave Airlie e2659176ce radeonsi/ac: move vertex export remove to common code.
This code can be shared by radv, we bump the max to
VARYING_SLOT_MAX here, but that shouldn't have too
much fallout.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:17:47 +01:00
Marek Olšák 96b0cfc82e radeonsi: turn si_shader_key::mono into a non-union
A merged LS-HS shader needs both fix_fetch and inputs_to_copy
for compilation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák 3f2a0649ab radeonsi: adjust ESGS ring buffer size computation on VI
Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák 60a20e6879 radeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák bd2cde0c25 radeonsi: add si_shader_selector::vs_needs_prolog
cleanup

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák 777f305840 radeonsi: don't set VGT_GS_MODE as part of the GS state
The VS state sets it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Nicolai Hähnle d6588d9962 radeonsi: cope with missing disassembly
For robustness and testing purposes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-14 22:51:07 +02:00
Nicolai Hähnle 472c84d1ad radeonsi: provide VS_STATE input to all VS variants
v2: fix incorrect change in get_tcs_out_patch_stride

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:20 +02:00
Marek Olšák 283c31afa1 radeonsi: unify HS max_offchip_buffers workarounds
Vulkan doesn't set more than 508.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák 172b05a37e radeonsi/gfx9: don't generate LS and ES states
these shaders don't exist on GFX9

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák 5271d12a6e radeonsi/gfx9: trivial shader and ring changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák 6d21fd51b6 radeonsi/gfx9: disable RB+ on Vega10
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák c9b004af58 radeonsi/gfx9: handle GFX9 in a few places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák 518d834162 radeonsi: don't hang on shader compile failure
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-24 18:25:05 +01:00
Grazvydas Ignotas 529a767041 util/disk_cache: use a helper to compute cache keys
This will allow to hash additional data into the cache keys or even
change the hashing algorithm easily, should we decide to do so.

v2: don't try to compute key (and crash) if cache is disabled

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:15:52 +11:00
Marek Olšák e9c6953ddb radeonsi: require that compiler threads are enabled
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Timothy Arceri 628e84a58f gallium/util: replace pipe_mutex_unlock() with mtx_unlock()
pipe_mutex_unlock() was made unnecessary with fd33a6bcd7.

Replaced using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_unlock(\([^)]*\)):mtx_unlock(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:53:05 +11:00
Timothy Arceri ba72554f3e gallium/util: replace pipe_mutex_lock() with mtx_lock()
replace pipe_mutex_lock() was made unnecessary with fd33a6bcd7.

Replaced using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_lock(\([^)]*\)):mtx_lock(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:38 +11:00
Timothy Arceri be188289e1 gallium/util: replace pipe_mutex_destroy() with mtx_destroy()
pipe_mutex_destroy() was made unnecessary with fd33a6bcd7.

Replace was done with:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_destroy(\([^)]*\)):mtx_destroy(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:16 +11:00
Timothy Arceri 75b47dda0c gallium/util: replace pipe_mutex_init() with mtx_init()
pipe_mutex_init() was made unnecessary with fd33a6bcd7.

Replace was done using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_init(\([^)]*\)):(void) mtx_init(\&\1, mtx_plain):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:07 +11:00
Timothy Arceri 6084855528 radeonsi: add support for an on-disk shader cache
V2:
- when loading from disk cache also binary insert into memory cache.
- check that the binary loaded from disk is the correct size. If not
  delete the cache item and skip loading from cache.

V3:
- remove unrequired variable

Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-03 12:09:08 +11:00
Marek Olšák 35915af6c9 radeonsi: fix broken tessellation on Carrizo and Stoney
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-25 00:03:09 +01:00
Marek Olšák 24847dd1b5 gallium/u_queue: isolate util_queue_fence implementation
it's cleaner this way.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 20:26:39 +01:00
Marek Olšák 63c462226e radeonsi: fix issues with monolithic shaders
R600_DEBUG=mono has had no effect since:

    commit 1fabb29717
    Author: Marek Olšák <marek.olsak@amd.com>
    Date:   Tue Feb 14 22:08:32 2017 +0100

    radeonsi: have separate LS and ES main shader parts in the shader selector

Also, this assertion was failing:
    si_state_shaders.c:1307: si_shader_select_with_key: Assertion
    `!shader->is_optimized' failed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák 84e72f2962 radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read them
We were unconditionally storing these outputs, sometimes even one component
at a time, but apps never read them in TES.

Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can
easily disable them on demand.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Nicolai Hähnle 066a117be7 radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIK
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.

Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-21 10:45:13 +01:00
Marek Olšák 45240ce598 radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 1fabb29717 radeonsi: have separate LS and ES main shader parts in the shader selector
This might reduce the on-demand compilation if the initial VS/LS/ES
determination is wrong.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák a02117ba6e radeonsi: don't compile pure monolithic shaders asynchronously
there is no point, we have to wait anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 41a2157a68 radeonsi: make fix_fetch an array of uint8_t
so that we can add 3-component fallbacks.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 408f9a1584 radeonsi: atomize the scratch buffer state
The update frequency is very low.

Difference: Only account for the size when allocating a new one and when
            starting a new IB, and check for NULL. (v3)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:29:36 +01:00
Marek Olšák 5f99c49008 radeonsi: precompute IA_MULTI_VGT_PARAM values into a table
The perf difference is very small: 0.99% -> 0.40% for the time spent
in si_get_ia_multi_vgt_param when si_draw_vbo is 20%. Pretty much nothing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák c78177fc64 radeonsi: move VGT_VERTEX_REUSE_BLOCK_CNTL into shader states for Polaris
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák 802fcdc0d2 radeonsi: atomize L2 prefetches
to move the big conditional statement out of draw_vbo

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák c99ba3eb47 radeonsi: unbind disabled shader stages to prevent useless L2 prefetches
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák 35cd7551a4 radeonsi: use the correct target machine when building shader variants
If the shader selector is created with a different context than
the shader variant, we should use the calling context's target machine
for the shader variant.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-18 19:51:31 +01:00
Marek Olšák 3ae3be6dd4 radeonsi: move shader pipe context state into a separate structure
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-18 19:51:31 +01:00
Nicolai Hähnle 5e94e5bb9b radeonsi: fix R600_DEBUG=nooptvariant
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
2017-01-16 20:16:18 +01:00
Marek Olšák 44e9b67229 radeonsi: make fix_fetch 64-bit
v2: add u_bit_consecutive64

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-16 18:07:08 +01:00
Marek Olšák 6f356d15be radeonsi: cleanly communicate whether si_shader_dump should check R600_DEBUG
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-09 12:01:30 +01:00
Marek Olšák 4b93ba542c radeonsi: assume that a TES without POSITION precedes GS
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-06 21:05:48 +01:00
Marek Olšák 3753dc896d radeonsi: update clip_regs if clip_disable changes to fix a hang
This seems to fix the GPU hangs caused by:

commit ed3190b3f3
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Sun Nov 13 18:41:43 2016 +0100

    radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99219

Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-01-05 14:01:18 +01:00
Nicolai Hähnle ec0a0a60cc radeonsi: shrink the GSVS ring to account for the reduced item sizes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-12 09:05:17 +01:00
Nicolai Hähnle 6fdef7d265 radeonsi: shrink each vertex stream to the actually required size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-12 09:05:13 +01:00
Nicolai Hähnle 2f2e941e2d radeonsi: use a single descriptor for the GSVS ring
We can hardcode all of the fields for swizzling in the geometry shader.

The advantage is that we use fewer descriptor slots and we no longer have to
update any of the (ring) descriptors when the geometry shader changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-12 09:05:05 +01:00
Nicolai Hähnle 7b5b3d63c5 radeonsi: update all GSVS ring descriptors for new buffer allocations
Fixes GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_geometry_instanced.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-12 09:04:06 +01:00
Marek Olšák e5302ad936 radeonsi: add a debug flag that disables optimized shader variants
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-23 18:49:10 +01:00
Marek Olšák 86514d84e0 util: import CRC32 implementation from gallium
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-11-22 18:05:51 +01:00
Marek Olšák bf75ef3f92 radeonsi: remove all varyings for depth-only rendering or rasterization off
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák ef6c84b301 radeonsi: eliminate VS outputs that aren't used by PS at runtime
A past commit added the ability to compile "optimized" shader variants
asynchronously (not stalling the app).

This commit builds upon that and adds what is basically a runtime shader
linker. If a VS output isn't used by the currently-bound PS, a new VS
compilation is started without that output. The new shader variant
is used when it's ready.

All apps using separate shader objects I've seen had unused VS outputs.

Eliminating unused/useless VS outputs also eliminates the corresponding
vertex attribute loads.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák 7e76f9a7a8 radeonsi: record information about all written and read varyings
It's just tgsi_shader_info with DEFAULT_VAL varyings removed.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák c7f3e5c647 radeonsi: make si_shader_io_get_unique_index stricter
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák ed3190b3f3 radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled
This is the first user of optimized monolithic shader variants.

Cull distances can't be disabled by states.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák d984a324bf radeonsi: add infrastr. for compiling optimized shader variants asynchronously
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák d2a56985d7 radeonsi: don't set vs.epilog.export_prim_id if TES is bound
there is no VS epilog in this case

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák fee71fec25 radeonsi: simplify checking for monolithic compilation
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák 6d5c2a8b5c radeonsi: split the shader key into 3 logical parts
key->part.*: prolog and epilog flags only
key->as_{ls,es}: special flags
key->mono.*: flags for monolithic compilation only

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Marek Olšák e59389d738 radeonsi: assume that a VS without POSITION is LS
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-21 21:44:35 +01:00
Nicolai Hähnle 2c875158e2 radeonsi: fix vertex fetches for 2_10_10_10 formats
The hardware always treats the alpha channel as unsigned, so add a shader
workaround. This is rare enough that we'll just build a monolithic vertex
shader.

The SINT case cannot actually happen in OpenGL, but I've included it for
completeness since it's just a mix of the other cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-04 21:30:18 +01:00
Nicolai Hähnle 908f92ad1f radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation
Fixes GL45-CTS.geometry_shader.adjacency.adjacency_indiced_triangle_strip and
others.

This leaves the case of triangle strips with adjacency and primitive restarts
open. It seems that the only thing that cares about that is a piglit test.
Fixing this efficiently would be really involved, and I don't want to use the
hammer of degrading to software handling of indices because there may well
be software that uses this draw mode (without caring about the precise
rotation of triangles).

v2:
- skip the GS prolog entirely if workaround is not needed
- only check for TES (TES is always non-null when tessellation is used)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-03 10:11:24 +01:00
Nicolai Hähnle 3b2516721b radeonsi: make the GS copy shader owned by the GS selector
The copy shader only depends on the selector. This change avoids creating
separate code paths for monolithic vs. non-monolithic geometry shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-03 10:07:50 +01:00
Nicolai Hähnle 9c6f7d66dc radeonsi: si_shader_vs only depends on the GS selector
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-03 10:07:48 +01:00
Nicolai Hähnle 693435d846 radeonsi: si_vgt_gs_mode only depends on the selector
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-03 10:07:45 +01:00
Marek Olšák d268b7f95e radeonsi: add a driver query for shader cache hits
This is an 8-month old patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01 22:33:13 +01:00
Marek Olšák ad45dce4a2 radeonsi: remove si_resource_create_custom
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-26 13:02:58 +02:00
Marek Olšák 29144d0f34 gallium/radeon: stop using PIPE_BIND_CUSTOM
it has no effect whatsoever

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-26 13:02:58 +02:00
Marek Olšák 3ec9975555 radeonsi: eliminate trivial constant VS outputs
These constant value VS PARAM exports:
- 0,0,0,0
- 0,0,0,1
- 1,1,1,0
- 1,1,1,1
can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports
can be removed from the IR to save export & parameter memory.

After LLVM optimizations, analyze the IR to see which exports are equal to
the ones listed above (or undef) and remove them if they are.

Targeted use cases:
- All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them
  are unused by PS (such as Witcher 2 below).
- VS output arrays with unused elements that the GLSL compiler can't
  eliminate (such as Batman below).

The shader-db deltas are quite interesting:
(not from upstream si-report.py, it won't be upstreamed)

PERCENTAGE DELTAS    Shaders PARAM exports (affected only)
batman_arkham_origins    589  -67.17 %
bioshock-infinite       1769   -0.47 %
dirt-showdown            548   -2.68 %
dota2                   1747   -3.36 %
f1-2015                  776   -4.94 %
left_4_dead_2           1762   -0.07 %
metro_2033_redux        2670   -0.43 %
portal                   474   -0.22 %
talos_principle          324   -3.63 %
warsow                   176   -2.20 %
witcher2                1040  -73.78 %
----------------------------------------
All affected             991  -65.37 %  ... 9681 -> 3353
----------------------------------------
Total                  26725  -10.82 %  ... 58490 -> 52162

v2: treat Undef as both 0 and 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
2016-10-19 22:21:46 +02:00
Marek Olšák a2ea653a49 radeonsi: remove cb0_is_integer handling
st/mesa does this for us.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 7dddf0b7ab radeonsi: adjust and clean up Z_ORDER and EXEC_ON_x settings
The table was copied from the Vulkan driver. The comment lines are as long
as the table for cosmetic reasons.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Marek Olšák e12c1cab5d radeonsi: disable ReZ
This is a serious performance fix. Discovered by luck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Marek Olšák e4bbab9022 radeonsi: fix R600_DEBUG=precompile for shader-db
radeonsi no longer supports pixel shaders without interpolation optimizations,
which led to assertion failures in si_shader_ps when running shader-db.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 18:29:40 +02:00
Marek Olšák 300a8221e9 radeonsi: add assertions to validate interpolation flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-05 21:03:23 +02:00
Marek Olšák 8c6ea5a6ff radeonsi: remove unnecessary #includes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:07 +02:00