Commit Graph

348 Commits

Author SHA1 Message Date
Marek Olšák 1378487fb4 radeonsi: rename and rearrange RW buffer slots
- use an enum
- use a unique slot number regardless of the shader stage
  (the per-stage slots will go away for RW buffers)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Bas Nieuwenhuizen 38f4cee3ff radeonsi: Add config parameter to si_shader_apply_scratch_relocs.
shader->config is not updated for compute kernels.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-21 19:36:19 +02:00
Marek Olšák ffe44d0283 radeonsi: fold num_user_sgprs where it is possible
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák 51c4034f9b radeonsi: fix SGPRS calculation once more
This fixes GS piglit failures after adding SI_PARAM_SHADER_BUFFERS,
which bumped NUM_USER_SGPRS and uncovered this bug on SI.

If this was fixed in LLVM, these workarounds wouldn't be needed.

LLVM would have to look at the calling convention to know how many SGPR
inputs are declared, and add VCC and the scratch wave offset (which is
enabled even if we spill SGPRs but not VGPRs, oh well).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák 2ca5566ed7 radeonsi: move scissor and viewport states into gallium/radeon
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:13:24 +02:00
Nicolai Hähnle 6f942ac5ee radeonsi: disable early Z if the fragment shader writes to memory
Empirically, both the EXEC_ON_* flags and LATE_Z are necessary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Marek Olšák a73a657def radeonsi: process TGSI property NEXT_SHADER
This allows compiling the main shader part as ES or LS.

If we get the correct hint, non-separable GLSL shaders no longer have to be
compiled as VS first, followed by LS or ES compiled on demand.

The result is that fewer shaders are compiled by piglit, but it doesn't
improve piglit running time.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-19 23:20:01 +01:00
Nicolai Hähnle 4de25fa7b0 radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:59 -05:00
Marek Olšák d0f3b524cd radeonsi: use re-Z
This can increase perf for shaders that kill pixels (kill, alpha-test,
alpha-to-coverage).

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-01 00:18:19 +01:00
Marek Olšák ff360a52e6 radeonsi: implement binary shaders & shader cache in memory (v2)
v2: handle _mesa_hash_table_insert failure
    other cosmetic changes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák 1fe73d55e3 radeonsi: move some struct si_shader members to new struct si_shader_info
This will be part of shader binaries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák 3c98e0b369 radeonsi: compile non-GS middle parts of shaders immediately if enabled
Still disabled.

Only prologs & epilogs are compiled in draw calls, but each variant of those
is compiled only once per process.

VS is always compiled as hw VS.
TES is always compiled as hw VS.

LS and ES stages are always compiled on demand.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák 4636d9be4a radeonsi: add PS prolog
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák dc27456194 radeonsi: separate out shader key bits for prologs & epilogs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:57 +01:00
Marek Olšák 329181ae33 radeonsi: enable denorms for 64-bit and 16-bit floats
This fixes FP16 conversion instructions for VI, which has 16-bit floats,
but not SI & CI, which can't disable denorms for those instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák de2e28366a radeonsi: compile geometry shaders immediately
they have only 1 variant

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák f7a8b6fff5 radeonsi: split out code for deleting si_shader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák b6d5666fbf radeonsi: remove useless code that handles dx10_clamp_mode
"enable-no-nans-fp-math" is a wrong string and there was a disagreement
about fixing it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 5a53628f45 radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák b9126dcda8 radeonsi: implement forcing per-sample_interpolation using the shader key only
It was partly a state and partly emulated by shader code, but since we want
to do this in a fragment shader prolog, we need to put it into the shader
key, which will be used to generate the prolog.

This also removes the spi_ps_input states and moves the registers
to the PS state.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 6dda2455c8 radeonsi: move BCOLOR PS input locations after all other inputs
BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs
were offset by 1 if color_two_side was enabled, and not offset if it was not
enabled, which is a variation that's problematic if we want to have 1 variant
per shader and the variant doesn't care about color_two_side (that should be
handled by other bytecode attached at the beginning).

Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location
"num_inputs" if it's present. BCOLOR1 is next.

This also allows removing si_shader::nparam and
si_shader::ps_input_param_offset, which are useless now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 606e4185f3 radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 90cbbe1c12 radeonsi: generate a color_two_side variant only if the shader reads colors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 0d68b91220 radeonsi: rework RB+ for Stoney
This fixes it.

States which also need to be taken into account:
- SPI color formats - each down-conversion format supports only a limited set
  of SPI formats
- whether MSAA resolving and logic op are enabled

These need special handling:
- blending
- disabled channels

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:19 +01:00
Marek Olšák 066d76c2f4 radeonsi: rename cb_target_mask state to cb_render_state
and rename a variable in the function.

SX_PS_DOWNCONVERT will be emitted here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:19 +01:00
Marek Olšák af57507e4f radeonsi: fix shader precompilation for shader-db
The addition of spi_shader_col_format killed all color outputs
in precompiled shaders.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)

v2: also set the alpha func (trivial)
2016-01-26 18:49:50 +01:00
Nicolai Hähnle a7754ffd31 radeonsi: replace use of is_gs_copy_shader in si_shader_vs
We now have an explicit parameter that contains the same information, and
this will allow us to get rid of is_gs_copy_shader in the si_shader struct.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-25 10:15:55 -05:00
Nicolai Hähnle 004fcd4230 radeonsi: ensure that VGT_GS_MODE is sent when necessary
Specifically, when the API switches from using a GS to not using a GS and then
back to using the same GS again, we do not have to re-send all the GS state,
but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part
of the VS state.

This fixes a rendering bug in Dolphin, but surely other applications are
affected as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-25 10:15:31 -05:00
Nicolai Hähnle 9f89bd69df radeonsi: extract the VGT_GS_MODE calculation into its own function
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-25 10:15:08 -05:00
Nicolai Hähnle d76bd85c35 Revert "radeonsi: fix discard-only fragment shaders (v2)"
This reverts commit 843855bbf0.

It became redundant due to Marek's earlier pushed 8667a1ae which achieves
the same thing.
2016-01-22 12:40:26 -05:00
Nicolai Hähnle 843855bbf0 radeonsi: fix discard-only fragment shaders (v2)
When a fragment shader is used that has no outputs but does conditional
discard (KILL_IF), all fragments are killed without this patch.

By comparing various register settings, my conclusion is that the exec mask
is either not properly forwarded to the DB by NULL exports or ends up being
unused, at least when there is _only_ a NULL export (the ISA documentation
claims that NULL exports can be used to override a previously exported exec
mask).

Of the various approaches I have tried to work around the problem, this one
seems to be the least invasive one.

v2: take discard by alpha test into account as well

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-22 11:59:50 -05:00
Marek Olšák 99dfeb01bd radeonsi: disable SPI color outputs the shader doesn't write
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák f6360de8c0 radeonsi: use all SPI color formats
because not using SPI_SHADER_32_ABGR doubles fill rate.

We should also get optimal performance if alpha isn't needed or blending
isn't enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák 933e3c4145 radeonsi: use 32_AR for alpha-to-coverage without a color buffer
This avoids the fp16 packing instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák e28b8530b9 radeonsi: set CB_SHADER_MASK according to SPI color formats
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák 8667a1aea2 radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc
This does change the behavior slightly:
  If a shader writes COLOR[i] and that color buffer isn't bound,
  the shader will export MRT_NULL instead and discard the IR tree that
  calculates the output. The only exception is alpha-to-coverage, which
  requires an alpha export.

v2: - update a comment about 16BPC
    - account for MRTZ when when fixing alpha-test/kill

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák dc96a18d24 radeonsi: don't miss changes to SPI_TMPRING_SIZE
I'm not sure about the consequences of this bug, but it's definitely
dangerous.

This applies to SI, CIK, VI.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-14 19:55:41 +01:00
Marek Olšák 4ea0febcb0 radeonsi: move POSITION and FACE fragment shader inputs to system values
And FACE becomes integer instead of float.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-13 12:27:28 +01:00
Marek Olšák caf3c2abea radeonsi: simplify gl_FragCoord behavior
It will become a system value, not an input.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-13 12:27:28 +01:00
Marek Olšák 20b9b5d7f5 radeonsi: add struct si_shader_config
There will be 1 config per variant, which will be a union of configs
from {prolog, main, epilog}. For now, just add the structure.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 0ffe3d3772 radeonsi: use EXP_NULL for pixel shaders without outputs
This never happens currently.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 5f3e6b5b0f radeonsi: simplify setting the DONE bit for PS exports
First find out what the last export is and simply set the DONE bit there.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák e00f3f23b1 radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 4e597c25c7 radeonsi: write all MRTs only if there is exactly one output
This doesn't fix a known bug, but better safe than sorry.

Also, simplify the expression in si_shader.c.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 746a7a7498 radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Marek Olšák 2cb8bf90cd radeonsi: determine DB_SHADER_CONTROL outside of shader compilation
because the API pixel shader binary will not emulate alpha test one day,
so the KILL_ENABLE bit must be determined elsewhere.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Nicolai Hähnle 4bb1c8dfec radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2)
This will allow us to send shader debug info.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-02 16:47:23 -05:00
Nicolai Hähnle 7d1fc2cf51 radeonsi: count compilations in si_compile_llvm
This changes the count slightly (because of si_generate_gs_copy_shader), but
this is only relevant for the driver-specific num-compilations query. It sets
the stage for the next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-12-29 09:07:01 -05:00
Marek Olšák 51603af390 radeonsi: use tgsi_shader_info::colors_written
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-11 15:25:11 +01:00
Edward O'Callaghan 13eb5f596b gallium/drivers: Sanitize NULL checks into canonical form
Use NULL tests of the form `if (ptr)' or `if (!ptr)'.
They do not depend on the definition of the symbol NULL.
Further, they provide the opportunity for the accidental
assignment, are clear and succinct.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:23 +01:00
Tom Stellard 95e0510916 radeonsi: Rename si_shader::ls_rsrc{1,2} to si_shader::rsrc{1,2}
In the future, these will be used by other shaders types.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 11:03:05 -05:00
Marek Olšák 3694d58e6c radeonsi: remove dead code after ES-GS linkage change
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák d79a3449a7 radeonsi: link ES-GS just like LS-HS
This reduces the shader key for ES.

Use a fixed attrib location based on (semantic name,  index).

The ESGS item size is determined by the physical index of the highest ES
output, so it's almost always larger than before, but I think that
shouldn't matter as long as the ESGS ring buffer is large enough.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák b1c5f3faa9 radeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga
I discovered that increasing the ESGS ring size fixes GS hangs on Tonga,
so let's do it properly.

There is now a separate init_config_gs_rings state that is not immutable,
because GS rings are resized when needed.

This also saves some memory. Most apps won't need more than 1MB
per ring per shader engine.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák 2f5d911ba2 radeonsi: rename si_update_gs_rings
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák 4acd856088 radeonsi: calculate ESGS_RING_ITEMSIZE in create_shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák a0cf589961 radeonsi: move maximum gs stream calculation into create_shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák 3ab0c49f04 radeonsi: clean up small duplication in si_shader_gs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:42 +01:00
Marek Olšák 6cc8f6c6a7 gallium/radeon: inline the r600_rings structure
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:41 +01:00
Marek Olšák 3a157e6e68 radeonsi: allow unbinding vertex shaders
Draw calls without a vertex shader are skipped.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-24 00:01:20 +02:00
Marek Olšák 07b3cc6ecf radeonsi: allow unbinding pixel shaders and remove the dummy shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-24 00:01:20 +02:00
Marek Olšák ed95cb3a31 radeonsi: add checks for a NULL pixel shader
This will allow removing the dummy PS.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-24 00:01:20 +02:00
Marek Olšák 8339585b12 radeonsi: enable BC_OPTIMIZE if centroid isn't used
This solution was recommended by a Catalyst developer.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-20 12:56:46 +02:00
Marek Olšák 9b54ce3362 radeonsi: support thread-safe shaders shared by multiple contexts
The "current" shader pointer is moved from the CSO to the context, so that
the CSO is mostly immutable.

The only drawback is that the "current" pointer isn't saved when unbinding
a shader and it must be looked up when the shader is bound again.

This is also a prerequisite for multithreaded shader compilation.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-20 12:51:51 +02:00
Marek Olšák 5bc871a4ca radeonsi: implement vertex color clamping
This is only supported in the compatibility profile (without GS and tess).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák 208d1ed38d radeonsi: implement fragment color clamping
using the shader key for now.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák acc6a07874 radeonsi: clean up other scratch buffer functions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák 9098d7e9bd radeonsi: clean up copy-pasted scratch buffer updates
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák 938a1bee34 radeonsi: unify shader create functions
The shader specifies the processor type, so use that instead.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák b0167809f1 radeonsi: unify shader delete functions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák c4f086f399 radeonsi: remove an unused ctx parameter in si_shader_destroy
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák b11edf8872 radeonsi: disable NaNs for LS and HS
They're disabled for all other shaders except compute, but I forgot
to do this for tess stages.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-17 21:40:03 +02:00
Marek Olšák 13e69805ea radeonsi: fix a GS hang on VI
Broken by one of the cleanups: 0d46c3bc9d
Not applicable to stable.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-10-07 19:18:50 +02:00
Marek Olšák b3c55fc669 radeonsi: do force_persample_interp in shaders for non-trivial cases
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák 9652bfcf2d radeonsi: implement the simple case of force_persample_interp
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák 214de2d815 radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate state
This will be a derived state used for changing center->sample and
centroid->sample at runtime.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák 2edb060639 gallium/radeon: tell the winsys the exact resource binding types
Use the priority flags and expand them.
This information will be used for debugging.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák 9932142192 radeonsi: add scratch buffer to the buffer list when it's re-allocated
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2015-09-26 01:51:05 +02:00
Marek Olšák b737d9c1dc radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák d556346b35 radeonsi: skip drawing if updating the scratch buffer fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 1f99b0be7e radeonsi: skip drawing if PS fails to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 237d7cccce radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 9b6d9dd7d8 radeonsi: handle fixed-func TCS shader create failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 5dbadb0257 radeonsi: handle shader precompile failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 263f5a2cf9 radeonsi: skip drawing if GS ring allocations fail
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 22d3ccf5a8 radeonsi: skip drawing if the tess factor ring allocation fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák 5c219ab552 radeonsi: add malloc fail paths to si_create_shader_state
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák 2d8f7d3c15 radeonsi: use an indirect buffer for init_config
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:15 +02:00
Marek Olšák aad43f0768 radeonsi: don't set number of IB dwords for states
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:15 +02:00
Marek Olšák fc95058add radeonsi: convert SPI state to an atom
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:15 +02:00
Marek Olšák 45e549fcbc radeonsi: convert CB_TARGET_MASK setup to an atom
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák c2a42d1f9f radeonsi: don't rebind GSVS ring buffers every draw call using GS
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák c9a3196b14 radeonsi: don't clear the tessellation factor ring buffer
Leftover from the bring-up.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák a2c6ae07b4 radeonsi: remove the tf_ring state, add the registers to init_config
One less state to worry about.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák 0d46c3bc9d radeonsi: remove the gs_rings state, add the registers to init_config
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák 8a97528b3a radeonsi: optimize viewport states
same as scissors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:13 +02:00
Marek Olšák f6a10f60b7 radeonsi: optimize scissor states
- convert 16 states to 1 atom
- only emit 1 scissor if VIEWPORT_INDEX isn't written
- use only one packet when emitting consecutive scissors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:13 +02:00
Marek Olšák 9b510a9652 radeonsi: fix a Unigine Heaven hang when drirc is missing
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:13 +02:00
Grazvydas Ignotas f8b01ae47c radeonsi: mark unreachable paths to avoid warnings
Otherwise we get:
warning: 'num_user_sgprs' may be used uninitialized in this function
...

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 15:42:26 +02:00
Marek Olšák 2d1952e2a5 radeonsi: add VI hardware support 2015-08-14 15:02:29 +02:00
Marek Olšák e7a52a5cb8 radeonsi: add support for gl_PrimitiveID in the fragment shader
It must be obtained from the VS.

The GS scenario A must be enabled for PrimID to be generated for the VS.

+ 4 piglits

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-08-13 01:25:26 +02:00
Marek Olšák 8e11be0ddb radeonsi: move VGT_GS_MODE to the VS state
The VS will want to select GS scenario A here (VS with PrimitiveID).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-08-13 01:25:26 +02:00
Grazvydas Ignotas 3206d4ed44 gallium/radeon: use helper functions to mark atoms dirty
This is analogous to r300_mark_atom_dirty() used by r300, and will
be used by later patches. For common radeon code, appropriate helper
is called through a function pointer.

No functional changes.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-11 14:46:53 +02:00
Marek Olšák 30a7e0c021 radeonsi: add a HUD query showing the number of shaders created
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-08-06 20:44:37 +02:00
Marek Olšák 70f5e49ba5 radeonsi: add a HUD query showing the number of compiler invocations
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-08-06 20:44:37 +02:00
Dave Airlie 3c73c41871 radeonsi: add GS multiple streams support (v2)
This is the final piece for ARB_gpu_shader5,

The code is based on the r600 code from Glenn Kennard,
and myself.

While developing this, I'm not 100% sure of all the calculations
made in the GS registers, this is why the max_stream is worked
out there and used to limit the changes in registers. Otherwise
my initial attempts either regressed GS texelFetch tests
or primitive-id-restart. The current code has no regressions
in piglit.

This commit doesn't enable ARB_gpu_shader5, since that just
bumps the glsl level to 4.00, so I'll just do a separate patch
for 4.10.

v1.1: fix bug introduced in rebase.
v2: Address Marek's review comments,
remove my llvm stream code for simpler C,
move gsvs_ring and gs_next_vertex to arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-30 09:00:17 +01:00
Dave Airlie 2294ba9565 radeon: add support for streams to the common streamout code. (v2)
This adds to the common radeon streamout code, support
for multiple streams.

It updates radeonsi/r600 to set the enabled mask up.

v2: update for changes in previous patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-29 10:48:47 +01:00
Marek Olšák a193c4978b radeonsi: add scratch buffer support for tessellation shaders
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:33 +02:00
Marek Olšák 74c1001d13 radeonsi: add derived tessellation state
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:33 +02:00
Marek Olšák db267a04ce radeonsi: implement a fixed-function tessellation control shader and its state
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:32 +02:00
Marek Olšák b6f4fdf6a9 radeonsi: set up a ring buffer for tessellation factors
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:32 +02:00
Marek Olšák ebfd9e0071 radeonsi: add tessellation shader states
ls_rsrc# will be emitted as part of the derived tessellation state

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:32 +02:00
Marek Olšák fff16e4ad2 radeonsi: add shader code generation for tessellation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:32 +02:00
Marek Olšák 59b3556f4c radeonsi: program VGT_SHADER_STAGES_EN for tessellation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:32 +02:00
Marek Olšák d1f43a7e5b radeonsi: add code for creating, binding and destroying tessellation shaders
This doesn't do anything yet.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:31 +02:00
Marek Olšák 3ce91c727f radeonsi: rework how shader pointers to descriptors are set
This is mainly needed for tessellation where a VS can be bound as VS, ES,
or LS, and TES (tess. evaluationshader) can be bound as VS or ES or neither.
Therefore we need the ability to move pointers to descriptors between
shaders arbitrarily.

The idea is that the context has a mapping from PIPE_SHADER_x to
SPI_SHADER_USER_DATA_x. After a shader is enabled or disabled,
si_shader_change_notify should be called to update this mapping accordingly.

There is a dirty flag for each shader pointer, but only one emit function
for all pointers in the whole context, whose code and logic is separated
from descriptors.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:31 +02:00
Marek Olšák 50a957c5de radeonsi: upload shader rodata after updating scratch relocations
Cc: 10.5 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:24 +02:00
Ilia Mirkin a2a1a5805f gallium: replace INLINE with inline
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile

and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability

to remove mentions of the inline define.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-21 17:52:16 -04:00
Michel Dänzer 248b26429f radeonsi: Use param export count from si_llvm_export_vs in si_shader_vs
This eliminates the error prone logic in si_shader_vs recalculating this
value.

It also fixes TGSI_SEMANTIC_CLIPDIST outputs incorrectly not being
counted for VS exports. They need to be counted because they are passed
to the pixel shader as parameters as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91193
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-07-07 12:35:35 +09:00
Dave Airlie 556dd4af76 radeonsi: add support for geometry shader invocations.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-27 00:24:30 +01:00
Dave Airlie 7e5064360c radeonsi: add support for viewport array (v3)
This isn't pretty and I'd suggest it the pm4 interface builder
could be tweaked to do this more efficently, but I'd need
guidance on how that would look.

This seems to pass the few piglit tests I threw at it.

v2: handle passing layer/viewport index to fragment shader.
fix crash in blit changes,
add support to io_get_unique_index for layer/viewport index
update docs.
v3: avoid looking up viewport index and layer in es (Marek).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-27 00:24:07 +01:00
Marek Olšák 224a77cc60 radeonsi: use a switch statement in si_delete_shader_selector
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák 0c5a309cee radeonsi: use a switch statement in si_shader_selector_key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák fa7f606e89 radeonsi: fix scratch buffer setup for geometry shaders
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák af4b9c7c2e radeonsi: don't count special outputs for the VS export count
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:36 +02:00
Marek Olšák e4339bc988 radeonsi: add support for PIPE_CAP_TGSI_TEXCOORD
Without it, texcoords are mapped to GENERIC[0..7], PointCoord is mapped to
GENERIC[8], and user-defined varyings start from GENERIC[9]. Since texcoords
can only be used between VS and PS, and PointCoord is PS-only, it's silly to
always start from GENERIC[9] in all other shaders (such as LS, HS, ES, GS).

This adds support for TEXCOORD and PCOORD semantics. As a result, st/mesa
will use GENERIC[0] as a base for user-defined varyings, which should make
linking ES and GS as well as tessellation shaders at runtime easier.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:31 +02:00
Marek Olšák b79c620663 radeonsi: add a debug option to compile shaders when they're created
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-16 18:36:29 +02:00
Michel Dänzer d64adc3a79 radeonsi: Cache LLVMTargetMachineRef in context instead of in screen
Fixes a crash in genymotion with several threads compiling shaders
concurrently.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89746

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-30 15:15:10 +09:00
Marek Olšák 98a2398222 radeonsi: implement line and polygon smoothing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák 1921fa4304 radeonsi: small cleanup in si_shader_selector_key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák 5349437154 radeonsi: only preload VertexID for the GS copy shader
The copy shader doesn't use any other preloaded VGPRs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák 050bf75c8b radeonsi: fix a warning caused by previous commit
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-02-23 11:45:00 +01:00
Marek Olšák 7820a11e3d radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-02-23 11:40:55 +01:00
Marek Olšák a27b74819a radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák 6c5af1dc4e radeonsi: implement polygon stippling
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Tom Stellard 2397a72129 radeonsi: Enable VGPR spilling for all shader types v5
v2:
  - Only emit write SPI_TMPRING_SIZE once per packet.
  - Use context global scratch buffer.

v3:
  - Patch shaders using WRITE_DATA packet instead of map/unmap.
  - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and
    VS_PARTIAL_FLUSH when patching shaders.

v4:
  - Code cleanups.
  - Remove unnecessary multiplies.

v5:
  - Patch shaders in system memory and re-upload to vram.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:47 +00:00
Marek Olšák 5935edd47c radeonsi: Avoid leaking memory when rebuilding shader states
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Michel Dänzer 82b7ee62fc Revert "radeonsi: only set BC_OPTIMIZE_DISABLE when necessary"
This reverts commit 0543630d0b.

It caused flickering artifacts in Steam games such as Team Fortress 2 or
Left 4 Dead 2.

We could probably only enable this optimization by also making sure the
shader code only uses either SI_PARAM_LINEAR_CENTROID or
SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader
variant.

Sorry I didn't remember this when reviewing the reverted change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-15 15:09:48 +09:00
Marek Olšák 1829f9c928 radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders
v2: complete rewrite

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-01-07 18:27:54 +01:00
Marek Olšák 2bfe9d4538 radeonsi: rename flush flags, split the TC flag into L1 and L2
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák 0543630d0b radeonsi: only set BC_OPTIMIZE_DISABLE when necessary
SPI_PS_IN_CONTROL is moved into the SPI mapping state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák 5d8e838dae radeonsi: do not define FACE as an ordinary PS input
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák 15a7fff69a radeonsi: remove flatshade from the shader key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák 2150db4d5d radeonsi: force NaNs to 0
This fixes incorrect rendering in Unreal Engine demos.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-21 20:34:38 +01:00
Marek Olšák 3291eedfe6 radeonsi: only emit line stippling and provoking vertex state when it changes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák acda2e113a radeonsi: fix SPI state dependency on sprite_coord_enable
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák b472709090 radeonsi: emit clip registers only if VS, GS, or rasterizer is changed
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák 20e570d115 radeonsi: move all shader-related functions to a new file si_state_shaders.c
This huge amount of code deserves its own file.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00