Commit Graph

149 Commits

Author SHA1 Message Date
Marek Olšák 3138a28ff2 radeonsi: move default tess level constant buffer to RW buffers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:14 +02:00
Marek Olšák 1378487fb4 radeonsi: rename and rearrange RW buffer slots
- use an enum
- use a unique slot number regardless of the shader stage
  (the per-stage slots will go away for RW buffers)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Bas Nieuwenhuizen 38f4cee3ff radeonsi: Add config parameter to si_shader_apply_scratch_relocs.
shader->config is not updated for compute kernels.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-21 19:36:19 +02:00
Marek Olšák ffe44d0283 radeonsi: fold num_user_sgprs where it is possible
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák 51c4034f9b radeonsi: fix SGPRS calculation once more
This fixes GS piglit failures after adding SI_PARAM_SHADER_BUFFERS,
which bumped NUM_USER_SGPRS and uncovered this bug on SI.

If this was fixed in LLVM, these workarounds wouldn't be needed.

LLVM would have to look at the calling convention to know how many SGPR
inputs are declared, and add VCC and the scratch wave offset (which is
enabled even if we spill SGPRs but not VGPRs, oh well).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák 2ca5566ed7 radeonsi: move scissor and viewport states into gallium/radeon
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:13:24 +02:00
Nicolai Hähnle 6f942ac5ee radeonsi: disable early Z if the fragment shader writes to memory
Empirically, both the EXEC_ON_* flags and LATE_Z are necessary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Marek Olšák a73a657def radeonsi: process TGSI property NEXT_SHADER
This allows compiling the main shader part as ES or LS.

If we get the correct hint, non-separable GLSL shaders no longer have to be
compiled as VS first, followed by LS or ES compiled on demand.

The result is that fewer shaders are compiled by piglit, but it doesn't
improve piglit running time.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-19 23:20:01 +01:00
Nicolai Hähnle 4de25fa7b0 radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:59 -05:00
Marek Olšák d0f3b524cd radeonsi: use re-Z
This can increase perf for shaders that kill pixels (kill, alpha-test,
alpha-to-coverage).

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-01 00:18:19 +01:00
Marek Olšák ff360a52e6 radeonsi: implement binary shaders & shader cache in memory (v2)
v2: handle _mesa_hash_table_insert failure
    other cosmetic changes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák 1fe73d55e3 radeonsi: move some struct si_shader members to new struct si_shader_info
This will be part of shader binaries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák 3c98e0b369 radeonsi: compile non-GS middle parts of shaders immediately if enabled
Still disabled.

Only prologs & epilogs are compiled in draw calls, but each variant of those
is compiled only once per process.

VS is always compiled as hw VS.
TES is always compiled as hw VS.

LS and ES stages are always compiled on demand.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák 4636d9be4a radeonsi: add PS prolog
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:58 +01:00
Marek Olšák dc27456194 radeonsi: separate out shader key bits for prologs & epilogs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-21 21:08:57 +01:00
Marek Olšák 329181ae33 radeonsi: enable denorms for 64-bit and 16-bit floats
This fixes FP16 conversion instructions for VI, which has 16-bit floats,
but not SI & CI, which can't disable denorms for those instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák de2e28366a radeonsi: compile geometry shaders immediately
they have only 1 variant

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák f7a8b6fff5 radeonsi: split out code for deleting si_shader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák b6d5666fbf radeonsi: remove useless code that handles dx10_clamp_mode
"enable-no-nans-fp-math" is a wrong string and there was a disagreement
about fixing it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 5a53628f45 radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák b9126dcda8 radeonsi: implement forcing per-sample_interpolation using the shader key only
It was partly a state and partly emulated by shader code, but since we want
to do this in a fragment shader prolog, we need to put it into the shader
key, which will be used to generate the prolog.

This also removes the spi_ps_input states and moves the registers
to the PS state.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 6dda2455c8 radeonsi: move BCOLOR PS input locations after all other inputs
BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs
were offset by 1 if color_two_side was enabled, and not offset if it was not
enabled, which is a variation that's problematic if we want to have 1 variant
per shader and the variant doesn't care about color_two_side (that should be
handled by other bytecode attached at the beginning).

Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location
"num_inputs" if it's present. BCOLOR1 is next.

This also allows removing si_shader::nparam and
si_shader::ps_input_param_offset, which are useless now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 606e4185f3 radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 90cbbe1c12 radeonsi: generate a color_two_side variant only if the shader reads colors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-09 21:19:51 +01:00
Marek Olšák 0d68b91220 radeonsi: rework RB+ for Stoney
This fixes it.

States which also need to be taken into account:
- SPI color formats - each down-conversion format supports only a limited set
  of SPI formats
- whether MSAA resolving and logic op are enabled

These need special handling:
- blending
- disabled channels

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:19 +01:00
Marek Olšák 066d76c2f4 radeonsi: rename cb_target_mask state to cb_render_state
and rename a variable in the function.

SX_PS_DOWNCONVERT will be emitted here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:19 +01:00
Marek Olšák af57507e4f radeonsi: fix shader precompilation for shader-db
The addition of spi_shader_col_format killed all color outputs
in precompiled shaders.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)

v2: also set the alpha func (trivial)
2016-01-26 18:49:50 +01:00
Nicolai Hähnle a7754ffd31 radeonsi: replace use of is_gs_copy_shader in si_shader_vs
We now have an explicit parameter that contains the same information, and
this will allow us to get rid of is_gs_copy_shader in the si_shader struct.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-25 10:15:55 -05:00
Nicolai Hähnle 004fcd4230 radeonsi: ensure that VGT_GS_MODE is sent when necessary
Specifically, when the API switches from using a GS to not using a GS and then
back to using the same GS again, we do not have to re-send all the GS state,
but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part
of the VS state.

This fixes a rendering bug in Dolphin, but surely other applications are
affected as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-25 10:15:31 -05:00
Nicolai Hähnle 9f89bd69df radeonsi: extract the VGT_GS_MODE calculation into its own function
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-25 10:15:08 -05:00
Nicolai Hähnle d76bd85c35 Revert "radeonsi: fix discard-only fragment shaders (v2)"
This reverts commit 843855bbf0.

It became redundant due to Marek's earlier pushed 8667a1ae which achieves
the same thing.
2016-01-22 12:40:26 -05:00
Nicolai Hähnle 843855bbf0 radeonsi: fix discard-only fragment shaders (v2)
When a fragment shader is used that has no outputs but does conditional
discard (KILL_IF), all fragments are killed without this patch.

By comparing various register settings, my conclusion is that the exec mask
is either not properly forwarded to the DB by NULL exports or ends up being
unused, at least when there is _only_ a NULL export (the ISA documentation
claims that NULL exports can be used to override a previously exported exec
mask).

Of the various approaches I have tried to work around the problem, this one
seems to be the least invasive one.

v2: take discard by alpha test into account as well

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-22 11:59:50 -05:00
Marek Olšák 99dfeb01bd radeonsi: disable SPI color outputs the shader doesn't write
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák f6360de8c0 radeonsi: use all SPI color formats
because not using SPI_SHADER_32_ABGR doubles fill rate.

We should also get optimal performance if alpha isn't needed or blending
isn't enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák 933e3c4145 radeonsi: use 32_AR for alpha-to-coverage without a color buffer
This avoids the fp16 packing instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák e28b8530b9 radeonsi: set CB_SHADER_MASK according to SPI color formats
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák 8667a1aea2 radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc
This does change the behavior slightly:
  If a shader writes COLOR[i] and that color buffer isn't bound,
  the shader will export MRT_NULL instead and discard the IR tree that
  calculates the output. The only exception is alpha-to-coverage, which
  requires an alpha export.

v2: - update a comment about 16BPC
    - account for MRTZ when when fixing alpha-test/kill

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák dc96a18d24 radeonsi: don't miss changes to SPI_TMPRING_SIZE
I'm not sure about the consequences of this bug, but it's definitely
dangerous.

This applies to SI, CIK, VI.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-14 19:55:41 +01:00
Marek Olšák 4ea0febcb0 radeonsi: move POSITION and FACE fragment shader inputs to system values
And FACE becomes integer instead of float.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-13 12:27:28 +01:00
Marek Olšák caf3c2abea radeonsi: simplify gl_FragCoord behavior
It will become a system value, not an input.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-13 12:27:28 +01:00
Marek Olšák 20b9b5d7f5 radeonsi: add struct si_shader_config
There will be 1 config per variant, which will be a union of configs
from {prolog, main, epilog}. For now, just add the structure.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 0ffe3d3772 radeonsi: use EXP_NULL for pixel shaders without outputs
This never happens currently.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 5f3e6b5b0f radeonsi: simplify setting the DONE bit for PS exports
First find out what the last export is and simply set the DONE bit there.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák e00f3f23b1 radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 4e597c25c7 radeonsi: write all MRTs only if there is exactly one output
This doesn't fix a known bug, but better safe than sorry.

Also, simplify the expression in si_shader.c.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:06 +01:00
Marek Olšák 746a7a7498 radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Marek Olšák 2cb8bf90cd radeonsi: determine DB_SHADER_CONTROL outside of shader compilation
because the API pixel shader binary will not emulate alpha test one day,
so the KILL_ENABLE bit must be determined elsewhere.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Nicolai Hähnle 4bb1c8dfec radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2)
This will allow us to send shader debug info.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-02 16:47:23 -05:00
Nicolai Hähnle 7d1fc2cf51 radeonsi: count compilations in si_compile_llvm
This changes the count slightly (because of si_generate_gs_copy_shader), but
this is only relevant for the driver-specific num-compilations query. It sets
the stage for the next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-12-29 09:07:01 -05:00
Marek Olšák 51603af390 radeonsi: use tgsi_shader_info::colors_written
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-11 15:25:11 +01:00