Nicolai Hähnle
dfa8e758c2
radeonsi/gfx10: disable clear state
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
0dd57f0fc0
radeonsi/gfx10: disable DPBB
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
815fd77a47
radeonsi/gfx10: disable SDMA
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
f66ee5af2f
radeonsi: determine the rasterization primitive type accurately (v2)
...
v2: reworked version to fix bugs and make it more efficient
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
a4b3eea325
radeonsi/gfx10: consolidate & improve input_prim determination for NGG
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
969e5176c2
ac: rework ac_build_waitcnt for gfx10
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
214ddfb688
radeonsi/gfx10: implement si_shader_vs
...
Only used with tessellation + GS instancing.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6cf2fb1fc4
radeonsi/gfx10: unpack GS invocation ID
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
32694456f7
radeonsi/gfx10: jump over the shader query atomic if the queries are disabled
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
244a8e6798
radeonsi/gfx10: cosmetic changes
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
09a905d930
radeonsi/gfx10: set cache control registers
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
b680f723f8
radeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
3203a74dcb
radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
07aacdbfd5
radeonsi/gfx10: add a workaround for stencil HTILE with mipmapping
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
51db950419
radeonsi/gfx10: disable DCC with MSAA
...
It was only enabled for 2x MSAA anyway.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6920f09f4b
radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives
...
We need to tell PA to accept edge flags generated by the input assembler,
because decomposed primitives shouldn't draw inner edges.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
e39d4594da
radeonsi/gfx10: fix NGG GS color clamping
...
Just need to pass the input from ES to GS. Everything else is done.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
40e7c65590
radeonsi/gfx10: fix vertex color clamping for TES
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
cc7875150a
radeonsi/gfx10: unbind NGG shaders when destroyed
...
This fixes glsl-max-varyings, which creates shaders, draws, and then
destroys them.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
b90ddff477
radeonsi/gfx10: don't use the GS workaround for triangle strips w/ adjancency
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
c3ac22a620
radeonsi/gfx10: don't do the query buffer atomic for blit shaders
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
adbec817d3
radeonsi/gfx10: update spi_map if API VS (as NGG) changes and PS doesn't
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
1e39c21c23
radeonsi/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
683cf11b81
radeonsi/gfx10: prefetch HW GS when NGG is used
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
76898a8062
amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
7f71579064
radeonsi/gfx10: fix PS exports for SPI_SHADER_32_AR
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
4bdf44724f
radeonsi/gfx10: set DLC for loads when GLC is set
...
This fixes L1 shader array cache coherency.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
f81aa6b0c8
radeonsi/gfx10: fix shader images
...
Don't promote 2D image instructions to 3D, and don't set z=BASE_ARRAY.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
7c805a7c67
radeonsi/gfx10: set the DCC constant encoding flag
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6eb219e963
radeonsi/gfx10: fix intensity formats
...
move the ALPHA_IS_ON_MSB fixup into vi_alpha_is_on_msb
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6944f99176
radeonsi/gfx10: allocate GDS BOs for streamout
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
395185912d
radeonsi/gfx10: make sure GDS is idle between IBs
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
5ff3aff0d6
radeonsi/gfx10: implement streamout
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
792a638b03
radeonsi/gfx10: implement streamout-related queries
...
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.
This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.
The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
bcd2d2e194
radeonsi/gfx10: enable the workaround for unaligned vertex fetch
...
Yes, really. Note that non-format buffer loads are unaffected and work
just fine with unaligned pointers (as long as SH_MEM_CONFIG is setup
correctly, which amdgpu ensures).
Fixes e.g. KHR-GL45.vertex_attrib_64bit.vao
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
22b85bfc02
radeonsi/gfx10: re-order the initialization order in si_compile_tgsi_main
...
It's useful to be able to access gs_ngg_scratch before creating the
main wrapping branch.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
3aa622aab1
radeonsi/gfx10: apply DCC MSAA blend workaround
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
bc25ccfe22
radeonsi/gfx10: implement si_emit_global_shader_pointers
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
6bcc273de8
radeonsi/gfx10: implement si_init_tess_factor_ring
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
2492cfde66
radeonsi/gfx10: initialize EXEC for TES-as-NGG (without geometry shader)
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
591537c7fa
radeonsi/gfx10: use correct VGPR for instance ID in LS shader
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
f3b9a37278
radeonsi/gfx10: implement si_shader_hs
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
e4d6b4daae
radeonsi/gfx10: implement si_create_sampler_state
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
0bf3e6fae7
radeonsi/gfx10: double the number of tessellation offchip buffers per SE
...
Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale
the number of offchip buffers accordingly.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
2afd3c421d
radeonsi/gfx10: implement get_tess_ring_descriptor
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
d028440f57
radeonsi/gfx10: mask DCC tile swizzle by alignment
...
DCC alignment can be less than the alignment of the main surface. In that
case, the DCC tile swizzle needs to be masked accordingly. Should have no
impact on pre-gfx10.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
1666ee183e
radeonsi/gfx10: implement hardware MSAA resolve
...
MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile
optimization that we use on gfx9 and earlier does not work.
Be very explicit about how the swizzle mode of the temporary surface is
selected.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
69c41fb8ff
radeonsi/gfx10: fix binding on si_update_scratch_relocs
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
fd8758366b
radeonsi/gfx10: set llvm_has_working_vgpr_indexing
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
48810ad02d
radeonsi/gfx10: implement load_const_buffer_desc_fast_path
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00