This radically simplifies the code to decrease CPU overhead in si_draw_vbo.
The generic CP DMA copy function is too complicated.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
This is a great candidate for a template. There are a lot of conditions
that are already templated in si_draw_vbo.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
Enable vrs2x2 coarse shading if flat shading as per
idea and guidance given by Marek.
is_flat_shading variable in struct si_shader_info is set
based on the data from gather_intrinsic_info() function
and struct si_state_rasterizer. If is_flat_shading_variable
is set, then in function si_emit_db_render_state() vrs2x2
shading is enabled in hardware.
v2: Fix review comments from Pierre-Eric. Code optimizations.
v3: Fix indentation style issue.
v4: Fix review comments from Marek. Fixed logical issue pointed
by Marek where info->is_flat_shading variable can be corrupted
and other code cleanup.
v5: Make the code compact as suggested by Pierre-Eric.
v6: Fix new review comments from Marek.
v7: use info->uses_interp_color variable fix from Marek.
v8: Fix coding style comment from Marek.
v9: Add uses_fbfetch_output check as suggested by Marek.
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8161>
The problem was that the shader constants were based on the framebuffer
sample count and ignored the multisample enable state and the line/polygon
smoothing state, which uses MSAA rasterization that only sets SampleMaskIn
to get the coverage for alpha-blended smoothing (the PS epilog computes
the alpha channel from SampleMaskIn and blending generates the AA results).
- This is a complete rework that adds a new state for NGG cull constants.
- It fixes the same thing for the prim discard compute shader.
- It documents how VS_STATE.SMALL_PRIM_PRECISION is encoded.
It fixes blue corruption in Unigine Heaven with MSAA and Medium details
or better.
Fixes: 7648060dc0 - radeonsi: enable NGG culling by default on gfx10.3 dGPUs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8022>
There are many issues with SDMA across many generations of hardware.
A recent example is that gfx10.3 suffers from random GPU hangs if
userspace uses SDMA.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7908>
It's straightforward except that the amdgpu winsys had to be cleaned up
to allow this.
radeon_cmdbuf is inlined and optionally the winsys can save the pointer
to it. radeon_cmdbuf::priv points to the winsys cs structure.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7907>
invalidate_draw_sh_constants should invalidate only SGPRs.
invalidate_draw_constants invalidates SGPRs and NUM_INSTANCES.
u_blitter called invalidate_draw_sh_constants, which previously
invalidated NUM_INSTANCES as well. This commit fixes that.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>
This aligns the offsets to match the memory layout of the query buffer
defined by gfx10_sh_query_buffer_mem and calls si_launch_grid_internal
to flush caches and wait for completion of shaders prior to retrieving
results.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7181>
Instead of:
if (VS) {
VS;
}
if (TCS) {
TCS;
}
Do this if the number of threads is the same in VS and TCS:
exec = enabled_threads;
VS;
TCS;
Skipping declare_vb_descriptor_input_sgprs is needed to match the VS return
values.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
Flushes non-explicit shared textures that need retiling on
* glFlush
* glSync
* glSignalSemaphoreEXT
* DRI fences.
* The first time we create a non-explicit handle for it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
This improves performance for uber shaders.
It must be enabled using the new driconf option.
The driver compiles the specialized shaders in another thread without stalls,
same as all other optimizations.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7057>
Add a vertex count threshold into si_shader_selector to simplify
the draw_vbo code.
The new option is supposed to be used in 00-mesa-defaults.conf and should be
tweaked for best performance unlike the AMD_DEBUG experimental options.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6948>
So far, the callback to create a resource from a memory object had code
for importing textures only. Modified it to allow importing buffers too.
Fixes the following piglit tests:
- ext_external_objects/vk-buf-exchange
- ext_external_objects/vk-pix-buf-update-errors
- ext_external_objects/vk-vert-buf-update-errors
- ext_external_objects/vk-vert-buf-reuse
v2: Used si_alloc_buffer_struct instead of CALLOC
v3: Fixed indentation issue, removed free in case of unsuccessful
allocation, joined two if conditions together
Signed-off-by: Eleni Maria Stea <estea@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6364>
tess_rings must be encrypted when used in a secure job so this commit
introduces a tess_rings_tmz resource.
The cs_preamble_state doesn't contain the tess_rings address anymore since
it can change. The tess_rings related registers go in a separate preamble.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
This commit makes TMZ always allowed instead of being either off or forced-on
with AMD_DEBUG=tmz.
With this change:
- secure job can be used as soon as the application made a tmz allocation. Driver
internal allocations are not enough to enable secure jobs (if tmz is supported
and enabled by the kernel)
- AMD_DEBUG=tmz forces all scanout/depth/stencil buffers to be allocated as TMZ.
This is useful to test app thats don't explicitely support protected content.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
Tag allocations as driver internal.
Some of these allocations will need to be doubled to handle TMZ (one secure bo,
one normal bo) but these allocations shouldn't switch the winsys in "the app
is using TMZ".
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
The retile maps are a software mechanism and hence very suceptible
to change. As such I'd like to avoid making it part of the cross
driver ABI.
Ideally we'd just use the cached tile info + a shader to avoid these
buffers altogether.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6783>
This reverts commit 430d384c31.
BIT_PAGE can't be set for GTT and we don't know if a buffer has been
evicted to GTT.
Fixes: 430d384c31
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6722>
so that they can be different depending on the GPU (for 16-bit support)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>