The old path copies state parameters into the parameter list, and then
the driver copies them into a buffer.
The optional new path loads state parameters into a buffer directly.
This increases performance by 5% in one subtest of viewperf.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
Drivers should upload only UniformBytes of uniforms and constants,
and then use _mesa_upload_state_parameters to upload state parameters.
This allows removing one copy of state parameters.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
Disabled because of CI failures.
Invoke fetch_state only once for all lights in the best case instead
of 8*8 times.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
Disabled because of CI failures.
Instead of separate state vars for each row and invoking fetch_state 4x:
state.matrix.modelview.row[0]
state.matrix.modelview.row[1]
state.matrix.modelview.row[2]
state.matrix.modelview.row[3]
The rows are now merged and fetch_state is invoked once:
state.matrix.modelview.row[0..3]
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
This moves state vars to the end of the parameter list, so that state vars
can be loaded directly into a buffer instead of loaded into the parameter list.
Also, state vars don't need to be searched in the parameter list anymore,
because we will know their index range. (this will make gallium faster)
This commit just wraps a for loop around the existing code.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
This moves state vars to the end of the parameter list, so that state vars
can be loaded directly into a buffer instead of loaded into the parameter list.
Also, state vars don't need to be searched in the parameter list anymore,
because we will know their index range. (this will make gallium faster)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
The order of enums is a preparation for a future commit, but if you are
guessing that I just want to copy all light params into a constant buffer
with one memcpy, you are right.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
Instead of having $matrix and $modifier as separate enums, combine them
to 1 enum $matrix_$modifier.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
There is no reason for it. This removes a pointer indirection.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
The destination memory can be uncached, e.g. a buffer object.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
Viewperf does a lot of redundant uniform updates - 60-80% in some tests.
Those are sometimes the only state changes between draw calls.
This improves performance by 33% in one viewperf subtest.
If you are worried about CPU overhead in the non-redundant case,
glthread is the solution.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6946>
This aligns the offsets to match the memory layout of the query buffer
defined by gfx10_sh_query_buffer_mem and calls si_launch_grid_internal
to flush caches and wait for completion of shaders prior to retrieving
results.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7181>
The filter is only relevant to handle blits that invole scaling, but
our TFU path doesn't handle any kind of scaling so we can safely
ignore it.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
The TFU path only activates for blits that are really copies
(no linear filtering, no scaling, same pixel format, etc.), and
we do it slice by slice, so we can easily handle mirroring of the
Z coordinate for 3D images by reversing the order of the layers
as we copy them.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
Same as with other TFU paths, we only handle exact copies without
conversion, so we can rewrite the format to use a compatible TFU format
based on its texel size, which allows us to use this path with more
formats.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
I can't think of any reason for ever requesting this here,
maybe in some future with parallel threads optimised compiles
and where vulkan drivers respect this bit.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7840>