Now that dxil_spirv_nir.h exposes an helper to run the
DXIL-SPIRV specific passes, we can handle the varying linking
on our side and tell nir_to_dxil() we don't want automatic
varying index/register assignment, which should fix a bunch
of compiler errors.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
Letting the compiler decide which slot should be used for varyings when
it doesn't know about the varyings written/read by the previous/next
stage doesn't work well. So let's the caller decide when it wants
automatic index/register assignment through a dedicated parameter,
instead of assuming Vulkan users always want that.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
Linking should be done in reverse order, starting from the last
pipeline stage and going backward, so we can eliminate outputs from the
previous stage that are never used by the next stage, and possibly
kill some instructions and input variables too.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
No need to keep them around if they're unused. Moreover, this should
allow the linking step to get rid of outputs when the next stage
doesn't use them.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
Right now, nir_to_dxil() assumes driver_location on inputs will be
contiguous, which is true for GL, and also true for Vulkan shaders
with the current implementation. But we are trying to delegate
the varying linking step to Dozen, and that means the driver will
assign the driver_location field.
For everything except vertex shaders this works fine, because we
are in control of the ID we assign to each variable, and can make
sure no holes exists in this assignment, but vertex inputs expect
the index value (which is directly extracted from the
driver_location field) to match the input index defined at pipeline
creation time. The compiler has a hack to treat Vulkan differently
and extract the index from the var->data.location field instead,
but that's a bit confusing.
Moreover, the input_mappings[] array is already indexed with
the var->data.driver_location field in the input load emission
path, so it makes sense to index it with the same field when
emitting signatures.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
Extract NIR passes out of spirv_to_dxil() so we can re-use them
without separately and do the varying linking in Dozen. This way
we will also be able to use vk_shader_module_to_nir() which
takes care of the SPIRV -> NIR translation, plus a bunch of
common lowering passes.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
We need the fix adding a Block decoration to the BuiltIn struct in
SpvModuleScopeVarParserTest_BuiltinPosition_BuiltIn_Position_Initializer.spvasm.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16221>
Tell the state tracker our point coordinates have a lower left origin
instead of an upper left origin, and remove our point coordinate
flipping code. Saves an instruction in any shader that reads
gl_PointCoord.y
Note: the OpenGL blob also emits an "fadd $y', ^y.neg, 1.0" to flip
point coordinates, so this isn't just a Metal weirdness.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16829>
When lower_wpos_pntc is used, the state tracker inserts code to
transform gl_PointCoord.y according to a uniform, to account for
API-requested point coordinate origin and framebuffer orientation. With
the transformation, driver-supplied point coordinates are expected to
have an upper left origin.
If the hardware point coordinate supports (only) a lower left origin,
the backend has to use lower_wpos_pntc and then lower *again* to flip
back. This ends up transforming twice, which is wasteful:
a = load point coord Y with lower left origin
a' = 1.0 - a
a'' = uniform_transform(a')
However, lower_wpos_pntc is quite capable of transforming for a lower
left origin too, it just needs to flip the transformation. Add a CAP
specifying the point coordinate origin convention, rather than assuming
upper-left. This simplifies the Asahi code greatly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16829>
The performance counter layout depends on the number of L2 blocks and the number
of shader cores. It doesn't make a ton of sense to hardcode these into the XML
files. Instead, let's make the coutner offsets in the XML files relative to the
categories (blocks), so we can calculate the offsets of the categories
themselves at runtime based on the computed layout. This fixes performance
counters on Mali-G57 as implemented on MT8192.
There is little code change here, mainly churn from changing the XML definition.
Postprocessing for the XML to make it suitable for Mesa uses Antonio Caggiano's
https://gitlab.freedesktop.org/panfrost/hwc-helper tool.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16803>
The number of L2 performance counter blocks equals the number of L2 slices, so
add a query to get this. This information isn't needed by the Mesa driver, so
don't get it in the default device initialization path.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16803>
The input is specified in two identical structs, tear that apart.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
Much simpler than creating a UBO and relying on it getting optimized to a push
constant, with possible reordering.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
Otherwise, load_push_constant won't work properly. This could probably be made
to work if we tried hard enough, but we still don't want reordering for internal
(meta) shaders which are layed out deliberately.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
Bifrost supports "fast access uniforms" loaded from a single contiguous buffer.
This maps directly to Vulkan push constants, with some caveats:
* No indirect access. Indirects need to be lowered to a UBO pull.
* Strict alignment requirements. These will be met in practice.
Implement the NIR intrinsic and map it to the native hardware construct.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
Inverted condition -- indirect dispatch gets disabled when WLS is in use, not
the other way around. Not sure how this worked before...
Fixes: fd7b44882c ("panfrost: Use direct dispatch with shared memory")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
To be able to sum drawcall cost and render pass cost, the units of costs
are changed to bytes. With that, tu_autotune_use_bypass can make
decisions by comparing the costs of sysmem rendering and gmem rendering.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16733>
Allow 1/4 of the max heap size, but maximum of 512 MB on 32-bit
architectures.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16901>
This is generally PEP-484 stuff, but there is one functional change.
The base class Node needed to have an add() method to allow typed
dynamic dispatch. This could have been decorated @abstractmethod, but
that would require an error-raising implementation on all leaf-type
nodes. Instead, I added a base implementation that just errors out with
information from the subclass instance.
As a simple optimization, subclass implementations of add() (instead of
raising the same (or similar) error) now call super().add() in the
case of invalid child nodes.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16884>
This allowed for nested groups rendered as arrays. Support for this had
mostly been removed already; this removes the additional value to make
typing easier.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16884>