Not sure where the 16k comes from, but pretty sure 2k is the max.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
These aren't used yet but we will want to use them when we
implement a separate compute queue.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is required for having a separate compute queue, we
probably can't use this on GFX queue due to DCC.
v2: Set coord_components = 2 for itoi texture fetch. (Bas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This implements the reverse of the current buffer->image path
and can be used when we need to do image transfer on compute queues
This just adds the code turned off as we don't support separate
computes queues yet, and we don't want to use this path on the GFX
queues for DCC reasons.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Needed when accessing a comrpessed texture as R32G32B32A32 from a shader. This
was not encountered previously, as we used the CB for the reinterpretation, which
does not use this pitch.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
A bit of a hack, but we need to do this until we can do tiled zs in
sysmem (and associated tile/until blits for transfer_map).
Fixes xonotic and glmark2 "refract", when reorder wasn't enabled.
(reorder would paper over the issue by avoiding the extra round-
trip to system memory and back to gmem.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Refactor out into a common helper, since this is the same across
generations when we need equiv z/s gmem restore format.
Next patch needs this in a5xx, rather than creating yet another
helper push this into core.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Fixes some issues at least with GMEM bypass mode, where we'd sometimes
end up with some FS quads not hitting memory.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Swap/component-order doesn't seem to be quite what that is. At least
blob was always setting it to XYZW ('11') but we weren't. Causing
problems w/ formats like sint16.. Hard-coding this instead at least
seems to get glamor working.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Might not be 100% accurate, mostly just copy from a4xx to get started.
We are defn lying about occlusion query at this point (not implemented
yet) but need it to expose anything higher than gl1.4 (glamor needs
gl2.1)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Not sure what this event is, but blob writes it.. and it seems to solve
random write faults at mystery address that would sometimes happen on
first BYPASS draw.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The spec says we have to try to create all, and only set failed
pipelines to VK_NULL_HANDLE. If one of them fails, we have to return
an error, but as far as I can see, the spec does not care which of
the suberrors.
Fixes
dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline
dEQP-VK.api.object_management.alloc_callback_fail_multiple.graphics_pipeline
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The strategy is to do the same thing that the GLSL lower_offset_arrays
pass does - create 4 separate texture gather ops, one per offset, and
read in the results from each gather's w component to recreate the
desired result.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This fixes a regression in a bunch of image store vulkan CTS tests
from commit ad38ba1134, which started
using OWORD block read messages to implement UBO loads. The reason
for the failure is that we were giving bogus buffer alignment limits
to the application (1B), so the CTS would happily come back with
descriptor sets pointing at not even word-aligned uniform buffer
addresses.
Surprisingly the sampler messages used to fetch pull constants before
that commit were able to cope with the non-texel aligned addresses,
but the dataport messages used to fetch pull constants after that
commit and the ones used to access storage buffers (before and after
the same commit) aren't as permissive with unaligned addresses.
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We do nothing but initialize it, add to it, and delete it.
This is a fallout from removing constant initializer support.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Work can now be added to fences and triggered by fence completion. This
allows for deferred resource deletion, and other asynchronous tasks.
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Reverting the previous attempt at this a5502a721f resulted in
the following Vulkan test failing.
dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex
This time we use the num_components from the alu dest rather than
num_inputs to the op to determine the size of the undef.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99100
No functional change, just rewriting it in an easier-to-understand way.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
We should make the dest in the textureLod() operation have the same number
of components as the destination in the original textureGrad()
Fixes regression in ES3-CTS.gtf.GL3Tests.shadow
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99072
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit 6aa730000f.
This was changing the size of the undef to always be 1 (the number of inputs
to imov and fmov) which is wrong, we could be moving a vec4 for example.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
We want to perform the URB read to a vec4 temporary, with no writemask,
then issue a MOV to swizzle the data and store it to the actual
destination, using the final writemask.
We were doing this wrong. For example, let's say we wanted to read
a vec2 stored in components 2-3 of a vec4. We would generate a URB
read message of:
SEND <actual destination>.XY <header with mask set to XY>
MOV <actual destination>.XY <actual destination>.ZW
This doesn't work, because the URB message reads the .XY components
of the vec4, rather than the ZW. It writes to the right place, but
with the wrong data. Then the MOV comes along and overwrites it
with data that didn't even come from the URB at all.
Instead we want to do:
SEND <temporary> <header with mask set to ZW>
MOV <actual destination>.XY <temporary>.ZW
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>