Sparse ir_texture will set is_sparse and use struct type to
hold both residency code and sampled texel.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362>
This fixes nir_opt_cse miss replace a non-sparse tex instruction
with a sparse tex instruction and fail the nir_validate_shader().
Fixes: 3a7972f72a ("nir,spirv: add sparse texture fetches")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362>
If an application calls eglMakeCurrent with the same context and the same
draw and read surfaces as the current ones, there is no need to go
through all the work of unbinding/flushing the old context and binding
the new one.
While the EGL spec is a bit ambiguous here, it seems that the implicit
flush of the outgoing context only needs to be done when the context is
actually changed.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14379>
The flush of the outgoing GL context, as required by the EGL spec for
eglMakeCurrent and extended by KHR_context_flush_control, is already
performed in the unbindContext call. There is no need to pierce through
the layers and unconditionally call glFlush() here.
Getting the out-fence FD for explicit fencing needs to move behind the
unbindContext, to make sure we are getting the fence for the most
recently flushed commands.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14379>
Doing this for ir3 required adding a struct for limits of how much base to
fold in (which NTT wants as well for its case of shared vars), otherwise
the later work to lower to the 1<<9 word limit would emit more
instructions.
The shader-db results are that sometimes the reduction in NIR instruction
count results in the fewer sampler prefetches due to the shader being
estimated to be shorter (dota2, nexuiz):
total instructions in shared programs: 8996651 -> 8996776 (<.01%)
total cat5 in shared programs: 86561 -> 86577 (0.02%)
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023>
Lower to `sample_mask = 0`. Actually that implements a demote... doing
discard correctly probably requires rewriting the shader control flow to
insert a return where necessary...
Also, possibly we should be lowering this in NIR to play nice with
gl_SampleMask writes but that's a problem for when we understand the
hardware better.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14219>
Sets the output sample mask to a given 8-bit immediate or 16-bit
register. Also used to implement discards, which is my ES2 interest.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14219>
Enough bits of this packet are known that open-coding hex bytes for it
is annoying. Add some XML correpsonding to what we know.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14219>
Issue: objects[i].size is returned as '0' for all
Root cause: The value of objects.size is hard coded to '0' in
vlVaExportSurfaceHandle()
Fix: Assigning the value by multiplying height and width of the surface
Signed-off-by: Krunal Patel <krunalkumarmukeshkumar.patel@amd.corp-partner.google.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14313>
Issue: VADRMPRIMESurfaceDescriptor width and height are rounded up to
macroblock size (16).
Rootcause: surf->buffer's width/height are rounded up to macroblock size (16),
so they shouldn't be used here.
Fix: Using surf->templ's width/height instead fixes incorrect surface
dimensions being sent via VADRMPRIMESurfaceDescriptor.
Signed-off-by: Krunal Patel <krunalkumarmukeshkumar.patel@amd.corp-partner.google.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14313>
The documentation[1] for the csv module specifies that we should specify
newline='' when opening the output file. Without that, the module
garbles the newlines, writing them as \r\r\n on Windows instead of \r\n.
So let's do what the documentation says, and specify newline=''
[1]: https://docs.python.org/3/library/csv.html#id3
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12405>
According to RFC 4189 CSV files should be encoded using CRLF newlines,
not LF. This helps compatibility with tools, like python's csv module,
who always uses CRLF.
While we're at it, normalize the one CSV that was CRLF in-repo to LF,
and let git do the newline-normalization when needed instead.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12405>
By software ABI, a blend shader is permitted to clobber registers
R0-R15. The scheduler needs to be aware of this, to avoid moving a write
to one of these registers past the BLEND itself. Otherwise the schedule
is invalid.
This bug affects GLES3.0, but is rare enough in practice that we had
missed it. It requires a fragment shader to write to multiple render
targets with attached blend shaders, and have temporaries register
allocated to R0-R15 that are not read by the blend shader, but are sunk
past the BLEND instruction by the scheduler. Prevents a regression when
switching boolean representations on:
dEQP-GLES31.functional.shaders.builtin_functions.integer.uaddcarry.uvec4_lowp_fragment
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14577>
Although Bifrost clause packing and register assignment is tricky, the
relevant code is by now extensively tested, and there's no remaining
reverse-engineering here. So disassembling verbosely just adds tons of
noise to pandecode without increasing the useful information.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14543>
It's a pointer to a thread storage descriptor, not a framebuffer
descriptor. Unlike Midgard, these don't have to alias. The FBD pointer
was unused anyway, so remove it to reduce noise in pandecode dumps.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14543>
Add a helper to dump all mapped GPU memory. This is a blunt, seldom
useful instrument ... but when it /is/ useful it's your only option.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14543>
Rather than emulating page tables, poorly, with a hash table, use a
red-black tree and store the intervals directly. This is deterministic
instead of probabilistic, attaining O(log n) performance for n mapped
intervals which is good enough. Unlike the hash table approach, this
allows us to iterate intervals easily.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14543>