Now that LLVM 9 will be released soon, we will only support
LLVM 8, 9 and master (10).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when
running dEQP that terminates and reinitializes a display.
Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We were always resolving the buffer as if we were accessing it via
CPU maps, which don't understand any auxiliary surfaces. But we often
copy to a temporary using BLORP, which understands compression just
fine. So we can avoid the resolve, and accelerate the copy as well.
Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
This doesn't work for compressed formats, as the source texture and
temporary texture would have different block sizes. (Forcing the driver
to always take the GPU path would expose the bug.) Instead, just use
the source format for the temporary, and let blorp_copy deal with
overrides.
The one case where we can't do this is ASTC, because isl won't let us
create a linear ASTC surface. Fall back to the CPU paths there for now.
Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Gen11 stores the fast clear color in an "indirect clear buffer", as
a packed pixel value. Gen9 hardware stores it as a float or integer
value, which is interpreted via the format. We were trying to store
that in a buffer, for similarity with Icelake, and MI_COPY_MEM_MEM
it from there to the actual SURFACE_STATE bytes where it's stored.
This unfortunately doesn't work for blorp_copy(), which does bit-for-bit
copies, and overrides the format to a CCS-compatible UINT format. This
causes the clear color to be interpreted in the overridden format.
Normally, we provide the clear color on the CPU, and blorp_blit.c:2611
converts it to a packed pixel value in the original format, then unpacks
it in the overridden format, so the clear color we use expands to the
bits we originally desired.
However, BLORP doesn't support this pack/unpack with an indirect clear
buffer, as it would need to do the math on the GPU. On Gen11+, it isn't
necessary, as the hardware does the right thing.
This patch changes Gen9 to stop using an indirect clear buffer and
simply do PIPE_CONTROLs with post-sync write immediate operations
to store the new color over the surface states for regular drawing.
BLORP continues streaming out surface states, and handles fast clear
colors on the CPU.
Fixes: 53c484ba8a ("iris: blorp using resolve hooks")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
For renderable surfaces, we allocate SURFACE_STATEs for each bit in
res->aux.possible_usages. Sampler views use res->aux.sampler_usages.
When pinning buffers, we call surf_state_offset_for_aux() to calculate
the offset to the desired surface state. surf_state_offset_for_aux()
took an aux_modes parameter, which should be one of those two fields.
However...it was not using that parameter. It always used the broader
res->aux.possible_usages field directly.
One of the callers, update_clear_value(), was passing incorrect masks
for this parameter. It iterated through the bits in order, using
u_bit_scan(), which destructively modifies the mask. So each time we
called it, the count of bits before our selected mode was 0, which would
cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE,
rather than updating each in turn. This was hidden by the earlier bug
where surf_state_offset_for_aux() ignored the parameter.
Fixes: 7339660e80 ("iris: Add aux.sampler_usages.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
This is genxml, we can compile out this code.
Fixes: 2660667284 ("iris/gen8: Re-emit the SURFACE_STATE if the clear color changed.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Rather than passing through the transformed gl_Position, we can use the
hardware-level varying for this, which will correctly handle
gl_FragCoord.w
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The offset is added to the base address, so we need to subtract it from
the size to maintain the same end address and thus prevent a buffer
overflow:
end_address = start_address + size
start_address' = start_address + offset
size' = size - offset
end_address' = start_address' + size'
= (start_address + offset) + (size - offset)
= (start_address + size) + (offset - offset)
= start_address + size
= end_address
QED.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We need a special path for special varyings so we parse them correctly
instead of throwing an error when they inevitably point to bad memory.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The hardware doesn't care, and a lot of Panfrost code relies on an
oversized buffer. The important part is that (stride *
padded_num_vertices) is no greater than size, which we'll need to check
once we validate instancing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We don't need to dump the contents necessary, but having the stub with
the address is useful.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
I don't know who thought this mask was a good idea but unfortunately it
must have been me.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
If we permit more $whatever through than the shader needs, that's a bit
of a waste, but it isn't an error.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We don't actually care about the *contents* of the index buffer, but we
would rather like to ensure it is present and of the correct size.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We can infer these stats in many cases from the disassembly, so we
should try to sanity check where we can. We may need to be fuzzy about
analysis, since analysis gives us a bound but we don't mind if it's not
used fully by the shader.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We could do better by forcing the checks to *equal* zero (right now, an
indeterminate answer will pass the checks), but this is a start to guard
against some egregious cases.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
There are a number of conditions we need to test for to statically check
for TILE_RANGE_FAULTs, but once these checks are in order, we can print
as-is.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
These tags need to match up with what's actually described by the MFBD,
so check this. Once this is checked, since the type and contents of the
FBD are obvious from printing above, there's no need to explicitly mark
off the framebuffer line.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
For shaders using exclusively direct attribute/varyings, we can work
this out statically. For shaders with indirect access, we just set an
upper bound of 16 (the max attributes/varyings we support) and the
actual count will be reported regardless.
We proceed similarly for textures/samplers, as well as for UBOs. While
UBOs can be *indexed* indirectly, the *UBO itself* -- which is what we
count in the shader descriptor (rather than the UBO descriptors) -- is
statically determinable.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This one is a little tricky, but the idea is that:
r16-r23 are always uniforms
r8-r15 are sometimes work, sometimes uniforms...
...but as work, they are always written before use
...and as uniforms, they are never written before use
So we use that heuristic to determine the count to feed the machine.
We'll record work register use in the next commit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
There are no remaining users in-tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Panfrost is the only user of the macro; we are better off expanding than
having random stuff in nir.h.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Right now it always returns zero, but as of:
commit a48a6b8a40
Author: Adam Jackson <ajax@redhat.com>
Date: Tue Nov 14 15:13:05 2017 -0500
glx: Prepare driFetchDrawable for no-config contexts
We were hoping it would return true if the drawable could actually be
looked up. It wasn't, so that didn't go very well. With the most recent
update to <GL/glxext.h> glXQueryGLXPbufferSGIX (correctly) returns void,
so there's no longer anything else besides driFetchDrawable that depends
on the return value from __glXGetDrawableAttribute.
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Minor fixups required to keep the prototypes matching and to remove
mention of retired enums.
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
A bunch of remaining issues including some that affect users.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111248
Fixes: ee21bd7440 "radv/gfx10: implement NGG support (VS only)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
We added this utility for vulkan where all timeouts are given as
uint64_t values. We can switch from signed to unsigned as this is the
only user and if we ever deal with signed integers somewhere else
we'll have to be careful to use the corresponding
timespec_(add|sub)_msec and always pass absolute values.
v2: Forgot to drop the test calling add_nsec() with a negative number
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Fixes: d2d70c3bb5 ("util: add a timespec helper")
Acked-by: Daniel Stone <daniels@collabora.com>
This fixes a regression introduced with scan&reduce operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.
This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
Fixes: 227c29a80d "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Commit fixes current crashes with Vulkan applications on Android.
Fixes: c0376a1234 "util: add anon_file.h for all memfd/temp file usage"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
v2: Pass through to oscreen rather than faking it (review from Marek).
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>