Two advantages:
* When using NIR_DEBUG=nir_print_xx, will print outcome only if
there is a change
* We can use NIR_PASS(_, ...) instead of NIR_PASS_V, that has
slightly more validation checks.
This includes:
* v3d_nir_lower_image_load_store
* v3d_nir_lower_io
* v3d_nir_lower_line_smooth
* v3d_nir_lower_load_store_bitsize
* v3d_nir_lower_robust_buffer_access
* v3d_nir_lower_scratch
* v3d_nir_lower_txf_ms
As we are here we also simplify some of them by using the
nir_shader_instructions_pass helper.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>
The trigger for this commit was when we found that we were not calling
nir_metadata_preserve when lowering the layout code. But then I found
that it would be better to just update the code to use
nir_shader_instructions_pass, so we can avoid to manually:
* Initialize the nir_builder
* Call nir_foreach functions (we pass the callback)
* Call nir_metadata_preserve functions (that as mentioned we were not calling)
We also get a nice cleanup of several functions by reducing the number
of parameters (we pass a state struct).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>
This represents an offset from the actual start of the image data,
not from the start of the memory allocation bound to the image.
Fixes:
dEQP-VK.image.subresource_layout.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17648>
From the Vulkan spec:
"If poolSizeCount is not 0, pPoolSizes must be a valid pointer to an
array of poolSizeCount valid VkDescriptorPoolSize structures"
So 0 is actually allowed and there is a CTS to check it is handled gracefully.
Fixes:
dEQP-VK.api.descriptor_pool.zero_pool_size_count
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17648>
This feature allows shaders to use pointers to buffers which may
not be bound via descriptor sets. Access to these buffers is done
via global intrinsics.
Because the buffers are not accessed through descriptor sets, any
live buffer flagged with VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT_KHR
can be accessed by any shader using global intrinsics, so the driver
needs to make sure all these buffers are mapped by the kernel when
it submits the job for execution.
We handle this by tracking if any draw call or compute dispatch in
a job uses a pipeline that has any such shaders. If so, the job is
flagged as using buffer device address and the kernel submission
for that job will add all live BOs bound to buffers flagged with the
buffer device address usage flag.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17275>
If we have 2 pipelines that consume the same push constant data
but where one of them only uses direct access and the other has
indirect access, a draw with the first pipeline would clear the
dirty flag without updating the UBO and by the time we bind and
draw with the second pipeline we won't upload the constants either
because the first draw cleared the dirty flag.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17536>
Since we allocate this ourselves we can immediately add it to the
job at the time we allocate it.
This also fixes a bug we introduced when we implemented inline
uniforms because since that commit, if we had an inline uniform
buffer at index 1 which happend to have indirect access we would
track it in slot 0 instead of slot 1, potentially overwriting
the push constant buffer reference.
Fixes: ea3223e7a4 ('v3dv: implement VK_EXT_inline_uniform_block')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17536>
We have code in there to allocate various segments of MAX_PUSH_CONSTANTS_SIZE
to handle the case of various draw calls in the same command buffer requiring
different push constants, so we are implicitly expecting it to be larger than
this. In fact, this only works now because when we allocate a BO we are always
at least allocating a full page, so the least we ever allocate is 4096 bytes,
so be explicit about it to avoid confusion.
Also, since we were always mapping MAX_PUSH_CONSTANTS_SIZE and the mapping
always starts at the beginning of the BO, it looks like after the first copy
when the resource offset is not zero, we would be writing outside the mapped
range. Always map the full size of the BO instead to ensure this doesn't
happen.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17536>
We have been always uploading MAX_PUSH_CONSTANTS_SIZE but now that
we track the actual size of the push constant buffer we can use
this instead.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17536>
If the command buffer didn't have any push constants or the meta
operation didn't write any new constants we don't need to restore
the state.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17536>
driFetchDrawable is only ever called from the MakeCurrent path, which
means it has to handle the case of pre-GLX-1.3 Windows being named as
the drawable. When it finds the drawable in the hash, it increments its
refcount before returning it, so for a GLXWindow it would be 2 on first
return, one from glXCreateWindow and one from glXMakeCurrent. But when
it does not find the drawable and creates one for the naked Window, the
reference count on first return would only be 1. As a result, if this
context was then ever bound to a different drawable, the old Window's
DRI drawable state (like the back buffer) would be destroyed.
Fixes piglit's glx-multi-window-single-context and glx-make-current for
a variety of drivers.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6713
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17479>
The compiler has improved significantly since we found this issue
and this is no longer required.
Notice that because we are increasing the number of samplers
supported beyond what we can loop unroll (currently capped at 16),
some piglit tests that test the maximum number of samplers supported
start to fail because they use indirect indexing on a sampler array
and we don't support that (previously the indirect indexing was
removed by loop unrolling). This is a bug in tests which the
GLSL linker detects, failing to compile the shaders.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17509>
Also, remove the FIXME to pre-compute this in images. We only use
this helper from copy/clear operations where we may be working
with a compatible framebuffer format instead of the original image.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17509>
It was checking "mesa's theoretical max attributes" rather than "the
driver's max attributes."
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17449>
This intrinsic is only produced when the compiler is instructed
to handle layer id as a system value, which we don't use. Also,
we have been supporting layered rendering for a while and passing
all the relevant tests which would've failed if we were hitting
this lowering.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17483>
When using the texel buffer copy path to copy a buffer we need to
sample from the buffer and for that we need a texture shader state
record where we specify the base offset of the texture (the buffer).
If the copy operation has a start offset we can't add that offset
to the base address of the buffer because the texture state record
requires the base pointer to be 64-byte aligned, so it would only
work for offsets that are multiple of 64B. Instead, we pass the
offset (in elements) to the shader and we use that to shift the
indices into the buffer when selecting the source texel to copy.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17482>
Over-estimating latency can cause us to delay the critical paths of
the shader unnecessarily, producing larger QPU programs that take more
time to execute as a result (and it also adds register pressure) so
striking a balance is important. The thread switching model in V3D
is quite effective at hiding latency and usuallly we just need to
hint it to delay TMU instructions a little bit to find the best
compromise for performance.
The new latency numbers have been chosen empirically by testing
V3DV with Sponza and a few UE4 samples.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17451>
Based on empirical testing with Sponza and a few UE4 samples this is
consistently slightly benefitial for performance.
The most likely reason why this helps is that thrsw is probably
already quite effective at hiding latency and we are already trying
to hide latency at NIR scheduling and also via TMU pipelining, so
piling up on this when scheduling QPU typically ends up providing no
benefit at all for latency and is instead possibly preventing us to
unblock critical paths in the shader that depend on the TMU result,
requiring us to execute more cycles to complete the program.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17451>
The hardware doesn't support 3D textures. We had been lying about 3D
texture level support in the past so that we got GL 2.1, but now reporting
levels==0 doesn't disable GL 2.1 (since we don't check for GL2 extensions
any more). But, by not lying, we now fix the majority of the remaining
GLES2 deqp failures.
This regresses a few desktop GL piglits which get GL errors that they
notice instead of what would be silent rendering failures on 3D texturing
operations.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17350>
We can produce slightly better code for these in the backend, so
do that. For this we need to:
1. Fix our implementation of uadd_carry (which wasn't used) to return
an integer instead of a boolean value.
2. Add an implementation of usub_borrow.
Notice these are only used in Vulkan. In GL these instructions are
always unconditionally lowered by the state tracker in GLSL IR so
we never get to see them in the backend.
Shader-db stats from a collection of Vulkan samples:
total instructions in shared programs: 122351 -> 122345 (<.01%)
instructions in affected programs: 196 -> 190 (-3.06%)
helped: 2
HURT: 0
total uniforms in shared programs: 18670 -> 18672 (0.01%)
uniforms in affected programs: 59 -> 61 (3.39%)
helped: 0
HURT: 2
total max-temps in shared programs: 13145 -> 13147 (0.02%)
max-temps in affected programs: 27 -> 29 (7.41%)
helped: 0
HURT: 2
total inst-and-stalls in shared programs: 123052 -> 123046 (<.01%)
inst-and-stalls in affected programs: 197 -> 191 (-3.05%)
helped: 2
HURT: 0
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>