From the GL_OES_texture_buffer spec:
"If no buffer object is bound to the buffer texture, the results
of the texel access are undefined."
this can be interpreted as allowing any result to come back, but not
terminate the program.
The current solution is not entirely complete, as it could still try to
get a wrong address for the shader state address.
This can be checked with piglit test
arb_texture_buffer_object-render-no-bo; the test is skip because it
requires OpenGL 3.1, but if overriding the version then it will crash.
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
This commit handles the support for texture buffer objects. In general
it is mostly about using the buffer info from the pipe_image_view
instead of the texture info.
v2:
- Rework some assertions (Iago)
- Remove needless comment (Alejandro)
- Fix comment typos (Iago)
v3:
- Fix typos (Iago)
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
We follow the same approach that v3d. We compile the files that
depends on the version several times, passing a different version each
time. We link all those per-version libs on the main library.
Note that right now we only support version == 42, so the array of
supported versions is one-sized.
Also note that although we were doing a previous work to split
hw-version dependant code from general code, this is the first commit
that only inject the current V3D_VERSION on the former.
We have two cases where we hardcode the V3D_VERSION (as a full
wrapping would be an overkill) that we need to include here to avoid
warnings/errors if we do that before or after.
Having some exceptions also happens on v3d. As we are here we add some
comment on v3d clarifying that.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11310>
As the driver was renamed in the past from VC5 to V3D, let's rename also
the definitions and enumerations to keep it consistent across the code.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>
We had an unnecessary case in our uniforms upload switch statement, since
we no longer advertise the cap.
Fixes: 8ad931808e ("v3d: do not report alpha-test as supported")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8601>
pipe_alpha_state and pipe_depth_state will be packed together
because they have only a few bitfields each. This will eventually
remove 4 bytes of padding in pipe_depth_stencil_alpha_state.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
When line smoothing is enabled, the driver now increases the width of
the line so that it can add some semi-transparent pixels to either side
of the line. A lowering pass is added which modifies the alpha component
of every write to fragment output 0 so that if the fragment is outside
the width of the line then the alpha is reduced. It additionally
discards fragments that are completely invisible. It might seem bad to
use discard on a tiled renderer but the assumption is that any bad
effects from using discard will also happen anyway because of enabling
alpha blending.
v2: Disable the line smoothing pass entirely when the framebuffer
contains an integer colour output or one with no alpha channel.
Calculate the coverage once upfront and store in a global variable
instead of calculating each time an output write is modified. Also
do the conditional discard once upfront.
v3: Don’t check whether the output buffer has an alpha channel. Only
look at output 0. Use aa_line_width intrinsic instead of calculating
the real line width in the shader. Clamp the coverage as part of the
global variable, not per output write.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
When geometry shaders write a value to gl_Layer that doesn't correspond to
an existing layer in the target framebuffer the rendering behavior is
undefined according to the spec, however, there are CTS tests that trigger
this scenario on purpose, probably to ensure that nothing terrible happens.
For V3D, this situation is problematic because the binner uses the layer
index to select the offset to write into the tile state data, and we only
allocate tile state for MAX2(num_layers, 1), so we want to make sure we
don't produce values that would lead to out of bounds writes. The simulator
has an assert to catch this, although we haven't observed issues in actual
hardware it is probably best to play safe.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
This is good enough to get basic GS workloads working, later patches will
improve this by adding instancing support, proper SIMD configuration, etc.
Notice that most of the TESSELLATION_GEOMETRY_SHADER_PARAMS fields are only
relevant when tessellation shaders are present. We do not support tessellation
yet, but we still need to fill in these tessellation state with default values
since our packing functions require some of these to have non-zero values.
v2:
- Add a comment in the code explaining why we fill in
tessellation fields (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
We were always ensuring a minimum size of 4 bytes for uniforms
for the case where we don't have any, to account for hardware pre-fetching
of the uniform stream, however, pre-fetching could also lead to to out
of bounds reads when have read the last uniform in the stream, so we
probably want to have the extra 4 bytes to prevent the kernel from
observing invalid memory accesses when the uniform stream sits right at
the end of a page.
This seems to fix MMU exceptions reported with a Linux 5.4 kernel.
Credit goes to Phil Elwell for identifying the problem and narrowing
it down to memory accesses in the uniform stream.
Reported-by: Phil Elwell <phil@raspberrypi.org>
Tested-by: Phil Elwell <phil@raspberrypi.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Now that the UAPI has landed, add the pipe_context function for
dispatching compute shaders. This is the last major feature for GLES 3.1,
though it's not enabled quite yet.
As introduced in "v3d: flag dirty state when binding new sampler states"
we need to add support for compute states. New flag VC5_DIRTY_COMPTEX and
VC5_DIRTY_UNCOMPILED_CS are introduced.
Reaching 33 flags at the dirty field forces us to change the type to
uint_64. Flags are reordered and empty continuous bits are available
for future pipeline stages.
v2: Update flag conditions to compile cs shader. (Eric Antholt)
Now dirty flags use uint_64t and flags are reordered.
Added VC5_DIRTY_UNCOMPILED_CS flag.
Reviewed-by: Eric Anholt <eric@anholt.net>
While waiting for the CSD UABI to get reviewed, I keep having to rebase
the CS patch. Just land the compiler side for now to keep it from
diverging.
For now this covers just GLES 3.1 compute shaders, not CL kernels.
The idea was that we could skip uploading the constant-indexed uniform
data and just upload the uniforms that are variably-indexed. However,
since the VS bin and render shaders may have a different set of uniforms
used, this meant that we had to upload the UBO for each of them. The
first case is generally a fairly small impact (usually the uniform array
is the most space, other than a couple of FSes in shader-db), while the
second is a larger impact: 3DMMES2 was uploading 38k/frame of uniforms
instead of 18k.
Given that the optimization is of dubious value, has a big downside, and
is quite a bit of code, just drop it. No change in shader-db. No change
on 3DMMES2 (n=15).
We'd end up with the constant offset in the uniform stream anyway, since
they're bigger than small immediates. Avoids the extra uniforms and adds
in the shader in favor of just adding once on the CPU.
shader-db:
total instructions in shared programs: 6496865 -> 6494851 (-0.03%)
total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)
I want to reuse this for encoding small constant UBO/SSBO offsets into the
uniform stream to reduce the extra uniform loads and adds for the small
constant offsets.
The sampler border color is encoded in the TMU's blending format (half
floats, 32-bit floats, or integers) and must be clamped to the format's
range unorm/snorm/int ranges by the driver. Additionally, the TMU doesn't
know about how we're abusing the swizzle to support BGRA, A, and LA, so we
have to pre-swizzle the border color for those.
We don't really want to spend half a kb on sampler states in most cases,
so skip generating the variants when the border color is unused or is
0,0,0,0.
This is only exposed on V3D 4.1+, because we didn't have the TMU write
operations for images on 3.3 (To do GLES 3.1 there, you have to lower it
to SSBO load/stores, which is a problem to solve later).
Follows 3954331aff ("vc4: Pull uinfo->data[i] dereference out to the top
of the loop.") which showed a large performance win for vc4, but also
cleans up the code a decent bit.
Just like vc4, we have to support linear shared BOs for X11 on arbitrary
displays. When we're faced with a request to texture from one of those,
make a shadow image that we copy using the TFU at the start of the draw
call.