They indicate the operation does not cause overflow or underflow.
This is motivated by SPIR-V decorations NoSignedWrap and
NoUnsignedWrap.
Change the storage of `exact` to be a single bit, so they pack
together.
v2: Handle no_wrap in nir_instr_set. (Karol)
v3: Use two separate flags, since the NIR SSA values and certain
instructions are typeless, so just no_wrap would be insufficient
to know which one was referred to. (Connor)
v4: Don't use nir_instr_set to propagate the flags, unlike `exact`,
consider the instructions different if the flags have different
values. Fix hashing/comparing. (Jason)
Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
There doesn't seem to be any reason to keep these opcodes around:
* fnot/fxor are not used at all.
* fand/for are only used in lower_alu_to_scalar, but easily replaced
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We can vectorize instructions with different constant sources by creating
a new load_const and using that.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In Midgard, a bundle consists of a few ALU instructions. Within the
bundle, there is room for an optional 128-bit constant; this constant is
shared across all instructions in the bundle.
Unfortunately, many instructions want a 128-bit constant all to
themselves (how selfish!). If we run out of space for constants in a
bundle, the bundle has to be broken up, incurring a performance and
space penalty.
As an optimization, the scheduler now analyzes the constants coming in
per-instruction and attempts to merge shared components, adjusting the
swizzle accessing the bundle's constants appropriately. Concretely,
given the GLSL:
(a * vec4(1.5, 0.5, 0.5, 1.0)) + vec4(1.0, 2.3, 2.3, 0.5)
instead of compiling to the naive two bundles:
vmul.fmul [temp], [a], r26
fconstants 1.5, 0.5, 0.5, 1.0
vadd.fadd [out], [temp], r26
fconstants 1.0, 2.3, 2.3, 0.5
The scheduler can now fuse into a single (pipelined!) bundle:
vmul.fmul [temp], [a], r26.xyyz
vadd.fadd [out], [temp], r26.zwwy
fconstants 1.5, 0.5, 1.0, 2.3
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In the past, each query object had their own BO. Checking if the batch
referenced that BO was an easy way to check if commands were still
queued to compute the query value. If so, we needed to flush.
More recently (c24a574e6c), we started using an u_upload_mgr for query
objects, placing multiple queries in the same BO. One side-effect is
that iris_batch_references is a no longer a reasonable way to check if
commands are still queued for our query. Ours might be done, but a
later query that happens to be in the same BO might be queued. We don't
want to flush in that case.
Instead, check if the current batch's signalling syncpt is the one we
referenced when ending the query. We know the syncpt can't have been
reused because our query is holding a reference, so a simple pointer
comparison should suffice.
Removes all batch flushing caused by query objects in Shadow of Mordor.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
This returns a pointer to the signalling syncpt, without incrementing
the reference count. This can be useful for comparisons.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
The ss local var is guaranteed to be != NULL. Get rid of this useless
check.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
"Collabora, Ltd." should be listed in lieu of simply "Collabora"
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Daniel Stone <daniels@collabora.com>
This support requires the driver to be a NIR driver as we use the
NIR lowering pass to do the clamping.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
A ton of tests were fixed by this series. A few were incorrectly passing
before (QualityError, for instance) and now are explicitly failing. A
few legitimate regressions but overwhelmingly positive.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes cube map tests due to disagreements between Mesa, dEQP, and the
spec...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
u_blitter gets "special treatment" and uses this mechanism to cast
cube maps to 2D textures in order to texelFetch them.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In a vertex shader, a tex op should map to txl, as there *must* be a LOD
given to the hardware (implicitly or explicitly).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Identify the seamless cubemap bit and passthrough the Gallium state
rather than setting unconditionally.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This is similar to the AFBC merge; now all (non-imported) buffers use a
common backing buffer. Reenables checksumming, eliminating a performance
regression.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
I thought I already fixed this. Maybe that was a dream...? Then again, I
might be dreaming now.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The main ctx->blitter instance should be reserved for blits originated
from Gallium (like mipmap generation). Since wallpapering is
conceptually different -- wallpaper blits can be triggered by Gallium
blits -- the blitter pipes must be separate to avoid potential u_blitter
recursion.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Rather than tracking AFBC memory "specially", just use the same codepath
as linear and tiled. Less things to mess up, I figure. This allows us to
use the standard setup_slices() call with AFBC resources, allowing
mipmapped AFBC resources.
Unfortunately, we do have to disable AFBC (and checksumming) in the
meantime to avoid functional regressions, as we don't know _a priori_ if
we'll need to access a resource from software (which is not yet hooked
up with AFBC) and we don't yet have routines to switch the layout of a
BO at runtime.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
As far as we know, Utgard-style tiling only works for color render
targets, not depth/stencil, so ensure we don't try to tile it (rather
than compress or plain old linear) and drive ourselves into a corner.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Now the autogeneration of mipmaps is working (via u_blitter), we can
finally enable mipmaps!
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Now that all the prerequisites breaking u_blitter are fixed, we can
finally hook up panfrost_blit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
txf instructions can result from blits, so handle them rather than
crash. Only works for 2D textures (not even 2D array texture) due to a
register allocation constraint that may not be sorted for a while.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>