This is way better than the stupid string approach especially since you
could overflow the string. Again, I thought I had something better at one
point but it obviously got lost.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
It was returning true if the function types have different lengths rather
than false. This was new with the SPIR-V to NIR pass and I thought I'd
fixed it a while ago but it may have gotten lost in rebasing somewhere.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Replace the previous hardcoded value with newly defined parameters
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Assign previously hardcoded values for OMX to newly defined
structure. As a result, OMX behaviour will not change at all.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Allow to specify more parameters in the encoding interface
which previously just hardcoded in the encoder
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Otherwise we leak the resources created for the DMA-buf descriptors.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com>
Ack-by: Tom St Denis <tom.stdenis@amd.com>
If a block might be entered from multiple locations, then the uniform
stream will (probably) be at different points, and we need to make sure
that it's pointing where we expect it to be. The kernel also enforces
that any block reading a uniform resets uniforms, to prevent reading
outside of the uniform stream by using looping.
With control flow, we can't be sure that we'll see the uses of a variable
before its def as we walk backwards. Given that NIR is eliminating our
long chains of dead code, a simple solution for now seems fine.
This slightly changes the order of some optimizations, and so an opt_vpm
happens before opt_dce, causing 3 dead MOVs to be turned into dead FMAXes
in Minecraft:
instructions in affected programs: 52 -> 54 (3.85%)
Previously, we could assume that a MOV from a temp was always an available
copy, because all temps were SSA in NIR, and their non-SSA state in QIR
was just due to the fact that they were from a bcsel or pack_unorm_4x8, so
we could use the current value of the temp after that series of QIR
instructions to define it.
However, this is no longer the case with control flow. Instead, we track
a new array of MOVs defined within the block that haven't had their source
or dest killed yet, and use that primarily. We fall back to looking
through the QIR defs array to handle across-block MOVs, but now require
that copies from the SSA defs have an SSA src as well.
v2 (Matt):
- Use brw_imm_df() as source argument of DIM instruction.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
According to HSW's PRM, vol02b, the DIM instruction has the following
restriction:
"Restriction : src0 must be immediate. src0 must specify the :f (F, Float)
type encoding but is an immediate 64-bit DF (Double Float) value. dst
must have type DF."
This commit allows to upload the immediate 64-bit DF value to the source
of a DIM instruction even when it is of float type encoding.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
v2 (Matt):
- Take a DF source argument for the DIM instruction emission
in the visitors.
- Indentation.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This reverts commit d8d6091a84.
Heap allocations may be only 8-byte aligned on 32-bit system, and so having
members with 16-byte alignment (such as in the case where pipe_blend_color is
embedded in radeonsi's si_context) is undefined behavior which indeed causes
crashes when compiled with gcc -O3.
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Acked-by: Chuck Atkins <chuck.atkins@kitware.com>
When we initially dropped bpb in favor of bs, we accidentally didn't change
this one line properly. This brings it back to what it should be.
Reviewed-by: Chad Versace <chad.versace@intel.com>
A while ago we got rid of the bits-per-block because we thought we didn't
need it. We're about to introduce some very useful 1 and 2-bit formats so
we really should be able to handle them again.
Reviewed-by: Chad Versace <chad.versace@intel.com>
This is based on a very long set of discussions between Chad and myself
about how we should properly represent HiZ and CCS buffers. The end result
of that discussion was that a tiling actually has two different sizes, a
logical size in elements, and a physical size in bytes and rows. This
commit reworks ISL's pitch and size calculations to work in terms of these
two sizes.
Reviewed-by: Chad Versace <chad.versace@intel.com>
We helpfully inserted a PRM quotation about how we need to use
ARRAY_PITCH_SPAN_FULL and then set it to COMPACT. Oops...
Reviewed-by: Chad Versace <chad.versace@intel.com>
The row pitch already specifies the size of a row of elements.
Multiplying by the block height simply causes us to allocate as muc as 12
times more memory than needed for compressed textures.
Reviewed-by: Chad Versace <chad.versace@intel.com>
I have no idea why we were multiplying by 4 before. The offsets we get
from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to
do any adjustment whatsoever.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
v2: use abort(), describe which LLVM version is affected
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Not sure if this is the right way to do it, but it seems to work.
v2: make it a no-op on LLVM <= 3.5
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>