We get a warning here for assigning a const char * pointer to
char *swizzle in struct ir2_src_register. The constructor strdups a 4
byte string here, so just memcpy to that instead.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
In the pursuit of lowering driver overhead, it became clear that some
amount of redesign of how libdrm_freedreno constructs the submit ioctl
would be needed. In particular, as the gallium driver is starting to
make heavier use of CP_SET_DRAW_STATE state groups/objects, the over-
head of tracking cmd buffers and relocs becomes too much. And for
"streaming" state, which isn't ever reused (like uniform uploads) the
overhead of allocating/freeing ringbuffer[1] objects is too high.
This redesign makes two main changes:
1) Introduces a fd_submit object for tracking bos and cmds table
for the submit ioctl, making ringbuffer objects more light-
weight. This was previously done in the ringbuffer. But we
have many ringbuffer instances involved in a submit (gmem +
draw + potentially 1000's of state-group rbs), and only need
a single bos and cmds table. (Reloc table is still per-rb)
The submit is also a convenient place for a slab allocator for
ringbuffer objects. Other options would have required locking
because, while we can guarantee allocations will only happen on
a single thread, free's could happen either on the application
thread or the flush_queue thread. With the slab allocator in
the submit object, any frees that happen on the flush_queue
thread happen after we know that the application thread is done
with the submit.
2) Introduce a new "softpin" msm_ringbuffer_sp implementation that
does not use relocs and only has cmds table entries for IB1 (ie.
the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER
to from the RB). To do this properly will require some updates
on the kernel side, so whether you get the softpin or legacy
submit/ringbuffer implementation at runtime depends on your
kernel version.
To make all these changes in libdrm would basically require adding a
libdrm_freedreno2, so this is a good point to just pull the libdrm code
into mesa. Plus it allows for using mesa's hashtable, slab allocator,
etc. And it lets us have asserts enabled for debug mesa buids but
omitted for release builds. And it makes life easier if further API
changes become necessary.
At this point I haven't tried to pull in the kgsl backend. Although
I left the level of vfunc indirection which would make it possible
to have other backends. (And this was convenient to keep to allow
for the "softpin" ringbuffer to coexist.)
NOTE: if bisecting a build error takes you here, try a clean build.
There are a bunch of ways things can go wrong if you still have
libdrm_freedreno cflags.
[1] "ringbuffer" is probably a bad name, the only level of cmdstream
buffer that is actually a ring is RB managed by kernel. User-
space cmdstream is all IB1/IB2 and state-groups.
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
This reverts commit 0fa9e6d7b3. The real
issue appears to have been that HiZ ops don't like having WM thread
dispatch force-enabled. The previous commit fixes that problem so we
can go back to using the ForceThreadDispatchEnable bit even on SKL+.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.
This patch introduces the behaviour when a presentation buffer
is used.
ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Windows drivers don't set this flag (which affects ff) to more than 8.
Do the same in case some games check for 8.
v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.
Fixes: https://github.com/iXit/Mesa-3D/issues/261
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.
These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.
We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.
These two changes avoid setting the two mentionned states when we
shouldn't.
Fixes: https://github.com/iXit/Mesa-3D/issues/320
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
The device state changed.* field are never used.
These fields are used only for stateblocks.
Avoid setting them at all for clarity.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
D3DSBT_ALL stateblocks capture the transform matrices.
Fixes some d3d test programs not displaying properly.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.
Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
At some point the project was to adapt the
commented version to csmt.
The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
This lets us get rid of a bunch of duplicated error messages.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
This reverts commit a5fd54f8bf.
The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Follow the restriction of making sure the clear value is between the min
and max values defined in CC_VIEWPORT. Avoids a simulator warning for
some piglit tests, one of them being:
./bin/depthstencil-render-miplevels 146 d=z32f_s8
Jason found this to fix incorrect clearing on SKL.
Fixes: 09948151ab
("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
MESA_GIT_SHA1 resolves to either an empty "" string if not build from git,
or " (git-DEADBEEF)" if it is. No need to wrap it in additional "()".
Fixes: 9d40ec2cf6 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Adding a function for converting h264 pipe video profile to profile idc
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Previously, we would always pull the bit size from the destination which
is wrong for opcodes like nir_ilt where the sources are variable-sized
but the destination is a fixed size. We were getting lucky before
because nir_op_ilt returns a 32-bit value and basically everyone who
uses spec constants uses 32-bit ones.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We have a helper that does exactly what the bany_inequal was doing. It
emits the same code but is a bit higher level and is designed to operate
on a bvec4.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
They do the same thing in the end but i2b is a bit simpler. Also, let's
clean up the mess of code for SSBO handling with one line of builder.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This isn't a great solution for bit-sizes but we don't have a
particularly convenient way to get a bit size from the system value enum
and this keeps the lowering pass from changing it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Instead of doing our own constant folding, we just emit instructions and
let constant folding happen. This is substantially simpler and lets us
use the nir_imm_bool helper instead of dealing with the const_value's
ourselves.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This requires that we rework the interface a bit to use nir_builder but
that's a nice little modernization anyway.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Missed one while converting to the nir_src_as_* helpers.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
When depth testing is disabled, we shouldn't pay attention to the
specified depthCompareOp, and just treat it as always passing. Before,
if the depth test is disabled, but depthCompareOp is VK_COMPARE_OP_NEVER
(e.g. from the app having zero-initialized the structure), then
sanitize_stencil_face() would have incorrectly changed passOp to
VK_STENCIL_OP_KEEP.
v2: Roll the depthTestEnable check into the ds_aspect check below since
they now both do the same thing.
Fixes: 028e1137e6 "anv/pipeline: Be smarter about depth/stencil state"
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This should address the remaining failures in Batman Arkhman City.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
While disassembling send(c) instruction print message descriptor as
immediate source operand along with message descriptor. This allows
assembler to read immediate source operand and set bits accordingly.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>