We can't rely on what we get from the assembler if we have indirect
addressing of constant file, since the assembler doesn't know the array
index. This got lost in the transition to NIR.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
A clear will do a partial validate, which will in turn reference all the
buffers in the bufctx again. However the fragprog last validated might
have already been deleted. So reset the bufctx when updating state.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.
It might make sense to refine the value according to a kernel's
resource usage, but that's a possible optimization for the future.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Still appears to have issues with negative indices less than -1M, but
that's a corner case of a corner case.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
The HUD doesn't check if query_create() fails and it calls other
pipe_query functions with NULL pointer instead of a valid query object.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.
The original fd's lifetime differs from that of the nouveau_screen stored
in the hash. The hash key is the fd, and in order to compare hash entries
we fstat them, so the fd must be around for as long as the screen is.
This is an extension of the fix in commit a59f2bb1 (nouveau: dup fd
before passing it to device).
Cc: "10.3 10.4 10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This isn't pretty and I'd suggest it the pm4 interface builder
could be tweaked to do this more efficently, but I'd need
guidance on how that would look.
This seems to pass the few piglit tests I threw at it.
v2: handle passing layer/viewport index to fragment shader.
fix crash in blit changes,
add support to io_get_unique_index for layer/viewport index
update docs.
v3: avoid looking up viewport index and layer in es (Marek).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
An immediate has to be the second arg of an ADD operation. However we
were mistakenly propagating the modifier of the non-folded value to the
folded immediate argument.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91117
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Since the addition of ilo_vma, it was used only to pad a bo for sampling
engine surfaces. Replace it entirely with these functions
ilo_state_surface_buffer_size()
ilo_state_vertex_buffer_size()
ilo_state_index_buffer_size()
ilo_state_sol_buffer_size()
1000 ms is an extreme value for typical interactive loads. A large
cache has some disadvantages. Search for reusable BOs can take a long
time and memory might get exhausted.
Let's be rather conservative and use half of the old value,
500ms. This is beneficial to some loads on my test system and there
are no regressions.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is the basic granularity for BO allocations. The alignment also
helps with BO reuse by the cached bufmgr.
This results in a huge 45% speedup in Metro 2033 Redux on my test
system. The game relies on buffer orphaning with very small buffers
(hundreds of bytes in size) and that did not work efficiently
before. This change may also affect other applications and games.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
But only when doing so is safe according to the
RADEON_INFO_VA_UNMAP_WORKING kernel query.
This avoids kernel GPU VM address range conflicts when the BO has other
references than the GEM handle being closed, e.g. when the BO is shared.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90537
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90873
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
It returns a new value for each sample in the TLB. We've already avoided
trying to get the same index's color multiple times at the vc4_program.c
level, so we're not losing anything by doing this.
Without first running the bo through pushbuf_refn, the nouveau drm
library will have uninitialized structures regarding this bo, and will
insert incorrect data.
This fixes supertuxkart 0.9 crash on start (where it ends up doing a lot
of indirect draws).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Since we clear the TFB bufctx binding point above, we need to put all of
the active tfb's back in, even if they haven't changed since last time.
Otherwise the tfb may get moved into sysmem and the underlying mapping
will generate write errors.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
If the shader doesn't specify number of invocations, assume one.
This fixes geometry shaders on state trackers other than Mesa (and
probably graw tests too.)
Trivial.
This extends the draw code to add support for invocations.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is just for softpipe, llvmpipe won't work without
some changes.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is required for ARB_gpu_shader5 support in softpipe.
v2: add support to txd/txf/txq paths.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>