This represents a float vec4 constant color, as passed to glBlendColor.
While the existing 4 shader sysvals are retained to minimize code churn,
a single vectorized intrinsic is required for efficient blending on
vector architectures. (This may also apply to archictectures like
Bifrost where ALU is scalar but load/store is vector; it largely depends
on how blending is implemented per-driver.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Complementing the new API-agnostic shader_enum blending style, we add
helpers to translate between the two forms. Ideally, we could just use
PIPE blending directly, but that makes Vulkan support challenging.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
We add enums corresponding to (GLES) blend state to shader_enums.h,
complementing the existing advanced blending enums in the file. This
allows us to represent blending state in a driver-agnostic, API-agnostic
way to permit lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
This can be used by both etnaviv and freedreno/a2xx as they are both vec4
architectures with some instructions being scalar-only.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
In order to set up KILL sets, the dataflow code was walking the entire
array of ACPs for every instruction. If you assume the number of ACPs
increases roughly with the number of instructions, this is O(n^2). As
it turns out, regions_overlap() is not nearly as cheap as one would like
and shows up as a significant chunk on perf traces.
This commit changes things around and instead first builds an array of
exec_lists which it uses like a hash table (keyed off ACP source or
destination) similar to what's done in the rest of the copy-prop code.
By first walking the list of ACPs and populating the table and then
walking instructions and only looking at ACPs which probably have the
same VGRF number, we can reduce the complexity to O(n). This takes the
execution time of the piglit vs-isnan-dvec test from about 56.4 seconds
on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to
about 38.7 seconds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
If the destination of an ACP entry exists only within this block, then
there's no need to keep it for dataflow analysis. We can delete it from
the out_acp table and avoid growing the bitsets any bigger than we
absolutely have to. This reduces the maximum number of global ACP
entries in the vs-isnan-dvec with software fp64 on Kaby Lake from 8630
to 3942 and takes the execution time of the piglit vs-isnan-dvec test
from about 1:16.2 on an unoptimized debug build (what we run in CI) with
NIR_VALIDATE=0 to about 56.4 seconds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
While the number of ACPs is generally not huge compared to the number of
blocks, 16 does seem a bit small. Bumping it to 64 takes the execution
time of the piglit vs-isnan-dvec test from about 1:18.1 on an unoptimized
debug build (what we run in CI) with NIR_VALIDATE=0 to about 1:16.2.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
There is no user fence for JPEG, the bug triggering
kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT)
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
When width=4096 and shift_w=0, block_w=0x100 which overflow
the PLBU_CMD 8 bits for it.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Just do what everybody else but Nouveau does and return 0.0f.
This prevents the repeated logging of these messages on startup:
Unexpected PIPE_CAPF 6 query
Unexpected PIPE_CAPF 7 query
Unexpected PIPE_CAPF 8 query
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
As the functions operate on 16-byte blocks.
Fixes this Valgrind error:
Invalid read of size 4
at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85)
by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171)
by 0x584F587: panfrost_tile_texture (pan_resource.c:489)
by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525)
by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516)
by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515)
by 0x5875F13: u_default_texture_subdata (u_transfer.c:80)
by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
by 0x5391353: teximage (teximage.c:3105)
by 0x5391353: teximage_err (teximage.c:3132)
by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd
at 0x483F5C8: malloc (vg_replace_malloc.c:299)
by 0x584F47D: panfrost_transfer_map (pan_resource.c:467)
by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59)
by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
by 0x5391353: teximage (teximage.c:3105)
by 0x5391353: teximage_err (teximage.c:3132)
by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2)
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Valgrind was complaining of those.
NIR_PASS only sets progress to TRUE if there was progress.
nir_const_load_to_arr() only sets as many constants as components has
the instruction.
This was causing some dEQP tests to flip-flop, such as:
dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Fixes: 14531d676b ("nir: make nir_const_value scalar")
These tests add too much time to the total run time, and some of them
even hang the DUTs, even if I haven't been able to reproduce it locally.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
There doesn't seem to actually be any noticeably memory leaks on Weston
when running dEQP. We do seem to leak quiet a bit in the client, so we
still have to run the dEQP runner in batches.
This removes the risk of Weston not restarting properly and introducing
spurious failures.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This matches the current state of things on both RK3288 and RK3399.
Hopefully, from now on we'll only remove stuff from this list.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Make sure we have only test case names in the list, excluding names of
test groups.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
To improve robustness, check that we got the expected number of results.
Right now we hard-code the expected number of tests run, but with some
effort we may be able to infer it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Build artifacts for armhf and schedule them on a Veyron Chromebook with
RK3288.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Buffer needs to be reloaded every time unless explicit clear() was
called.
Fixes rendering issues with wayland compositors.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
The key reason for that mechanism is gone: all the extra optional data
that could be in the anv_push_constants was moved elsewhere. At this
point, just put anv_push_constants directly in anv_cmd_state (part of
anv_cmd_buffer).
v2: Remove a NULL check we don't need anymore in
anv_cmd_buffer_push_constants(). (Lionel)
Fix size we consider for valid push params. (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This provides a way for the application to query whether any resets have
happened, which lets us expose "robust" contexts. This also enables the
KHR_robust_buffer_access_behavior tests.
This mechanism lets the driver inform the state tracker about GPU
resets, say for destroying a robust API context and reporting a "device
lost" error to the application, making it take action to deal with this.
The iris batch module now tries to detect that the kernel has banned
our GEM context, creates a new non-banned context, and informs the
iris context module that all assumptions about state are now invalid
and it needs to reinitialize the relevant state.
Based on Chris Wilson's work, but significantly rewritten by me.
Adapted from Chris Wilson's patch. The comment is largely his.
Currently, when iris hangs the GPU, it will continue sending batches
which incrementally update the state, assuming it's preserved across
batches. However, the kernel's GPU reset support reinitializes the
guilty context to the default GPU state (reasonably not wanting to
trust the current state). This ends up resetting critical things
like STATE_BASE_ADDRESS, causing memory accesses in all subsequent
batches to be garbage, and almost certainly result in more hangs
until we're banned or we kill the machine.
We now ask the kernel to ban our render context immediately, so we
notice we've gone off the rails as fast as possible. Eventually, we'll
attempt to recover and continue. For now, we just avoid torching the
GPU over and over.
Ofc legacy gl features that are broken don't trigger fails in deqp. I
should remember to test glxgears more often.
Fixes: 7ff6705b8d freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
We didn't notice this issue much because the 2 struct share a similar
layout, expect for the additional fields...
We run into that issue in Anv :
==15236== Invalid write of size 8
==15236== at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211)
==15236== by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264)
==15236== by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312)
==15236== by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167)
==15236== by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190)
==15236== by 0x8CF60871: alloc_surface_state (anv_image.c:1122)
==15236== by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519)
==15236== by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358)
==15236== Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd
==15236== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15236== by 0x8D2578E6: u_vector_init (u_vector.c:47)
==15236== by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168)
==15236== by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921)
==15236== by 0x8CF56517: anv_CreateDevice (anv_device.c:1909)
==15236== by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073)
==15236== by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so)
==15236== by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so)
==15236== by 0x8BCB35C6: loader_create_device_chain (loader.c:5449)
==15236== by 0x8BCBC230: vkCreateDevice (trampoline.c:838)
v2: Rename mmap_cleanups to avoid confusion (Caio)
v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
When first implemented in fefd03e16c Mesa's behavior was aligned on behavior
of Nvidia's driver. This caused a failing test in piglit but was ok since the
specification is unclear on this subject.
Nvidia's driver behavior has been modified because using version 410.104, the
problematic test (program_binary_retrievable_hint) now passes.
This commit defers BinaryRetrievableHint update until the next linking so the
test passes on Mesa as well.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
I don't know why I thought NIR_PASS always set the progress variable.
Derp.
Fixes: d41cdef2a5 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic")
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Coverity CID: 1444996
Coverity CID: 1444995
Coverity CID: 1444994
Coverity CID: 1444993
Coverity CID: 1444991
Coverity CID: 1444989
We need to know the number of rectangles.
This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*.
Fixes: 5db0bf9994 ("radv: Implement VK_EXT_discard_rectangles.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Propagate the failure from GEM_EXECBUFFER2, cleanup then report failure
if need be. We retain the current behaviour to abort() at the first sign
of trouble -- for a non-robustness context, arguably this is the right
thing to do as the client cannot recover, and the system state is lost.
How to properly integrate with KHR_robustness and reset-strategy is
left as a future exercise.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Pull i915_drm.h to include
kernel commit ba4fda620a5f7db521aa9e0262cf49854c1b1d9c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Feb 18 10:58:21 2019 +0000
drm/i915: Optionally disable automatic recovery after a GPU reset
for improved resilience in handling GPU hangs.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
[ Michel Dänzer: Take changes affecting the docker image from !299,
plus remove the unzip package again before generating the image ]
And consolidate it all into a single job.
It doesn't take much longer than a single version, thanks to ccache.
Overall, this single job might be faster or at least use fewer CPU
cycles than the two jobs before, while covering thrice as many versions
of LLVM.
v2:
* Move "rm -rf _build" to meson-build.sh.
* Set GALLIUM_DRIVERS the same way both times in the meson-clover job,
for symmetry.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
No functional change intended (except for no longer running meson
--version separately, as the version appears early in meson's output
anyway).
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>