This was still used in the linear branch, since it works all a little
differently there (in particular, when using guard band we have to
intersect the draw regions with the viewport, since draw won't clip
for us). However, we should always intersect with draw_regions
(regardless if that includes the intersection with vp or not), since
the viewport can be larger than the fb size, and we don't want to
draw outside the fb (usually harmless, but important for occlusion
queries and shader image/buffer writes).
This fixes various dEQP-GLES31.functional.fbo.no_attachments failures
(which uses oversized viewport with occlusion queries).
The other ci changes aren't really bugs (the humus/Portals image
looks the same, we cannot expect bit-identical results, and
for the piglit quad-invariance test, I think we merely passed it
by accident since our interpolation may give different results
depending on where on the screen a tri is regardless of linear
rasterizer).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11969>
This change adds:
- an alternative rasterizer, which rasterizes bins in a left->right &
top->bottom linear fashion;
- triangle -> rectangle detection;
- 1:1 blit detection;
- a special TGSI -> LLVM IR code generation that uses 8-bit SSE integers
in AoS fashion (as opposed to 32bits floats.)
Altogether these changes yield a 2x to 3x performance improvement for 2D
workloads. It was designed to render Windows 7 Aero and other Windows
built-in 3D applications (like Windows Media Player, Internet Explorer
11, UWP applications) with minimum CPU utilization, but it should be
generally applicable to other 2D-on-3D applications, like desktop
compositors, HTML browsers, 3D based UI toolkits, etc.
This was mostly the brainchild of Keith Whitwell back in 2010. I wrote
TGSI -> AoS translation. And many others added bug-fixes and
enhancements over the years: Roland Scheidegger, Brian Paul, and James
Benton.
Known issues:
- piglit spec@!opengl 1.1@quad-invariance will warn that "left and right
half should match" due to rounding error difference
- These optimized paths to kick in is that depth-buffer must not be
used, so some applications which want to benefit from these improvements
might need to be modified to ensure they use painter's algorithm instead
of depth-buffers.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Keith Whitwell <keithw@vmware.com>
v2: Incorporate Dave Airlie feedback: cleanup LP_DEBUG_xx; shrink 3+
empty lines.
v3: silence unused var warning, adapt to new upstream code (point setup)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11969>
This is required for Zink where the API ballot type is a uint64_t and
the "hardware" ballot type is uvec4.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11989>
We've had a physical machine death, and the restore/transfer is achingly
slow at the moment. Some of the devices are still fine, but
conservatively just kill the lot until it's all recovered.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11997>
This changes the pass to extract pinned instructions and not just unpinned
instructions when rescheduling instructions. This stops pinned instructions
from being bunched together when instructions are reinserted into the blocks
which can result in regressions with regards to cycles and instruction
counts on i965 and register use/Max Waves on AMD hardware.
In order to do this we also throw away the post-order depth-first
search linearization algorithm used to re-insert the instructions, which
itself causes possible regressions when instructions are reinserted into
a less than ideal new order (of which the bunched together pinned
instructions is one example). Instead we simply insert instructions in the
reverse order they were extracted. This will simply place instructions
that were scheduled earlier onto the end of their new block and
instructions that were scheduled later to the start of their new block.
With this everything should remain in order without the need to run
over uses.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
With this pass enabled in Intel drivers, running shader-db on
shaders/unity/38.shader_test resulted in
Program received signal SIGSEGV, Segmentation fault.
gcm_schedule_early_src (src=0x555555d45348, void_state=0x7fffffffba40) at ../../SOURCE/master/src/compiler/nir/nir_opt_gcm.c:297
297 if (info->early_block->index < src_info->early_block->index)
(gdb) print src_info->early_block
$1 = (nir_block *) 0x0
I tracked this down to an early exit from gcm_schedule_early_instr on
the parent instruction because instr->pass_flags was 0x1c. That
should be an impossible value for this pass, so I inferred that
pass_flags must have dirt left from some previous pass.
Fixes: 8dfe6f672f ("nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
Without this patch the egl symbols check test fail on mips platform:
72/87 mesa:egl / egl-symbols-check FAIL 0.20s (exit status 1)
src/egl/libEGL.so.1.0.0: unknown symbol exported: _fbss
src/egl/libEGL.so.1.0.0: unknown symbol exported: _fdata
src/egl/libEGL.so.1.0.0: unknown symbol exported: _ftext
See Mips Run say thoes special symbols are automatically defined by the
linker to allow programs to discover the start and end of their various
section. They are descended from conventions that grew up in UNIX-like OSs,
and are peculiar to the MIPS environment.
_fbss : Start of uninitialized data segment
_fdata : Start of initialized data segment
_ftext : Start of text segment
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: suijingfeng <suijingfeng@loongson.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11955>
ls3a4000 and ls2k1000 cpu is mips64r5 compatible with MSA SIMD
instruction set implemented, while ls3a3000 is mips64r2 compatible only.
Due to lacking llvm support for loongson CPU, llvm::sys::getHostCPUName().
return "generic" on all loongson mips CPU.
So we override the MCPU to mips64r5 if MSA is implemented, feedback to
mips64r2 for all other ordinaries.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: suijingfeng <suijingfeng@loongson.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11955>
Make sure that the host is using llvmpipe while the guest is using virgl as driver.
Note that the neverball/neverball.trace trace actually regressed in a way that the
foreground is missing.
Fixes: f1b952fa ("ci: Run tests inside Crosvm")
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11986>
This is a port of the softpipe anisotropic filtering
to llvmpipe. It should produce pretty similiar results.
This contains the proposed fix to the softpipe calculating
dq after scaling.
It also contains a number of other fixes around vector lengths
etc caught during test.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8804>
The anisotropic filtering needs access to a table of weights,
to make the calculations easier. This routes the table
through from shader parameter and makes it available for the
sampler code.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8804>
The rules here are the same as for texture instructions. The bits on
the intrinsic are the ground truth and are allowed to vary from the
deref a bit as-needed. If the intrinsic says PIPE_FORMAT_NONE, then we
can look at the variable, if visible, to get format information. This
means that we need to be careful when we rewrite intrinsics based on the
deref to only override the format from the _deref intrinsic from the
image variable unless the intrinsic is PIPE_FORMAT_NONE.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11849>
Semantically, -1 means "Unknown; don't validate" but it's really only
used for derefs because they often need to be flexible. We don't really
need that flexibility for image intrinsics but this makes it more
consistent. More immediately useful is that this gives us the ability
to tell _deref forms of these intrinsics apart from the lowered ones.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11849>
Some drivers doesn't support PIPE_SHADER_CAP_INTEGERS.
This leads to using load_ubo_vec4 which throws llvmpipe off the guard since
it doesn't expect load_ubo_vec4 in shader. Use nir_to_tgsi utility in
such a case.
This fixes crash seen with conform's mustpass.c, select.c and feedback.c.
Also, few gl-select related piglit tests exhibit same crash. Found in vmware's
internal testing
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
v2: incorporated Emma's comments. Added check for PIPE_SHADER_CAP_INTEGERS and
remove PIPE_SHADER_IR_TGSI check
v3: As per Emma's comment, removed expected crashes for i915 piglit
v4: update expetcted passes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11911>