It looks like there's also a standalone version and a 32-bit version.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5574>
Apply the destination swizzle on GLES games based on HL2 engine.
Also add Portal 2 since some people are experiencing issues with
that.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5481>
Adapted from wayland's wl_os_dupfd_cloexec().
Suggested-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5369>
The meson-windows-vs2019 job was going down the `!defined(WIN32)` path,
leading to a broken build once that path contained non-windows code.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5369>
To fix game artifacts. It's always sad to have to fix game bugs
inside drivers ...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
Equivalent to clamp(x, 0.0, 1.0) or fsat in NIR. Useful for format
packing, among other uses given the variety of substituions in-tree.
v2: Drop brackets (Eric).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5100>
Silence the warning about this always-true comparison.
src/util/softfloat.c:214:42: warning: comparison of constant 32768
with expression of type 'int16_t' (aka 'short') is always false
[-Wtautological-constant-out-of-range-compare]
} else if ((e > 0x1d) || (0x8000 <= m)) {
~~~~~~ ^ ~
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5174>
xxhash is faster than fnv1a in almost all circumstances, so we're
switching to it globally.
Signed-off-by: Dmytro Nester <dmytro.nester@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4020>
Any system that provides `/dev/urandom` should be allowed to try to use it.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2316>
When the caller asked for a randomised_seed, we should fall back to the
time-based seed instead of the fixed seed.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2316>
This function has been added in glibc 2.25, and the related syscall in
Linux 3.17, in order to avoid requiring the /dev/urandom to exist, and
doing the open()/read()/close() dance on it.
We pass GRND_NONBLOCK so that it doesn’t block if not enough entropy has
been gathered to initialise the /dev/urandom source, and fallback to the
next source in any error case.
Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2026>
The vma_heap allocator was originally designed to prefer high addresses
in order to find bugs in ANV's high address handling. However, there
are cases where you might want the allocator to prefer lower addresses
for some reason. This provides a configure bit for exactly this
purpose.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5019>
When the LLVM version is too old or missing, SotTR applies shader
workarounds and that reduces performance by 2-5% with ACO.
SotTR workarounds are applied with LLVM 8 and older, so reporting
LLVM 9.0.1 should be fine.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4984>
SCHED_IDLE on linux can lead to extraordinarily long periods of no scheduling
leading to starvation of minimum priority threads for such an extended period
that it can eventually lead to GUI stalls. Switch to renicing the threads to
the lowest priority and use the SCHED_BATCH scheduling policy which is a hint
to the scheduler that this is latency insensitive thread instead. This change
has been confirmed to address unexpected GUI related stalls in mesa
applications across a range of different linux kernels.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4912>
This is currently broken hard, because this code is being used in more
places that it used to be, and fixing that is prohibitively hard right
now.
This is far from ideal, as it leaves the same inconsistency in the
EMBEDDED_DEVICE code-path. But that only used by VMWare, so it's
probably better if they fix it, as they know their requirements better
than we do.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2911
Fixes: 76f79db3f5 ("util: stop including files from mesa/main")
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4919>
BITSET_FOR_EACH_SET can walk a sparse set (such as a register class's set
of registers) much faster than just iterating over individual bits.
Improves freedreno startup time (as measured by shader-db ./run
shaders/closed/gputest/triangle on my x86 system) by -4.12679% +/-
1.99006% (n=151)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4537>
This make the code significantly more readable, I think (along with
shorter). Also, using util_dynarray_delete_unordered() saves us a move of
the rest of the list when removing adjacency on a node.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4537>
This moves the fi_types to a new mesa_private.h and removes the
imports.c file. The vast majority of this patch is just removing
pound includes of imports.h and fixing up the recursive includes.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
MSVC 2015 and newer has perfectly valid snprintf and vsnprintf
implementations, let's just use those.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Mesa has one of these in imports.h, so u_memory needs one as well. This
is the version from mesa ported.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
This makes more sense for it, it's only used in the glsl compiler
currently, so we could probably move it there, but this seems fine for a
header only #define.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
This adds two new util functions to rounding.h, _mesa_iroundf and
mesa_lround, which are just wrappers around roundf and round, that cast
to int and long int respectively. This is possible since mesa recently
dropped support for VC2013, since 2015 and 2017 support roundf.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Which has the same behavior as long as you don't change the FPU rounding
mode. Other code in mesa makes the same assumption so it should be safe
to make that assumption more generally.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
which are exactly the same function with exactly the same implementation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
This is copied from the one in src/mesa/main/imports.h, which is the
same otherwise.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
The implementation is somewhat different, although if you go back in
time far enough they're the same, but the one in u_math was changed a
long time back to be faster.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
The 64 bit variant in imports.h isn't even used.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Mostly this uses util_is_power_of_two_or_zero, which has the same
behavior as _mesa_is_pow_two when the input is zero. In cases where the
value is known to be != 0 ahead of time I used the _nonzero variant as
it may be faster on some platforms.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
Probably doesn't fix anything but those should be accessed in an
atomic way just like the head pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e4f01eca3b ("util: Add a free list structure for use with util_sparse_array")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4613>
By temporarily storing the new_head by a uint32_t, we wipe out the
counter section of the head pointer.
Fixes: e4f01eca ("util: Add a free list structure for use with util_sparse_array")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4612>
This just silences a compiler-warning about a potentially uninitialized
variable. It's not uninitialized, but it's a bit hard for the compiler
to see. So let's just initialize it to zero.
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4577>
These are less mesa specific than the rest of macros.h, and would be
nice to use outside of mesa. Prep for next patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
Windows provides MAX_PATH rather than PATH_MAX for the maximum allowable
path length. This is not a limit on the length of filename which can
exist on the filesystem, but a length on the length of path which can be
passed to Win32 API calls.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Fixes: f8f1413070 ("util/u_process: add util_get_process_exec_path")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4304>
This increases performance by 11-13% in Viewperf11/Catia - first scene.
Set allow_draw_out_of_order=true to enable this.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4152>
This implementation was removed by 8b8af6d3 ("gallium/util: Switch
util_float_to_half to _mesa_float_to_half()'s impl.")
It was not actually broken, but _mesa_float_to_half() implements
round-to-nearest-even, whereas util_float_to_half() implemented
round-to-zero. So rename it appropriately.
GL actually never cares about rounding (except a broken piglit test),
however d3d10 very much does and requires RTZ for float to half
conversion. Moreover, apparently at least radeon gpus actually always
do RTZ when doing RT writes (and I'd suspect for shader image writes
as well). Hence it seems appropriate to hook up this rtz function to
the format instead. This will cause llvmpipe and softpipe to use rtz
rounding for clears with half float formats, and softpipe would use rtz
behavior for rt writes as well (llvmpipe has that hardcoded), not sure
if "real" hw drivers hit this function for much.
(For shader opcodes would still need to figure out what rounding to use
appropriately, but this is a question for another day.)
Note should probably unify with _mesa_float_to_float16_rtz. Unclear at
this point which one is better, so just restore previous function here.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4312>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4312>
This is useful to enable workarounds for applications with a generic name.
For instance all games made with the YoYo game engine have the same executable
name "runner".
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4181>
This reworks the data structure a bit and, in my view, simplifies it.
Instead of each node having a header which has the node level in it, we
use the bottom 6 bits of the pointer for that. This requires us to
allocate with the os_malloc/free_aligned helpers (which call into
posix_memalign on Linux) but cache-line aligning our allocations is
actually probably a good thing given that we're doing atomics on them.
The primary advantages to doing this is that it changes the number of
memory accesses per tree level from 2 to 1 when walking the tree because
we no longer have to look at node->level.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4228>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4228>
As soon as I switch to using the allocation helpers in os_memory.h,
these tests start blowing up on the Windows build in GitLab CI. As far
as I can tell, the issue is something with the combination of the debug
allocator in u_debug_memory.c and the mutex implementation in the
version of Wine running in CI. The tests don't fail on real windows nor
do they fail with newer versions of Wine.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4228>
We use this value several times. It's probably best to encourage the
compiler to only read it once. I have no proof that this actually makes
any performance improvement whatsoever.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4228>
Overwrite function for this type was missing and I needed it for my project.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Mark Menzynski <mmenzyns@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3903>
In the select_reg callback, I want to be able to determine if a given
node is already assigned, and if so what physical register has been
assigned.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Add a parameter so the callback can know which node it is selecting a
register for. And remove the graph parameter, as it is unused by
existing users, and somewhat unnecessary (ie. the callback data could
be used instead).
And add a comment so $future_me remembers how this works.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
This is where the other debug_memory_* functions are declared, so let's
move it here for symmetry.
This allows us to drop an include of u_debug_gallium.h, which makes us
depend on gallium-headers in non-gallium code.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3901>
This allows communicating that it wasn't possible to determine whether
the two file descriptors reference the same file description. When
that's the case, log a warning in the amdgpu winsys.
In turn, remove the corresponding debugging output from the fallback
os_same_file_description implementation. It depends on the caller if
false negatives are problematic or not.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3879>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3879>
When os_memory_debug.h was promoted to src/util, this source-file on
which it depends on when the debug-flag is set on windows was left
out. So let's move this also.
It doesn't seem there's any way of triggering this issue right now, but
it seems better to correct this to avoid this from biting us in the ass
in the future.
Fixes: 88c4680b5a ("util: promote u_memory to src/util")
Reviewed-by: Dylan Baker <dylan@pnwbakers>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3844>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3844>
../src/util/blob.c:166:7: runtime error: null pointer passed as argument 2, which is declared to never be null
#0 0x7fe51bc315df in blob_write_bytes ../src/util/blob.c:166
#1 0x7fe51c7a7b9a in iris_disk_cache_store ../src/gallium/drivers/iris/iris_disk_cache.c:115
#2 0x7fe51c7f444d in iris_compile_fs ../src/gallium/drivers/iris/iris_program.c:1693
#3 0x7fe51c7fdcd9 in iris_create_fs_state ../src/gallium/drivers/iris/iris_program.c:2331
#4 0x7fe519e871a3 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1275
#5 0x7fe519e89dd0 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1435
#6 0x7fe519ed51e1 in st_update_fp ../src/mesa/state_tracker/st_atom_shader.c:163
#7 0x7fe519eb5d73 in st_validate_state ../src/mesa/state_tracker/st_atom.c:261
#8 0x7fe519e4e0bf in prepare_draw ../src/mesa/state_tracker/st_draw.c:132
#9 0x7fe519e4e76e in st_draw_vbo ../src/mesa/state_tracker/st_draw.c:184
#10 0x7fe51aca5245 in vbo_save_playback_vertex_list ../src/mesa/vbo/vbo_save_draw.c:215
#11 0x7fe51a25b1cc in ext_opcode_execute ../src/mesa/main/dlist.c:1126
#12 0x7fe51a2f8d58 in execute_list ../src/mesa/main/dlist.c:11830
#13 0x7fe51a34b2d0 in _mesa_CallList ../src/mesa/main/dlist.c:14267
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3825>
This reverts the functional part of commit
d17ff2f7f1, leaving the unit test for
mesa/pipe agreement on what's an array.
The issue is that the util_channel_desc.shift values on array formats are
not used for bit addressing in memory, they're bit addressing within a
word treating a pixel of the format as a native type, as seen by
llvmpipe's use of the values to do shifts (see
lp_build_unpack_arith_rgba_aos() for example). This means the values are
nonsensical for 3-byte RGB, but then llvmpipe doesn't expose those formats
so it works out.
I still want to clean up our big-endian format handling at some point, but
let's fix the s390x regression first, sort out our format unit tests in
CI, then be able to refactor with confidence.
Fixes: d17ff2f7f1 ("gallium: Fix big-endian addressing of non-bitmask array formats.")
Closes: #2472
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3721>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3721>
Using LLVM 8 for ppc64el and 7 for s390x (which hits some coroutine
related issues with LLVM 8).
There are some test failures we need to ignore for now. Also, the
timeout needs to be bumped from the default 30s for some tests, because
they can take longer under emulation.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3643>
Almost all users of the unpack functions don't have strides to plug in
(and many are only doing one pixel!), and this will help simplify
them.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2744>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2464
Fixes: e62c3cf350 ("util/os_socket: Include unistd.h to fix build error")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Fixes
In file included from ../src/util/os_socket.c:8:
../src/util/os_socket.h:26:1: error: unknown type name ‘ssize_t’; did you mean ‘size_t’?
ssize_t os_socket_recv(int socket, void *buffer, size_t length, int flags);
seen with gcc version 8.3.0 (Buildroot 2019.11) and uClibc 1.0.32.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Fixes: ef5266ebd5 ("util/os_socket: Add socket related functions.")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3659>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3659>
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
For most key sizes, xxhash outperforms fnv1a's hash rate substantially (bug
2153). In particular, the V3D driver hashes multiple ~200 byte keys as part
of the shader cache lookup which can easily eat up 10-20% of the runtime on
the Raspberry Pi. Swapping over to xxhash drops this to ~1% of the runtime.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
__size, in particular, makes this macro rather confusing to understand
how to use. Hopefully this comment saves future users the headache.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
We currently initialize this float-array with double-literals. Some
compilers generate warnings for this, so let's switch these to
float-literals instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Setting up transitive conflicts between a full register and its two
half registers (eg r0.x and hr0.x and hr0.y) will make the half
registers conflict. They don't actually conflict and this prevents us
from using both at the same time.
Add and use a new ra helper that sets up transitive conflicts between
a register and its subregisters, except it carefully avoids the
subregister conflict.
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Two competing rules for defining u_format_table.c exists,
which is an error.
Additionally the more general rule lacks the inclusion of
format/u_format.csv.
Fixes: 882ca6dfb0 ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This would've caught 8829f9ccb0 ("u_format: add ETC2 to
util_format_srgb/util_format_linear").
Suggested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 3c45c4bc44 ("util: Cope with the fact that formats in u_format.csv are not ordered.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Add some missing vulkan formats to util/format, this solves all the missing
pipe format cases for the formats that turnip supports.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>
macOS does not have pthread_getcpuclockid.
src/util/u_thread.h:156:4: error: implicit declaration of function 'pthread_getcpuclockid' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
pthread_getcpuclockid(thread, &cid);
^
Fixes: 4913215d14 ("util/u_thread: don't restrict u_thread_get_time_nano() to __linux__")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2171
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
This makes simple_mtx_destroy set the counter to an invalid canary
value and then makes lock/unlock assert that the value is legal.
That way, calling lock/unlock after destroy will assert fail,
rather than deadlocking or potentially even working.
This has caught real deadlocks in dEQP multithreaded tests (in st/mesa
shader variant zombie list handling), which have since been fixed.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
With 781a78 ("mesa: enable ARB_direct_state_access in compat for
GL3.1+), it's possible to have DSA with GL3.1+.
FTL creates a GL3.1 compat context, but fails the
_mesa_has_geometry_shaders(..) check in frame_buffer_texture.
Bump the compat version to pass the check.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Braces mismatch (flagged by CI, untested).
Fixes: 385d13f26d "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This new function lets you request to remove a specific address range
from the allocator. It returns true on success and leaves the allocator
unmodified and returns false on failure. It doesn't need to return an
offset because, if it succeeds, the offset passed in is the allocated
offset.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
pthread_getcpuclockid() and clock_gettime() are also available on at least
OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Right now there are two copies of mm:
* mesa/main/mm.[ch]
* gallium/auxiliary/util/u_mm.[ch]
At some point they splitted, and from the commit message it was not
clear why it was not possible to have only one copy at a common place.
Taking into account that was several years ago, Im assuming that it
was not possible then.
This change would allow to have one copy of the same code, and also
being able to use that code out of mesa/main or gallium, if needed.
This commit moves u_mm and removes mm, as u_mm has slightly more
changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format
unification work. This was really most of the work of implementing
the extension. We just need to handle it in a couple of places and
expose the extension.
v2: Reject the new formats in llvmpipe_is_format_supported to prevent
crashes because it doesn't know how to handle the new formats.
Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
When drawing the main character in Shadow of Mordor, the game appears
to draw Talion with one vertex shader, and the Wraith with another.
If the compiler optimizes those in different ways which lead to slight
imprecisions, then the resulting positions may not line up, leading to
Z-fighting occurring as the game decides which of the two are in front.
brw_nir_opt_peephole_ffma looks at usages of multiply adds across the
entire shader, and may make different decisions between the two, leading
to such imprecisions and Z-fighting. This started happening recently
after a NIR change to eliminate unnecessary MOVs (7025dbe7), but that
change simply exposed the existing problem.
Improves performance on Skylake GT4e by 1.22945% +/- 0.398672% (n=3),
likely due to the fixed rendering.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1985
Fixes: 7025dbe794 ("nir: Skip emitting no-op movs from the builder.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Many applications use multi-pass rendering and require their vertex
shader position to be computed the same way each time. Optimizations
may consider, say, fusing a multiply-add based on global usage of an
expression in a shader. But a second shader with the same expression
may have different code, causing that optimization to make the other
choice the second time around.
The correct solution is for applications to mark their VS outputs
'invariant', indicating they need multiple shaders to compute that
output in the same manner. However, most applications fail to do so.
So, we add a new driconf option - vs_position_always_invariant - which
forces the gl_Position output in vertex shaders to be marked invariant.
Fixes: 7025dbe794 ("nir: Skip emitting no-op movs from the builder.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
GiMark benchmark from GpuTest has such code in VS:
out vec4 lightDir0;
out vec4 lightDir1;
...
lightDir0.xyz = lp0 - vVertex.xyz;
lightDir1.xyz = lp1 - vVertex.xyz;
In FS:
float distSqr = dot(lightDir0, lightDir0);
So due to the usage of uninitialized .w channel in the dot product,
distSqr may become undefined which results in many black dots
in the test on Iris.
In https://www.geeks3d.com/forums/index.php/topic,6242.0.html
developer stated that this benchmark most likely won't be updated.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1919
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dolphin: 75 fps -> 88 fps - Super Mario Galaxy
Citra: 81 fps -> 91 fps - A Link Between Worlds
Yuzu: 21 fps -> 27 fps - Super Mario Odyssey
Dolphin still has many syncs because of glFenceSync and glClientWaitSync.
Moving them to the dispatcher thread might yield another speedup.
Yuzu uses a compatible profile by default. This benchmark used the variable
MESA_GL_VERSION_OVERRIDE=4.5FC to overwrite this behavior.
This profilation was done on a mobile i7-8550U CPU with i965.
Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
To avoid following building error:
out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_util_intermediates/format/u_format_table.c:30:10:
fatal error: 'u_format.h' file not found
^~~~~~~~~~~~
1 error generated.
Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium. Since u_format used
util_copy_rect(), I moved that in there, too.
I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.
Closes: #1905
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
They use the "sample" keyword as a variable name.
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This allows ZSTD instead of ZLIB to be used for compressing the shader
cache.
On a 72 core system emulating skl with a full shader-db (with i965):
ZSTD:
1915.10s user 229.27s system 5150% cpu 41.632 total (cold cache)
225.40s user 10.87s system 3810% cpu 6.201 total (warm cache)
154M (235M on disk)
ZLIB:
2231.33s user 194.24s system 1899% cpu 2:07.72 total (cold cache)
229.15s user 10.63s system 3906% cpu 6.139 total (warm cache)
163M (244M on disk)
Tim Arceri sees (8 core ryzen and a full shader-db):
ZSTD:
2505.22 user 40.50 system 3:18.73 elapsed 1280% CPU (cold cache)
418.71 user 14.93 system 0:46.53 elapsed 931% CPU (warm cache)
454.3 MB (681.7 MB on disk)
ZLIB:
3069.83 user 40.02 system 4:20.13 elapsed 1195% CPU (cold cache)
425.50 user 15.17 system 0:46.80 elapsed 941% CPU (warm cache)
470.3 MB (701.4 MB on disk)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
As requested by Tim.
This was generated with:
grep 'PIPE_ARCH_.*_ENDIAN' -rIl | xargs sed -ie 's@PIPE_ARCH_\(.*\)_ENDIAN@UTIL_ARCH_\1_ENDIAN@'g
v2: - add this patch
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
This will allow it to be used as a drop in replacement for
_mesa_little_endian in a number of cases.
v2: - Always define PIPE_ARCH_LITTLE_ENDIAN and PIPE_ARCH_BIG_ENDIAN,
define the one that reflects the host system to 1 and the other to 0
- replace all uses of #ifdef, #ifndef, and #if defined() with #if
and #if ! with PIPE_ARCH_*_ENDIAN
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
_WIN32 is defined by basically all windows compilers (MSVC, ICL, MinGW),
wereas _MSC_VER is not defined by MinGW. Without this change MinGW falls
through and doesn't define PIPE_ARCH at all, and is caught by some extra
code in gallium.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
That's where `xmlpool_options_h` is defined, and this way we can make sure
nobody starts making use of it in the future :)
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
A bunch of components need the former but not the latter.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Flagged by UBSan:
../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233:14: runtime error: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
#0 0x55b4c1a2a428 in rand_sint ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233
#1 0x55b4c1a2ad3a in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:308
#2 0x55b4c1a2b837 in fast_idiv_by_const_int32_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:410
#3 0x55b4c1abc13f in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#4 0x55b4c1aa7a4d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#5 0x55b4c1a4ce57 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
#6 0x55b4c1a4f530 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
#7 0x55b4c1a51cbe in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
#8 0x55b4c1a6d698 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
#9 0x55b4c1abfd58 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#10 0x55b4c1aab425 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#11 0x55b4c1a64cba in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
#12 0x55b4c1ae4b73 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
#13 0x55b4c1ae4a33 in main ../src/gtest/src/gtest_main.cc:37
#14 0x7ff172d1dbba in __libc_start_main ../csu/libc-start.c:308
#15 0x55b4c1a28dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)
../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309:52: runtime error: negation of -9223372036854775808 cannot be represented in type 'long int'; cast to an unsigned type to negate this value to itself
#0 0x563b24dafd2d in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309
#1 0x563b24db0f0f in fast_idiv_by_const_int64_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:473
#2 0x563b24e41111 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#3 0x563b24e2ca1f in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#4 0x563b24dd1e29 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
#5 0x563b24dd4502 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
#6 0x563b24dd6c90 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
#7 0x563b24df266a in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
#8 0x563b24e44d2a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#9 0x563b24e303f7 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#10 0x563b24de9c8c in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
#11 0x563b24e69b45 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
#12 0x563b24e69a05 in main ../src/gtest/src/gtest_main.cc:37
#13 0x7f9a90330bba in __libc_start_main ../csu/libc-start.c:308
#14 0x563b24daddc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)
v2:
* Use INT64_MIN instead of LLONG_MIN (Jason Ekstrand)
* Simpler test for INT64_MIN result from rand_sint (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Shifting int64_t values left into the sign bit has undefined behaviour:
../src/util/fast_idiv_by_const.c:175:14: runtime error: left shift of 131 by 56 places cannot be represented in type 'long int'
#0 0x561337ed10c1 in sign_extend ../src/util/fast_idiv_by_const.c:175
#1 0x561337ed1335 in util_compute_fast_sdiv_info ../src/util/fast_idiv_by_const.c:239
#2 0x561337e17519 in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:357
#3 0x561337ea815d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#4 0x561337e93a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#5 0x561337e38e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
#6 0x561337e3b54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
#7 0x561337e3dcdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
#8 0x561337e596b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
#9 0x561337eabd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#10 0x561337e97443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#11 0x561337e50cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
#12 0x561337ed0b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
#13 0x561337ed0a51 in main ../src/gtest/src/gtest_main.cc:37
#14 0x7f85ba483bba in __libc_start_main ../csu/libc-start.c:308
#15 0x561337e14dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)
../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51:14: runtime error: left shift of negative value -63
#0 0x55fc3c0e67cc in strunc ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51
#1 0x55fc3c0e6d93 in smul_high ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:140
#2 0x55fc3c0e7067 in fast_sdiv ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:181
#3 0x55fc3c0e858b in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:358
#4 0x55fc3c17915d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#5 0x55fc3c164a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#6 0x55fc3c109e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
#7 0x55fc3c10c54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
#8 0x55fc3c10ecdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
#9 0x55fc3c12a6b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
#10 0x55fc3c17cd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
#11 0x55fc3c168443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
#12 0x55fc3c121cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
#13 0x55fc3c1a1b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
#14 0x55fc3c1a1a51 in main ../src/gtest/src/gtest_main.cc:37
#15 0x7fd224759bba in __libc_start_main ../csu/libc-start.c:308
#16 0x55fc3c0e5dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)
v2:
* Use two casts instead of changing the argument type (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
This fixes a deadlock in pthread_barrier_destroy.
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
If there are queued shaders to be written to disk, wait for that.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We're trying to cast the return type to the type of the var, but instead
we were casting `sizeof(*v)`.
Fixes: 6df72e970c ("util: Make u_atomic.h typeless.")
Fixes: 0a7f17cf5b ("util/u_atomic: add p_atomic_xchg")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: dcf9d91a ("util: Handle differences in pthread_setname_np")
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
v2: Replace autoconf check for flock() with meson check
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
gcc is very particular about where you place the (void) cast
The previous placement made it error out with:
In file included from disk_cache.c:40:0:
../../src/util/u_atomic.h:203:29: error: void value not ignored as it ought to be
#define p_atomic_add(v, i) ((void) \
^
disk_cache.c:658:4: note: in expansion of macro ‘p_atomic_add’
p_atomic_add(cache->size, size);
^
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
It crashes hard (pop-up window and all).
v2: - Change comment to FIXME
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
MinGW defines only _WIN32, but doesn't have fcntl, so we need to use the
windows path.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
<sys/param.h> is required for NetBSD version detection,
and __NetBSD__ must be used to detect even on older releases.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
DOOM fails to handle more images than expected when the adaptative
sync mode is enabled.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1902
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The old version of the iterators relies on a &iter->field != NULL check
which works fine on older GCC but newer GCC versions and clang have
optimizations that break if you do pointer math on a null pointer. The
correct solution to this is to do the null comparisons before we do any
sort of &iter->field or use rb_node_data to do the reverse operation.
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The new order matches that of the comparison functions accepted by the C
standard library qsort() functions. Being consistent with qsort will
hopefully help avoid developer confusion.
The only current user of the red-black tree is aub_mem.c which is pretty
easy to fix up.
Reviewed-by: Lionel Landwerlin <lionel.g.lndwerlin@intel.com>
When I wrote the red-black tree implementation, I wrote tests for it but
they never got imported into mesa.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
There's nothing whatsoever compiler-specific about it other than that's
currently where it's used.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This makes use of the total job size limiting feature added in the
previous patch.
The idea is to avoid an excessive build up in memory use due to the
use of both the UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flags.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a
situation where the queue never executes and grows to a huge size
due to all other threads being busy.
This is the case with the shader cache when attempting to compile a
huge number of shaders up front. If all threads are busy compiling
shaders the cache queues memory use can climb into the many GBs
very fast.
The use of these two flags with the shader cache is intended to
allow shaders compiled at runtime to be compiled as fast as possible.
To avoid huge memory use but still allow the queue to perform
optimally in the run time compilation case, we now add the ability
to track memory consumed by the jobs in the queue and limit it to
a hardcoded 256MB which should be more than enough.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Since we set the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flag this should
have little impact on low core systems. However just about all modern
CPUs currently available that run Mesa have *at least* 4 cores. For
these CPUs allowing more threads can result in the queue being
processed faster and avoid excessive memory use due to a backlog of
cache entrys building up in the queue.
This change helps avoid a huge build up of cache entrys in the queue
due to using both the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY and
UTIL_QUEUE_INIT_RESIZE_IF_FULL flags.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
In order to be coherent with the pre-existent API for half floats,
this new API for double is the one meant to be used when doing double
to float conversions. It is no more than a wrapper for the softfloat.h
API but we meant to keep that one private.
v2:
- Fix bug in _mesa_double_to_float_rtz() in the inf/nan detection
using the exponent value.
v3:
- Replace custom f64 -> f32 implementations with the softfloat
one (Andres).
v4:
- Added API usage clarifying comments (Caio).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
In order to be coherent with the pre-existent functions, this new API
is the one meant to be used when doing half float to float
conversions. It is no more than a wrapper for the softfloat.h API but
we meant to keep that one private.
v2:
- Replace custom f32 -> f16 RTZ implementation with the softfloat
one (Andres).
v3:
- Added API usage clarifying comments (Caio).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Implemented fadd, fsub, fmul and ffma for doubles and ffma for floats,
rounding to zero, using a modified implementation from Berkely
Softfloat 3e Library.
Their implementation correctness has been checked with the Berkeley
TestFloat Release 3e tool for x86_64.
v2:
- Reuse util_last_bit64() in _mesa_count_leading_zeros64()
implementation (Connor).
v3:
- Add a specific ffma for floats version (Connor).
- Implement the ffma for doubles version (Andres).
- Lots of fixes in fadd, fsub and fmul (Andres).
- Improved documentation (Andres).
v4:
- Added f64 -> f32 conversion function (Andres).
- Added f32 -> f16 RTZ conversion function (Andres).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Tested-by: Andres Gomez <agomez@igalia.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Since struct timespec's tv_sec member is of type time_t, adjust the
expected value to allow for the truncation which will occur with 32-bit
time_t.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
This was meant to include up to version 23.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0616b7ac90 ("vulkan: add vk_x11_strict_image_count option")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
This is embarrasing...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 04dc6074cf ("driconfig: add a new engine name/version parameter")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
This option strictly allocate the minImageCount given by the
application at swapchain creation.
This works around application that do not deal with the fact that the
implementation allocates more images than the minimum specified.
v2: Add values in default drirc (Bas)
v3: specify engine name/version (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Vulkan applications can register with the following structure :
typedef struct VkApplicationInfo {
VkStructureType sType;
const void* pNext;
const char* pApplicationName;
uint32_t applicationVersion;
const char* pEngineName;
uint32_t engineVersion;
uint32_t apiVersion;
} VkApplicationInfo;
This enables the Vulkan implementations to apply workarounds based off
matching this description.
Here we add a new parameter for matching the driconfig options with
the following :
<device driver="anv">
<application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
<option name="blaaah" value="true" />
</application>
</device>
v2: switch engine name match to use regexps
v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)
v4: Add missing bit that went to the following commit (Eric)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
u_endian.h needs to be included, otherwise PIPE_ARCH_BIG_ENDIAN might not
be defined on big-endian architectures and the endian conversion macros
will be incorrect.
I don't think anything is broken because of this, I just noticed this when
looking at the file.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
In particular, it would be nice for failed debug_assert() msgs to show
up in logcat.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 955c63d364 ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
While the documentation for _BitScanReverse64 on MSDN says that it's
available on ARM, this isn't true. It's only available on ARM64. So
let's match reality.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
This code generates CVTSD2SI, which requires SSE2. So let's fix the
required SSE-version.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc)
Reviewed-by: Matt Turner <mattst88@gmail.com>
strchrnul is not available on macOS.
pipe_loader.c:141:14: error: implicit declaration of function 'strchrnul' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
next = strchrnul(library_paths, ':');
^
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
MinGW headers already define it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
With MinGW cross compilation.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
This always returns a int64_t, translating to _mesa_lroundevenf on
systems where long is 64-bit, and llrintf where "long long" is needed.
Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
At least on Linux, we can use the ELF auxiliary vector to
detect the presence of AltiVec, VSX and other CPU features
without having to go through handling SIGILL, which has
various problems of its own.
A similar thing is already being done for ARM to detect NEON.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
timespec_get() is not available on macos, we need to pull in the
include/c11/threads_posix.h helper.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674
Fixes: e2d761de03 ("util: drop final reference to p_compiler.h")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We added this utility for vulkan where all timeouts are given as
uint64_t values. We can switch from signed to unsigned as this is the
only user and if we ever deal with signed integers somewhere else
we'll have to be careful to use the corresponding
timespec_(add|sub)_msec and always pass absolute values.
v2: Forgot to drop the test calling add_nsec() with a negative number
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Fixes: d2d70c3bb5 ("util: add a timespec helper")
Acked-by: Daniel Stone <daniels@collabora.com>
Commit fixes current crashes with Vulkan applications on Android.
Fixes: c0376a1234 "util: add anon_file.h for all memfd/temp file usage"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Expose configs when allow_fp16_configs has been enabled and
DRI_LOADER_CAP_FP16 is set in the loader.
Also, define a new dri configuration option so users can disable exposure of
fp16 formats. Make fp16 opt-in for i965.
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>