Some filesystems don't fill in d_type in dirent. Resulting in
DT_UNKNOWN. Pass this entry through to the next step and use stat on the
full filepath to determine if it is a file.
sshfs is known to not fill d_type.
This resolves an issue where driconf living on an sshfs path wasn't
working.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13697>
Looks like the __attribute__ version doesn't work for C++ in the
Android build. Only found now because we don't enable
-Wimplicit-fallthrough by default project wide for C++. Only
ACO enables it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13164>
This allows specific per-application override.
The existing MESA_EXTENSION_OVERRIDE env variable is kept.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13364>
Some applications may request an indirect context but this feature is
disabled by default on Xorg and thus context creation will fail.
This commit adds a drirc setting to force the creation of direct glx
context, regardless of what the app is requesting.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13246>
In the menu of CS:GO R8_SRGB textures are uploaded and read back, and
since R8_SRGB can't be read back on GLES, because it is not a rendertarget
format and glGetTexImage and siblings don't exists, we can't default to
enabling reading back this format. This leads to an emulation of the
glGetTexImage calls issued by CS:GO, and this slows down the menus a lot
(below 1 fps on Intel XE hosts).
So add this driconf tweak and enable it for CS:GO to work around the issue.
It can be done safely, because in this case we actually can use the data
that is stored on the host in the backing IOV.
This tweak lets the CS:GO menu run at around 60 FPS when run with virgl
on a Intel XE host when it would run with less than 1 FPS without the tweak.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13572>
To allow using non-constant sampler array index. This prevent
Mari render fail for any scene.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13304>
For matching executable with variable names like Mari which
append version string to its executable like: Mari4.5v2, Mari4.7v4.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13304>
sysctl is the correct way of getting the current executable's path.
procfs is not mounted by default.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1598>
FreeBSD 5.0 was released in 2003.
We really do not need to check that we're on >= 4.4.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1598>
The GLSL spec says it's an error if a shader statically writes to these
2 variables.
Until this commit, Mesa refused to link a shader if it had an unused
function writing to one of these variables while another (used) function
wrote to the other.
This commit adds an option to perform dead function elimination after
the intra-stage linking step but before performing these checks.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12897>
The original implementation in os_memory_fd.c always uses memfds.
Replace this by using the already existing os_create_anonymous_file in
order to support older systems or systems without memfd.
Fixes: 1166ee9caf ("gallium: add utility and interface for memory fd allocations")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13331>
It's easy to make a mistake of using VK_ATTACHMENT_LOAD_OP_DONT_CARE
when LOAD_OP_LOAD should be used since all desktop GPUs do the same
thing (nothing) for both of them.
However tiler GPUs do actually care about them.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13367>
With the `gtest` protocol meson will add some extra arguments to the
test to generate better junit results, which may be useful. This
protocol is only available in meson 0.55.0+, so keep using the default
`exitcode` protocol for meson older than that.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8484>
There's no danger of accidentally using these, the default pixel format
is integer and if you want float you need to have explicitly asked for
it in eglChooseConfig.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13182>
This adds 60 bytes to both structures. It eliminates "False Sharing"
for atomic operations (see wikipedia).
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11618>
Make u_vector_init a wrapper to u_vector_init_pot. Let both take
(element_count, element_size) as parameters.
Motivated by eed0fc4caf ("vulkan/wsi/wayland: fix an invalid
u_vector_init call")
v2: rename u_vector_init_pot to u_vector_init_pow2
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>
Add utility functions to allocate aligned memory backed by
mem_fd objects. Add interface to Gallium for same allocation.
It will be used in later commits for external memory support
in Vulkan/OpenGL.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
Turnip will use this for mapping the to the corresponding hardware format,
and we could also extend mesa/st to avoid adding extra samplers for
external image sampling of YV12 on freedreno.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13046>
This is basically the same workaround as in 9b577f2a88 (driconf, glsl: Add a
vs_position_always_invariant option) commit but for tesselation evaluation
shaders. Some applications do not mark outputs as precise in tesselation
evaluation shaders which can lead to different results in case some
optimizations were applied.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13027>
To make sure we are not just using the in-memory cache index for
the single file cache, we test adding and retriving cache items
between two different cache instances.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12925>
This simple test validates that it is possible to set bits
across word bounary.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
Re-usable command buffer could be resubmitted any number of times,
but tracepoints are written once. u_trace_clone_append allows
copying tracepoints and copying timestamps if GPU doesn't support
writing timestamps into indirect address.
The case of tiling GPUs is similar, command stream for draws is
resubmitted for each tile.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10969>
Add ability to auto-generate:
- printing of args for "GPU_TRACE=1", still could be overriden with
tp_print.
- population of extra data for perfetto event.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10969>
With little modifications u_trace could be usable for Vulkan drivers.
Beside removing dependencies on gallium, the other notable change is
the passing of opaque flush_data pointer via u_trace_flush. There
is data which becomes available only at this point which other drivers
may want to pass.
For example Vulkan drivers would want to pass at least submission id
(for perfetto) and a sync object to wait on in u_trace_read_ts.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10969>
This partially reverts commit 4d9acfa533.
The original patch said:
"Python 3 handles unicode strings by default, so we can drop all that."
But this breaks building on RHEL 7 (or similiar) since python3 support
on is much more limited than newer distros. Backporting all the needed
python 3 libraries to EL7 is a pretty big task, and isn't very easy to
maintain.
For workstation purposes, we need the AMD MM/GL driver building on RHEL
7, so only src/util/driconf_static.py needs to be reverted.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12549>
Doesn't fix, but improves the situation in issue #5163. The
VK.spirv_assembly.instruction.graphics.spirv_ids_abuse.lots_ids_* tests
emit about 160k instructions. ir3_sched blows out the stack after
dag_traverse_bottom_up_node reaches a depth of about 130k frames.
This patch replaces the recursively-implemented post-order traversal
with an iterative implementation using a stack, allowing us to process
DAGs as large as memory can hold.
Definitely makes you appreciate the elegance of recursion...
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12232>
Since
2b5178ee util: Switch the non-block formats to unpacking rgba rows instead of rects,
compressed formats define unpack_rgba_8unorm_rect instead
of unpack_rgba_8unorm.
Fixes the u_format_translate check to take this into account.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5201
Fixes: 2b5178ee ("util: Switch the non-block formats to unpacking rgba rows instead of rects")
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12315>
A non-static inline function body is only actually emitted by GCC during optimization passes,
so running -O0 ends up never emitting the body, producing linker errors.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12158>
The flock is per-fd, not per thread, and we do it outside of the main mutex. This was
done to avoid having to wait in the mutex, but we can get a case where one ends up running
the body with the flock unlocked.
Fix this by adding a mutex that doesn't need to be locked for reads.
Fixes: 4f0f8133a3 "util/fossilize_db: Do not lock the fossilize db permanently."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12266>
Don't anticipate seeing any partial written headers but just in case we
should probably wait on the lock to make sure whatever header was being
written is finished being written.
Fixes: 4f0f8133a3 "util/fossilize_db: Do not lock the fossilize db permanently."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12204>
If things went perfectly parsed_offset was never updated for the
final entry and we'd seek_set to the start of the last entry. Is
fun when appending to the file next.
Fixes: 2ec1bff0f3 "util/fossilize_db: Split out reading the index."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12204>
Many places need to know the maximum or minimum possible value for a
given size integer... so everyone just open-codes their favorite
version. There is some potential to hit either undefined or
implementation-defined behavior, so having one version that Just Works
seems beneficial.
v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in
u_intmin. Noticed by CI. Lol. Rename functions
`s/u_(uint|int)(min|max)/u_\1N_\2/g`. Suggested by Jason. Add some
unit tests that would have caught the copy-and-paste bug before wasting
CI time. Change the implementation of u_intN_min to use the same
pattern as stdint.h. This avoids the integer division. Noticed by
Jason.
v3: Add changes to convert_clear_color
(src/gallium/drivers/iris/iris_clear.c). Suggested by Nanley.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>
A number of android games are so far, sadly, unaware of open source
drivers. And when they see an unknown driver they lump it in the lowest
performance tier, artificially limiting framerate and/or gfx settings.
So until the games catch up, use driconf to override vendor/renderer
settings for moar fps and nicer gfx.
Furthermore, some games seem to be limiting *too* conservatively when
we otherwise have plenty of headroom even if we claim to be a bigger
adreno. Possibly a concession to battery life or tighter thermal
constraints in a phone, as compared to something like a chromebook.
Or maybe the flagship gaming phone thing is a scam ;-)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12135>
Add support for driconf overrides on a per-device level, for cases
where we don't want to override behavior for all devices supported
by a particular driver.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12135>
This is the distinction between formats:
- PIPE_FORMAT_R64_FLOAT is the legacy "convert-to-float" vertex format.
- PIPE_FORMAT_R64_UINT is the raw double format.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11370>
This simplifies the error exit paths for drivers that use these queues.
v2: Move allocation of queue->jobs after initializing the mutxes and
condition variables. Noticed by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
Unfortunately I contacted the dev about this issue years ago and he
made a fix, but it has never been released after all these years.
This stops the screen from being completely black in game.
CC: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11941>
ls3a4000 and ls2k1000 cpu is mips64r5 compatible with MSA SIMD
instruction set implemented, while ls3a3000 is mips64r2 compatible only.
Due to lacking llvm support for loongson CPU, llvm::sys::getHostCPUName().
return "generic" on all loongson mips CPU.
So we override the MCPU to mips64r5 if MSA is implemented, feedback to
mips64r2 for all other ordinaries.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: suijingfeng <suijingfeng@loongson.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11955>
On APUs, we fake heaps to simulate a dGPU setup because it seems to
have the maximum compatibility. Though, some applications like RDR2
still only looks at GTT if the driver reports an iGPU which means it
will only use 1/3rd of total memory available.
This is currently behind a drirc option because it might have
implications for other apps but we might want to extend this later
if everything is fine.
Cc: 21.2 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11891>
Enable radeonsi_no_infinite_interp for Teardown to fix hangs.
Based on comments from #3714.
Tested-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11814>
There are several dependencies on headers from
/gallium/include/pipe/
which currently mean that dependencies on util
must include gallium to compile.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11811>
application_name_match is a regex... and DCC was also disabled for
DOOM Eternal (because DOOMEternal matches DOOM). Fun.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11805>
Android.mk files haven't really been supported by Mesa devs for a long
time. Most of us have been willing to update Makefile.sources if we
remember and sometimes we try to blind code some Android.mk for a new
generator. However, the reality is that it breaks regularly and ends up
being maintained by the Android community. To address this problem
another approach was implemented in !10183 utilizing the maintained
meson build system. The old Android.mk files are no longer required.
This commit was created with the following commands:
git rm **/Android.mk
git rm **/Android.*.mk
git rm **/Makefile.sources
git rm CleanSpec.mk
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4487
Acked-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9728>
Both games perform two image layout transitions with the same image
in the same pipeline barrier with UNDEFINED and this re-initializes
DCC to the uncompressed state. No ideal solution sadly. Note that
both games declare all images as CONCURRENT.
This fixes rendering issues on GFX10+ because DCC for stores is
supported and this implicitly enables DCC for concurrent.
Fixes: da166f648f ("radv: enable DCC for concurrent images on GFX10")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4927
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4607
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11656>
Currently the cache just deletes enough items when the cache is
full to make room for the new item being stored. This hasn't
been too much of a problem in practice but for things like running
piglit where we have thousands of unique shaders and all threads
being utilised we end up with a pretty big bottle neck.
With this change rather than just brute forcing our way to having
enough room for the new item, we instead grab 10% of the least
recently used items in the random directory we chose and delete
them all. This should only be around 0.04% of total cache items
but should hopefully releave some of the pressure on system calls
like fstatat().
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11523>
Statement (void*)debug_name when FreeBSD is defined has no use. Removed
it to fix compiler warnings.
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11203>
On Linux ENODATA is defined but on BSD, and MacOSX ENOATTR is used
instead. Defined ENODATA to be ENOATTR when the system is not Linux.
v2: Replaced ENODATA and ENOATTR with -EFAULT that is exists everywhere
and added a comment (Ian Romanick)
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11203>
Apparently the quantization math isn't cheap.
This further reduces overhead by 2% for drawoverhead/8 textures.
The improvement is measured by looking at the sysprof percentage delta and
multiplying by 2 (because we have the frontend and gallium threads with
equal overhead, so the benefit is doubled compared to 1 thread).
Both per-sampler and per-unit lod bias values are quantized.
The difference in behavior is that both values are quantized separately
and then added up, instead of first added up and then quantized.
The worst case error is +- 1/256 in the reduced precision, i.e. off by one
in a fixed-point representation, which should be fine.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11339>
MSVC's qsort_s behaves similarly to sort_r. Unfortunately, qsort_s's
compare function has the "context"/"args" as its first argument. BSD's
qsort_r has a different order than GNU's qsort_r. Finally, C11 added
qsort_s's which look like GNU's gsort_r.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10989>
Android and MSVC don't have qsort_r() so let's provide a util wrapper
that uses the old qsort and thread-local storage. We use C++ for this
because thread_local is built into C++11 and we can't rely on C11
everywhere.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10989>
This thing is entirely opt-in wrt caring about it when writing to
a file anyway. Since we also lock the two at the same time and they
have an 1-1 relation we can just lock one of the two files. Saves
some syscalls.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11485>
This avoids all locks for reads and using lock only while actually
writing.
This is enabled by doing two things:
1) Reading the index incrementally. This way we get new entries
written by other processes and do not write duplicate entries.
2) Taking the lock only during writes, and applying the incremental
read while holding the lock so we always append to the actual end of the file.
Fixes: eca6bb9540 ("util/fossilize_db: add basic fossilize db util to read/write shader caches")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11485>
Often disassemblers and things in our drivers want to be able to
incrementally printf together a line, but that gets in the way of
Android's logging that wants to see a whole line all at once. Make a
little wrapper to do the ralloc_asprintf_rewrite_tail() and flushing lines
as they appear.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9262>
Instead of spawning 4 threads when the cache is created,
spawn 1 and let u_queue grow the number of threads if
needed.
I wrote this patch because when running piglit's quick_shader
profile I had lots of samples in disk cache threads - mostly
in native_queued_spin_lock_slowpath kernel function.
Since these tests shouldn't really stress the cache, I assumed
it was caused only by thread creations.
After writing the patch and redoing the measurement, I got an
improvement but I still more hits in the same function for
shader_runner:$disk0 thread so something was wrong.
After digging more, I found out that my shader cache index was
corrupted: the on-disk size was 29MB but the index reported it
was way more than 1GB. So each disk cache thread was spending
a lot of time trying to evict files. Given that my cache had
a really low count of files, the LRU method based on randomly
generating subfolder names failed, so evicting was very slow.
Now that my cache index is fixed, the disk cache threads are
mostly idle but I still think it makes sense to grow the
number of threads instead of spawning 4 at the program start.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11296>
This flag allow to create a single thread initially, but set
max_thread to the request thread count.
If the queue is full and num_threads is lower than max_threads,
we spawn a new thread to help process the queue faster.
This avoid creating N threads at queue creation time.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11296>
this better enables object-specific (e.g., context) queues where the owner
of the queue will always be needed and various pointers will be passed in
for tasks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11312>
A similar path can be used on at least FreeBSD using cpuset_getaffinity.
This is how Ninja determines the number of available CPUs on that
platform. See the GetProcessorCount function in util.cc:
https://github.com/ninja-build/ninja/blob/master/src/util.cc
v2: Fix counting the number of available CPUs. The CPU_COUNT API does
not work the way I thought it did. :face_palm: Noticed by Marek.
Reviewed-by: Adam Jackson <ajax@redhat.com> [v1]
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> [v1]
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11228>
This prevents problems when some CPUs are offline. In a four CPU
system, if CPUs 1 and 2 are offline, the cache topology code would
only examine CPUs 0 and 1... giving incorrect information.
The types are changed to int16_t so that the offset of num_L3_caches
does not change. This triggered a STATIC_ASSERT failure:
STATIC_ASSERT(offsetof(struct util_cpu_caps_t, num_L3_caches) == 5 * sizeof(uint32_t));
I'm assuming there's some assembly code or something that depends on
this offset, and I don't feel like messing with it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11228>
In the current code, this prevents a very unlikely corner case. More
importantly, it should prevent the next commit from breaking the
universe.
Imagine a system with 64 CPUs configured, but first 32 CPUs are offline.
_SC_NPROCESSORS_CONF will return 32. All of the surrounding code will
interpret this as meaning CPUs 0 through 31, but all of those CPUs are
offline. Nothing good can happen then.
The problem cases require systems with more than 32 CPUs because
util_cpu_caps.num_cpu_mask_bits is always rounded up to a multiple of
32.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11228>
memset operates in bytes, and there are 8-bits in a byte. This is a
very easy to miss typo. :(
Fixes: 9758b1d416 ("util: add util_set_thread_affinity helpers including Windows support")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11228>
Debugging fd mix-ups (ie. where, possibly via close()ing the original
fd, etc, you end up with something that is a valid fd but not a valid
*fence* fd) can be difficult. Fortunately we can use the FILE_INFO
ioctl, which will return an error if the fd is not a fence fd.
For android, we instead use the libsync API, which does a similar thing
on modern kernels, but has a fallback path for older android kernels.
Note that the FILE_INFO ioctl has existed upstream since at least prior
to destaging of sync_file.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11202>
Now that we have an idea of how many regs the conflicting allocation uses,
we can just skip to the next one and save repeated tests to find the same
conflicting neighbor again.
shadowrun-returns shader-db time on skl -1.62821% +/- 1.58079% (n=679),
now there's no statistically significant change from the start of the series
(n=420)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
In the fully general case of register classes, to expose an allocation
class of unaligned 2-contiguous-regs allocations, for example, you'd have
your base individual regs (128 on intel), and another set of 127 regs that
each conflicted with the corresponding pair of the base regs. Single-reg
nodes would allocate in the 128, and double-reg nodes would allocate in
the 127 and the user would remap from the 127 down to the base regs with
some irritating table.
If you need many different contiguous allocation sizes (16 is a pretty
common number across drivers), your number of regs explodes, wasting
memory and making the q computation expensive at startup.
If all the user has is contiguous-reg classes, we can easily compute the q
value up front (as found in the intel driver and nouveau, for example),
and we only have to change a couple of places in the conflict-checking
logic so the contiguous-reg classes can use the base registers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
This was bumped in 7e584a70c4 ("gallium: increase table size for fast
log/pow functions") presumably to fix conformance of tgsi_exec, but we
don't need that much accuracy in the only place it's used in the tree any
more: softpipe texture sampling.
softpipe glmark2 -b texture:texture-filter=linear FPS +0.335748% +/-
0.220111% (n=20)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11173>
It's disabled due to non-conformance with no configuration knob to turn it
on, and if you care about swrast performance you're on llvmpipe anyway.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11173>
DXVK 1.8.1 marks position as always invariant but it's disabled for
SotTR because it introduces rendering issues on NV. The DX12 version
also likely needs that.
Fixes a similar foliage issue initially found with the native version.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11006>
No more error-prone encoding of swizzles in the .csv for non-planar
formats!
No change to generated u_format_table.c
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
Once you read enough of them, there's an obvious pattern that we can just
write a little code for instead of making every dev write it out each time.
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
Just check against the CSV (which has its codegen now tested with
u_format_test in CI) for now, so we know that our computed channels are
correct.
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
It notably didn't fit the pattern of RGB5_A1_UNORM, and violated the
general pattern for bitmask format BE channels (channels are ordered
right-to-left in the BE columns in the CSV due to the parser walking them
in that order for historical reasons).
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
These tests passed for LE, and the BE channel ordering specified obviously
didn't fit the pattern of the other BE formats (channels are listed
right-to-left in the BE columns for historical reasons).
Note that we can't write pure-integer format tests in u_format_tests.c
currently.
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
Z32_FLOAT_S8X24_UINT and X32_S8X24_UINT are in fact the only non-bitmask
formats that have BE swizzles specified, but sorting out those two is
harder.
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
I wanted to do the next set BE changes here where I have Format's helper
functions available.
No changes in generated u_format_table.c.
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
My editor likes to enforce pep8, here's some low hanging fruit so I don't
have to do too much add -p.
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10505>
Android 29 introduced general-dynamic TLS variable support ("quick
function call to look up the location of the dynamically allocated
storage"), while Mesa on normal Linux has used initial-exec ("use some of
the startup-time fixed slots, hope for the best!"). Both would be better
options than falling all the way back to pthread_getspecific(), which is
the alternative we have pre-29.
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10389>
I'm going to add another case for Android shortly, and then we can keep
the logic all in one spot.
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10389>
DXVK 1.8.1 marks position as always invariant but the DX12 version
of the game has the same issue and it's not yet fixed on the
vkd3d-proton side.
Fixes some Z-fighting on GFX10.3.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11029>
Various places around mesa which might want to register a data-source,
etc, should call util_perfetto_init() first to ensure we connect to the
tracing service.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
gcc 11 warns:
../src/util/format/u_format_fxt1.c:940:22: warning: ‘fxt1_variance.constprop’ accessing 128 bytes in a region of size 64 [-Wstringop-overflow=]
940 | int32_t maxVarR = fxt1_variance(NULL, &input[N_TEXELS / 2], n_comp);
But, suspiciously, if you inline fxt1_variance the warning goes away.
What's happening is that the 2nd arg is uint8_t[N_TEXELS][MAX_COMP], so
it looks like we're passing too small of an array in since gcc knows
that `input` is also [N_TEXELS][MAX_COMP]. Fair enough. Fix the
signature to reflect what's actually going on, and remove some unused
arguments while we're at it.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10671>
Expressible as a prefix sum but that's a bit unnatural, so add a
convenience helper.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mike Blumenkrants <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10581>
For non 64bit devices the key stored in hash_table_u64 is wrapped in
hash_key_u64 structure, which is never free.
This commit fixes this issue by just removing the user-defined
`delete_function` parameter in hash_table_u64_{destroy,clear} (which
nobody is using) and using instead a delete function to free this
structure.
Fixes: 608257cf82 ("i965: Fix INTEL_DEBUG=bat")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10480>
The s8x24 is a packed group in the format, and for packing we properly
write our s8 in the low bits of the 32, but for unpacking we tried
treating it as a byte array.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7776>
Before we can fix big endian, we kind of have to agree on how pixel data
should be laid out in memory.
For this we have to use packed data for !is_array declared formats and per
element values for is_array ones.
Also we need proper PACKED_1xn definitions for BE.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7776>
This is similar to the transcode_etc flag in that it changes the ASTC
fallback (when present) to use DXT5 instead of RGBA8888. This reduces
the memory footprint of the app at the expense of a bit of correctness.
Because it's not quite correct, it's hidden behind a driconf option.
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10476>
This code is taken from src/freedreno/isa/decode.c.
Since we need a similar function in panfrost, it's probably good to move
it to utils.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9461>
channel datatypes in Mesa are the host's endianness. For example,
PIPE_FORMAT_R32_UINT doesn't do a bswap in and out in u_format_table.c's
pack/unpack functions. So, z32_unorm shouldn't be byte swapping either,
and neither should z24s8 which is also a packed format, and once you've
got those it becomes clear that all of the swaps in this file were
mistaken.
Things would mostly work out because it's unusual to read/write Z/S data
through the GL API, and even for drivers like softpipe as long as the pack
and unpack both swap it could work anyway. However, the bug would be
visible in glReadPixels() with the matching datatype which would hit the
memcpy fastpath without doing another swap.
Caught by a mesa/main unit test on transitioning to using these
pack/unpack functions.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10336>
This reverts commit cd12fcff96.
The terrain looks fine now that TRUNC_COORD=0 for textureGather().
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>
In looking at the profile of dEQP, GLES3 was spending 5-10% of its time in
ReadPixels, and almost all of that is b8g8r8a8_unorm8. It's really slow
because we're getting about 47MB/s by doing uncached reads 32 bits at a
time in the code-generated unpack. If we use NEON to generate larger bus
transactions, we can speed things up to 136MB/s. In comparison, raw
ldr/str read/writes with no byte swapping can hit a max of 216MB/sec.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10014>
We have only a few callers of unpack that do rects, so add a helper that
iterates over y adding the strides. This saves us 36kb of generated code
and means that adding cpu-specific variants for RGBA format unpack will be
much simpler.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10014>
After the pixel block width and height, a third field is used to
store the pixel block depth. Document this field.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
With tearfree_discard=false, we tear when rendering
fullscreen apps with vsync off.
This is a feature in the sense it's the same as the native
implementation. This also means lower input lag.
However I think most users will prefer to have no tearing,
and don't care about sub refresh-rate input lag.
Thus it's better to default tearfree_discard to true.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10160>
512 MB is too ambitious as default value.
128 MB seems safer, and users can increase the limit
manually for the few games that would benefit from it.
Also fixes a typo in the description
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10160>
Until the problem described in
https://gitlab.freedesktop.org/mesa/mesa/-/issues/4489
is fixed, the advantages of using a sw renderer for the
sw rendering in nine are too small compared to the
disadvantages.
Add an option to control whether we use a sw renderer,
and make it so by default we don't.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10160>
Added:
* a pass that renumbers bases of IO intrinsics
* a pass that converts mediump IO to 16 bits, optionally using the new
packed varying slots
* a pass that sets (forces) mediump in IO intrinsics (for testing)
* a pass that remaps VARYING_SLOT_VAR[0..15]_16BIT to VARYING_SLOT_VAR[0..31]
(if some shader stages don't want packed varyings)
* a pass that folds type conversions around texture opcodes into those
opcodes (e.g. tex(f2f32(coord), ..) is changed into tex accepting f16)
* a pass that changes (legalizes) sampler src and dst types based on specified
hw constraints (e.g. derivatives must be the same type as coordinates)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
None of the call sites pass a string here, which produces warnings
for MSVC, for not passing an argument to a macro which requires it.
Looks like GCC/clang stringize an unpassed argument to ""
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10157>
a common usage for hash tables is for tracking exactly one instance of a pointer
for a given period of time, after which the table's entries are purged and it
is reused
this macro enables the purge phase of such usage to reset the table to a
pristine state, avoiding future rehashing due to ballooning of deleted entries
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8498>
a common usage for sets is for tracking exactly one instance of a pointer
for a given period of time, after which the set's entries are purged and it
is reused
this macro enables the purge phase of such usage to reset the table to a
pristine state, avoiding future rehashing due to ballooning of deleted entries
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8498>
I missed a corner case here: when the next range ends right at the end
of the bitset, we need to return immediately to avoid trying to search
after the bitset. And when finding the next end, we similarly need to
bail if the range is size 1 at the very end of the range. In practice
this probably would'nt have been noticed, because it would break out of
the loop anyway, but I happened to be running something using this under
Valgrind and it complained.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
This changes how the L3 cache affinity code works out the affinity
masks. It works better with multi-CPU systems and should also be
capable of handling big/little type situations if they appear in
the future.
It now iterates over all CPU cores, gets the core count for each
CPU, and works out the L3_ID from the physical CPU ID, and
the current cores L3 cache. It then tracks how many L3 caches
it has seen and reallocate the affinity masks for each one.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4496
Fixes: d8ea509965 ("util: completely rewrite and do AMD Zen L3 cache pinning correctly")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9782>
This adds a HAVE_COMPRESSION macro, which is undefined if neither zlib
nor zstd are present, and is used to no-op compress.h/c. This also has
a side effect of fixing SCons, since it won't define this macro.
Fixes: d7ecbd5bf8 ("util: create some standalone compression helpers")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9689>
MSVC warns (loudly) about a mismatch between the generated functions in
u_format_table.c and the definition of util_format_[un]pack_description,
specifically having 'restrict' in the function but not in the pointer
type in the struct.
So, add 'restrict' everywhere - to the function types, and to the rest
of the implementations that are assigned to those structs.
v2: util_format_unpack_description is used in gallium/auxiliary/translate
Fixes: 5785fdac ("u_format: Mark the generated pack/unpack src/dst args as restrict.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9670>
This refactors the cache code so that we create the full cache
item in memory before writing it to disk. The result is slightly
cleaner code and the ability to share this code between the single
and multi file cache implementations.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9443>
This will allow us to share validation between the single file cache
and the multifile cache.
It also reduces the number of malloc() calls.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9443>
We now handle compression in the shared cache item creation code.
Compressing the cache item header with the already compressed blob
doesn't help much so lets just remove it.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9593>
This makes compression use more consistent between the zstd and
zlib libraries. It also reduces the amount of code required for
zlib use.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9593>
This will be used by the following patch. It allows us to detangle
compression from the disk cache code, and abstract the underlying
compression libraries we use.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9593>
Calling code to pack/unpack with overlap would be already be undefined.
Cuts 50k of text on x86_64 release builds from the compiler having more
freedom in the src/dst loads knowing that they don't interfere with each
other.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9500>