mesa/src/util/u_queue.c:242:15: error: address of array 'queue->name'
will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
Fixes: b238e33bc9 "kutil/queue: add a process name into a thread name"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
On Python 2, the default JSON separators are ', ' for items and ': ' for
dicts.
On Python 3, the default is the same when no indent is specified, but if
one is (and we do specify one) then the default items separator becomes
',' (the dict separator remains unchanged).
This change explicitly specifies the Python 3 default, which helps
ensuring that the output is identical, whether it was generated by
Python 2 or 3.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
In Python, dictionaries and sets are unordered, and as a result their
is no guarantee that running this script twice will produce the same
output.
Using ordered dicts and explicitly sorting items makes the build more
reproducible, and will make it possible to verify that we're not
breaking anything when we move the build scripts to Python 3.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We already embed the headers, no need to redefine defines/structs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
For gen8+, write out PPGTT tables in aub files so that full 48-bit
addresses can be serialized.
v2: Fix handling of `end` index in map_ppgtt
v3: Correctly mark GGTT entry as present (Rafael)
Signed-off-by: Scott D Phillips <scott.d.phillips@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Scott added new stuff in IGT.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
With PPGTT mappings, our aubinator implementation can be quite slow if
we request a buffer that doesn't exist. Instead of doing a PPGTT walk
for invalid addresses (0 lengths), wait until we're sure we want to
decode the data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
v2: by Lionel
Fix memfd_create compilation issue
Fix pml4 address stored on 32 instead of 64bits
Return no buffer if first ppgtt page is not mapped
v3: Drop additional memfd_create() (Rafael)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
We use memfd to store physical pages as they get read/written to and
the GGTT entries translating virtual address to physical pages.
Based on a commit by Scott Phillips.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
This is a simple, invasive, liberally licensed red-black tree
implementation. It's an invasive data structure similar to the
Linux kernel linked-list where the intention is that you embed a
rb_node struct the data structure you intend to put into the
tree.
The implementation is mostly based on the one in "Introduction to
Algorithms", third edition, by Cormen, Leiserson, Rivest, and
Stein. There were a few other key design points:
* It's an invasive data structure similar to the [Linux kernel
linked list].
* It uses NULL for leaves instead of a sentinel. This means a few
algorithms differ a small bit from the ones in "Introduction to
Algorithms".
* All search operations are inlined so that the compiler can
optimize away the function pointer call.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that we're softpinning the address of our BOs in anv & i965, the
addresses selected start at the top of the addressing space. This is a
problem for the current implementation of aubinator which uses only a
40bit mmapped address space.
This change keeps track of all the memory writes from the aub file and
fetch them on request by the batch decoder. As a result we can get rid
of the 1<<40 mmapped address space and only rely on the mmap aub file
\o/
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
On a follow up commit in this series, we stop copying the data from
the mmap'ed file into our big gtt mmap, and start referencing data in
it directly. So reallocating the read buffer and adding more data from
stdin wouldn't work. For that reason, let's stop supporting stdin
process.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
These memory offsets are stored in the gen_batch_decode_ctx.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Commit f69bc797e1 did the following:
- if format.layout in ('bptc', 'astc'):
+ if format.layout in ('astc'):
The intention was to go from matching either 'bptc' or 'astc' to
matching only 'astc'.
But the new code doesn't respect this intention any more, because in
Python `('astc')` is not a tuple containing a string, it is just the
string. (the parentheses are simply ignored)
That means we now match any substring of 'astc', for example 'a'.
This commit fixes the test to respect the original intention.
Fixes: f69bc797e1 "gallium/auxiliary: Add helper support for
bptc format compress/decompress"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Always emitting a bottom-of-pipe event is quite dumb. Instead,
start to optimize these functions by syncing PFP for the
top-of-pipe and syncing ME for the post-index-fetch event.
This can still be improved by emitting EOS events for
syncing PS and CS stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This introduces radv_barrier() (same as the draw/dispatch codepath).
This helper is used for merging the code from CmdWaitEvents() and
CmdPipelineBarrier because it's quite similar.
We do ignore the source stage mask for CmdWaitEvents because
it's irrelevant when event objects are used.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Structures might be padded by the compiler and these padding bytes remain
un-initialized which in turn makes memcmp return a difference where from
the logical point of view there is none.
Fixes valgrind:
Conditional jump or move depends on uninitialised value(s)
at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099)
by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573)
by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129)
by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153)
by 0xB3B92CB: st_draw_vbo (st_draw.c:235)
by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391)
by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550)
by 0x10A989: piglit_display (textureSize.c:157)
by 0x4F8F174: run_test (piglit_fbo_framework.c:52)
by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229)
by 0x10A60A: main (textureSize.c:71)
Uninitialised value was created by a stack allocation
at 0xB3948FD: st_update_array (st_atom_array.c:388)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Below tests would fail with an error message
"Vertex format (R4G4B4A4|R5G5B5A1) not supported."
Add the formate to the translation routine to enable these formats.
Fixes:
dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d
dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube
dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d
dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube
dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d
dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube
dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d
dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube
dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array
dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d
dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array
dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d
dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d
dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
For a texture that has only one LOD defined, but for which
GL_TEXTURE_MAX_LEVEL is the default (1000) and
GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does
not properly resolve the LOD level and texture lookup might fail. Hence,
when no mipmap filter is given (indicating that no mip-mapping takes place),
force the LOD range to contain only value.
Fixes:
dEQP-GLES3.functional.shaders.texture_functions.texture*.(i|u)sampler2d*
dEQP-GLES3.functional.texture.format.sized.cube.rgb*
out of VK_GL_CTS/android/cts/master/gles3-master.txt
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
restart_index is later always used in a comparison, so it should be
initialized properly.
Fixes valgrind warning:
Conditional jump or move depends on uninitialised value(s)
at 0xB8D682F: r600_draw_vbo (r600_state_common.c:2153)
by 0xB71F743: u_vbuf_draw_vbo (u_vbuf.c:1156)
by 0xB3B92DB: st_draw_vbo (st_draw.c:235)
by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391)
by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550)
by 0x10A989: piglit_display (textureSize.c:157)
by 0x4F8F174: run_test (piglit_fbo_framework.c:52)
by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229)
by 0x10A60A: main (textureSize.c:71)
Uninitialised value was created by a stack allocation
at 0xB3B90B0: st_draw_vbo (st_draw.c:143)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Its unlikely anyone will add proper ARB_direct_state_access compat
support before we branch 18.2. Enabling the extension in 4.5 at
least allows users to make use of MESA_GL_VERSION_OVERRIDE=4.5COMPAT
for games like No Mans Sky.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The game forgets to enable multiple extensions in its shaders, one
of those extesions is EXT_texture_array. But enabling this config
entry fixes at least one other rendering issue that enabling
EXT_texture_array on its own doesn't fix.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
There is a 15-character limit for thread names shared by the queue name
and process name. Shorten the thread name to make space for the process
name.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Compile times of simple shaders are reduced by ~20%.
Compile times of prologs and epilogs are reduced by up to 40%.
Reviewed-by: Dave Airlie <airlied@redhat.com>
This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked.
struct ac_compiler_passes (opaque type) contains the main pass manager.
ac_create_llvm_passes -- the result can go to thread local storage
ac_destroy_llvm_passes -- can be called by a destructor in TLS
ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary
The motivation is to do the expensive call addPassesToEmitFile once
per context or thread.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Changes in v2:
- make loadSuInfo32() protected without making the rest protected
- move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating
NVC0_SU_INFO_MS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
In 6f5abf3146 this code was fixed to calculate the maximum size of
an attribute in a seperate pass and then allocate the registers to
that size. However this wasn’t taking into account ranges that overlap
but don’t have the same starting location. For example:
layout(location = 0, component = 0) out float a[4];
layout(location = 2, component = 1) out float b[4];
Previously, if ‘a’ was processed first then it would allocate a
register of size 4 for location 0 and it wouldn’t allocate another
register for location 2 because it would already be covered by the
range of 0. Then if something tries to write to b[2] it would try to
write past the end of the register allocated for ‘a’ and it would hit
an assert.
This patch changes it to scan for any overlapping ranges that start
within each range to calculate the maximum extent and allocate that
instead.
Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/
vs-fs-array-interleave-range.shader_test
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 6f5abf3146 "i965: Fix output register sizes when multiple variables
share a slot."
The current code causes:
/usr/include/c++/8/debug/safe_iterator.h:207:
Error: attempt to copy from a singular iterator.
This is due to the iterators getting invalidated, fix the
reverse iterator to use the return value from erase, and
cast it properly.
(used Mathias suggestion)
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
This ports radv to the shared code, however due to a bug in LLVM
version prior to 7, radv cannot add target info at this stage,
as it would leak one for every shader compile, however I'd prefer
to keep this llvm damage in the shared code, since it isn't the
driver at fault here. We just add a flag to denote if the driver
can support leaking the target info or not, and the common code
does the right thing depending on the llvm version.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We want to share this code with radv in the future, so port
it out of radeonsi.
Add a return value as radv will want that to know if this
succeeds
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
As precursor to moving init to common code, just rename the struct
and move it.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is prep work for moving this to a per-thread struct
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>