Map all formats to a valid ifmt. FMT6_10_10_10_2_UNORM_DEST still
doesn't work on the blitter so keep that one on the u_blitter path.
Fixes
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.*
with FD_MESA_DEBUG=nogmem.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5717>
in some cases (e.g., gl_ClipDistance) the nir_variable type doesn't match
the needed destination type, so we can simplify this code to just use
the destination type
fixes spec@glsl-1.10@execution@interpolation@interpolation-none-gl_backcolor-smooth-vertex
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5852>
as in the comment, while we may want to try verifying that discard will be
the last instruction in a block, it's a bit problematic given that other nir
passes we're doing may insert instructions after a discard as part of e.g.,
nir_opt_dead_cf in the process of removing another block
fixes shaders@glsl-fs-discard-04
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5852>
if the last instruction in a loop's body terminates a block, e.g., from
a nested loop with a jump as its final instruction, then no block will
have been started when returning to the original loop, and there's no need
to emit a branch
fixes shaders@glsl-vs-continue-in-switch-in-do-while
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5852>
VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_PRIMITIVES_BIT includes
primitives which won't get drawn due to e.g., not enough vertices emitted
by geometry shader
fixes spec@glsl-1.50@gs-emits-too-few-verts
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5533>
reset can't be performed during a renderpass, so we need to defer that until a
time when we're definitely not in a renderpass, such as when we're starting a
new query or resuming a query
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5533>
inline a query result value to each query object so we can stash the partial
result just before we do a pool reset, which will always happen during the
suspend/resume query mechanism that swaps active queries from a flushed batch
to the next batch
once (or if) the "real" call to fetch query results is called, we can dump the
inlined value into the fetch value and return the full results
fixesmesa/mesa#3000
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5533>
xfb queries allocate vk buffer objects in the underlying driver which
can be deallocated while an xfb query is in-flight if we attempt to
defer it due to the way that gl xfb is translated to vk, so we need to
continue forcing this behavior in that case
for other query types, we can safely defer here until the current batch has
finished rather than block
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5533>
this hooks up query objects to the batches that they're actively running on
(and the related fence) in order to manage the lifetimes of queries more
efficiently by calling vkCmdResetQueryPool only on init and when the query
pool has been completely used up. additionally, this resolves some vk spec
issues related to destroying pools with active queries
note that any time a query pool is completely used up, results are lost,
which is a very slight improvement on the previous abort() that was triggered
in that scenario
ref mesa/mesa#3000
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5533>
It's just whether it writes or not, which is already implied by the
presence/absence of a writer. So no need to track explicitly.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
Always checked together and really signal the same property from
different perspectives.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
When overwriting the writer, we need to release the old reference.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 2dad9fde50 ("panfrost: Start tracking inter-batch dependencies")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5859>
Native loads are the same for any format, so we can use the same
shader variant for all framebuffer formats with a native load.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
The pass is useful for EXT_shader_framebuffer_fetch, not just blend
shaders, so we should do it with the other lowering passes in
midgard_compile_shader_nir.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5755>
../src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp:69:8: warning: type ‘struct opProperties’ violates the C++ One Definition Rule [-Wodr]
69 | struct opProperties
| ^
../src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp:88:8: note: a different type is defined in another translation unit
88 | struct opProperties
| ^
../src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp:77:17: note: the first difference of corresponding definitions is field ‘fShared’
77 | unsigned int fShared : 3;
| ^
../src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp:96:17: note: a field with different name is defined in another translation unit
96 | unsigned int fImmd : 4; // last bit indicates if full immediate is suppoted
nvc0 code also has the same thing.
v2: rename both paths (Karol)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5873>
We don't use linked texture/samplers. Fixes a bunch of CTS issues which
also seem to fail a bit randomly depending on what tests ran before and
such, so the list is incomplete.
Fixes:
KHR-GL46.texture_gather.*
KHR-GL46.compute_shader.resource-texture
KHR-GL46.multi_bind.dispatch_bind_samplers
KHR-GL46.multi_bind.dispatch_bind_textures
KHR-GL46.shading_language_420pack.binding_sampler_array
KHR-GL46.shading_language_420pack.binding_sampler_single
KHR-GL46.shading_language_420pack.binding_samplers
KHR-GL46.stencil_texturing.functional
KHR-GL46.texture_gather.incomplete-texture-last-comp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5874>
This is required when the shader uses local memory for arrays or spills.
I really dislike how it's done right now, but oh well, it's the same for
other gens.
Fixes CTS tests:
KHR-GL46.shading_language_420pack.binding_image_array
KHR-GL46.shading_language_420pack.length_of_compute_result
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5840>
The pitchalign value was being left to 0 and then wrapping around when
the base offset was subtracted in texture state.
Fixes: 979e7e3680 ("freedreno/layout: layout simplifications and pitch from level 0 pitch")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5864>
The compute shader dirtying is a bit wrong here, since we don't
have a second stage like for fragment shaders, so dirty the compute
shader whenever a sampler or image changes, (ssbo/contexts don't
needs this).
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5835>
The fragment shader needs to be regenerated here,
so flag the same as for sampler views.
This causes correct flushing:
KHR-GL46.shader_image_load_store.non-layered_binding
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5835>
These variables are either not used in the code, only assigned but
never accessed, or only used inside codegen. Another reason is that this
patch will be preceding shader cache, and these variables are useless to
cache. Removing/moving them should make it clearer by removing the case something
from the structure is not cached.
Shader cache patch: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4264
Signed-off-by: Mark Menzynski <mmenzyns@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5697>
The current fall-through doesn't cause a difference in code flow, but
I think we want a break here.
Fixes: 2305ab6938 ("iris: Refactor modifier_is_supported for gen12")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5618>
Removes random global state flying about which doesn't really work for
common code. We cleanup some debug messages while we're at it because
the mostly-unused DBG macro relies on magic state.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5794>
Pass pools instead of batches, and rename in terms of pools instead of
transient memory for consistency while we're find-and-replacing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5794>
This updates a3xx/a4xx/a5xx to fix the fetchsize to "PITCHALIGN" (called
"MINLINEOFFSET" by the a3xx docs), and some simplifications to make things
more like a6xx. Also similar simplifications for a2xx layout code.
The pitch can always be determined using a simple calculation from the base
level pitch, so don't pre-calculate a pitch for each mipmap level.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5796>
Two problems:
* Multiply has higher priority than shift
* rsc->layout.format isn't initialized for a2xx
Fixes: 5a8718f01b ("freedreno: Make the slice pitch be bytes, not pixels.")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5796>
according to EXT_multiview_draw_buffers, gl_FragColor outputs to all available
render targets when used, so we need to translate this to gl_FragData[PIPE_MAX_COLOR_BUFS]
in order to correctly handle more than one color buffer attachment
this fixes the rest of spec@arb_framebuffer_object tests in piglit
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5687>
gcc is not smart enough to see that
enum pipe_format dst_fmt;
...
switch (data_size) {
case 16:
dst_fmt = PIPE_FORMAT_R32G32B32A32_UINT;
...
break;
case 12:
/* RGB32 is not a valid RT format. This will be handled by the pushbuf
* uploader.
*/
break;
case 8:
dst_fmt = PIPE_FORMAT_R32G32_UINT;
...
break;
case 4:
dst_fmt = PIPE_FORMAT_R32_UINT;
...
break;
case 2:
dst_fmt = PIPE_FORMAT_R16_UINT;
...
break;
case 1:
dst_fmt = PIPE_FORMAT_R8_UINT;
break;
default:
assert(!"Unsupported element size");
return;
}
...
if (data_size == 12) {
...
return;
}
Does not result in dst_fmt being uninitialized when it is used so
lets just initialise it to silence the warning.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5766>
When executing for a single primitive, the mask has only one active
lane, however the vertex emit emits for all the lanes, pass in
the active mask and write the excess lanes to the overflow slot.
Fixes:
glsl-1.50-gs-max-output -scan 1 20
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5555>
This set of checks matched the "access" list in u_format_table.py that
controls initializing this this function pointer, so just use the function
pointer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5728>
There's no need to go to all this trouble of setting up 16-byte vectors to
pack/unpack our 32-bit values, memcpy is really good at moving 4 bytes
around.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5728>
If we clear depth only texture via glClearTex(Sub)Image it may cause:
../src/intel/blorp/blorp_genX_exec.h:1554: blorp_emit_surface_states: Assertion `params->depth.enabled || params->stencil.enabled' failed.
due to clear_depth_stencil calling blorp_clear_depth_stencil when
depth is already fast-cleared and there is no stencil.
Fixes piglit test: arb_clear_texture-depth
Fixes: 51638cf18a
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5770>
In some places in SWR cod objects are initialized using
memset/memcpy. This is usually done to enable
allocating those objects in aligned memory.
It generates compilation warnings though,
which are worked around by casting the pointers to void*
before calling memset/memcpy.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5777>
For example only some UCPs may be used by the shader, triggering asserts
that too many consts are being uploaded.
While we're at it, also fix the const size when loading UCPs, since
otherwise it doesn't correspond to what the shader is actually using.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5752>
V3D has a bit to set the line caps to be perpendicular to the line
rather than aligned to the edges of the framebuffer. I don’t know what
the disadvantages are of enabling this, but I noticed by experimentation
that enabling line smoothing on the Intel driver also enables nicer line
caps, so it seems nice to enable it here too.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
When line smoothing is enabled, the driver now increases the width of
the line so that it can add some semi-transparent pixels to either side
of the line. A lowering pass is added which modifies the alpha component
of every write to fragment output 0 so that if the fragment is outside
the width of the line then the alpha is reduced. It additionally
discards fragments that are completely invisible. It might seem bad to
use discard on a tiled renderer but the assumption is that any bad
effects from using discard will also happen anyway because of enabling
alpha blending.
v2: Disable the line smoothing pass entirely when the framebuffer
contains an integer colour output or one with no alpha channel.
Calculate the coverage once upfront and store in a global variable
instead of calculating each time an output write is modified. Also
do the conditional discard once upfront.
v3: Don’t check whether the output buffer has an alpha channel. Only
look at output 0. Use aa_line_width intrinsic instead of calculating
the real line width in the shader. Clamp the coverage as part of the
global variable, not per output write.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
perf_cfg is enough - it already contains almost all necessary
information and is constructed in a more optimal way (O(n) vs O(n^2)
- it uses hash table to build the unique counter list).
"Almost all", because it doesn't contain OA raw counters, but
we should have not exposed them anyway. Quoting Mark Janes:
"I see no reason to include the OA raw counters in the list that
are provided to the user. They are unusable.
The MDAPI library can be used to configure raw counters in a way
that provides esoteric metrics, but that library is written against
INTEL_performance_query."
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5399>
This is necessary now that the compiler respects centroid interpolation,
even in non-MSAA mode. Otherwise the interpolation doesn't work. Fixes a
bunch of dEQP centroid transform feedback tests.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5778>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
As we do not support stream output buffers we only count the primitives
processed by the pipeline. Use the correct query type.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5754>
For some reason, in order to get all tests to pass, pretty much all
hardware (across vendors) has to program in offset_units * 2. This fixes
dEQP-GLES3.functional.polygon_offset.float32_displacement_with_units.
While we're at it, add polygon offset clamp support.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5763>
The previous version assumes tess level outputs will only be written once
in the shader, however its not possible to guarantee that.
It also assumes all invocations will write all the levels, which is also
not guaranteed.
This is required to fix the "tesselation" and "terraintessellation" demos
with turnip.
The comment about nir_lower_io_to_temporaries in lower_tess_ctrl_block is
removed because nir_lower_io_to_temporaries specifically skips TESS_CTRL
shaders so the comment doesn't make sense.
The split load for tess levels workaround is removed, the new version only
has scalar access unless if ever gets vectorized.
This sets NIR_COMPACT_ARRAYS cap to avoid the glsl tess vec lowering with
gallium. It seems this will also disable "LowerCombinedClipCullDistance",
which I'm not sure was needed or not.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5744>
glGetPerfMonitorCounterInfoAMD(..., ..., GL_COUNTER_RANGE_AMD, ...)
returned NAN (binary representation of uint64_t(-1) as float) as
a max value.
Fixes: 0fd4359733 ("iris/perf: implement routines to return counter info")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5473>
Atm. it is not possible to move the enums to a header file
as they do not use an identifier but directly define an
object.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5690>
this is a simple extension that enables using uint8-sized index buffers,
which lets us avoid having those go through primconvert
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5712>
This just refactors the image code, so that outdata is passed
explicitly, and refactors the internal handling of NONE formats.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
This isn't fully free of bugs, but it's good to get CI working,
so fixing those bugs doesn't break anything.
The main buggy areas are missing indirect texture size,
and transform feedback geometry streams.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
This isn't needed for the basic indirect code, but it is needed for
texture size/image size unfortunately. They could be done with a super
switch, but it seems simple to query them.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
In FreeBSD x86 and aarch64 __u64 is typedef to unsigned long and
is the same size as unsigned long long.
Since we are explicitly specifying the format, cast the value
to the proper type.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emmanuel <manu@FreeBSD.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3559>
Note:
* a3xx change based on available register documentation
* a4xx guesses (RB_RENDER_CONTROL2 bits especially)
* a5xx change based on a6xx, these registers seem identical
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
Casting to u8 arrays and picking the lowest byte is fairly LE specific
grab the other byte.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5679>
This allows doing occlusion queries in one frame and getting the
results in the next frame without having to flush.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5676>
Commit e369b8931c ("freedreno: Use explicit *_NONE enum for undefined
formats") only partially converts ~0 to *_NONE enum. It breaks texture
support, and glmark2 texture scene gives a black screen.
Adding the missing conversion of ~0 to *_NONE enum fixes the issue.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5693>
We weren't initializing the VCM bits in the !gs path, but v33 doesn't have
GS so we can just mark it unreachable.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2952>
So it could be used by both the OpenGL and the Vulkan driver.
In addition to the move, some small changes were needed to be made on
the API. For example, the simulator was receiving v3d_screen on
initialization, and that code setted v3d_screen->sim_file. Now it
returns the new sim_file created.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5666>
Chris Wilson noted that u_default_texture_subdata's transfer path
sometimes results in wasteful double copies. This patch is based
on an earlier path he wrote, but updated now that we have staging
blits for busy or compressed textures.
Consider the case of idle, non-CCS-compressed, tiled images:
The transfer-based CPU path has to return a "linear" mapping, so upon
map, it mallocs a temporary buffer. u_default_texture_subdata then
copies the client memory to this malloc'd buffer, and transfer unmap
performs a tiled_memcpy to copy it back into the texture. By writing
a direct texture_subdata() implementation, we're able to directly do
a tiled_memcpy from the client memory into the destination texture,
resulting in only one copy.
For linear buffers, there is no advantage to doing things directly, so
we simply fall back to u_default_texture_subdata()'s transfer path to
avoid replicating those cases.
We still may want to use GPU staging buffers for busy destinations
(to avoid stalls) or CCS-compressed images (to compress the data),
at which point we also fall back to the existing path. We thought
to try and use a tiled temporary, but this didn't appear to help.
Improves performance in x11perf -shmput500 by 1.96x on my Icelake.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2500
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3818>
Adds a shader disk-cache for ir3 shader variants. Note that builds with
`-Dshader-cache=false` have no-op stubs with `disk_cache_create()` that
returns NULL.
Binning pass variants are serialized together with their draw-pass
counterparts, due to shared const-state.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
For shader-cache, we are going to want to serialize them together.
Which is awkward if the two related variants are not compiled together.
This also decouples allocation and compile, which will simplify adding
shader-cache (which still needs to allocate, but can skip compile).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
Currently we always do sysmem if there is tess. And for GS, the binning
pass VS ends up identical to the draw pass VS, so no point in compiling
it twice. (For GS what we should do someday is generate a binning pass
GS, and possibly if we can do cross-stage linking opts, an optimized
binning pass VS, but the required outputs would somehow have to end up
in the shader variant key.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
Use ir3_compiler_destroy() rather than open-coding ralloc_free(). This
will give us a place to add more compiler related cleanup code in the
following patches.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
The next step is to hook this into pscreen->finalize_nir() so it can
come before the state tracker's shader-caching.
Unfortunately we still need to do lower_io after mesa/st, so that is
split out into a post-finalize pass.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
This was already done correctly for the indirect variants, and turnip
was setting the correct value, but it seems freedreno missed the change
in the non-indirect variant. Also, fix a misspelling of "indices" and
add a type to INDX_SIZE.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5644>
Fixes a performance regression in alacritty, and rendering is still
fine in GLQuake ports.
Fixes: 361fb38662 ("panfrost: Copy resources when mapping to avoid waiting for readers")
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5642>
last_seqnos is used in atomic operations. Specially on 32 bit platorms,
it tends to be slower if it's not aligned to 64 bits (see
cdc331c6f9). This fixes a small regression
on Bioshock.
Fixes: aba3aed96e ("iris: fix export of GEM handles")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5637>
Produced by comparing the traces of:
dEQP-VK.rasterization.culling.front_triangles
dEQP-VK.rasterization.culling.front_triangles_point
dEQP-VK.rasterization.culling.front_triangles_line
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5650>
Translate PIPE_BIND_SCANOUT as ISL_SURF_USAGE_DISPLAY_BIT,
instead of PIPE_BIND_DISPLAY_TARGET.
PIPE_BIND_DISPLAY_TARGET isn't used for dri images and seem to
be set only for fake winsys buffers (which aren't displayed).
The trouble is that a fake buffer could be multisampled and we
cannot have multisampled surface with display bit.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2313
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4784>
In rare cases, we can get into situations where the tracking of read/
written resources triggers a flush of the current batch.
To handle that, (1) take a reference to the current batch, so it doesn't
disappear under us, and (2) check after resource tracking whether the
current batch was flushed. If it is, we have to re-do the resource
tracking, but since we have a fresh batch, it should not get flushed the
second time around.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3160
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5634>
We don't really need the varying remapping, and it seems to somehow
happen twice when shader-cache comes into the picture. But we can
just choose not to have this problem.
Now that everything is using the ir3_point_sprite() helper, we can
flip this pipe cap without it being a massive flag-day.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5595>
The correct varying semantic to use depends on PIPE_CAP_TGSI_TEXCOORD.
To handle this transition switch it over to ureg builder, and query the
pipe-cap to choose the appropriate semantic.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5595>
When we flip the texcoord patch, we'll setup PNTC input slot in the
pre-built interp stateobj, rather than this being a draw-time (slow-
path) built stateobj. But rather than duplicate more of the slow-
path logic, refactor it out into a helper that is reused in both
cases.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5595>
This will simplify a bit the logic for setting up vinterp/vprepl in the
driver backend, and also avoid it being a flag-day when we switch the
texcoord pipe cap.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5595>
Simplification and alignment with meson's sources generation rules
Changelog:
- move rules from src/gallium/drivers/freedreno/Android.gen.mk to Android.ir3.mk
- simplify LOCAL_GENERATED_SOURCES based on $(ir3_GENERATED_FILES)
- remove includes of src/gallium/drivers/freedreno/Android.gen.mk
- remove src/gallium/drivers/freedreno/Android.gen.mk
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5580>
Changelog:
- Makefile.sources: add ir3_lexer.c and ir3_parser.{c,h} generated sources
- Android.ir3.mk: add the necessary generated sources rules
- Android.ir3.mk: add the necessary include paths
- src/gallium/drivers/freedreno/Android.gen.mk: generate only ir3_nir_{imul,trig}.c for the moment
Fixes the following building error:
target C: libfreedreno_ir3 <= external/mesa/src/freedreno/ir3/ir3_assembler.c
FAILED: out/target/product/x86_64/obj/STATIC_LIBRARIES/libfreedreno_ir3_intermediates/ir3/ir3_assembler.o
...
external/mesa/src/freedreno/ir3/ir3_assembler.c:28:10: fatal error: 'ir3_parser.h' file not found
^~~~~~~~~~~~~~
1 error generated.
Fixes: 1e8808a4a0 ("freedreno/ir3: refactor out helper to compile shader from asm")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5580>
this is required by spec, so we can generally assume that any time it's 0 here
this is the result of us being lazy elsewhere in the zink driver when we're
manually creating this sort of buffer
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5614>
layout_resource_for_modifier() needs to handle DRM_FORMAT_MOD_INVALID
as well, since src/gallium/frontends/dri/dri2.c uses this to indicate
"no modifier" when it's called through the older non-modifier entry
points.
This is similar to 334788d4 ("freedreno: allow INVALID modifier") but
for the generic implementation.
Fixes: 98910626 ("freedreno/a6xx: Implement layout for DRM_FORMAT_MOD_QCOM_COMPRESSED")
Closes: #3154
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5611>
When I was bringing up the driver, I had BLORP use #ifdefs for
softpin-mode vs. relocation-mode. That all got reworked during
review, but apparently this #define is still kicking around,
even though nothing uses it.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5610>
If there are holes between color outputs (e.g. a shader exports MRT1, but
not MRT0), we can remove the holes by moving higher MRTs lower.
The hardware will remap the MRTs to their correct locations if we remove
holes in SPI_SHADER_COL_FORMAT but not CB_SHADER_MASK.
This is a performance optimization, but MRTs with holes are pretty rare,
so there is most likely no effect on any app.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5535>
We are starting to see platforms that don't support the get/set tiling
uAPI. (For example, DG1.)
Additionally on DG1 we shouldn't be using the map_gtt anymore.
Let's add some asserts and make sure we don't take those paths
accidentally.
Rework:
* Jordan: Only apply for DG1, not all gen12
* Rafael: Use has_tiling_uapi
* Jordan: Copy has_tiling_uapi from devinfo
* Jordan: merge in "iris: Rework iris_bo_import_dmabuf() a little."
* Jordan: Continue to call get/set_tiling on modifier path
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4956>
Reworks:
* Jordan: Check for cfg == NULL rather than is_dg1
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4956>
The hardware can only support the PIPE_CAP_PRIMITIVE_RESTART_FIXED_INDEX
subset. This will make it stop advertising the NV_primitive_restart
extension without breaking GLES 3.0 support.
v2: Update features.txt
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5559>
Adds PIPE_CAP_PRIMITIVE_RESTART_FIXED_INDEX which is a subset of the
primitive restart cap for when the hardware can only support the fixed
indices specified in GLES.
The switch statements were automatically modified with this command:
find \( \( -name \*.cpp -o -name \*.c \) \! -type l \) \
-exec sed -i -r \
's/^(\s*case\s+PIPE_CAP_PRIMITIVE_RESTART)\s*:.*$/\0\n\1_FIXED_INDEX:/' \
{} \;
v2: Add a note in screen.rst
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5559>
When doing RA we end up with adding ValueDef references to Values across
all over the shader. This is all fine until we remove the Instruction
defining those Values, which happens when spilling values.
Instead of manipulating the values directly we should just track all
merged in defs in a seperate structure and remove stale references when
an instruction gets deleted in the spiller.
fixes following libasan report:
=================================================================
==612087==ERROR: AddressSanitizer: heap-use-after-free on address 0x6150003ea380 at pc 0x7f1d12142fe9 bp 0x7fffca6fd120 sp 0x7fffca6fd110
READ of size 8 at 0x6150003ea380 thread T0
#0 0x7f1d12142fe8 in nv50_ir::ValueDef::get() const ../src/gallium/drivers/nouveau/codegen/nv50_ir.h:648
#1 0x7f1d12143c02 in nv50_ir::Value::getUniqueInsn() const ../src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h:229
#2 0x7f1d1221530d in nv50_ir::RegAlloc::BuildIntervalsPass::addLiveRange(nv50_ir::Value*, nv50_ir::BasicBlock const*, int) ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:333
#3 0x7f1d1221872e in nv50_ir::RegAlloc::BuildIntervalsPass::visit(nv50_ir::BasicBlock*) ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:686
#4 0x7f1d1215676c in nv50_ir::Pass::doRun(nv50_ir::Function*, bool, bool) ../src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp:495
#5 0x7f1d121563ed in nv50_ir::Pass::run(nv50_ir::Function*, bool, bool) ../src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp:477
#6 0x7f1d122262b8 in nv50_ir::RegAlloc::execFunc() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1910
#7 0x7f1d122256b0 in nv50_ir::RegAlloc::exec() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1849
#8 0x7f1d12226f1e in nv50_ir::Program::registerAllocation() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1970
#9 0x7f1d1214092a in nv50_ir_generate_code ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:1275
#10 0x7f1d1227461b in nvc0_program_translate ../src/gallium/drivers/nouveau/nvc0/nvc0_program.c:634
#11 0x7f1d12294b21 in nvc0_sp_state_create ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:620
#12 0x7f1d12294d90 in nvc0_fp_state_create ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:661
#13 0x7f1d12ad4912 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1498
#14 0x7f1d12ad4cd5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1525
#15 0x7f1d12ad8252 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:2053
#16 0x7f1d12c9c851 in st_program_string_notify ../src/mesa/state_tracker/st_cb_program.c:185
#17 0x7f1d12d17731 in st_link_tgsi ../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:7441
#18 0x7f1d12cabaf0 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_ir.cpp:175
#19 0x7f1d127c85ca in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3186
#20 0x7f1d1252a9f7 in link_program ../src/mesa/main/shaderapi.c:1285
#21 0x7f1d1252a9f7 in link_program_error ../src/mesa/main/shaderapi.c:1384
#22 0x7f1d1252deb3 in _mesa_LinkProgram ../src/mesa/main/shaderapi.c:1876
#23 0x403e13 in main._omp_fn.0 /home/kherbst/git/shader-db/run.c:926
#24 0x7f1d17b8b4b5 in GOMP_parallel (/lib64/libgomp.so.1+0x124b5)
#25 0x4029e4 in main /home/kherbst/git/shader-db/run.c:765
#26 0x7f1d179b51a2 in __libc_start_main ../csu/libc-start.c:308
#27 0x402d1d in _start (/home/kherbst/git/shader-db/run+0x402d1d)
0x6150003ea380 is located 0 bytes inside of 504-byte region [0x6150003ea380,0x6150003ea578)
freed by thread T0 here:
#0 0x7f1d17e5d96f in operator delete(void*) (/usr/lib64/libasan.so.5.0.0+0x11096f)
#1 0x7f1d1214ec0f in __gnu_cxx::new_allocator<nv50_ir::ValueDef>::deallocate(nv50_ir::ValueDef*, unsigned long) /usr/include/c++/9/ext/new_allocator.h:128
#2 0x7f1d1214dc00 in std::allocator_traits<std::allocator<nv50_ir::ValueDef> >::deallocate(std::allocator<nv50_ir::ValueDef>&, nv50_ir::ValueDef*, unsigned long) /usr/include/c++/9/bits/alloc_traits.h:470
#3 0x7f1d1214c5fb in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::_M_deallocate_node(nv50_ir::ValueDef*) /usr/include/c++/9/bits/stl_deque.h:624
#4 0x7f1d121498c4 in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::_M_destroy_nodes(nv50_ir::ValueDef**, nv50_ir::ValueDef**) /usr/include/c++/9/bits/stl_deque.h:758
#5 0x7f1d1214704d in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::~_Deque_base() /usr/include/c++/9/bits/stl_deque.h:680
#6 0x7f1d12145371 in std::deque<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::~deque() /usr/include/c++/9/bits/stl_deque.h:1069
#7 0x7f1d1213bc5b in nv50_ir::Instruction::~Instruction() ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:615
#8 0x7f1d1213fb2f in nv50_ir::Program::releaseInstruction(nv50_ir::Instruction*) ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:1148
#9 0x7f1d122250fb in nv50_ir::SpillCodeInserter::run(std::__cxx11::list<std::pair<nv50_ir::Value*, nv50_ir::Value*>, std::allocator<std::pair<nv50_ir::Value*, nv50_ir::Value*> > > const&) ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1830
#10 0x7f1d12221445 in nv50_ir::GCRA::allocateRegisters(nv50_ir::ArrayList&) ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1541
#11 0x7f1d122262e9 in nv50_ir::RegAlloc::execFunc() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1913
#12 0x7f1d122256b0 in nv50_ir::RegAlloc::exec() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1849
#13 0x7f1d12226f1e in nv50_ir::Program::registerAllocation() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp:1970
#14 0x7f1d1214092a in nv50_ir_generate_code ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:1275
#15 0x7f1d1227461b in nvc0_program_translate ../src/gallium/drivers/nouveau/nvc0/nvc0_program.c:634
#16 0x7f1d12294b21 in nvc0_sp_state_create ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:620
#17 0x7f1d12294d90 in nvc0_fp_state_create ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:661
#18 0x7f1d12ad4912 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1498
#19 0x7f1d12ad4cd5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1525
#20 0x7f1d12ad8252 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:2053
#21 0x7f1d12c9c851 in st_program_string_notify ../src/mesa/state_tracker/st_cb_program.c:185
#22 0x7f1d12d17731 in st_link_tgsi ../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:7441
#23 0x7f1d12cabaf0 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_ir.cpp:175
#24 0x7f1d127c85ca in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3186
#25 0x7f1d1252a9f7 in link_program ../src/mesa/main/shaderapi.c:1285
#26 0x7f1d1252a9f7 in link_program_error ../src/mesa/main/shaderapi.c:1384
#27 0x7f1d1252deb3 in _mesa_LinkProgram ../src/mesa/main/shaderapi.c:1876
#28 0x403e13 in main._omp_fn.0 /home/kherbst/git/shader-db/run.c:926
previously allocated by thread T0 here:
#0 0x7f1d17e5c9d7 in operator new(unsigned long) (/usr/lib64/libasan.so.5.0.0+0x10f9d7)
#1 0x7f1d1215046f in __gnu_cxx::new_allocator<nv50_ir::ValueDef>::allocate(unsigned long, void const*) /usr/include/c++/9/ext/new_allocator.h:114
#2 0x7f1d1214ebec in std::allocator_traits<std::allocator<nv50_ir::ValueDef> >::allocate(std::allocator<nv50_ir::ValueDef>&, unsigned long) /usr/include/c++/9/bits/alloc_traits.h:444
#3 0x7f1d1214dbd3 in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::_M_allocate_node() /usr/include/c++/9/bits/stl_deque.h:617
#4 0x7f1d1214c464 in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::_M_create_nodes(nv50_ir::ValueDef**, nv50_ir::ValueDef**) (/home/kherbst/local/lib64/dri//nouveau_dri.so+0x829464)
#5 0x7f1d121495cd in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::_M_initialize_map(unsigned long) /usr/include/c++/9/bits/stl_deque.h:716
#6 0x7f1d12146f7d in std::_Deque_base<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::_Deque_base() /usr/include/c++/9/bits/stl_deque.h:507
#7 0x7f1d1214518d in std::deque<nv50_ir::ValueDef, std::allocator<nv50_ir::ValueDef> >::deque() /usr/include/c++/9/bits/stl_deque.h:912
#8 0x7f1d1213b9c9 in nv50_ir::Instruction::Instruction(nv50_ir::Function*, nv50_ir::operation, nv50_ir::DataType) ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:605
#9 0x7f1d1224dd44 in nv50_ir::Function::convertToSSA() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ssa.cpp:385
#10 0x7f1d1224d381 in nv50_ir::Program::convertToSSA() ../src/gallium/drivers/nouveau/codegen/nv50_ir_ssa.cpp:310
#11 0x7f1d121407c0 in nv50_ir_generate_code ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:1264
#12 0x7f1d1227461b in nvc0_program_translate ../src/gallium/drivers/nouveau/nvc0/nvc0_program.c:634
#13 0x7f1d12294b21 in nvc0_sp_state_create ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:620
#14 0x7f1d12294d90 in nvc0_fp_state_create ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:661
#15 0x7f1d12ad4912 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1498
#16 0x7f1d12ad4cd5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1525
#17 0x7f1d12ad8252 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:2053
#18 0x7f1d12c9c851 in st_program_string_notify ../src/mesa/state_tracker/st_cb_program.c:185
#19 0x7f1d12d17731 in st_link_tgsi ../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:7441
#20 0x7f1d12cabaf0 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_ir.cpp:175
#21 0x7f1d127c85ca in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3186
#22 0x7f1d1252a9f7 in link_program ../src/mesa/main/shaderapi.c:1285
#23 0x7f1d1252a9f7 in link_program_error ../src/mesa/main/shaderapi.c:1384
#24 0x7f1d1252deb3 in _mesa_LinkProgram ../src/mesa/main/shaderapi.c:1876
#25 0x403e13 in main._omp_fn.0 /home/kherbst/git/shader-db/run.c:926
SUMMARY: AddressSanitizer: heap-use-after-free ../src/gallium/drivers/nouveau/codegen/nv50_ir.h:648 in nv50_ir::ValueDef::get() const
Shadow bytes around the buggy address:
0x0c2a80075420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c2a80075430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c2a80075440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c2a80075450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
0x0c2a80075460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2a80075470:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2a80075480: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2a80075490: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2a800754a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
0x0c2a800754b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c2a800754c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==612087==ABORTING
v2: full rework
v3: manage a full copy instead of recreating new lists on every access
Closes: #3066
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5277>
It is often faster to copy the whole resource and modify that than
to flush and wait for readers of the BO.
Helps anything which updates textures after already using them in a
frame, such as most GLQuake ports.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5573>
The BO reallocation path in panfrost_transfer_map caused textures and
sampler views to get out of sync.
v2: Use the GPU address of the BO in case two BOs get allocated at the
same address.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5573>
Remove all convergency handling as it was broken and get the code to be a
bit closer to TGSI. Also removes pointless asserts.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Ben Skeggs <bskeggs@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5512>