New functions for examining instructions, declarations, etc.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
The errors.c file had grown quite large so split off this extension
code into its own file. This involved making a handful of functions
non-static.
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
This fixes a crash with bin/arb_clear_texture-base-formats and
probably some other tests which use clear_texture().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
There is no need to allocate memory when unwrapping the indirect buf.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The builtin data can get released with a glReleaseShaderCompiler call.
We're careful everywhere to clone everything that comes out of builtins
except here, where we accidentally return the signature belonging to the
builtin version, rather than the locally-cloned one.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
Cc: mesa-stable@lists.freedesktop.org
The builtin function shader is part of the builtin state, released
when glReleaseShaderCompiler is called. We must ensure that the
builtins have been (re)initialized before attempting to link with the
builtin shader.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
Cc: mesa-stable@lists.freedesktop.org
All interface blocks will have been lowered by this point so just
use an assert. Returning false would have caused all sorts of
problems if they were not lowered yet and there is an assert to
catch this later anyway.
We also update the tests to reflect this change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We use the simple batch helper to submit a batch at driver startup time
which holds all the state that never changes. We don't have a whole lot
and once we enable tesselation there'll be even less. Even so, it's a
simple mechanism and reduces our steady state batch sizes a bit.
The gen7 pipeline has a useful helper function for this, let's use it in
gen8_pipeline.c too. The gen7 function has an off-by-one bug though: we
have to compute log2(size / 1024) - 1, but we divide by 2048 instead so
as to avoid the case where size is less than 1024 and we'd return -1.
Fixes compute since 7dd31b81fe
gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
The vec4 backend, at the end, does this:
if (inst->is_3src()) {
for (int i = 0; i < 3; i++) {
if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0)
assert(brw_is_single_value_swizzle(inst->src[i].swizzle));
So make sure that we use the same conditions when trying to
copy-propagate. UNIFORMs will be converted to vstride 0 in
convert_to_hw_regs, but so will ATTRs when interleaved (as will happen
in a GS with multiple attributes). Since the vstride is not set at
copy-prop time, infer it by inspecting dispatch_mode and reject ATTRs if
they have non-scalar swizzles and are interleaved.
Fixes assertion errors in dolphin-generated geometry shaders (or
misrendering on opt builds) on Sandybridge or on IVB/HSW with
INTEL_DEBUG=nodualobj.
Co-authored-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93418
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
v2: assert and return if query_memory_info is not set
rebase
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
If you're worried about the duplication of some CAPs, we can remove them
later.
v2: add fields for memory eviction stats
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
GLsync objects had a race condition when used from multiple threads
(which is the main point of the extension, really); it could be
validated as a sync object at the beginning of the function, and then
deleted by another thread before use, causing crashes. Fix this by
changing all casts from GLsync to struct gl_sync_object to a new
function _mesa_get_and_ref_sync() that validates and increases
the refcount.
In a similar vein, validation itself uses _mesa_set_search(), which
requires synchronization -- it was called without a mutex held, causing
spurious error returns and other issues. Since _mesa_get_and_ref_sync()
now takes the shared context mutex, this problem is also resolved.
Fixes bug #92757, found while developing Nageru, my live video mixer
(due for release at FOSDEM 2016).
v2: Marek: silence warnings, fix declaration after code
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Yet another change motivated by AMD GPUPerfStudio compatibility. These groups
are not directly accessible from userspace, and AMD GPUPerfStudio does not
actually query them - it just requires them to be there. Hence, adding
a placeholder for now.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
This is yet another change motivated by appeasing AMD GPUPerfStudio's
hardcoding of performance counter group numbers.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
As documented in the comment, AMD GPUPerfStudio unfortunately hardcodes the
order of performance counter groups. Let's do the pragmatic thing and present
the same order as Catalyst/Crimson.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
This group was used by older versions of AMD GPUPerfStudio (via
AMD_performance_monitor) to identify the GPU family, and GPUPerfStudio
still complains when it isn't available.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Set R600_DEBUG=preoptir to dump the LLVM IR before optimization passes,
to allow diagnosing problems caused by optimization passes.
Note that in order to compile the resulting IR with llc, you will first
have to run at least the mem2reg pass, e.g.
opt -mem2reg -S < shader.ll | llc -march=amdgcn -mcpu=bonaire
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> (original patch)
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (w/ debug flag)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>