Commit Graph

101087 Commits

Author SHA1 Message Date
Brian Paul 08d97aadd1 st/mesa: fix texture deletion context mix-up issues (v2)
When we destroy a context, we need to temporarily make that context
the current one for the thread.

That's because during context tear-down we make many calls to
_mesa_reference_texobj(&texObj, NULL).  Note there's no context
parameter.  If the texture's refcount goes to zero and we need to
delete it, we use the thread's current context.  But if that context
isn't the context we're tearing down, we get into trouble when
deallocating sampler views.  See patch 593e36f956 ("st/mesa:
implement "zombie" sampler views (v2)") for background information.

Also, we need to release any sampler views attached to the fallback
textures.

Fixes a crash on exit with a glretrace of the Nobel Clinician
application.

v2: at end of st_destroy_context(), check if save_ctx == ctx and
unbind the context if so.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-03-25 06:57:57 -06:00
Brian Paul d13167cd21 nir: fix a few signed/unsigned comparison warnings
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-25 06:51:31 -06:00
Kishore Kadiyala e1d8057160 android: static link with libexpat with Android O+
In Android O, MESA needs to statically link libexpat so that
it's in same VNDK namespace.

v2: apply change also to anv driver (Tapani)
v3: use += in anv change (Eric Engestrom)

Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33
Signed-off-by: Kishore Kadiyala <kishore.kadiyala@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-03-25 10:11:57 +02:00
Samuel Iglesias Gonsálvez 01cf390035 radv: write availability status vkGetQueryPoolResults() when the data is not available
If VK_QUERY_RESULT_WITH_AVAILABILY_BIT is set and
VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both not
set, we need return to VK_NOT_READY only and set the availability
status field for each query.

From Vulkan spec:

"If VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both
not set then no result values are written to pData for queries that are
in the unavailable state at the time of the call, and
vkGetQueryPoolResults returns VK_NOT_READY. However, availability state
is still written to pData for those queries if
VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-25 08:21:22 +01:00
Samuel Iglesias Gonsálvez cb3ea50ec2 radv: don't overwrite results in VkGetQueryPoolResults() when queries are not available
If the query is not available and VK_QUERY_RESULT_WAIT_BIT and
VK_QUERY_RESULT_PARTIAL_BIT are both not set, the spec doesn't
allow to modify its result.

From Vulkan spec:

"If VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are
both not set then no result values are written to pData for queries
that are in the unavailable state at the time of the call, and
vkGetQueryPoolResults returns VK_NOT_READY. However, availability state
is still written to pData for those queries
if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set."

v2:
- Move VK_NOT_READY change to next patch (Samuel Pitoiset)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-25 08:21:22 +01:00
Tapani Pälli 2c240a5216 st/mesa: fix warnings about implicit conversion on enumeration type
These enums match but compiler warns about implicit conversion.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-03-25 07:44:27 +02:00
Tapani Pälli ec12316489 st/mesa: fix compilation warning on storage_flags_to_buffer_flags
(warning: 'const' type qualifier on return type has no effect)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-03-25 07:44:05 +02:00
Dave Airlie 9417793fb1 nir/split_vars: fixup some more explicit_stride related issues.
With vkpipelinedb Samuel discovered a regression since we stopped
stripping types at the spir-v level.

This adds a check to the var splitting for the case where it
asserts the type hasn't changed, when it has just created a bare
type, and it's different than the original type which has an explicit
stride.

This also removes a pointless assert that also triggers.

Fixes: 3b3653c4cf (nir/spirv: don't use bare types, remove assert in split vars for testing)

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-25 13:57:16 +10:00
Caio Marcelo de Oliveira Filho 9d0ae777dd spirv: Use interface type for block and buffer block
Also handle GLSL_TYPE_INTERFACE the same way we do GLSL_TYPE_STRUCT in
various places.  Motivated by ARB_gl_spirv work, that will take
advantage of the interface types when handling NIR coming from SPIR-V.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-23 10:22:39 -07:00
Caio Marcelo de Oliveira Filho fb024f5e72 intel/compiler: handle GLSL_TYPE_INTERFACE as GLSL_TYPE_STRUCT
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-23 10:22:39 -07:00
Caio Marcelo de Oliveira Filho 15012077bc spirv: Add an execution environment to the options
Also updates gl_spirv to pick the right one.  At the moment nothing
uses it, but upcoming functionality part of ARB_gl_spirv will use it,
and we also later can be more assertful when handling certain features
for each of the execution environments.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2019-03-23 09:29:21 -07:00
Eric Anholt dacb11a585 egl: Add a 565 pbuffer-only EGL config under X11.
The CTS requires a 565-no-depth-no-stencil (meaning d/s not-required, not
not-present) config for ES 3.0, but at depth 24 of X11 we wouldn't do so.
We can satisfy that bad requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

I've tried to raise this as an absurd requirement with Khronos and made no
progress.

v2: Make sure it's single sample, no depth, no stencil.  Comment typo fix

Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-03-22 15:22:40 -07:00
Caio Marcelo de Oliveira Filho e5830e1132 nir: Handle array-deref-of-vector case in loop analysis
SPIR-V can produce those for SSBO and UBO access.  Found when testing
the ARB_gl_spirv series.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-22 13:50:39 -07:00
Rob Clark 6fd5a7ff8c freedreno: add ESSL cap
Report 320 for a6xx, which isn't *quite* true (no geom/tess, in
particular), but other caps keep the reported GL and GLSL versions
correct (3.1 / 3.10 es).  But reporting 320 will switch on
EXT_gpu_shader5, which is the goal.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-22 16:39:14 -04:00
Rob Clark 6cd9876047 mesa/st: use ESSL cap top enable gpu_shader5
For GLES2+ contexts, enable EXT_gpu_shader5 if the driver exposes a
sufficiently high ESSL feature level, even if the GLSL feature level
isn't high enough.

This allows drivers to support EXT_gpu_shader5 in GLES contexts before
they support all the additional features of ARB_gpu_shader5 in GL
contexts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-03-22 16:39:13 -04:00
Rob Clark de481947d9 gallium: add PIPE_CAP_ESSL_FEATURE_LEVEL
Adds a new cap to allow drivers to expose higher shading language
versions in GLES contexts, to avoid having to report an artificially
low version for the benefit of GL contexts.

The motivation is to expose EXT_gpu_shader5 even though a driver may
not support all the features needed for the corresponding GL extension
(ARB_gpu_shader5).

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-03-22 16:39:13 -04:00
Vinson Lee 93c81ca336 swr: Fix build with llvm-9.0.
Fix build error after llvm-9.0svn r352827 ("[opaque pointer types] Add a
FunctionCallee wrapper type, and use it.").

In file included from ./rasterizer/jitter/builder.h:158:0,
                 from swr_shader.cpp:35:
./rasterizer/jitter/gen_builder_meta.hpp: In member function ‘llvm::Value* SwrJit::Builder::VGATHERPD(llvm::Value*, llvm::Value*, llvm::Value*, llvm::Value*, llvm::Value*, const llvm:
:Twine&)’:
./rasterizer/jitter/gen_builder_meta.hpp:51:117: error: no matching function for call to ‘cast(llvm::FunctionCallee)’
     Function* pFunc = cast<Function>(JM()->mpCurrentModule->getOrInsertFunction("meta.intrinsic.VGATHERPD", pFuncTy));
                                                                                                                     ^

Suggested-by: Philip Meulengracht <the_meulengracht@hotmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-03-22 13:13:51 -07:00
Samuel Pitoiset 23d30f4099 spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass
This lowering isn't needed for RADV because AMDGCN has two
instructions. It will be disabled for RADV in an upcoming series.

While we are at it, factorize a little bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-22 19:41:46 +01:00
Samuel Pitoiset 6ae5797243 nir: use generic float types for frexp_exp and frexp_sig
Only the exponent needs to be 32-bit signed integer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-22 19:41:44 +01:00
Vinson Lee 77aa11ca32 nir: Fix anonymous union initialization with older GCC.
Fix this build error with GCC 4.4.7.

  CC     nir/nir_opt_copy_prop_vars.lo
nir/nir_opt_copy_prop_vars.c: In function ‘load_element_from_ssa_entry_value’:
nir/nir_opt_copy_prop_vars.c:454: error: unknown field ‘ssa’ specified in initializer
nir/nir_opt_copy_prop_vars.c:455: error: unknown field ‘def’ specified in initializer
nir/nir_opt_copy_prop_vars.c:456: error: unknown field ‘component’ specified in initializer
nir/nir_opt_copy_prop_vars.c:456: error: extra brace group at end of initializer
nir/nir_opt_copy_prop_vars.c:456: error: (near initialization for ‘(anonymous).<anonymous>’)
nir/nir_opt_copy_prop_vars.c:456: warning: excess elements in union initializer
nir/nir_opt_copy_prop_vars.c:456: warning: (near initialization for ‘(anonymous).<anonymous>’)

Fixes: 96c32d7776 ("nir/copy_prop_vars: handle load/store of vector elements")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109810
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-03-22 10:43:41 -07:00
Chris Wilson db99d02fce iris: Push heavy memchecker code to DEBUG
Invoking VALGRIND_CHECK_MEM_IS_DEFINED pulls in enough code to convince
gcc to not inline __gen_uint and results in a lot of packing code ending
up out-of-line with lots of stack copying. To ameliorate this, only
insert the check inside the packer if DEBUG is defined and instead
perform the validation checking before submitting the batch to the
kernel. This should give accurate results if --trace-origins=yes is
used, and failing that we can recompile in full debug mode to check on
insertion.

Improve drawoverhead baseline by 25% with a default build with
valgrind-dev installed (with effectively no loss of vg coverage).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-22 10:38:03 -07:00
Kenneth Graunke 87f865aab3 iris: Fix batch chaining map_next increment.
Caught by Chris Wilson; split out from his valgrind patch.
2019-03-22 09:31:15 -07:00
Rob Clark bf5a92811d freedreno/ir3: disable early-z for SSBO/image writes
Fixes:

dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil_fbo

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-22 08:53:28 -04:00
Rob Clark dbac1a80d1 freedreno/ir3: rename has_kill to no_earlyz
There are other cases where we need to disable early-z, like image
writes.  So rename to something more generic.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-22 08:53:28 -04:00
Rhys Perry f736250ab4 ac/nir: implement 16-bit pack/unpack opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-22 12:50:16 +01:00
Lionel Landwerlin 87dadbce5b vulkan/overlay: improve error reporting
We can show the actual command & line where the failure happened

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-22 11:26:04 +00:00
Lionel Landwerlin 9f3727351d vulkan/overlay: check return value of swapchain get images
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-22 11:26:01 +00:00
Lionel Landwerlin 1fbf355597 vulkan/overlay: silence validation layer warnings
v2: Drop call to FreeDescriptorSet

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-22 11:25:58 +00:00
Lionel Landwerlin de14107741 vulkan/overlay: properly register layer object with loader
This is required by the validation layers if we want to validate the
commands inserted by the overlay layer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-22 11:25:55 +00:00
Józef Kucia c077d5d7de radv: Fix driverUUID
Fixes: 14cad8786a ("radv: generate the same driver UUID as radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-22 08:57:16 +01:00
Danylo Piliaiev ea9bde151f
glsl: Cross validate variable's invariance by explicit invariance only
'invariant' qualifier is propagated on variables which are used
to calculate other invariant variables, however when we are matching
variable's declarations we should take into account only explicitly
declared invariance because invariance propagation is an implementation
specific detail.

Thus new flag is added to ir_variable_data which indicates 'invariant'
qualifier being explicitly set in the shader.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100316
Fixes: 89b60492 ('glsl: Add a pass to propagate the "invariant" and
  "precise" qualifiers')

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-03-21 23:28:08 -07:00
Józef Kucia 1d996ef714 mesa: Fix GL_NUM_DEVICE_UUIDS_EXT
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-22 07:37:14 +02:00
Kenneth Graunke 66c100a8d6 iris: Skip resolves and flushes altogether if unnecessary
Improves drawoverhead baseline scores by 1.17x.
2019-03-21 20:28:17 -07:00
Kenneth Graunke 365886ebe1 iris: Skip framebuffer resolve tracking if framebuffer isn't dirty
Improves drawoverhead baseline score by 1.86x.
2019-03-21 20:28:17 -07:00
Kenneth Graunke 1d05d24b1d iris: Skip input resolve handling if bindings haven't changed
This brings the drawoverhead 16 Tex w/ no state change score from
22% of baseline to 97% of baseline.
2019-03-21 20:28:17 -07:00
Kenneth Graunke a342f2deb1 iris: Fix util_vma_heap_init size for IRIS_MEMZONE_SHADER
Fixes assertions when disabling bucket allocators.
2019-03-21 19:07:17 -07:00
Dave Airlie 9dd92d08a5 softpipe: fix integer texture swizzling for 1 vs 1.0f
The swizzling was putting float one in not integer 1.

This fixes a lot of arb_texture_view-rendering-formats cases.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-22 09:30:35 +10:00
Dave Airlie aae5ba72ab softpipe: remove shadow_ref assert.
I don't think this really buys us anything and TG4 with cubemap arrays
falls over because sampler == 2, but otherwise works fine.

Fixes:
./bin/textureGather fs shadow r  CubeArray repeat

on softpipe with ARB_gpu_shader5 enabled.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-22 09:30:29 +10:00
Dave Airlie 8dc8b1361a softpipe: handle 32-bit bitfield inserts
Fixes piglits if ARB_gpu_shader5 is enabled

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-22 09:30:26 +10:00
Dave Airlie 7b7cb1bc35 softpipe: fix 32-bit bitfield extract
These didn't deal with the width == 32 case that TGSI is defined with.

Fixes piglit tests if ARB_gpu_shader5 is enabled.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-03-22 09:30:21 +10:00
Timothy Arceri a1bd9dd5bc nir: fix opt_if_loop_last_continue()
Rather than skipping code that looked like this:

     loop {
        ...
        if (cond) {
           do_work_1();
           continue;
        } else {
           break;
        }
        do_work_2();
     }

Previously we would turn this into:

     loop {
        ...
        if (cond) {
           do_work_1();
           continue;
        } else {
           do_work_2();
           break;
        }
     }

This was clearly wrong. This change checks for this case and makes
sure we now leave it for nir_opt_dead_cf() to clean up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-03-22 09:58:18 +11:00
Gurchetan Singh 620df57dbb anv: fix build on Nougat
AHardwareBuffer is only available on O and above.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-21 15:36:39 -07:00
Gurchetan Singh 139f908d8f anv: move anv_GetMemoryAndroidHardwareBufferANDROID up a bit
No functional change, just makes the next patch a little easier.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-21 15:36:39 -07:00
Eric Anholt bfed0a7099 v3d: Remove some dead members of struct v3d_compile.
These are more vc4 leftovers.
2019-03-21 14:20:50 -07:00
Eric Anholt 16f2770eb4 v3d: Upload all of UBO[0] if any indirect load occurs.
The idea was that we could skip uploading the constant-indexed uniform
data and just upload the uniforms that are variably-indexed.  However,
since the VS bin and render shaders may have a different set of uniforms
used, this meant that we had to upload the UBO for each of them.  The
first case is generally a fairly small impact (usually the uniform array
is the most space, other than a couple of FSes in shader-db), while the
second is a larger impact: 3DMMES2 was uploading 38k/frame of uniforms
instead of 18k.

Given that the optimization is of dubious value, has a big downside, and
is quite a bit of code, just drop it.  No change in shader-db.  No change
on 3DMMES2 (n=15).
2019-03-21 14:20:50 -07:00
Eric Anholt 320e96bace v3d: Move constant offsets to UBO addresses into the main uniform stream.
We'd end up with the constant offset in the uniform stream anyway, since
they're bigger than small immediates.  Avoids the extra uniforms and adds
in the shader in favor of just adding once on the CPU.

shader-db:
total instructions in shared programs: 6496865 -> 6494851 (-0.03%)
total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)
2019-03-21 14:20:50 -07:00
Eric Anholt c36d2793ec v3d: Rename v3d_tmu_config_data to v3d_unit_data.
I want to reuse this for encoding small constant UBO/SSBO offsets into the
uniform stream to reduce the extra uniform loads and adds for the small
constant offsets.
2019-03-21 14:20:50 -07:00
Benjamin Gordon b30aad552c configure.ac/meson.build: Add options for library suffixes
When building the Chrome OS Android container, we need to build copies
of mesa that don't conflict with the Android system-supplied libraries.
This adds options to create suffixed versions of EGL and GLES libraries:

libEGL.so -> libEGL${egl-lib-suffix}.so
libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so
libGLESv2.so -> libGLES${gles-lib-suffix}.so

This is similar to what happens when --enable-libglvnd is specified, but
without the side effects of linking against libglvnd.  To avoid
unexpected clashes with the suffixed appended by libglvnd, make it an
error to specify both --enable-libglvnd and --with-egl-lib-suffix.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-03-21 10:18:31 -07:00
Kenneth Graunke e426c3a6cb nir: Record non-vector/scalar varyings as unmovable when compacting
In some cases, we can end up with varying structs that aren't split to
their member variables.  nir_compact_varyings attempted to record these
as unmovable, so it would leave them be.  Unfortunately, it didn't do
it right for non-vector/scalar types.  It set the mask to:

   ((1 << (elements * dmul)) - 1) << var->data.location_frac

where elements is the number of vector elements.  For structures and
other non-vector/scalars, elements is 0...so the whole mask became 0.

This caused nir_compact_varyings to assign other varyings on top of
the structure varying's location (as it appeared to take up no space).

To combat this, we just set elements to 4 for non-vector/scalar types,
so that the entire slot gets marked as unmovable.

Fixes KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in on iris.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-21 16:03:58 +00:00
Rob Clark 6e781a01b9 freedreno/ir3: dynamic UBO indexing vs 64b pointers
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_fragment
and similar things with multiple UBOs

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 2e01c534f4 freedreno/ir3: fix bit_count
Seems like it can only work 16b at a time.  Fixes
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitcount.*

TODO need to check if this limitation applies to a3xx as well.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 3d8349048b freedreno/ir3: additional lowering
For some things that show up when we expose higher glsl

TODO check blob traces to see if we have instructions for some of this?
I guess we don't but worth a check..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark bcd81d2387 freedreno/ir3: optimize sam.s2en to sam
Detect when sampler/texture idx are immediate and switch to non s2en
encoding.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 1443694ee5 freedreno/ir3: enable indirect tex/samp (sam.s2en)
For now it uses indirect for everything.  The next step is for the
ir3_cp pass to detect the case that tex and samp idx are immediate
and convert the sam instruction back to the non .s2en variant.  But
doing that in a following patch so we can shake out the bugs with
.s2en more easily.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 1088b788d8 freedreno/ir3: find # of samplers from uniform vars
When we have indirect samplers, we cannot tell the max sampler
referenced.  Instead just refer to the number of sampler uniforms.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark d4cbc94685 nir: move gls_type_get_{sampler,image}_count()
I need at least the sampler variant in ir3..

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-21 09:13:05 -04:00
Rob Clark 8eb16ae8bf freedreno/ir3: fix regmask for merged regs
On a6xx+ with half-regs conflicting with full-regs, the legalize pass
needs to set appropriate sync bits, such as (sy), on writes to full regs
that conflict with half regs, and visa-versa.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 1dffb089f9 freedreno/ir3: fix sam.s2en encoding
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 45b7a581b4 freedreno/ir3: fix sam.s2en decoding
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 2d31cf9d3b freedreno/ir3/ra: fix half-class conflicts
On a6xx, half-regs conflict with full-regs.  But we were only setting up
conflicts for the first class (ie. scalar, but not hvec2/hvec3/hvec4),
resulting in higher half-reg classes getting assigned to regs that
overwrite full-regs.

Noticed while trying to enable indirect-sampler (sam.s2en) which uses an
hvec2 argument to pass the sampler/tex index.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark cc5ca9391c freedreno/ir3 better cat6 encoding detection
These two bits seem to be a better way to detect which encoding we are
looking at.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Samuel Pitoiset 00327f827f ac: fix incorrect argument type for tbuffer.{load,store} with LLVM 7
GLC/SLC are boolean.

This fixes the following LLVM error when checkir is set:
Intrinsic has incorrect argument type!
void (i32, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32)* @llvm.amdgcn.tbuffer.store.i32

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:02:00 +01:00
Samuel Pitoiset 20cac1f498 ac: fix 16-bit shifts
This fixes the following LLVM error when ckeckir is set:
Type too small for ZExt

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:01:58 +01:00
Samuel Pitoiset 2ac5c5c1b5 ac: add 16-bit support to fract
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:09 +01:00
Samuel Pitoiset 0eb1478ac2 ac: add 16-bit support fo fsign
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:07 +01:00
Samuel Pitoiset ff11c9dcc7 ac: add f16_0 and f16_1 constants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:05 +01:00
Timothy Arceri 427a6fee43 nir: only override previous alu during loop analysis if supported
Users of this function expect alu to be a supported comparision
if the induction variable is not NULL. Since we attempt to
override the return values if the first limit is not a const, we
must make sure we are dealing with a valid comparision before
overriding the alu instruction.

Fixes an unreachable in inverse_comparison() with the game
Assasins Creed Odyssey.

Fixes: 3235a942c1 ("nir: find induction/limit vars in iand instructions")

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110216
2019-03-21 21:51:21 +11:00
Samuel Pitoiset db07f0554a radv: add missing initializations since VK_EXT_pipeline_creation_feedback
This fixes the world.

Fixes: 5f5ac19f13 ("radv: Implement VK_EXT_pipeline_creation_feedback.")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:42:31 +01:00
Rhys Perry 037f11d42e radv: enable VK_KHR_8bit_storage
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:27 +01:00
Rhys Perry 3cc72a88d8 ac/nir: implement 8-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:25 +01:00
Rhys Perry c73f8b6576 ac/nir: add 8-bit types to glsl_base_to_llvm_type
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:22 +01:00
Rhys Perry 9c5067acf1 ac/nir: implement 8-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:20 +01:00
Samuel Pitoiset b235d77e18 ac: add ac_build_tbuffer_store_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:18 +01:00
Rhys Perry b12e074b89 ac/nir: implement 8-bit push constant, ssbo and ubo loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:16 +01:00
Samuel Pitoiset 104dbc64a5 ac: add ac_build_tbuffer_load_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:14 +01:00
Samuel Pitoiset 6e632eb24b ac: add various int8 definitions
Original patch by Rhys Perry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:10 +01:00
Tapani Pälli 4e1bbb000c anv/radv: release memory allocated by glsl types during spirv_to_nir
Fixes leaks for each glsl_type generated:

   ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
   ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
   ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
   ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
   ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
   ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
   ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
   ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
   ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
   ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
   ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
   ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
   ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)

v2: move release call to vkDestroyInstance
v3: apply fix also to radv driver

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 08:30:22 +02:00
Jason Ekstrand 6e19348ad1 spirv: Drop inline tg4 lowering
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Jason Ekstrand 08f804ec0c anv,radv,turnip: Lower TG4 offsets with nir_lower_tex
v2: turn on for turnip as well (Karol Herbst)

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst d8a0658d8b nir/lower_tex: Add support for tg4 offsets lowering
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst 99f202432b nv50/ir/nir: support gather offsets
v2: only emit offsets if those are !0

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst 71c66c254b nir: add support for gather offsets
Values inside the offsets parameter of textureGatherOffsets are required to be
constants in the range of [GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET,
GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET].

As this range is never outside [-32, 31] for all existing drivers inside mesa,
we can simply store the offsets as a int8_t[4][2] array inside nir_tex_instr.

Right now only Nvidia hardware supports this in hardware, so we can turn this
on inside Nouveau for the NIR path as it is already enabled with the TGSI one.

v2: use memcpy instead of for loops
    add missing bits to nir_instr_set
    don't show offsets if they are all 0
v3: default offsets aren't all 0
v4: rename offsets -> tg4_offsets
    rename nir_tex_instr_has_explicit_offsets -> nir_tex_instr_has_explicit_tg4_offsets

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Dave Airlie b95b33a5c7 nir/deref: remove casts of casts which are likely redundant (v3)
Not sure how ptr_stride should be taken into account if at all here

v2: reorder check to avoid src walking (Jason)
v3: remove is_cast_cast checks, keep going afterwards (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 10:58:06 +10:00
Dave Airlie 3b3653c4cf nir/spirv: don't use bare types, remove assert in split vars for testing
For OpenCL we never want to strip the info from the types, and it makes
type comparisons easier in later stages. We might later need a nir pass to
strip this for GLSL, but so far the only regression is the assert and Jason
said removing that is fine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-03-21 10:25:40 +10:00
Rafael Antognolli e7c8402163 iris: Let blorp update the clear color for us.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli 93123417dd iris: Track fast clear color.
v2: Update tracked clear color when we update the surface state.
v3: Update all aux surface states when updating the clear color.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli 5658c661de iris: Stall on the CPU and resolve predication during fast clears.
Only if the clear color/depth is changing. In those cases, it's hard to
keep track of the current clear color, and aux state of some layers,
when predication is enabled. So simplify everything by stalling on the
few cases where we would have a fast clear color change with
predication.

v2:
 - fix comment (Ken)
 - explicitly check for predicate state after resolving it (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli ce830a364e iris: Add iris_resolve_conditional_render().
This function can be used to stall on the CPU and resolve the predicate
for the conditional render. It will convert ice->state.predicate from
IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or
IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query.

v2:
 - return void (Ken)
 - update the stored condition (Ken)
 - simplify the code leading to resolve the predicate (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 131b42f0aa iris: Implement fast clear color.
If all the restrictions are satisfied, do a fast clear instead of
regular clear.

v2:
 - add perf_debug() when we can't fast clear (Ken)
 - improve comment: s/miptree/resource/ (Ken)
 - use swizzle_color_value from blorp (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli bd6f51ec21 intel/blorp: Make swizzle_color_value public.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli d97eddff25 intel/isl: Add isl_format_has_color_component() function.
v2: Get luminance bits from luminance component (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 7f6344a726 iris: Bring back check for srgb and fast clear color.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli a8b5ea8ef0 iris: Add function to update clear color in surface state.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 32c8fa6411 iris: Add helper to convert fast clear color.
It needs to be converted to a value that can be used by ISL (and our
hardware SURFACE_STATE structure).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 51638cf18a iris: Fast clear depth buffers.
Check and do a fast clear instead of a regular clear on depth buffers.

v3:
 - remove swith with some cases that we shouldn't wory about (Ken)
 - more parens into the has_hiz check (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 34d00b4410 iris: Use the clear depth when emitting 3DSTATE_CLEAR_PARAMS.
Take the clear depth into account when IRIS_DIRTY_DEPTH_BUFFER is marked
as dirty.

Also update the blorp surface clear color.

v2: Use a single if (zres && zres->aux.bo) (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 37f2692591 iris: Allocate buffer space for the fast clear color.
Also store clear color in the iris_resource.

Always allocate clear color state buffer.

v2:
 - Make clear_color_offset be 64 bits (Ken).
 - Simplify the logic to decide when to memset the aux buffer (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Bas Nieuwenhuizen 5f5ac19f13 radv: Implement VK_EXT_pipeline_creation_feedback.
Does what it says on the tin.

The per stage time is only an approximation due to linking and
the Vega merged stages.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-20 21:19:46 +00:00
Samuel Pitoiset 72e366b4c2 ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()
New buffer intrinsics have a separate soffset parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:19 +01:00
Samuel Pitoiset 9d960c17a8 ac: use new LLVM 8 intrinsic when storing 16-bit values
vindex is always 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:14 +01:00
Samuel Pitoiset 2a9d331898 ac: add ac_build_{struct,raw}_tbuffer_store() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:12 +01:00
Samuel Pitoiset 30c2aca67f ac: use new LLVM 8 intrinsics in ac_build_buffer_load()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:08 +01:00
Samuel Pitoiset da46dbb1be ac/nir: use ac_build_buffer_store_dword() for SSBO store operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:06 +01:00
Samuel Pitoiset 6b573c00c9 ac/nir: use ac_build_buffer_load() for SSBO load operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:02 +01:00
Samuel Pitoiset 29132af234 ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations
Use the raw version (ie. IDXEN=0) because vindex is unused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:56 +01:00
Samuel Pitoiset b39844457f ac/nir: remove one useless check in visit_store_ssbo()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:54 +01:00
Samuel Pitoiset a2073f49f1 ac: add ac_build_buffer_store_format() helper
Similar to ac_build_buffer_load_format().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:50 +01:00
Samuel Pitoiset 4debe49d44 ac/nir: set attrib flags for SSBO and image store operations
For consistency regarding other store operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:37 +01:00
Samuel Pitoiset 1b553dd47f ac: make use of ac_get_store_intr_attribs() where possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:35 +01:00
Juan A. Suarez Romero efcf9c9f9f nir: deref only for OpTypePointer
Fixes dEQP-VK.binding_model.buffer_device_address.* and
dEQP-VK.ssbo.phys.layout* Vulkan CTS tests.

v2: set val->type->stride in the section below (Jason)

v3: restore val->type->type to original place (Jason)

Fixes: d0ba326f23 ("nir/spirv: support physical pointers")
CC: Karol Herbst <kherbst@redhat.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-20 19:26:32 +00:00
Dave Airlie 04189565a0 softpipe: fix texture view crashes
I noticed we crashed piglit arb_texture_view-rendering-formats
when run on softpipe.

This fixes the clear tiles to use the surface format not the
underlying storage format.

This fixes a bunch of srgb piglits as well.

Fixes: 396ac41fc2 (softpipe: add integer support)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-03-21 05:06:07 +10:00
Kenneth Graunke 3c3f250456 nvc0: Skip new update barrier bits
I added new barrier bits in 220c1dce1e
and made most drivers skip them.  I thought nvc0 was already skipping
those but missed the else case here, which does something.  So make it
explicitly skip like I did everywhere else.

Thanks to Ilia for catching this.

Fixes: 220c1dce1e gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.
2019-03-20 10:30:32 -07:00
Lionel Landwerlin 6601e5d6fc anv: implement VK_EXT_pipeline_creation_feedback
An extension reporting cache hit in the user supplied pipeline cache
as well as timing information for creating the pipelines & stages.

v2: Don't consider no cache for cache hits (Jason)
    Rework duration accumulation (Jason)

v3: Fold feedback creation writing into pipeline compile functions (Jason/Lionel)

v4: Get cache hit information from anv_device_search_for_kernel() (Jason)
    Only set cache hit from the whole pipeline if all stages also have that bit (Lionel)

v5: Always user_cache_hit in anv_device_search_for_kernel() (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-20 16:18:35 +00:00
Rob Clark 70904eb99a freedreno/ir3/a6xx: fix ssbo comp_swap
One line left out of the conversion to ir3 ssbo intrinsics on a6xx.

Fixes: 2e4525883f ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsics
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-20 11:48:13 -04:00
Jason Ekstrand 0b7e5bdbd4 nir: Constant values are per-column not per-component
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-20 09:26:56 -05:00
Jason Ekstrand 9a129510f5 anv: Bump maxComputeWorkgroupInvocations
We initially set this lower because we didn't have SIMD32 support yet
but we've supported SIMD32 for quite some time now.  We should bump it
up to the real limit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-20 09:26:56 -05:00
Samuel Pitoiset 4fa61273a8 radv: fix binding transform feedback buffers
The mask should be accumulated if two calls are used for
binding two buffers at different indexes. Otherwise, the
driver only accounts for the last one.

Noticed while glancing at this code.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 09:06:40 +01:00
Samuel Pitoiset f4f0e3a395 ac: use llvm.amdgcn.fract intrinsic for nir_op_ffract
Noticed with a Doom shader.

29077 shaders in 15096 tests
Totals:
SGPRS: 1282125 -> 1282133 (0.00 %)
VGPRS: 908716 -> 908616 (-0.01 %)
Spilled SGPRs: 24811 -> 24779 (-0.13 %)
Code Size: 49048176 -> 48936488 (-0.23 %) bytes
Max Waves: 244232 -> 244226 (-0.00 %)

Totals from affected shaders:
SGPRS: 229584 -> 229592 (0.00 %)
VGPRS: 163268 -> 163168 (-0.06 %)
Spilled SGPRs: 8682 -> 8650 (-0.37 %)
Code Size: 12819572 -> 12707884 (-0.87 %) bytes
Max Waves: 24398 -> 24392 (-0.02 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 09:06:35 +01:00
Kenneth Graunke 220c1dce1e gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.
The glMemoryBarrier() function makes shader memory stores ordered with
respect to things specified by the given bits.  Until now, st/mesa has
ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT,
saying that drivers should implicitly perform the needed flushing.

This seems like a pretty big assumption to make.  Instead, this commit
opts to translate them to new PIPE_BARRIER bits, and adjusts existing
drivers to continue ignoring them (preserving the current behavior).

The i965 driver performs actions on these memory barriers.  Shader
memory stores go through a "data cache" which is separate from the
render cache and other read caches (like the texture cache).  All
memory barriers need to flush the data cache (to ensure shader memory
stores are visible), and possibly invalidate read caches (to ensure
stale data is no longer visible).  The driver implicitly flushes for
most caches, but not for data cache, since ARB_shader_image_load_store
introduced MemoryBarrier() precisely to order these explicitly.

I would like to follow i965's approach in iris, flushing the data cache
on any MemoryBarrier() call, so I need st/mesa to actually call the
pipe->memory_barrier() callback.

Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate
and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on
the iris driver.

Roland said this looks reasonable to him.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-19 23:43:33 -07:00
Tapani Pälli 3e534489ec iris: mark switch case fallthrough
CID: 1444103
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 08:21:50 +02:00
Tapani Pälli 03cbfbd913 iris: initialize num_cbufs
Currently initialized only if 'ish' is non-NULL.

CID: 1444106
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 08:20:09 +02:00
Daniel Stone d258b787fa panfrost: Properly align stride
Handle buffers whose width is not aligned to 16px by padding the stride
and storing it accordingly.

This does not reject imports for images whose stride is not sufficiently
aligned.

v2: make sure bo->stride is set on imported buffers, and add missing
variable definition. (Tomeu)

Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-20 04:20:42 +00:00
Anuj Phogat 2be60e0c73 anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-19 14:42:19 -07:00
Anuj Phogat 85ecd14ef6 i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-19 14:42:02 -07:00
Eric Anholt 17115da6ad v3d: Expose the dma-buf modifiers query.
This allows DRI3 to pick between UIF and raster according to whether we're
pageflipping or not and whether the pageflipping display can do UIF,
avoiding copies for the windowed/composited case that previously was
forced to linear.

Improves windowed glmark2 -b build:use-vbo=false performance by 30.7783%
+/- 13.1719% (n=3)
2019-03-19 08:59:01 -07:00
Eric Anholt bf6973199d v3d: Allow the UIF modifier with renderonly.
We ask the other side to make a buffer with the right number of pages, and
then just store the UIF in it.  This avoids an extra silent copy of the
buffer from linear to UIF if it gets used for texturing (X11 copy-based
swapbuffers, GL compositors).
2019-03-19 08:54:46 -07:00
Eric Anholt eb5903a908 v3d: Always lay out shared tiled buffers with UIF_TOP set.
The samplers are already ready for this, we just needed to make sure that
layout chose UIF for level 0.
2019-03-19 08:54:46 -07:00
Andres Gomez ab28dca033 Revert "glsl: relax input->output validation for SSO programs"
This reverts commit 1aa5738e66.

This patch incorrectly asumed that for SSOs no inner interface
matching check was needed.

From the ARB_separate_shader_objects spec v.25:

  " With separable program objects, interfaces between shader stages
    may involve the outputs from one program object and the inputs
    from a second program object.  For such interfaces, it is not
    possible to detect mismatches at link time, because the programs
    are linked separately.  When each such program is linked, all
    inputs or outputs interfacing with another program stage are
    treated as active.  The linker will generate an executable that
    assumes the presence of a compatible program on the other side of
    the interface.  If a mismatch between programs occurs, no GL error
    will be generated, but some or all of the inputs on the interface
    will be undefined."

This completes the fix from commit:
3be05dd267 ("glsl/linker: don't fail non static used inputs without matching outputs")

Fixes: 1aa5738e66 ("glsl: relax input->output validation for SSO programs")
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-19 17:36:20 +02:00
Andres Gomez 422882e78f glsl/linker: simplify xfb_offset vs xfb_stride overflow check
Current implementation uses a complicated calculation which relies in
an implicit conversion to check the integral part of 2 division
results.

However, the calculation actually checks that the xfb_offset is
smaller or a multiplier of the xfb_stride. For example, while this is
expected to fail, it actually succeeds:

  "

    ...

    layout(xfb_buffer = 2, xfb_stride = 12) out block3 {
      layout(xfb_offset = 0) vec3 c;
      layout(xfb_offset = 12) vec3 d; // ERROR, requires stride of 24
    };

    ...

  "

Fixes: 2fab85aaea ("glsl: add xfb_stride link time validation")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-19 17:23:27 +02:00
Andres Gomez 3be05dd267 glsl/linker: don't fail non static used inputs without matching outputs
If there is no Static Use of an input variable, the linker shouldn't
fail whenever there is no defined matching output variable in the
previous stage.

From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec:

  " Only the input variables that are statically read need to be
    written by the previous stage; it is allowed to have superfluous
    declarations of input variables."

Now, we complete this exception whenever the input variable has an
explicit location. Previously, 18004c338f ("glsl: fail when a
shader's input var has not an equivalent out var in previous") took
care of the cases in which the input variable didn't have an explicit
location.

v2: do the location based interface matching check regardless on
    whether it is a separable program or not (Ilia).

Fixes: 1aa5738e66 ("glsl: relax input->output validation for SSO programs")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-19 17:23:27 +02:00
Andres Gomez de1bc2d19a glsl/linker: always validate explicit location among inputs
Outputs are always validated when having explicit locations and we
were trusting its outcome to catch similar problems with the inputs
since, in case of having undefined outputs for existing inputs, we
would be already reporting a linker error.

However, consider this case:

  " Shader stage n:
    ---------------

    ...

    layout(location = 0) out float a;

    ...

    Shader stage n+1:
    -----------------

    ...

    layout(location = 0) in float b;
    layout(location = 0) in float c;

    ...
  "

Currently, this won't report a linker error even though location
aliasing is happening for the inputs.

Therefore, we also need to validate the inputs independently from the
outcome of the outputs validation.

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-19 17:23:27 +02:00
Andres Gomez a96093136b glsl: correctly validate component layout qualifier for dvec{3,4}
From page 62 (page 68 of the PDF) of the GLSL 4.50 v.7 spec:

  " A dvec3 or dvec4 can only be declared without specifying a
    component."

Therefore, using the "component" qualifier with a dvec3 or dvec4
should result in a compiling error.

v2: enhance the error message (Timothy).

Fixes: 94438578d2 ("glsl: validate and store component layout qualifier in GLSL IR")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-19 17:23:27 +02:00
Jason Ekstrand cbfe31ccbe Revert "nir: const `nir_call_instr::callee`"
This reverts commit db57db5317.  When
building IR, nothing is really immutable and, since C has no concept of
constness propagating beyond the first pointer, we have to be vary
careful with how we use it.  To just throw const into a function like
this is a lie.

Instead, we should just drop the unneeded const in spirv_to_nir which
this commit does along with the revert.
2019-03-19 10:19:42 -05:00
Eric Engestrom db57db5317 nir: const `nir_call_instr::callee`
Fixes: c95afe56a8 "nir/spirv: handle kernel function parameters"
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2019-03-19 12:51:53 +00:00
Rafael Antognolli 76f9ca6cf9 iris: Make intel_hiz_exec public.
Need to use it for fast clearing depth buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-18 22:27:02 -07:00
Rafael Antognolli 9c63ec26ea iris: Enable HiZ for multisampled depth surfaces.
Fix this check so that we can get a HiZ aux buffer for multisampled
surfaces as well. Also make sure we don't try to emit a sampler view
surface state for multisampled depth sufaces with HiZ enabled, as
the sampler can't HiZ for multisampled buffers and isl would assert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-18 22:21:30 -07:00
Karol Herbst d0ba326f23 nir/spirv: support physical pointers
v2: add load_kernel_input

Signed-off-by: Karol Herbst <kherbst@redhat.com>

squash! nir/spirv: support physical pointers
2019-03-19 04:08:07 +00:00
Karol Herbst c95afe56a8 nir/spirv: handle kernel function parameters
the idea here is to generate an entry point stub function wrapping around the
actual kernel function and turn all parameters into shader inputs with byte
addressing instead of vec4.

This gives us several advantages:
1. calling kernel functions doesn't differ from calling any other function
2. CL inputs match uniforms in most ways and we can just take advantage of most
   of nir_lower_io

v2: move code into a seperate function
v3: verify the entry point got a name
    fix minor typo
v4: make vtn_emit_kernel_entry_point_wrapper take the old entry point as an arg

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-19 04:08:07 +00:00
Karol Herbst 0ccdf23a57 nir/lower_locals_to_regs: cast array index to 32 bit
local memory is too small to require 64 bit pointers, so cast the array index
to a 32 bit value to save up on 64 bit operations.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-19 04:08:07 +00:00
Karol Herbst 44d32e62fb glsl: add cl_size and cl_alignment
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-03-19 04:08:07 +00:00
Karol Herbst 659f333b3a glsl: add packed for struct types
We need this for OpenCL kernels because we have to apply C rules for alignment
and padding inside structs and for this we also have to know if a struct is
packed or not.

v2: fix for kernel params

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-03-19 04:08:07 +00:00
Alyssa Rosenzweig b98955e128 panfrost: Rewrite varying assembly
There are two stages to varying assembly in the command stream: creating
the varying buffers in the command stream, and creating the varying meta
descriptors (also in the command stream) linked to the aforementioned
buffers. The previous code for this was ad hoc and brittle, making some
invalid assumptions causing unmaintainable workarounds to pile up across
the driver (both compiler and command stream side).

This patch completely rewrites the varying assembly code. There's a
trivial performance penalty (we now memcpy the varying meta to the
command stream on draw, rather than on compile). That said, the
improvement in flexibility and clarity is well-worth it.

The motivator for these changes was support for gl_PointCoord (and
eventually point sprites for legacy GL), which was impossible to
implement with the old varying assembly code.  With the new refactor,
it's super easy; support for gl_PointCoord is included with this patch.

All in all, I'm quite happy with how this turned out.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-19 03:55:10 +00:00
Alyssa Rosenzweig 5e6d33a7b6 panfrost: Replay more varying buffers
This is required for gl_PointCoord to show up on decodes.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-19 03:53:56 +00:00
Alyssa Rosenzweig b517e36842 panfrost/decode: Respect primitive size pointers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-19 03:53:48 +00:00
Alyssa Rosenzweig 4f89e4437c panfrost: Disable PIPE_CAP_TGSI_TEXCOORD
I don't know why this was on to begin with...?

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-19 03:52:43 +00:00
Alyssa Rosenzweig 7c02c4f114 panfrost: Fix primconvert check
In addition to fixing actual primconvert bugs, this prevents an infinite
loop when trying to draw POINTS.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-19 03:52:20 +00:00
Alyssa Rosenzweig 60d5b85261 panfrost: Workaround buffer overrun with mip level
Mipmaps are still broken, but at least this way we don't crash on some
apps using mipmaps.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-03-19 03:50:59 +00:00
Bas Nieuwenhuizen a777c3d7cb radv: Use correct image view comparison for fast clears.
The if is actually returning true on success, enabling fast clears, so we
need to have the test succeed when the iview dimensions are right.

Fixes: d5400a5ec2 "radv: provide a helper for comparing an image extents."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-03-19 00:39:47 +01:00
Jason Ekstrand 493b3ada9b anv,radv: Implement VK_KHR_surface_capability_protected
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-03-18 17:02:10 +00:00
Danylo Piliaiev ecb98c6898 anv: Treat zero size XFB buffer as disabled
Vulkan spec doesn't explicitly forbid zero size transform
feedback buffers.
Having zero size xfb caused SurfaceSize overflow and
triggered assert in debug build.

The only way to have zero size SO_BUFFER is to disable
SO_BUFFER as stated in hardware spec.

From SKL PRM, Vol 2a, "3DSTATE_SO_BUFFER":
  "If set, stream output to SO Buffer is enabled,
  if 3DSTATE_STREAMOUT::SO Function ENABLE is also enabled.
  If clear, the SO Buffer is considered "not bound" and effectively
  treated as a zero- length buffer for the purposes of SO output and
  overflow detection. If an enabled stream's Stream to Buffer Selects
  includes this buffer it is by definition an overflow condition.
  That stream will cause no writes to occur,
  and only SO_PRIM_STORAGE_NEEDED[<stream>] will increment."

Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-18 16:09:42 +00:00