Commit Graph

114824 Commits

Author SHA1 Message Date
Alyssa Rosenzweig 89c5370118 pan/decode: Mark tripped zeroes with XXX
This normalizes the printed format. It also makes it easier for the
future when we may introduce semantic _warn and _error handlers.

A tripped zero is essentially a hazard to check for.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig e49204c878 pan/decode: Check for MFBD preload chicken bit
If this bit is clear, MFBD preload will be enabled, and you.. don't want
that. (At least, when the bit is clear, the old contents of the
framebuffer will be preserved. I'm assuming this is what "MFBD preload"
refers to in kbase.)

Validate that this bit is always set.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig c9b6233558 pan/decode: Validate AFBC fields are zero when AFBC is disabled
There is no "chunknown" structure; that part of the union is an artefact
from falsely believing vertex/tiler MFBDs could have render targets
attached (they can't). These are just plain old AFBC fields, and if
there is no AFBC, it's error to set these field.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig 4aeb694462 pan/decode: Do not print uniform/buffers explicitly
For our purposes of driver debugging, the contents of uniform buffers
are rarely interesting; we're more concerned about the metadata setting
them up.

We do need to be careful to validate the sizes of both uniforms and
uniform buffers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig 4391c65f10 pan/decode: Add static bounds checking utility
Many structures in the command stream have a GPU address and size
determined statically. We should check that the pointers we are passed
are valid and the buffers they point to are big enough for the given
size. If they're not, an MMU fault would be raised.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig 9dfbc8dc03 pan/decode: Don't print unreferenced attribute memory
This is a source of uninitialized memory leaking into the traces.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig 897110a566 pan/decode: Check for a number of potential issues
Verify sizes / masks / etc against something logical to cull down the
trace space and automatically guard against a number of potential
hazards.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig f5c293425f panfrost: Correct polygon size computations
While the algorithm for computing the header size has been correct for a
while, we used a major hack to conservatively guess the body size. Let's
scrap that and figure out the algorithm we actually need to use to be
bit-identical with what the hardware expects.

We do have to be careful to add the header size to total comptued BO
size.

It's not clear how big the polygon list needs to be in practice -- but
it has to be somewhat bigger than the polygon list itself. This needs
more investigation. If we size the polygon list exactly based on the
polygon_list_size field, we get faults like:

[ 1224.219886] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x000000001BDE8000
               Reason: TODO
               raw fault status: 0x660003C3
               decoded fault status: SLAVE FAULT
               exception type 0xC3: TRANSLATION_FAULT_LEVEL3
               access type 0x3: WRITE
               source id 0x6600

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig f6e41f30d0 panfrost: Remove DRY_RUN
Nobody uses this anymore anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig b4a214207c pan/decode: Print "just right" count of texture pointers
The other commented lines just add noise/entropy we don't want, and can
in fact crash the trace due to asserts failing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig a8bd3ad470 pan/decode: Verify and omit polygon size
The polygon sizes are computed from the width/height/flags, so we can
reverse the computation and use our computation to verify the two
computation algorithms are bit-identical. If they are, we can omit the
computed fields.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig b45eb2775e panfrost: Move pan_tiler.c outside of Gallium
The routines in this file may be shared with Vulkan.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig 13d07978ff pan/decode: Bounds check polygon list and tiler heap
We have the BOs available; ensure that the bounds specified in the
command stream are actually the correct bounds.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig b072d0357b pan/decode: Allow updating mmaps
This allows the caller to call track_mmap multiple times for the same
gpu_va for the purpose of updating the mmap. This is used to trace
invisible BOs with kbase and doesn't apply to native traces.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig 52101e48f8 pan/decode: Express tiler structures as offsets
This allows us to catch a class of errors (for negative offsets, etc)
automatically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig e918dd8a6c pan/decode: Don't print zero exception_status
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig 2a8d776884 pan/decode: Fix missing NULL terminator
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig 6c67bd05a6 pan/decode: Silence workgroups_x_shift_2
Since we're bit-identical we can compare the computed value.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig 3752566584 panfrost: Implement workgroups_x_shift_2 quirk
I'm not sure why this is done this way, but let's follow the blob.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig 25ed930c4a pan/decode: Don't print canonical workgroup encoding
The on-the-wire representation of workgroups is not 1:1 to the decoded
Gallium-level workgroups (there are multiple valid encodings; see the
previous commit). Nevertheless, since we're now bit-identical in packing
vs the blob, we can check for a canonical form and only print the
verbose trace if we fail the canonical form.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig fb56a162a9 panfrost: Set workgroups z to 32 for non-instanced graphics
This is a blob quirk; in so much as I know, the hardware doesn't care.
But we're trying to be bit-identical to take as much entropy out of
traces as possible, so let's introduce the quirk.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig 39b226cfb3 panfrost: Move pan_invocation to shared panfrost/
The routines in this file have no dependency on Gallium. Let's share
them so they can be used for a theoretical future Vulkan driver or, more
immediately, consulted when tracing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig d9f33951df pan/decode: Don't print MALI_DRAW_NONE
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig 740f86c9ee pan/decode: Eliminate DYN_MEMORY_PROP
It's obvious that it's linked by virtue of us printing the struct it
links against. No need to repeat ourselves.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alejandro Piñeiro 41549a18e6 i965: Enable OpenGL 4.6 for Gen8+
The last remaining stuff was ARB_gl_spirv and ARB_spirv_extensions.

Note that it is really likely that we can enable it for some Gen7 (as
4.5 was), but it was not tested yet.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-21 17:29:42 +02:00
Alejandro Piñeiro 7dab76014a mesa/version: uncomment SPIR-V extensions
As they are implemented on i965, so we can expose 4.6.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-21 17:29:42 +02:00
Alejandro Piñeiro 2e8565bead i965: enable ARB_gl_spirv extension and ARB_spirv_extensions for gen7+
v2: squashed the two enable patches with the docs one (Jason)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-21 17:29:42 +02:00
Tomeu Vizoso 4109a2f612 panfrost/ci: Print load stats
To help make sure we are running tests in the ideal number of threads,
print load stats to make obvious when there's a problem with
utilization.

This will be specially useful when we run tests on a wider variety of
devices.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso 3794652385 panfrost/ci: Install qemu-arm-static into chroot
Some runners may be configured such that the qemu binary might not be
available by the time we need to start running commands within the
chroot.

So make sure that it's there to avoid suprising problems in that case.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso 8496045adc panfrost/ci: Build kernel with CONFIG_DETECT_HUNG_TASK
There's lots of locking changes going into the Panfrost kernel driver,
so better be prepared.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso a074513dc2 panfrost/ci: Print bootstrap log
A number of things can go wrong when building the rootfs from within a
non-native chroot, so make sure to print the bootstrap.log so we can
tell what's going on.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso 76af465e57 panfrost/ci: Use Volt-based runner for dEQP tests
It's able to run tests in parallel, fully utilizing the HW and
shortening considerable the time it takes.

Needed to disable tests in RK3288 for now because Volt doesn't support
armhf yet, though this should be fixed soon.

Tests are now run with --deqp-gl-config-name=rgba8888d24s8ms0, so we are
hitting a few more failures in tests that previously were being skipped.

The time to run the tests decreases from around 8 minutes to 1:45
minutes, allowing for extending coverage without increasing CI times too
much.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Samuel Pitoiset 29834fe8a2 radv: implement VK_AMD_shader_core_properties2
Trivial extension that matches PAL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Samuel Pitoiset a6ad9e8ccf radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood
This gives a nice boost, +20% at this time on my Vega 56. Shader
ballot should be enabled by default at some point but it reduces
performance a bit (-6%) with Wolfeinstein II. Enable it only for
Youngblood at the moment, like what we did for Talos in the past.

As a bonus point, it gets rid of some minor artifacts that only
happens when ballot is disabled for some reasons.

Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Samuel Pitoiset f202ac27a9 radv: add a new debug option called RADV_DEBUG=noshaderballot
Shader ballot will be enabled by default for Wolfenstein
Youngblood. This follows what we did for sisched.

Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Samuel Pitoiset e73d863a66 radv: allow to enable VK_AMD_shader_ballot only on GFX8+
Scans aren't implemented on SI/CIK.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Danylo Piliaiev e71fc7f238 nir/loop_analyze: Treat do{}while(false) loops as 0 iterations
Loops like:

block block_0:
vec1 32 ssa_2 = load_const (0x00000020)
vec1 32 ssa_3 = load_const (0x00000001)
loop {
    vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9
    vec1 1 ssa_8 = ige ssa_2, ssa_7
    if ssa_8 {
        break
    } else {
    }
    vec1 32 ssa_9 = iadd ssa_7, ssa_1
}

Were treated as having more than 1 iteration and after unrolling
produced wrong results, however such loop will exit during
the first iteration if not unrolled.

So we check if loop will actually loop.

Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 11:01:15 +00:00
Danylo Piliaiev 84b3ef6a96 nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll
Without loop_prepare_for_unroll loops are losing phis.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411
Fixes: 5db98195 "nir: add loop unroll support for wrapper loops"
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 10:43:27 +00:00
Danylo Piliaiev 8869f44e9a nir/loop_unroll: Update the comments for loop_prepare_for_unroll
The comments say that we should remove continue if it is the last
intruction in a loop however we remove any kind of jump.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 10:43:27 +00:00
Bas Nieuwenhuizen e04761d0f9 radv: Emit VGT_GS_ONCHIP_CNTL for tess on GFX10.
Otherwise hangs are possible. This register was already set for
GS and NGG.

Fixes: 5eaed7ecfc "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-21 09:51:47 +00:00
Bas Nieuwenhuizen 2e763f7c87 radv: Use correct vgpr_comp_cnt for VS if both prim_id and instance_id are needed.
Should take the max of the 2.

Fixes: ea337c8b7e "radv/gfx10: fix VS input VGPRs with the legacy path"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-21 09:38:46 +00:00
Daniel Schürmann 7fa1740035 nir/algebraic: some subtraction optimizations
Changes with RADV/ACO:
Totals from affected shaders:
SGPRS: 444087 -> 455543 (2.58 %)
VGPRS: 436468 -> 436768 (0.07 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13448928 -> 13353520 (-0.71 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 68060 -> 67979 (-0.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-21 08:51:49 +00:00
Lionel Landwerlin 8a2465e3f3 radeonsi: take reference glsl types for compile threads
An application quitting before the destroying its GL context and
binding a NULL context might still have a radeonsi compiler thread
running and potentially still accessing the types.

Therefore take a reference for the duration of the threads' lifetime.

v2: Only ref the glsl types, the builtins should be used by the time
    shader data gets to a gallium driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin e4da8b9c33 mesa/compiler: rework tear down of builtin/types
The issue we're running into when running CTS is that glsl types are
deleted while builtins depending on them are not.

This happens because on one hand we have glsl types ref counted, but
builtins are not. Instead builtins are destroyed when unloading libGL
or explicitly calling glReleaseShaderCompiler().

This change removes almost entirely any dealing with glsl types
ref/unref by letting the builtins deal with it instead. In turn we
introduce a builtin ref count mechanism. Each GL context takes a
reference on the builtins when compiling a shader for the first time.
It releases the reference when the context is destroyed. It can also
explicitly release those when glReleaseShaderCompiler() is called.

Finally we also take a reference on the glsl types when loading libGL
to avoid recreating glsl types too often.

v2: Ensure we take a reference if we don't have one in link step (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin 9f37bc419c compiler: ensure glsl types are not created without a reference
We want to detect invalid refcounting so assert we have at least one
use before creating types.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin 8b913bd1ce nir/tests: take reference on glsl types
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin 3ade8f0040 glsl/tests: take refs on glsl types
Much like each driver, tests as standalone entities must take
references on the glsl types.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Samuel Pitoiset 41d9873459 radv/gfx10: hardcode some depth+stencil formats in the format table
The script doesn't handle them correctly and D16_UNORM_S8_UINT
isn't supported by the hardware, mark it as invalid.

This fixes warning when generating gfx10_format_table.h.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 08:17:40 +02:00
Samuel Pitoiset 1650e747c6 radv/gfx10: tidy up gfx10_format_table.py
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 08:17:38 +02:00
Ilia Mirkin 958390a9bf gallium/vl: use compute preference for all multimedia, not just blit
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-20 23:51:39 -04:00