Commit Graph

3143 Commits

Author SHA1 Message Date
Emma Anholt f460fb3f91 turnip: Store the computed iova in the tu_buffer.
We recently had a bug of forgeting to add the buf->bo_offset.  Just make
the easiest field to get be the bo->iova + buf->bo_offset already.  Plus,
a little less work at emit time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14816>
2022-02-01 15:30:12 +00:00
Chia-I Wu 9eb1592e57 turnip: respect buf->bo_offset in transform feedback
buf->bo->iova should always be offset by buf->bo_offset.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14786>
2022-01-31 18:31:54 +00:00
Danylo Piliaiev f14cae43ac ci/freedreno: properly test sysmem and gmem paths
After autotuner introduction most CTS tests are running in
sysmem mode. Now we have to force gmem run and add a small
forced sysmem run since it's not guaranteed that autotuner
would select gmem in future.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12128>
2022-01-31 15:26:35 +00:00
Danylo Piliaiev 803055ccb4 tu: add debug option to force gmem
With autotuner we now want to be able to force gmem rendering,
it will respect existing constraints though.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12128>
2022-01-31 15:26:35 +00:00
Danylo Piliaiev a4f9c54444 freedreno: Update gmem/sysmem debug options to be in line with turnip
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12128>
2022-01-31 15:26:35 +00:00
Danylo Piliaiev dbae9fa7d8 tu: implement sysmem vs gmem autotuner
The implementation is separate from Freedreno due to multithreading
support.

In Vulkan application may fill command buffer from many threads
and expect no locking to occur. We do introduce the possibility of
locking on renderpass end, however assuming that application
doesn't have a huge amount of slightly different renderpasses,
there would be minimal to none contention.

Other assumptions are:
- Application does submit command buffers soon after their creation.

Breaking the above may lead to some decrease in performance or
autotuner turning itself off.

The heuristic is too simplistic at the moment, to find a proper
one - we should run a bunch of traces with sysmem and gmem, and
build better heuristic from gathered data.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12128>
2022-01-31 15:26:35 +00:00
Connor Abbott 360f7c5d64 tu: Initial link-time optimizations
This is mostly taken from radv, and cleaned up a bit: don't explicitly
list every stage at the beginning, and name the shaders "producer" and
"consumer" to reduce confusion. I also stripped out a lot of other stuff
to get to the bare minimum of calling nir_link_opt_varyings,
nir_remove_unused_varyings, and nir_compact_varyings and then cleaning
up the fallout. In the future we may want to temporarily scalarize I/O
like radv does, and add back a few things like the psize optimization.
In the meantime this already provides a lot of benefit.

Results from the radv fossil_db with some apps not compilable by turnip
removed:

Totals:
MaxWaves: 1637288 -> 1668200 (+1.89%); split: +1.89%, -0.00%
Instrs: 54620287 -> 54114442 (-0.93%); split: -0.98%, +0.05%
CodeSize: 92235646 -> 91277584 (-1.04%); split: -1.07%, +0.03%
NOPs: 11176775 -> 11185206 (+0.08%); split: -0.63%, +0.71%
Full: 1689271 -> 1657175 (-1.90%); split: -1.92%, +0.02%
(ss): 1318763 -> 1317757 (-0.08%); split: -1.40%, +1.32%
(sy): 618795 -> 617724 (-0.17%); split: -0.70%, +0.53%
(ss)-stall: 3496370 -> 3470116 (-0.75%); split: -1.37%, +0.62%
(sy)-stall: 23512954 -> 23511164 (-0.01%); split: -1.04%, +1.03%
STPs: 27557 -> 27461 (-0.35%)
LDPs: 22948 -> 22804 (-0.63%)
Cat0: 11823765 -> 11829681 (+0.05%); split: -0.62%, +0.67%
Cat1: 3120042 -> 2991831 (-4.11%); split: -4.43%, +0.32%
Cat2: 28605309 -> 28324829 (-0.98%); split: -0.98%, +0.00%
Cat3: 7334628 -> 7252342 (-1.12%); split: -1.12%, +0.00%
Cat4: 1216514 -> 1204894 (-0.96%)
Cat5: 863976 -> 861926 (-0.24%)
Cat6: 1648571 -> 1641457 (-0.43%)

Totals from 23575 (16.16% of 145856) affected shaders:
MaxWaves: 258806 -> 289718 (+11.94%); split: +11.94%, -0.00%
Instrs: 7571190 -> 7065345 (-6.68%); split: -7.04%, +0.36%
CodeSize: 13864308 -> 12906246 (-6.91%); split: -7.09%, +0.18%
NOPs: 959185 -> 967616 (+0.88%); split: -7.35%, +8.23%
Full: 313335 -> 281239 (-10.24%); split: -10.36%, +0.11%
(ss): 154628 -> 153622 (-0.65%); split: -11.90%, +11.25%
(sy): 69758 -> 68687 (-1.54%); split: -6.21%, +4.67%
(ss)-stall: 322002 -> 295748 (-8.15%); split: -14.92%, +6.76%
(sy)-stall: 3270366 -> 3268576 (-0.05%); split: -7.45%, +7.40%

STPs: 3624 -> 3528 (-2.65%)
LDPs: 1074 -> 930 (-13.41%)
Cat0: 1022684 -> 1028600 (+0.58%); split: -7.13%, +7.71%
Cat1: 531102 -> 402891 (-24.14%); split: -26.04%, +1.90%
Cat2: 4090309 -> 3809829 (-6.86%); split: -6.86%, +0.00%
Cat3: 1449686 -> 1367400 (-5.68%); split: -5.69%, +0.01%
Cat4: 103543 -> 91923 (-11.22%)
Cat5: 57441 -> 55391 (-3.57%)
Cat6: 316096 -> 308982 (-2.25%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14767>
2022-01-31 12:19:55 +00:00
Emma Anholt b5e41c8c2d ci/freedreno: Switch 2 default a630 VK jobs to being GLES and VK ASan jobs.
The automatic VK coverage we care about is happening on a618, which is the
HW we're shipping.  Having the old 630 runners make sure we don't leak
memory is a great use for them.  Still, keep one default A630 VK job to
make sure we don't totally trash it.

Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14235>
2022-01-27 23:47:46 +00:00
Emma Anholt 8457667be9 ci: Use a dlclose-disabling preload library for leak checking in Vulkan.
For GL, we disable the dlclose() call on the driver in asan builds so that
leak reports get proper backtraces.  For Vulkan, the dlclose() happens
from libvulkan so you need a bigger hammer to keep our drivers loaded.

Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14235>
2022-01-27 23:47:46 +00:00
Danylo Piliaiev da7a475138 turnip: Drop references to layout of all sets on pool reset/destruction
We dropped the references only for non-host_memory_base pools.
Create a list of alive descriptor to account for all of them.

Fixes: 1b513f49 ("tu: add reference counting for descriptor set layouts")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14235>
2022-01-27 23:47:46 +00:00
Danylo Piliaiev 24144f6f5c turnip/trace: Delete unused start/end_resolve tracepoints
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev 1989e1e6d8 turnip/perfetto: handle gpu timestamps being non-monotonic
Perfetto requires time in clock snaphots to be monotonic, otherwise
the clock would be excluded.
GPU timestamps start from zero after every suspend-resume cycle
which makes them non-monotonic.

As a solution on msm we check whether GPU was just resumed and
remember previous highest timestamp to then add it to the next
timestamps.

If the functionality to get whether gpu is resumed is unavailable
or doesn't work - we fallback to a check for a discontinuity
in timestamps. For kgsl we always use fallback.

Fixes renderstage timeline disappearing in AGI.

Or you could avoid the issue altogether by preventing GPU from going to
sleep by increasing auto suspend delay e.g.:

  echo 5000 > /sys/devices/platform/soc\@0/3d00000.gpu/power/autosuspend_delay_ms

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev ba7faa6f43 turnip/trace: process u_trace chunks on queue submission
tu_QueuePresentKHR was not the best place since application
isn't required to call it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev a6482a3a6e turnip: rename tu_drm_get_timestamp into tu_device_get_gpu_timestamp
It is not drm specific and will be implemented in kgsl.

Change parameter to tu_device along the way.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev f2c53c2a9b turnip/trace: refactor creation and usage of trace flush data
Fixes the case when last cmd buffer in submission doesn't have
tracepoints leading to flush data not being freed.

Added a few comments, renamed things, refactored allocations - now
the data flow should be a bit more clean.

Extracted submission data creation into tu_u_trace_submission_data_create
which would be later used in in tu_kgsl.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev 95896dee93 turnip/perfetto: Optimize timestamp synchronization
We shouldn't do ioctl to get timestamp if perfetto isn't connected.
Also it's better to sync timestamps after submission since the
call could block until GPU is resumed.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Emma Anholt 7380d8e285 ci/freedreno: Update hashes for closed traces.
These two had different pixel results from last time someone updated them,
but things still look fine.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14757>
2022-01-27 17:46:52 +00:00
Connor Abbott 065785e689 tu: Report code size in pipeline statistics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14754>
2022-01-27 17:16:18 +00:00
Emma Anholt ccbf16124e ci/traces: Rename the piglit/run.sh script to piglit-traces.sh.
That's the only use of this script that's left.

Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14604>
2022-01-27 04:37:16 +00:00
Emma Anholt d041630a37 ci/llvmpipe,softpipe: Switch piglit testing to piglit-runner.
The new runner reduces the runtime by about 1/3 thanks to using rust
instead of python, and includes automatic flake handling so you don't just
have to skip flaky tests.  The wrapper script also includes IRC flake
reporting (so one can track and update the flakes list to improve CI
reliability), always uploading results to CI for review (so you can
diagnose flakes and look at timings), has a prettier regressions report
and a helpful timing report, and is the same as what's used by all the HW
runners as well.

The downside is that by dropping the massive list of skips, you no longer
get flagged if Mesa refactors end up accidentally disabling extensions and
thus making tests skip.  For that, I've started on
https://gitlab.freedesktop.org/anholt/deqp-runner/-/merge_requests/33 so
that hardware drivers get extension checking coverage too.

Thanks to the perf improvement, we get to drop one of the jobs for
llvmpipe.

xfail lists were mostly sed-jobs from the prior expectations lists.  The
exceptions to that you'll find in the form of whitespace around the
affected test group (usually changes of capitalization or
special-characters), or an explanation for the more interesting changes
(which thankfully we can now record in the xfails lists!).

Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14604>
2022-01-27 04:37:16 +00:00
Emma Anholt 6ce7a6e725 Revert "ci: freedreno: Update a530 dEQP fail expectation list"
This reverts commit a35c5540e4.  Another
patch doing so had already landed, so 530 was now failing all its (manual)
tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14604>
2022-01-27 04:37:16 +00:00
Hyunjun Ko 8a5b949a3e turnip: fix leaks of submit requests.
Fixes: 479a1c40 ("turnip: Porting to common vulkan implementation for synchronization.")

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14727>
2022-01-26 22:22:33 +00:00
Yiwei Zhang 96acd0933e tu: VkExternalImageFormatProperties is optional
..even if external image info has valid external handles.

Fixes: 26380b3a9f ("turnip: Add driver skeleton (v2)")

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14730>
2022-01-26 16:35:10 +00:00
Danylo Piliaiev 1b513f4958 tu: add reference counting for descriptor set layouts
The spec states that descriptor set layouts can be destroyed almost
at any time:

   "VkDescriptorSetLayout objects may be accessed by commands that operate
    on descriptor sets allocated using that layout, and those descriptor
    sets must not be updated with vkUpdateDescriptorSets after the descriptor
    set layout has been destroyed. Otherwise, a VkDescriptorSetLayout object
    passed as a parameter to create another object is not further accessed
    by that object after the duration of the command it is passed into."

Copied mostly from ANV.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5893

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14622>
2022-01-25 12:17:41 +00:00
Chia-I Wu ef325d4650 freedreno/drm, turnip: set DRM_RDWR for exported dma-bufs
This allows the exported fds to be mapped for writing.  My use case is
for virtio-gpu blob resources where the fds are mapped rw and mappings
are added to the guests using KVM_SET_USER_MEMORY_REGION.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14699>
2022-01-25 05:32:38 +00:00
Jordan Crouse 3608bce137 turnip: Update the msm_kgsl.h header with the sanitized 4.19 version
The current msm_kgsl.h header in the tree isn't sanitized and the kernel
specific macros will confuse a compiler. Copy in the sanitized version of
the header from the 4.19 kernel tree which also adds a few new API bits
that are currently unused but may be useful some day.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14651>
2022-01-21 21:25:07 +00:00
Connor Abbott ab5176ec40 tu/blit: Don't set CLAMPENABLE in sampler for 3d path
This was copied from the blob before we understood what it did, and it
has questionable utility: there's nothing in the GL, Vulkan, or D3D11
specs that require the result be clamped to the underlying range to
account for imprecision. And it doesn't make sense at all for cubic
filtering, because the result can legitimately be outside the range in
some scenarios. Just remove it.

This fixes a bunch of tests added in vulkan CTS 1.2.8 to test blitting
from compressed textures, which use random inputs and therefore are more
likely to hit the out-of-range condition. For example,
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.2d.etc2_r8g8b8a8_unorm_block.r8g8b8a8_snorm.general_general_cubic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14613>
2022-01-21 20:55:46 +00:00
Connor Abbott bb41d47f2e freedreno/a6xx: Name texture descriptor bit
This appears to do the same thing as CLAMPENABLE on a3xx. That is, it
clamps the result to [0, 1] for unorm formats and [-1, 1] for snorm
formats *after* filtering. In particular it's now more easily observable
with cubic filtering, because cubic filtering can produce values outside
the original range. Presumably this only matters with linear filtering
due to rounding errors when computing the weighted average.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14613>
2022-01-21 20:55:46 +00:00
Bas Nieuwenhuizen d1530a3f3b Revert "nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants"
This reverts commit a1af902531.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5423
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14532>
2022-01-21 16:58:11 +00:00
Danylo Piliaiev cadcbed258 tu: expose VK_KHR_copy_commands2
Relevant CTS tests:
dEQP-VK.api.copy_and_blit.copy_commands2.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14623>
2022-01-20 10:43:31 +00:00
Charles Giessen 4e0604279d freedreno, tu: Update LoaderICDInterfaceVersion to v5
With the proper version checking in the common vulkan instance code
(commit 88b9b68) it is now possible to bring the reported interface
version up to v5.

Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14563>
2022-01-20 07:25:07 +00:00
Emma Anholt 700d2fbd0a nir: Add a .base field to nir_load_ubo_vec4.
This lets nir-to-tgsi fold the constant offset of addressing calculations
into the CONST[] reference, which is important for D3D9-era compatibility:
HW of that age has limited uniform space, and if we do the addressing math
as math in the shader for dynamic indexing, the nir_load_consts end up
taking up uniforms we don't have available.

r300:
total instructions in shared programs: 1279699 -> 1279167 (-0.04%)
instructions in affected programs: 134796 -> 134264 (-0.39%)
total instructions in shared programs: 1279699 -> 1279167 (-0.04%)
instructions in affected programs: 134796 -> 134264 (-0.39%)
total temps in shared programs: 213912 -> 213736 (-0.08%)
temps in affected programs: 2166 -> 1990 (-8.13%)
total consts in shared programs: 953237 -> 952973 (-0.03%)
consts in affected programs: 45980 -> 45716 (-0.57%)

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309>
2022-01-19 22:28:34 +00:00
Dave Airlie 1352e0ba0c mesa/*: add a shader primitive type to get away from GL types.
This creates an internal shader_prim enum, I've fixed up most
users to use it instead of GL types.

don't store the enum in shader_info as it changes size, and confuses
other things.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
2022-01-19 21:54:58 +00:00
Dave Airlie d54c07b4c4 mesa/*: use an internal enum for tessellation primitive types.
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.

This has a lot of fallout all over the place.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
2022-01-19 21:54:58 +00:00
Guilherme Gallo a35c5540e4 ci: freedreno: Update a530 dEQP fail expectation list
The test
`KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs`
was failing even before the kernel uprev

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14548>
2022-01-19 20:14:43 +00:00
Marcin Ślusarz ed0edcc338 freedreno/rnn: normalize line endings in rules-ng.xsd
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11779>
2022-01-19 15:17:17 +00:00
Cristian Ciocaltea 279cc37ac0 freedreno/ci: Fix dEQP tests expectations on A530
Add a new entry to the 'fails' list.

Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14413>
2022-01-18 18:42:05 +00:00
Danylo Piliaiev e4c582ee71 tu: support VK_EXT_primitive_topology_list_restart
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14556>
2022-01-17 15:21:03 +00:00
Emma Anholt f6ffefba3e nir: Apply nir_opt_offsets to nir_intrinsic_load_uniform as well.
Doing this for ir3 required adding a struct for limits of how much base to
fold in (which NTT wants as well for its case of shared vars), otherwise
the later work to lower to the 1<<9 word limit would emit more
instructions.

The shader-db results are that sometimes the reduction in NIR instruction
count results in the fewer sampler prefetches due to the shader being
estimated to be shorter (dota2, nexuiz):

total instructions in shared programs: 8996651 -> 8996776 (<.01%)
total cat5 in shared programs: 86561 -> 86577 (0.02%)

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023>
2022-01-16 19:11:29 +00:00
Emma Anholt b024102d7c freedreno/ir3: Use nir_opt_offset for removing constant adds for shared vars.
Saves some work in carchase and manhattan31:

instructions in affected programs: 2842 -> 2818 (-0.84%)
nops in affected programs: 1131 -> 1105 (-2.30%)
non-nops in affected programs: 1236 -> 1238 (0.16%)
mov in affected programs: 57 -> 61 (7.02%)
dwords in affected programs: 2144 -> 2150 (0.28%)
cat0 in affected programs: 1195 -> 1169 (-2.18%)
cat1 in affected programs: 151 -> 155 (2.65%)
cat2 in affected programs: 142 -> 140 (-1.41%)
sstall in affected programs: 190 -> 178 (-6.32%)
(ss) in affected programs: 63 -> 63 (0.00%)
systall in affected programs: 532 -> 511 (-3.95%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023>
2022-01-16 19:11:29 +00:00
Rob Clark fcb3b87553 freedreno/decode: Handle chip-id
For cmdstream traces from newer devices, we need to identify the gpu
based on chip-id.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14564>
2022-01-14 23:17:03 +00:00
Danylo Piliaiev 3e7f6c9aeb tu: implement wsi hook to decide if we can present directly on device
This will prevent the driver to take the prime blit path for presentation
in scenarios where it can avoid it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11091>
2022-01-14 12:19:57 +00:00
Charles Giessen dbd3935b04 freedreno, tu: Export vk_icdGetPhysicalDeviceProcAddr
Support Loader ICD Interface Version 4 by exporting the function
vk_icdGetPHysicalDeviceProcAddr.

Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14299>
2022-01-14 10:26:13 +01:00
Connor Abbott 8f18c72f9a freedreno/fdl: Fix reinterpreting "size-compatible" formats
It's allowed to reinterpret compressed formats as one of a few
non-compressed formats with the same pixel size as the blocksize of the
compressed format, and vice-versa. If we did this we'd wind up with an
incorrect width/height. Fix that.

Fixes dEQP-VK.image.sample_texture.*.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14520>
2022-01-13 13:44:14 +00:00
Rob Clark 4dc406c748 freedreno: Update chip-ids
Counterpoint to https://patchwork.freedesktop.org/series/98772/

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14506>
2022-01-13 05:26:11 +00:00
Rob Clark 785a324deb freedreno: Handle wildcard fuse-id in device matching
A future kernel update will add fuse-id in the upper bits of the
chip_id.  Do avoid breaking device matching, add a way to include
a wildcard/fallback fuse-id.  (Note that this only effects un-
released devices.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14506>
2022-01-13 05:26:11 +00:00
Rob Clark 6b8e3aeeb7 freedreno: Rearrange dev_id_compare() logic
We're going to need to add a couple more cases.  Let's split up the
existing two cases first, rather than piling on more logic to a single
expression.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14506>
2022-01-13 05:26:11 +00:00
Rob Clark 9176e27dd2 freedreno: Small dev_id_compare() cleanup
We don't really treat the two arguments identically, so rename them to
make it clear which one is the device id coming from kernel, and which
one is the reference id from the fd_dev_recs table.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14506>
2022-01-13 05:26:11 +00:00
Hyunjun Ko 0a82a26a18 turnip: Porting to common implementation for timeline semaphore
Define struct tu_timeline_sync for emulated timeline support in common
implementation that is on top of drm syncobj as a binary sync.

Also implement init/finish/reset/wait_many methods for the struct.

v1. Does not set MSM_SUBMIT_SYNCOBJ_RESET for waiting syncobjs since
it's being managed in the common implementation already.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14105>
2022-01-13 04:01:44 +00:00
Hyunjun Ko 479a1c405e turnip: Porting to common vulkan implementation for synchronization.
This patch ports to common code for VkSemaphore, VkFence and relevant
APIs like vkCreate(Destroy)Semaphore/Fence, vkGetSemaphoreFdKHR, etc.

Accordingly, starts using common vkQueueSubmit with implementing
driver-specific hook.

Also remove all timeline semaphore codes so that we could use common
code in the following patches. This way we could easily see what's
modified in the following patch.

Note that kgsl is not ported in this patch.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14105>
2022-01-13 04:01:44 +00:00
Hyunjun Ko f976f71fb0 turnip: Use the new common device lost tracking
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14105>
2022-01-13 04:01:44 +00:00
Emma Anholt c638d6f3bf ci: Add paraview traces to several drivers.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14501>
2022-01-13 00:22:54 +00:00
Thomas H.P. Andersen d71c6eebe2 freedreno: silence sometimes-uninitialized warning
Clang does not see that this is unreachable and thus
thinks that opc will be used uninitialized later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14500>
2022-01-12 21:20:23 +00:00
Jason Ekstrand c8d364cb9d turnip: Use vk_common_QueueSignalReleaseImageANDROID for DRM
It's identical to the one turnip copy+pasted from RADV.  For KGSL, we
still need to hand-roll because of all the emulated stuff.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14411>
2022-01-11 17:25:22 +00:00
Jason Ekstrand 5b8b6315e4 turnip: Use vk_common_AcquireImageANDROID
It's got some bug fixes that turnip never picked up.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14411>
2022-01-11 17:25:22 +00:00
Christian Gmeiner 6e08d8fc3d ci: Uprev piglit to af1785f31
Brings in these changes:

af1785f31 occlusion_query_conform: skip GetQueryCounterBits test if needed
dad078717 occlusion_query_conform: convert to pilgit subtests
b52c1c761 glsl-1.30: test nested preprocessor concat
6c4da153b texture-storage: Fix subtest result handling of skips.
4343f19db fbo-integer: Remove the invalid DrawPixels test.
e3842f2fe arb_dsa: exclude stencil8 textures from test sets.
ce8649be7 spec/ext_external_objects: Fix build on Debian systems
4e553838f glsl: add basic tests for desktop GLSL invariant qualifier linking
7e61e5199 Tests for variable in and out of loop scope
f855ad1c8 fbo-mrt-alphatest: Only require GLSL 1.20
9be2fe999 glx: add glx-multi-display-single-pbuffer test
bfe290725 glx: add glx-swap-pbuffer test
efa64335e framework: Fix build on Windows when using waffle

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14468>
2022-01-10 21:52:42 +00:00
Konstantin Seurer 651bec0971 turnip: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
2022-01-10 19:28:17 +00:00
Danylo Piliaiev d77bfc117c tu,ir3: Implement VK_KHR_shader_integer_dot_product
- gen4 - has dp4acc and dp2acc, dp4acc is used to implement
  4x8 dot product.
- gen3 - has dp2acc, in OpenCL blob uses dp2acc for dot product
  on both get3 and gen4.
- gen2 - unknown, lower everything.
- gen1 - no dp2acc, lower everything. OpenCL blob doesn't advertise
  cl_qcom_dot_product8 but still generates code for it.
  The assembly is more verbose and uses yet to be documented
  mad32.u16 instruction.

Passes:
 dEQP-VK.spirv_assembly.instruction.compute.opsdotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opudotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsudotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsdotaccsatkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsudotaccsatkhr.*

Only packed 4x8 unsigned and mixed versions are accelerated.
However in theory we should be able to do better for signed version
than current NIR lowering.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:21:24 +02:00
Danylo Piliaiev e1f89a1da2 ir3: Make nir compiler options a part of ir3_compiler
This would allow for sub-gens to have different options.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Danylo Piliaiev c1d5c318bc ir3: New cat3 instructions
* shrm - (src2 >> src1) & src3
* shlm - (src2 << src1) & src3
* shrg - (src2 >> src1) | src3
* shlg - (src2 << src1) | src3
* andg - (src2 & src1) | src3
* dp2acc - dot product of two {i,u}8vec2 packed into
  SRC1 and SRC2, added to 32b SRC3
* dp4acc - dot product of two {i,u}8vec4 packed into
  SRC1 and SRC2, added to 32b SRC3
* wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is
  duplicated (1 << (SRC3 / 32)) times starting from DST register
* wmm.accu - same as wmm but result is added to DST registers, however
  the first reg in each vec4 result is overwritten instead of
  accumulating.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Connor Abbott c45c6e36eb tu: Implement VK_EXT_subgroup_size_control
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott 1a1e25dcce tu, ir3: Support runtime gl_SubgroupSize in FS
We already supported it in the CS for computing the subgroup ID, but
soon we'll need it in the FS too. Vertex stages will always have it
lowered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott e6e34883a9 ir3: Add wavesize control
This allows the wavesize to be controlled per-shader. This will be used
by VK_EXT_subgroup_size_control, and freedreno will also need it if
legacy ARB_shader_ballot is to be supported (since it forces a wavesize
of 64 or less).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott 30237b3d9c ir3: Pass shader to ir3_nir_post_finalize()
We'll need to add shader-specific lowering for gl_SubgroupSize.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott 9ebc48005c ir3, freedreno: Add options struct for ir3_shader_from_nir()
We'll expand this in a moment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Danylo Piliaiev fe9c9ec83f tu: fix workaround for depth bounds test without depth test
Fixes: bb4db22ff4

("turnip: apply workaround for depth bounds test without depth test")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14390>
2022-01-10 09:36:59 +00:00
Danylo Piliaiev 3792fbfcf6 ir3: Assert that we cannot have enough concurrent waves for CS with barrier
If we have a compute shader that has a big workgroup, a barrier, and
a branchstack which limits max_waves - this may result in a situation
when we cannot run concurrently all waves of the workgroup, which
would lead to a hang.

Blob just explodes in such case.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>
2022-01-07 18:40:15 +00:00
Danylo Piliaiev 9ed4d49c97 ir3: Be able to reduce register limit for RA when CS has barriers
If barriers are used, it must be possible for all waves in the workgroup
to execute concurrently. Thus we may have to reduce the registers limit.

Fixes a hang in "Digital Combat Simulator".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>
2022-01-07 18:40:15 +00:00
Connor Abbott cb45120556 ir3: Use (ss) for instructions writing shared regs
The blob uses *both* nops and (ss). It turns out that in some rare cases
the hardware does take more than 6 cycles, at least for movmsk, but
adding nops is unnecessary. I believe the extra nops are only there due
to the immaturity of the blob's implementation of subgroup ops, so we
don't have to copy them - just handle shared reg producers the same as
SFU instructions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott d45678cac4 ir3/postsched: Rename tex/sfu to sy/ss
Analogous to the previous commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott e6b35d606d ir3/sched: Rename tex/sfu to sy/ss
This now covers e.g. cat6 instructions as well, and ss will cover
instructions writing shared regs as well. This is split out from the
previous change to avoid too much churn and shouldn't cause any
functional changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott 0cc4aca345 ir3: Use new (sy)/(ss) stall helpers in the compiler
This fixes a few bad assumptions in the pre-RA and post-RA scheduler,
for example that (sy) is only for texture instructions and (ss) is only
for SFU instructions and (sy) and (ss) producers will always take the
same number of cycles. This means we now start doing latency hiding for
cat6 instructions like ldib and ldc. It also should make us hide latency
more aggressively, since the number used for (sy) stall cycles was way
lower than the real numbers for everything except ldc. Finally it
unifies the various places (ss) soft nops were calculated.

selected shader-db results:

total nops in shared programs: 345278 -> 358959 (3.96%)
nops in affected programs: 215622 -> 229303 (6.34%)
helped: 690
HURT: 2430
helped stats (abs) min: 1 max: 125 x̄: 11.40 x̃: 5
helped stats (rel) min: 0.53% max: 100.00% x̄: 24.19% x̃: 18.52%
HURT stats (abs)   min: 1 max: 501 x̄: 8.87 x̃: 5
HURT stats (rel)   min: 0.00% max: 9900.00% x̄: 52.36% x̃: 14.29%
95% mean confidence interval for nops value: 3.78 4.99
95% mean confidence interval for nops %-change: 28.21% 42.66%
Nops are HURT.

total mov in shared programs: 75049 -> 74110 (-1.25%)
mov in affected programs: 15754 -> 14815 (-5.96%)
helped: 566
HURT: 455
helped stats (abs) min: 1 max: 36 x̄: 4.52 x̃: 3
helped stats (rel) min: 0.83% max: 100.00% x̄: 35.85% x̃: 30.00%
HURT stats (abs)   min: 1 max: 35 x̄: 3.55 x̃: 3
HURT stats (rel)   min: 0.00% max: 1100.00% x̄: 63.60% x̃: 25.00%
95% mean confidence interval for mov value: -1.25 -0.58
95% mean confidence interval for mov %-change: 2.92% 14.02%
Inconclusive result (value mean confidence interval and %-change mean
confidence interval disagree).

total last-baryf in shared programs: 80468 -> 67670 (-15.90%)
last-baryf in affected programs: 63676 -> 50878 (-20.10%)
helped: 309
HURT: 147
helped stats (abs) min: 1 max: 260 x̄: 49.20 x̃: 24
helped stats (rel) min: 0.60% max: 98.81% x̄: 37.92% x̃: 40.91%
HURT stats (abs)   min: 1 max: 115 x̄: 16.35 x̃: 12
HURT stats (rel)   min: 0.96% max: 1933.33% x̄: 45.55% x̃: 7.89%
95% mean confidence interval for last-baryf value: -33.03 -23.10
95% mean confidence interval for last-baryf %-change: -21.52% -0.50%
Last-baryf are helped.

total sstall in shared programs: 133997 -> 126398 (-5.67%)
sstall in affected programs: 86866 -> 79267 (-8.75%)
helped: 1893
HURT: 598
helped stats (abs) min: 1 max: 77 x̄: 6.06 x̃: 4
helped stats (rel) min: 0.71% max: 100.00% x̄: 32.82% x̃: 16.67%
HURT stats (abs)   min: 1 max: 65 x̄: 6.47 x̃: 6
HURT stats (rel)   min: 0.00% max: 900.00% x̄: 65.51% x̃: 25.00%
95% mean confidence interval for sstall value: -3.39 -2.71
95% mean confidence interval for sstall %-change: -12.19% -6.24%
Sstall are helped.

total systall in shared programs: 350304 -> 288234 (-17.72%)
systall in affected programs: 234855 -> 172785 (-26.43%)
helped: 1456
HURT: 260
helped stats (abs) min: 1 max: 574 x̄: 46.42 x̃: 27
helped stats (rel) min: 0.19% max: 100.00% x̄: 39.43% x̃: 36.06%
HURT stats (abs)   min: 1 max: 757 x̄: 21.20 x̃: 8
HURT stats (rel)   min: 0.00% max: 180.95% x̄: 24.82% x̃: 12.50%
95% mean confidence interval for systall value: -39.31 -33.03
95% mean confidence interval for systall %-change: -31.49% -27.90%
Systall are helped.

total waves in shared programs: 236732 -> 235142 (-0.67%)
waves in affected programs: 6142 -> 4552 (-25.89%)
helped: 535
HURT: 17
helped stats (abs) min: 2 max: 8 x̄: 3.08 x̃: 2
helped stats (rel) min: 12.50% max: 75.00% x̄: 28.78% x̃: 25.00%
HURT stats (abs)   min: 2 max: 6 x̄: 3.53 x̃: 4
HURT stats (rel)   min: 16.67% max: 75.00% x̄: 37.35% x̃: 33.33%
95% mean confidence interval for waves value: -3.04 -2.72
95% mean confidence interval for waves %-change: -28.10% -25.39%
Waves are helped.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott 7e60978d30 ir3: Introduce systall metric and new helper functions
Add new centralized functions which will replace the various places we
hardcode 10 for the number of (ss) nops, add numbers for soft (sy) nops
based on similar computerator experiments with ldc, sam, and ldib (the
most common (sy) producers), and add a "systall" metric which is
analogous to sstall. This also fixes some cases where we'd erroniously
count ldl* as (sy) producers instead of (ss) producers when calculating
sstall.

This only switches over the metric reporting to the new functions, so
there is no behavior change. The following commit will switch over
the rest of the compiler.

While we're at it, remove max_sun as it's never set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott 603791bdeb ir3: Bump type mismatch penalty to 3
After some experimentation with computerator, it seems on a618 that
writing a full register and then reading half of it as a half register
requires a delay of 6, the same as the delay for cat5/cat6 sources. The
other direction only has a delay of 5, but just bump it unconditionally
out of an abundance of caution.

Fixes: 890de1a436 ("ir3/delay: Fix full->half and half->full delay")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott d371d807eb ir3/ra: Fix logic bug in compress_regs_left
If we're allocating a source then we force is_killed to false, not to
true. Fixes a regression in
dEQP-GLES31.functional.synchronization.in_invocation.image_atomic_write_read
later.

Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Guilherme Gallo a6d05e6863 ci: Add a630_skqp jobs
Start Xorg during skqp job, since it is needed to make rendered tests
work.

There are 1 new job, namely `a630_skqp` which runs GL and GLES backends
and then the skqp GPU unittests.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5580

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14146>
2022-01-05 20:15:04 +00:00
Guilherme Gallo 8992cf5ab8 ci: Build skqp on ARM64 images
This commit makes `kernel+rootfs_arm64` job build and install skqp on
ARM64 devices rootfs.

Skia repository has a tool to prepare skqp models located at
`tools/skqp/cut-release`, which get files from [Skia
Gold](https://skia.org/docs/dev/testing/skiagold/), generate
files.checksum, rendertests.txt and unittests.txt. One gives a range of
commits to let `cut-release` find the right resources to prepare skqp
for the user. However, it is failing, since it fails when trying to get
image packages from a range of commits via HTTPS from the host
https://public-gold.skia.org but it responds with error 404 every time.
I tried a range a thousand of commits, yet it still does not give
results. The workaround employed was to recover the most recent
`files.checksum` and `rendertests.txt` files from the git history and
generate `unittests.txt` from `list_gpu_unit_tests` binary.

`skqp` runs two lists of tests, `rendertests.txt` and `unittests.txt`.
Both must be located inside the `skqp` assets folder.  The first list
uses GL and GLES to test rendering scenarios. The second runs some unit
tests that do not render an image per se.

In order to make the first `a630_skqp` to be green, the crashing tests
were removed from the test lists and the expectations of the failing
ones were updated.

It is worth noting that `rendertests.txt` can bring some detail about
each test expectation, so each test can have a max pixel error count, to
tell `skqp` that it is OK to have at most that number of errors for that
test. See also:
https://github.com/google/skia/blob/main/tools/skqp/README_ALGORITHM.md

As each render backend has a different error count, two different
`rendertests.txt` files were created,
`src/freedreno/ci/freedreno-a630-skqp-gl_rendertests.txt`,
`src/freedreno/ci/freedreno-a630-skqp-gles_rendertests.txt` and
, which one refers to GL and GLES tests respectfully.
The unit tests file for a630 is located at
`src/freedreno/ci/freedreno-a630-skqp_unittests.txt`

```
aaclip
domain
formats
highcontrastfilter
rectangle_texture
yuv_make_color_space
```

```
ProcessorOptimizationValidationTest
VkProtectedContext_CreateNonprotectedContext
VkYCbcrSampler_DrawImageWithYcbcrSampler
VkYCbcrSampler_NoYcbcrSurface
```

Each test was updated with the max_error count equal to the first run result.

```
analytic_antialias_inverse
async_rescale_and_read_dog_down
async_rescale_and_read_dog_up
async_rescale_and_read_rose
async_rescale_and_read_text_down
async_rescale_and_read_text_up
async_rescale_and_read_text_up_large
async_rescale_and_read_yuv420_rose
complexclip2_path_bw
encode-platform
imageblur_large
lcdtextsize
onebadarc
onefailarc
scale-pixels
surfaceprops
textfilter_color
textfilter_image
```

Considering all the following tests results as wrong.

```
async_rescale_and_read_no_bleed
backdrop_imagefilter_croprect_persp
complexclip2
imageblurrepeatmode
mixerCF
overdrawcolorfilter
patch_alpha
patch_primitive
rrect_clip_bw
scaledemoji_rendering
yuv_splitter
```

v2:
  a) add link to HTML report on job log
  b) remove extraneous spaces diff
  c) remove unnecessary conditions from build-skqp.sh
  d) use fixed skqp source commit SHA

v3:
  a) Use only main skia repository to fetch models and build skqp
  b) Use list_gpu_unit_tests binary to create a base unittests.txt file
  c) Remove crashing tests
  d) Set failing tests expectations for the first skqp run

v4:
  a) Remove clang dependency
  b) Separate each skqp backend result into its folder
  c) Regroup a630_skqp in one job

v5:
  a) Separate tests files per driver

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5580
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14146>
2022-01-05 20:15:04 +00:00
Thomas H.P. Andersen ff7aee2ac9 tu/clear_blit: use || when working with bools
Fixes a warning with clang

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14315>
2021-12-28 03:13:38 +00:00
Vinson Lee 1d6f6f9102 ir3: Make shift operand 64-bit.
Fix defect reported by Coverity Scan.

Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression 2 << W
with type int (32 bits, signed) is evaluated using 32-bit
arithmetic, and then used in a context that expects an expression
of type uint64_t (64 bits, unsigned).

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14258>
2021-12-22 01:19:46 +00:00
Rob Clark 8a21b2fda0 freedreno/ir3: Dump const state with shader disasm
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14231>
2021-12-20 19:47:35 +00:00
Rob Clark 9766a5721d freedreno/computerator: Mark shader bo for dumping
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14231>
2021-12-20 19:47:35 +00:00
Rob Clark d1edc6d9a1 freedreno/computerator: Fix @buf header
Order is important in the grammar, the more specific match needs to go
first.

Fixes: ba1c989348 ("freedreno/computerator: pass iova of buffer to const register")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14231>
2021-12-20 19:47:35 +00:00
Rob Clark 78c53f4888 freedreno/ir3: Handle instr->address when cloning
Without this, a cloned instruction that takes full regs will trigger an
ir3_validate assert.  This can happen, for ex, if an instruction that
writes p0.x and has a relative src gets cloned in ir3_sched.

Fixes an assert in Genshin Impact with a debug build.

Fixes: 9af795d9b9 ("ir3: Make ir3_instruction::address a normal register")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14231>
2021-12-20 19:47:35 +00:00
Emma Anholt 9c722a06ed ci/freedreno: Add known flakes from the last month.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14239>
2021-12-16 22:37:53 +00:00
Pierre-Eric Pelloux-Prayer 1cb5c1775b glx: fix querying GLX_FBCONFIG_ID for Window
This commit fixes apps using the following sequence:
1. XCreateWindow(dpy) -> win
2. glXCreateContextAttribsARB(dpy, ...) -> ctx
3. glXMakeCurrent(dpy, win, ctx)
4. glXQueryDrawable(dpy, win, GLX_FBCONFIG_ID, ...)

glXQueryDrawable returned 0 (while correctly returning a valid
GLXFCONFIG_ID for other types of drawables).

This commit adds the same dance as driInferDrawableConfig to get
the GLX visual from the Window, and then the GLXFBCONFIG_ID of
this visual.

This fixes:
* piglit: glx-query-drawable --attr=GLX_FBCONFIG_ID --type=WINDOW
* Maya which uses the config ID from step 4 as an input to
glXChooseFBConfig.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14174>
2021-12-16 01:21:36 +00:00
Danylo Piliaiev c82d7e3617 turnip: Fix operator precedence in address calculation macros for queries
Fixes crash in Oblivion, Skyrim, Crysis running through DXVK on 32b
systems.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5723
Fixes: 937dd76426 "turnip: Implement VK_KHR_performance_query"

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14148>
2021-12-10 17:45:02 +00:00
Ilia Mirkin 0db2e78788 freedreno/ci/a306: increase concurrency
No harm from using more threads, but not enough benefit to reduce
parallelism unfortunately.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14067>
2021-12-08 00:50:25 +00:00
Ilia Mirkin 3db30ea877 freedreno/ci/a306: add more skips
These come up with increased concurrency.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14067>
2021-12-08 00:50:25 +00:00
Danylo Piliaiev c749da6135 ir3,turnip: Add support for GL_KHR_shader_subgroup_quad
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev 3dfd4230bb ir3,turnip: Enable subgroup ops support in all stages on gen4
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev ded51fd39e ir3: Use getfiberid for SubgroupInvocationID on gen4
Since it requires (ss) categorize it as is_sfu() and not is_mem().

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev d1c49901df ir3: Add gen4 new subgroup instructions
* getlast.w8 #4 - Perform jump for the first (CLUSTER_SIZE-1)
   fibers in a subgroup
* brcst.active.w8 - necessary to implement arithmetic subgroup
   operations with prefix sum.
* quad_shuffle.brcst - subgroupQuadBroadcast
* quad_shuffle.horiz - subgroupQuadSwapHorizontal
* quad_shuffle.vert - subgroupQuadSwapVertical
* quad_shuffle.diag - subgroupQuadSwapDiagonal
* getfiberid - gl_SubgroupID

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev e63ffc2f04 freedreno,tu: Limit the amount of instructions preloaded into icache
Inferring from blob's cmdstream the size of shader instruction
cache for:
- a630 is 64
- a650 is 128
- a660 is 128

On a650 and a660 gpu could hang if we exceed the limit. Though
it is not reproducible with computerator or a single amber
test. Also while blob limits the size to 128 - Turnip still
hangs with it but does not hang with the limit of 127.

On a630 there seem to be no hang when limit is exceeded.

Fixes the hang of compute shader in Alien Isolation on a650/a660.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14044>
2021-12-07 13:48:35 +00:00
Ilia Mirkin a7180bd4a6 freedreno/a5xx: enable OES_gpu_shader5
This extension is controlled by the ESSL feature level. Bump it up since
all parts of OES_gpu_shader5 should be supported.

This also avoids lowering all of the "advanced" functions (which should
probably not be lowered in the first place since they're part of ES
3.1...)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14035>
2021-12-03 20:04:17 +00:00
Guilherme Gallo dabc068e6c ci: Use ci-fairy minio login via token file
For every CI job, put JWT content into a file and unset CI_JOB_JWT
environment var
=======

* virgl jobs:
	- Share JWT token file to crosvm instance
	- Keep using `export -p` due to high complexity in the scripts
	  of these jobs. At least, the CI_JOB_JWT will not be leaked,
	  since it is being unset at the `before_script` phase of each
	  Mesa CI job.

* iris jobs: Update lava_job_submitter to take token file as argument
	- generate-env with CI_JOB_JWT_TOKEN_FILE
	- create token file during baremetal init stage

* baremetal jobs: Copy token file to bare-metal NFS

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14004>
2021-12-02 18:01:29 +00:00
Guilherme Gallo cdf8a14bff ci: Uprev piglit
Bring up the piglit replay jwt-file argument feature.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14004>
2021-12-02 18:01:29 +00:00
Ilia Mirkin fc2cc39a0f freedreno/ci/a306: split off snorm blending failures
The hardware doesn't support this.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13990>
2021-12-02 03:39:28 +00:00
Ilia Mirkin bbe5b745dc freedreno/ci/a306: split off the f32 blend / texturing failures
The hardware doesn't support this.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13990>
2021-12-02 03:39:28 +00:00
Ilia Mirkin 1f79c36dae freedreno/ci/a306: separate msaa fails
The driver does not implement MSAA. When that happens these can be split
up further.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13990>
2021-12-02 03:39:28 +00:00
Ilia Mirkin 58aad3f403 freedreno/a3xx: add some legacy formats
These can be used in "legacy" buffer textures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13989>
2021-12-02 02:29:50 +00:00
Ilia Mirkin 41aa583edf freedreno/ci/a306: add additional skip which hangchecks
I was having trouble getting a run to complete without this. Was working
earlier, not sure what changed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13989>
2021-12-02 02:29:50 +00:00
Emma Anholt d7226e9a9e freedreno/a6xx: Allocate a fixed-size tess factor BO.
Saves per-batch allocations, avoids reallocation for various vertex
counts, and avoids needing the indirect tess addrs constobj so that we
could emit the relocs to the tess BO after we'd emitted all the draws.

Also apparently it fixes one of our CTS fails.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13851>
2021-12-02 01:47:38 +00:00
Rob Clark 145b0711fc freedreno/crashdec: Basing GMU log decoding
Looks like each entry is four dwords, with the second dword being a
timestamp.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13937>
2021-12-01 17:53:21 +00:00
Rob Clark 8c654d02a3 freedreno/crashdec: Fallback to chip_id for GPU id
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13937>
2021-12-01 17:53:21 +00:00
Rob Clark f33d5256dd freedreno/crashdec: HFI queue decoding
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13937>
2021-12-01 17:53:21 +00:00
Rob Clark 2133d34b11 freedreno/crashdec: Split out mempool decoding
Before we start adding GMU HFI decoding, lets split the other big
section specific decoding (mempool) out into it's own file.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13937>
2021-12-01 17:53:21 +00:00
Emma Anholt b234c538e8 turnip: Move CP_SET_SUBDRAW_SIZE to vkCmdBindPipeline() time.
Now that the subdraw size is constant for a pipeline, this lets tess draws
avoid the slow path in vkCmdDraw*().

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6089>
2021-12-01 16:57:30 +00:00
Jonathan Marek fd11d99254 turnip: use SUBDRAW_SIZE and constant sized tess bos
This fixes the problem of large indirect draws, and at the same time avoids
allocating too large buffers for tessellation.

Reworked by @anholt to use a separate tess factor BO so we can skip the
WFIs to set the TESSFACTOR_ADDR.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6089>
2021-12-01 16:57:30 +00:00
Emma Anholt 3748b8afce freedreno/ir3: Make a shared helper for the tess factor stride.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6089>
2021-12-01 16:57:30 +00:00
Lionel Landwerlin 698343edc5 util/u_trace/perfetto: add new env variable to enable perfetto
When using the Vulkan API, command buffers can be recorded way before
perfetto is enabled. This can be problematic if you want already
recorded command buffers to produce traces.

This new environment variable makes perfetto enabled internally so
that command buffers are recorded with timestamps, even though no
perfetto recording happens.

v2: rename to GPU_TRACE_INSTRUMENT (Rob)

v3: Move instrumentation check to generated headers (Danylo)
    Decouple instrumentation enabling from tracing (Danylo)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13911>
2021-12-01 15:14:05 +00:00
Lionel Landwerlin 65697d6141 util/u_trace: add end_of_pipe property to tracepoints
In order to capture the timestamp when things actually end on Intel
GPU HW, we need to know whether the timestamp should be capture at the
top or end of pipeline.

v2: use one line python if/else (Danylo)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13911>
2021-12-01 15:14:05 +00:00
Ilia Mirkin c868bff36a freedreno/ci: add piglit runs for a306
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13920>
2021-11-30 20:06:07 -05:00
Qiang Yu fcc062235c ci: remove egl-copy-buffers from fail list
egl-copy-buffers test has been fixed for dri3. So remove
it from broadcom and freedreno ci fail list to prevent the
gitlab ci test fail:

  spec@egl 1.4@egl-copy-buffers,UnexpectedPass

Also remove it from radeonsi ci fail list since I verified
on radeonsi.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13868>
2021-11-30 01:58:42 +00:00
Ilia Mirkin e31d08d307 ci: move windowoverlap exclusion to all-skips
The test is just plain not built by our containers. Skip it everywhere.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13919>
2021-11-29 18:08:49 -05:00
Ilia Mirkin f533d7a446 freedreno/ir3: get the post-lowering clip/cull mask
The variant may include a lowered gl_Clip/CullDistance array. So we have
to use the variant's info (which is not available). However we save off
the clip/cull masks already, so just reuse those.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13891>
2021-11-28 02:55:58 -05:00
Ilia Mirkin 13fb587b8a freedreno/ir3: indicate that clipdist arrays are in use
We expose the compact array cap, which means that we get compact
clipdist arrays. Indicate this to the lowering pass so that it works for
gl_ClipDistance from fs, among others.

Fixes, among others, on a420,

tests/spec/glsl-1.30/execution/clipping/fs-clip-distance-interpolated.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13891>
2021-11-28 02:55:58 -05:00
Danylo Piliaiev a78c36ecc6 ir3/cp: Prevent setting an address on subgroup macros
These macros expand to a mov in an if statement which breaks address
assumption that instruction which produces address and consumes it
are in the same block.

Fixes test:
 dEQP-VK.subgroups.ballot_broadcast.framebuffer.subgroupbroadcast_intvertex

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13931>
2021-11-25 12:18:48 +00:00
Connor Abbott 969369e962 ir3/lower_subgroups: Fix potential infinite loop
I was trying to be clever here, skipping ahead to the newly-created
block and processing the remaining instructions after the split in the
same loop. But if the last instruction in a block was lowered, the saved
next instruction would be the head of the block before the split, not
the new block, and we would compare it to the new block so we wouldn't
stop like we were supposed to. Stop being so clever, and just restart
processing with the new block after lowering an instruction.

Because we're wrapping the actual transform in yet another loop, and the
restarting logic is a bit tricky, refactor the actual lowering into a
separate lower_instr function. Otherwise we'd be mixing the two and
indenting the actual logic even more.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13928>
2021-11-25 10:16:48 +00:00
Danylo Piliaiev d5757c965a turnip: implement VK_KHR_buffer_device_address
We don't advertise bufferDeviceAddressCaptureReplay capability and
neither does blob, because at the moment there is no way to allocate
bo with predefined iova.

There is no support of any arithmetic with addresses since shaderInt64
is not enabled. However, we could enable int64 support whenever we want.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8717>
2021-11-23 18:26:37 +00:00
Danylo Piliaiev 99388f0c27 freedreno/ir3: handle global atomics
Only for a6xx since we don't know the instructions for global
atomics on previous gens. Per Qualcomm's docs in OpenCL atomics
are only supported since a5xx together with Generic memory space.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8717>
2021-11-23 18:26:37 +00:00
Danylo Piliaiev 5d5b1fc472 freedreno/ir3: add a6xx global atomics and separate atomic opcodes
Separating atomic opcodes makes possible to express a6xx global
atomics which take iova in SRC1. They would be needed by
VK_KHR_buffer_device_address.
The change also makes easier to distiguish atomics in conditions.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8717>
2021-11-23 18:26:37 +00:00
Ilia Mirkin be048ec112 freedreno/ir3: remove unused actual_in counting
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13918>
2021-11-23 17:20:32 +00:00
Ilia Mirkin bb6fb6065f freedreno/a[345]xx: fix unorm/snorm blend factors when they're "over"
The float value may be out of range, so must be clamped to the allowed
range. Unclear if a3xx also has a SNORM factor that we're just missing
there, but that will be a separate investigation.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13903>
2021-11-22 18:09:44 +00:00
Ilia Mirkin 43f94ee9f1 freedreno/a5xx: add missing L8A8_UNORM format to support TBOs
Fixes arb_texture_buffer_object-formats test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13906>
2021-11-22 17:44:59 +00:00
Danylo Piliaiev ed16eedb2d ir3: print half-dst/src for ldib.b/stib.b
So it would print:
 ldib.b.untyped.1d.u16.1.imm.base0 hr0.z, r0.x, 0
instead of:
 ldib.b.untyped.1d.u16.1.imm.base0 r0.z, r0.x, 0

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13876>
2021-11-22 12:32:15 +00:00
Lionel Landwerlin 8657fa6b86 pps: allow drivers to report timestamps in their own time domain
For this each driver must :

  - report its clock_id (if no particular clock just default to cpu
    boottime one)

  - be able to sample its clock (gpu_timestamp())

The PPSDataSource will then emit timestamp correlation events in the
trace ensuring perfetto is able to display GPU & CPU events
appropriately on its timeline.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
2021-11-22 11:52:46 +00:00
Emma Anholt b8ffd7a888 freedreno/a5xx: Emit MSAA state for sysmem rendering, too.
This looked obviously wrong, we want to set the sample counts for sysmem
too just like we do on 6xx.  Turns out it fixes some piglits.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Emma Anholt 5071d39cb2 freedreno/a5xx: Document the sRGB bit on RB_2D_SRC/DST info.
Noticed while looking through my set of traces for where the average bit
might be.  Same spot as on a6xx.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Emma Anholt 1ef6465665 freedreno/a5xx: Define a5xx_2d_surf_info like a6xx has.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Emma Anholt cad0b6e2e5 freedreno/a6xx: Disable sample averaging on non-ubwc z24s8 MSAA blits.
The fallback path we averages unorm textures, but if we don't have ubwc on
either then we can just cast them to uint which then just takes sample 0.

The proper UBWC format I think ends up averaging, though.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Emma Anholt 93eb697a8d freedreno/a6xx: Disable sample averaging on z/s or integer blits.
We can't generally force fd_blitter_blit() to not average in our fallback
blits, but this should at help some cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
2021-11-19 17:24:11 +00:00
Connor Abbott c98adc56f4 ir3/lower_pcopy: Fix bug with "illegal" copies and swaps
If the source and destination were within the same full register, like
hr90.x and hr90.y (which both map to r45.x), then we'd perform the
swap/copy with the wrong register. This broke
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.35 once BDA is enabled.

Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
2021-11-19 16:59:54 +00:00
Connor Abbott 65da866ad9 ir3/lower_pcopy: Fix shr.b illegal copy lowering
The immediate shouldn't be half-reg because the other source isn't.
Fixes an assertion failure with
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.35.

Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
2021-11-19 16:59:54 +00:00
Connor Abbott 9912c61362 ir3/spill: Support larger spill slot offset
This is required by
dEQP-VK.ssbo.phys.layout.random.all_shared_buffer.47, where we need to
spill a lot of pointers due to NIR CSE being a little too aggressive and
creating a large register pressure across basic blocks, too large to fit
within the boundaries of ldp/stp offsets.

Note that this will be a lot more difficult with support for "real
functions" because the base register will become unknown at compile
time. However this hack gets things working for the time being.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
2021-11-19 16:59:54 +00:00
Connor Abbott 29d3889bbb ir3/ra: Add missing asserts to ra_push_interval()
This would've caught the previous issue earlier. We checked that the
physreg made sense when inserting via ra_file_insert() but not
ra_push_interval() which is used for live-range splitting.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
2021-11-19 16:59:54 +00:00
Connor Abbott 9d88b98b08 ir3/ra: Consider reg file size when swapping killed sources
Don't swap a 2-component vector of half-regs with a full reg if that
would result in the half regs going outside of the allowable half-reg
space.

Fixes: d4b5d2a020 ("ir3/ra: Use killed sources in register eviction")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
2021-11-19 16:59:54 +00:00
Alejandro Piñeiro ff89dc3523 vulkan: move common format helpers to vk_format
v3dv, radv, and turnip are using several C&P format helpers (most of
them wrappers over util_format_description based helpers).  methods.

This commit moves the common helpers to the already existing common
vk_format.h. For the case of v3dv we were able to remove the vk_format
header. For turnip and radv, a local vk_format.h header remains, with
methods that are only used for those drivers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13858>
2021-11-19 12:23:19 +01:00
Dylan Baker a854cbc7b5 turnip: don't use mesa/macros.h to get utils/rounding.h
For hopefully obvious reasons.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13853>
2021-11-18 10:46:51 +00:00
Connor Abbott 23a5f1a5ac ir3: Stop inserting nops during scheduling
Not necessary since nothing uses it anymore. This might have a slight
effect on spilling with multiple blocks, but no shader-db difference
because nothing spills.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott e0eeba6cbb ir3/postsched: Only prefer tex/sfu if they are soft-ready
Otherwise we schedule an SFU depending on a tex as soon as the tex is
scheduled, which is very much not what we want.

Note that sstall is helped more than nops are hurt, and the shaders with
the largest nop regressions also have sstall helped. However (sy) is
also very much helped.

total nops in shared programs: 345482 -> 345986 (0.15%)
nops in affected programs: 5731 -> 6235 (8.79%)
helped: 15
HURT: 81
helped stats (abs) min: 1 max: 9 x̄: 3.27 x̃: 3
helped stats (rel) min: 0.50% max: 16.00% x̄: 8.55% x̃: 10.26%
HURT stats (abs)   min: 1 max: 72 x̄: 6.83 x̃: 4
HURT stats (rel)   min: 0.57% max: 400.00% x̄: 32.50% x̃: 13.16%
95% mean confidence interval for nops value: 3.34 7.16
95% mean confidence interval for nops %-change: 13.07% 39.10%
Nops are HURT.

total sstall in shared programs: 133804 -> 132381 (-1.06%)
sstall in affected programs: 4743 -> 3320 (-30.00%)
helped: 68
HURT: 24
helped stats (abs) min: 1 max: 153 x̄: 21.88 x̃: 8
helped stats (rel) min: 1.79% max: 100.00% x̄: 33.20% x̃: 28.00%
HURT stats (abs)   min: 1 max: 11 x̄: 2.71 x̃: 2
HURT stats (rel)   min: 1.02% max: 200.00% x̄: 17.73% x̃: 5.59%
95% mean confidence interval for sstall value: -22.05 -8.89
95% mean confidence interval for sstall %-change: -27.60% -12.22%
Sstall are helped.

total (ss) in shared programs: 35471 -> 35481 (0.03%)
(ss) in affected programs: 462 -> 472 (2.16%)
helped: 9
HURT: 15
helped stats (abs) min: 1 max: 2 x̄: 1.11 x̃: 1
helped stats (rel) min: 4.17% max: 33.33% x̄: 14.00% x̃: 7.69%
HURT stats (abs)   min: 1 max: 3 x̄: 1.33 x̃: 1
HURT stats (rel)   min: 1.19% max: 50.00% x̄: 12.27% x̃: 8.33%
95% mean confidence interval for (ss) value: -0.14 0.97
95% mean confidence interval for (ss) %-change: -5.11% 9.94%
Inconclusive result (value mean confidence interval includes 0).

total (sy) in shared programs: 13522 -> 13288 (-1.73%)
(sy) in affected programs: 422 -> 188 (-55.45%)
helped: 22
HURT: 1
helped stats (abs) min: 1 max: 21 x̄: 10.68 x̃: 10
helped stats (rel) min: 8.00% max: 94.44% x̄: 56.58% x̃: 56.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%
95% mean confidence interval for (sy) value: -13.18 -7.17
95% mean confidence interval for (sy) %-change: -65.48% -40.59%
(sy) are helped.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott 6f5c0d209c ir3/postsched: Rewrite delay handling
Analogous to the pre-RA scheduler. Unfortunately this time it's a bit
more involved because we have to correctly handle (rptN), which is
already relevant for swz. This means we need the index of the
destination register that conflicts with the source register, to handle
swz, and we need to expose that part of ir3_delay. But once that's done,
we can delete ir3_delay_calc_postra.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott 140e117f2b ir3/delay: Ignore earlier definitions to the same register
We have a situation in some skia shaders like:

add.f r0.x, ...
(rpt2)nop
mul.f ..., r0.x
sam (xyzw) r0.x, ...
rcp ..., r0.x

Notice that rcp uses the result of the sam instruction, not the add.f,
but we didn't keep track of which instructions kill the sources in
ir3_delay, so we'd add an extra nop, resulting in a disagreement betwen
ir3_delay and the scheduling graph. Since postsched is correct, fix
ir3_delay. This only results in some very slight shader-db changes but
keeps the next commit from changing things.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott a54e7baa65 ir3/postsched: Handle sync dependencies better
We want to model soft dependencies, but because of how there's only a
single bit to wait on all of them, there may be unnecessary delays
inserted when a (sy)-consumer follows an unrelated (sy)-producer.
Previously there was some code to try to work around this, but we can
just model it directly using the sfu_delay and tex_delay cycle counts
that we have to maintain anyway and delete it.

This also gets rid of the calls to ir3_delay_postra with soft=true which
would be more complicated to handle in the next commit.

There is a functional change here: the idea of preferring less nop's
over critical path length (max_delay) up to 3 nops is kept (and we
delete the TODO which is already sort-of resolved by it), but delays due
to (ss)/(sy) and nops are now treated equally, rather than always
preferring nops over syncs. So if our estimate indicates that scheduling
an (ss) consumer will result in a wait of one cycle and there's another
instruction that will require one nop, we will treat them otherwise
equal and choose based on max_delay instead. This results in more
sstall, but the decrease in nops is much greater.

total nops in shared programs: 376613 -> 345482 (-8.27%)
nops in affected programs: 275483 -> 244352 (-11.30%)
helped: 3226
HURT: 110
helped stats (abs) min: 1 max: 78 x̄: 9.73 x̃: 7
helped stats (rel) min: 0.19% max: 100.00% x̄: 19.48% x̃: 13.68%
HURT stats (abs)   min: 1 max: 16 x̄: 2.43 x̃: 2
HURT stats (rel)   min: 0.00% max: 150.00% x̄: 13.34% x̃: 4.36%
95% mean confidence interval for nops value: -9.61 -9.06
95% mean confidence interval for nops %-change: -19.01% -17.78%
Nops are helped.

total sstall in shared programs: 126195 -> 133806 (6.03%)
sstall in affected programs: 79440 -> 87051 (9.58%)
helped: 300
HURT: 1922
helped stats (abs) min: 1 max: 15 x̄: 4.72 x̃: 4
helped stats (rel) min: 1.05% max: 100.00% x̄: 17.15% x̃: 14.55%
HURT stats (abs)   min: 1 max: 29 x̄: 4.70 x̃: 4
HURT stats (rel)   min: 0.00% max: 900.00% x̄: 25.38% x̃: 10.53%
95% mean confidence interval for sstall value: 3.22 3.63
95% mean confidence interval for sstall %-change: 17.50% 21.78%
Sstall are HURT.

total (ss) in shared programs: 35190 -> 35472 (0.80%)
(ss) in affected programs: 6433 -> 6715 (4.38%)
helped: 163
HURT: 401
helped stats (abs) min: 1 max: 2 x̄: 1.06 x̃: 1
helped stats (rel) min: 1.92% max: 33.33% x̄: 11.53% x̃: 10.00%
HURT stats (abs)   min: 1 max: 3 x̄: 1.13 x̃: 1
HURT stats (rel)   min: 1.56% max: 100.00% x̄: 15.33% x̃: 12.50%
95% mean confidence interval for (ss) value: 0.41 0.59
95% mean confidence interval for (ss) %-change: 6.22% 8.93%
(ss) are HURT.

total (sy) in shared programs: 13476 -> 13521 (0.33%)
(sy) in affected programs: 669 -> 714 (6.73%)
helped: 30
HURT: 78
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 4.00% max: 50.00% x̄: 21.22% x̃: 21.11%
HURT stats (abs)   min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel)   min: 3.45% max: 100.00% x̄: 31.93% x̃: 25.00%
95% mean confidence interval for (sy) value: 0.23 0.60
95% mean confidence interval for (sy) %-change: 11.19% 23.15%
(sy) are HURT.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott b9f61d7287 ir3/postsched: Fix copy-paste mistake
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott d9a91318b1 ir3/sched: Rewrite delay handling
The old code walked the instructions between each ready instruction and
each of its parents for every instruction, which can quickly become
accidentally quadratic. Instead we keep track of the current
"instruction pointer" of the to-be-scheduled instruction, and for each
ready instruction calculate an "earliest possible IP" which is the IP
that needs to be reached before we can schedule it. Because this stays
constant as soon as an instruction becomes ready, we never have to
recompute it and each call to ir3_delay_calc_prera() becomes a simple
comparison and subtract. We only need to iterate over the children and
update their earliest_ip when scheduling an instruction, and we already
do that in util_day_prune_head() so it should be cheap.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott 508f917d8c util/dag: Make edge data a uintptr_t
Nobody was actually using it as a pointer, and I'm going to introduce a
shared function which relies on it not being a pointer so let's fix this
once and for all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Ilia Mirkin aa93896156 freedreno/ir3: adjust condition for when to use ldib
We have to use it any time that the image is writable. Otherwise writes
from the same invocation won't have posted into the texture cache.

See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5629
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13807>
2021-11-16 18:22:29 +00:00
Ilia Mirkin a95a9f0cc6 freedreno/a4xx: include guesses from a3xx for some of the constid's
The ones that are untested are left as comments. The ones that rename
values were tested manually.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
2021-11-16 05:08:26 +00:00
Ilia Mirkin 45606b51cc freedreno/a4xx: indicate whether outputs are uint/sint
Unclear whether this fixes anything, but the blob does seem to set
these. (Discovered while trying to determine if value clamping was
missing for non-32-bit integer formats, which fail in some tests.)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
2021-11-16 05:08:26 +00:00
Ilia Mirkin 20e8e11d64 freedreno/a6xx: re-express buffer textures more logically
Same as a5xx, move one bit into the tex type, one as a separate named
BUFFER bit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13805>
2021-11-16 04:44:23 +00:00
Ilia Mirkin 8c041f4bf3 freedreno/a5xx: re-express buffer textures more logically
Instead of treating it as 2 bits to enable, make BUFFER a type (and
extend the bitfield width), and then add a separate BUFFER bit
(ostensibly to perform the width/height concatenation but who knows).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13805>
2021-11-16 04:44:23 +00:00
Ilia Mirkin 6566eae933 freedreno/a4xx: add proper buffer texture support
Rather than faking it as a 1d texture, add the buffer texture type, and
allow a full range of sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13805>
2021-11-16 04:44:23 +00:00
Ilia Mirkin 8c9a86cb57 freedreno/ir3: fix image-to-tex flags, remove 3d -> array hack
The function would return both the 3d and array flags set for 2d array,
and would return just 3d for cubes. Fix the flags so that they are
appropriate for images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13804>
2021-11-16 00:33:31 +00:00
Emma Anholt 42753be1e7 freedreno/a6xx: Fix a bunch of 3D texture layout to match blob behavior.
This doesn't get all of the texelfetch sampler3d testcases working, but
it's sure a lot more.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Emma Anholt a3717c1496 freedreno/cffdump: Handle the TILE_ALL flag in unit test generation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Emma Anholt e42450a255 freedreno/cffdump: Fix up formatting of texturator unit test script output.
Now I don't need to re-clang-format as I generate testcases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Emma Anholt 7a6fc25daa freedreno/fdl: Add support for unit testing 3D texture array strides.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Emma Anholt 0d7c6eedc7 freedreno/cffdump: Fix 64-bit reg decode in script mode.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Emma Anholt f63fd3425d freedreno: Fix the texturator unit test script.
We no longer have reg defs for the HI fields, so all we can access from
lua is the low 32 bits.  LUA has only double-precision floats for numbers,
so we can't fix that.  However, the high bits are almost always the same,
so it's not that big of a deal to be ignoring them for this script.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Emma Anholt 3ddefb4ae3 freedreno/fdl: Dump the generated layout when a layout test fails.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
2021-11-15 22:25:08 +00:00
Ilia Mirkin 31d6cd224a a5xx: remove astc srgb workaround logic
This was copied from a4xx, which only needs it on one chip model (A420).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13782>
2021-11-15 17:31:53 +00:00
Emma Anholt 01d36149cd ci/freedreno: Add a link to the issue for color_depth_attachments.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13747>
2021-11-12 20:26:22 +00:00
Emma Anholt 1847700d3c ci/freedreno: Add notes explaining the KHR-GL* failures.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13747>
2021-11-12 20:26:22 +00:00
Emma Anholt 943449fb8e ci/freedreno: Enable the tes-input/tcs-input tests.
They seem to be mostly passing these days.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13747>
2021-11-12 20:26:22 +00:00
Emma Anholt 2ce44a0298 freedreno/ir3: Fix an off-by-one in so->outputs_count safety assert.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13747>
2021-11-12 20:26:22 +00:00
Emma Anholt 02079cbb77 freedreno/a6xx: Add some notes about piglit failures.
Hopefully this helps others save time looking at piglit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13747>
2021-11-12 20:26:22 +00:00
Connor Abbott a9b4a507fe tu: Expose Vulkan 1.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13756>
2021-11-12 18:14:34 +00:00
Connor Abbott c6216c941c tu: Add VK_KHR_buffer_device_address stubs
dEQP-VK.api.version_check.entry_points requires us to return a function
pointer, even though the feature is optional in Vulkan 1.2.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13756>
2021-11-12 18:14:34 +00:00
Connor Abbott 952ab4f64f tu: Enable subgroupBroadcastDynamicId
It's a Vulkan 1.2 only feature, but it's trivially supported.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13756>
2021-11-12 18:14:34 +00:00
Hyunjun Ko ddb3d30d47 turnip: Enable VK_KHR_separate_depth_stencil_layouts
We now start handling depth/stencil layouts separately when
adding implicit subpass dependancies.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13057>
2021-11-12 13:16:23 +00:00
Christian Gmeiner a0634a3c85 ci/bare-metal: switch to common .baremetal-test-arm64
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13751>
2021-11-12 08:22:29 +00:00
Emma Anholt a68a0c9e1c mesa/st: Disable NV_copy_depth_to_color on non-doubles-capable HW.
The previous doubles check
(https://gitlab.freedesktop.org/mesa/mesa/-/issues/3459) checked that you
didn't have full doubles emulation turned on, but we also need to check
that you have doubles at all (emulated or not) or non-GL4 drivers will
fail.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13743>
2021-11-11 16:38:58 +00:00
Emma Anholt 94e4cd4d83 freedreno/fdl6: Skip redundant setting of TILE_ALL for NV12.
We already respect the tile_all flag above, and it should be set in tu.
Fixes a mismatch between fdl6_view_init() and gallium.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 2e6810a06a util/format: Add G8_B8R8_420_UNORM to match Vulkan.
turnip was playing fast and loose with the name, using the R8_G8B8 format
name but actually setting the descriptors up to read G8_B8R8 like Vulkan
(sensibly) wants.  This caused trouble when trying to make freedreno and
turnip share code.  By having both orderings as format names, we can share
the descriptor code and also confuse readers less.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 271b6cb981 util: Rename PIPE_FORMAT_G8_B8_R8_420_UNORM.
The only user, turnip, was actually treating it as this layout, matching
vulkan's specification of how the planes map to RGB values.  (Y=G means
that Cb=B and Cr=R).

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13443>
2021-11-11 00:10:57 +00:00
Emma Anholt 549924d53e freedreno: Fix constant-index assumptions in IBO loads.
The encoder already sets up our IBO accesses as potentially nonuniform, so
we just need to be careful to not try to force the IBO index into an
immediate.

Fixes assertion failures in piglit arb_shader_image_load_store-invalid
(intermittent due to
https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/597), which
had some interesting actual failures hidden behind it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13601>
2021-11-10 17:48:59 +00:00
Emma Anholt 9e04f97d8e freedreno: Fix the uniform/nonuniform handling for cat5 bindful modes.
We can see from the dynamically_uniform (compiler doesn't know if you're
uniform or not) vs uniform (compiler can see it's uniform) case in the
blob which is which.  Now that we have the right names, also use the
nonunif flag for encoding the actual non-uniform mode (previously, we were
always setting it always in a way that meant uniform).

I verified this behavior back to a418 with samplers.  The a3xx blob I have
only does GLES3, so we don't have the opaque_type_indexing tests to see.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13601>
2021-11-10 17:48:59 +00:00
Hyunjun Ko 5d0712b185 turnip: expose VK_KHR_driver_properties
Now that we have a conformance version to advertise, we can expose the
extension.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6251>
2021-11-09 03:43:54 +00:00
Emma Anholt 1e850f23b1 turnip: Claim 1.2.7.1 CTS conformance.
I submitted a conformance package for A618 today, so let's stop doing all
this warning about non-conformance.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6251>
2021-11-09 03:43:54 +00:00
Connor Abbott 38f0b36f1a ir3/spill: Initial implementation of rematerialization
This only handles moves from immedates/constants. The next step would be
to rematerialize ALU instructions whose sources are available.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13650>
2021-11-08 23:51:37 +00:00
Connor Abbott db566904ba ir3/spill: Mark root as non-spillable after inserting
We have to mark the root as non-spillable in case the interval is the
child of some other interval, but we can't know whether it's the child
of some other interval until it's been inserted. Move the setting of
cant_spill below the insertion. This prevents us from using a bogus
parent value.

Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13650>
2021-11-08 23:51:37 +00:00
Emma Anholt 34739cb6e2 freedreno/ir3: Fix off-by-one in prefetch safety assert.
This looks like just a typo, we allow up to == 0xf in the lowering pass.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
2021-11-04 22:49:29 +00:00
Emma Anholt 35f56ad856 freedreno/a5xx: Diff reduction in fd5_layout to fd6_layout.
This should be exactly equivalent code, except for the is_3d "level <= 1"
which doesn't bring over 6c19d37331 ("freedreno/a6xx: fix 3d tex
layout") due to it failing our unit tests where we compare to the blob's
behavior.  The layer_stride setup is pulling in what freedreno_resource.c
was doing after the layout setup, so we match fd6 and so that it could
potentially be checked in unit testing.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13668>
2021-11-04 22:49:29 +00:00
Emma Anholt 1e869e3fb4 freedreno/a5xx+: Fix missing LA formats.
GL_ARB_texture_buffer_object uses these formats, and we expose it.  Since
we didn't have the formats in the table, we we were using bad HW
texture/color formats for them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13666>
2021-11-04 19:07:54 +00:00
Emma Anholt 0e4fcda7e0 freedreno/a6xx: Don't try to generate mipmaps for SNORM with our blitter.
Since we're casting to unorm, the linear filtering will give bad results.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13666>
2021-11-04 19:07:54 +00:00
Emma Anholt 0913ac33a9 freedreno/a618: Mark a flaky test that triggers hangcheck.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13659>
2021-11-04 03:47:54 +00:00
Emma Anholt d1801d43f8 freedreno/a5xx: Use the defined names for 2D_BLIT_CNTL regs.
We have definitions for them above, no need to be UNKNOWN about it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13659>
2021-11-04 03:47:54 +00:00
Emma Anholt f0f5b8d47c freedreno/a6xx: Fix partial z/s clears with sysmem.
We have to set 8c01 to say "leave these channels alone" when
clearing/storing just Z or S of z24s8.  Fixes the bypass path for
KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8.

Cc: mesa-stable
Fixes: #5592
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13659>
2021-11-04 03:47:54 +00:00
Matt Turner cc29b94041 freedreno/ir3: Use immediate for flat.b's src1
According to Jonathan Marek:

  Only one immediate can be decoded in a cat2 instruction (if both srcs
  are immediates, they will use the value of the either the first or
  second one, I don't remember which) - using 2 immediates in a cat2
  instruction is only "correct" if they are both equal.

  The (i,j) in the second src of flat.b is not unused, but behaves as 0
  for any (small) integer because it is a float src. The hack I
  suggested is to set the second src equal to (immediate) first src,
  which seems to work.

This allows us to remove a couple of mov instructions or a bit of extra
constfile usage.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Matt Turner 2ab0cf2b54 freedreno/ir3: Use flat.b to load flat varyings on a6xx
The flat.b/bary.f cat2 instruction should be faster than an ldlv cat6
instruction, even with a couple of additional moves (which will be
removed in the next patch).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Matt Turner 2ee1b5a526 freedreno/ir3: Add infrastructure for flat.b
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Matt Turner a150e31910 ir3: Add support for (dis)assembling flat.b
flat.b is a variant of the bary.f instruction that does not perform
interpolation of the varying input.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Emma Anholt 14fca01b32 freedreno: Fix layered rendering to just Z/S and not color.
We would try to take the gmem path which can't do layered rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13653>
2021-11-03 21:13:45 +00:00
Emma Anholt 3050e20283 freedreno/fdl6: Add support for texture swizzles of A/L/I/LA/RGBx.
To convert freedreno over, we need to support these formats where we remap
R or RG formats to GL compat ones, or RGBA to RGBx.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13635>
2021-11-03 19:38:48 +00:00
Emma Anholt 669caded51 turnip: Remove buffer-view cross-check code.
Now that I've tested storage.*buffer, I'm confident I've moved the buffer
views correctly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13635>
2021-11-03 19:38:48 +00:00
Emma Anholt ef1fb25787 turnip: Use the new shared buffer-view descriptor creation function.
This cross-checks that our descriptors match as I move the code.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13635>
2021-11-03 19:38:48 +00:00
Emma Anholt aa3074e5be freedreno/fdl6: Add an interface for setting up buffer descriptors.
Buffers don't need all the layout stuff that image views do, so it's
easier to have a separate interface for generating them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13635>
2021-11-03 19:38:48 +00:00
Emma Anholt 7b578c1249 freedreno/a6xx: Emit a null descriptor for unoccupied IBO slots.
Fixes a crash in some desktop GL testcases in piglit.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13635>
2021-11-03 19:38:48 +00:00
Emma Anholt 29093bc42d freedreno: Fix gmem invalidating the depth or stencil of packed d/s.
The gmem store stores both depth and stencil for z24s8.  So, if we're
doing a write (clear or draw) to one or the other of the channels, we need
the other one restored as well.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13649>
2021-11-03 18:56:23 +00:00
Danylo Piliaiev 3afdc3ab2c freedreno/computerator: Support A660 gpu
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13640>
2021-11-03 16:32:19 +00:00