Commit Graph

38 Commits

Author SHA1 Message Date
Vasily Khoruzhick fa86a2a94d lima: add support for 3D textures
It looks like MBS format used by blob doesn't distinguish sampler2D from
sampler3D, so load texture instruction is the same for 2D and 3D
textures.

So all we need to RE is texture descriptor for 3D textures, but blob
doesn't implement it, so we need to do some guesswork:

- unknown_3_1 looks like a depth since it sits after height/width and
  always set to 1
- unknown_2_2 is exactly 3 bits and it follows wrap_t, so it must be
  wrap_r
- missing part is texture type for 3D textures. By trial and error it
  seems to be 4. First bit is only set for cubemap, so it's likely a
  separate flag, and rest 2 bits look like number of tex dimensions akin
  to midgard and later (thanks, panfrost!) with 0 for 1D, 1 for 2D
  and 2 for 3D.

Put it all together and we have working 3D textures on lima!

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213>
2021-11-16 22:58:12 +00:00
Vasily Khoruzhick 764760314d lima: add native txp support
Currently lima uses generic TXP lowering that results in downgrading
coords precision to FP16 since we have to do some calculations with
coords instead of loading them directly from varying.

Mali4x0 has native TXP support, however coords and projector have to
come from a single source.

Add NIR lowering pass that combines coords and projector into a single
backend-specific source and use it instead of generic lowering.

Unfortunately this change regresses one test, but it also fails in blob and
disassembly is now identical.

shader-db diff:

total instructions in shared programs: 15623 -> 15603 (-0.13%)
instructions in affected programs: 877 -> 857 (-2.28%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 2.86 x̃: 2
helped stats (rel) min: 0.87% max: 10.53% x̄: 4.93% x̃: 1.85%
95% mean confidence interval for instructions value: -4.95 -0.76
95% mean confidence interval for instructions %-change: -9.31% -0.55%
Instructions are helped.

total loops in shared programs: 3 -> 3 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 136 -> 137 (0.74%)
spills in affected programs: 0 -> 1
helped: 0
HURT: 1

total fills in shared programs: 598 -> 602 (0.67%)
fills in affected programs: 0 -> 4
helped: 0
HURT: 1

Tested-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13111>
2021-11-16 19:13:42 +00:00
Andreas Baierl 1e9f18008f lima/parser: add shader disassembly to dump
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13138>
2021-10-04 08:37:50 +02:00
Michel Dänzer d200f45875 Use explicit break instead of fall-through to break-only case
clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Vasily Khoruzhick bf09ba5385 lima: implement shader disk cache
Wire up disk cache routines and change fs and vs keys to use nir_sha1
instead of pointer to uncompiled shader to be able to reuse them for
disk cache.

Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9672>
2021-03-25 06:31:41 +00:00
Marek Olšák f5f0c012ad gallium/util: remove empty file u_half.h
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6987>
2020-10-06 21:07:11 -04:00
Marek Olšák b42c6ff6f6 util: remove util_float_to_half and util_half_to_float wrappers
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6987>
2020-10-06 21:07:07 -04:00
Matt Turner 1aac47db69 Revert F16C series (MR 6774)
This reverts commit 4fb2eddfdf.
This reverts commit 7a1deb16f8.
This reverts commit 2b6a172343.
This reverts commit 5af81393e4.
This reverts commit 87900afe5b.

A couple of problems were discovered after this series was merged that
cause breakage in different configurations:

   (1) It seems that using -mf16c also enables AVX, leading to SIGILL on
   platforms that do not support AVX.
   (2) Since clang only warns about unknown flags, and as I understand
   it Meson's handling in cc.has_argument() is broken, the F16C code is
   wrongly enabled when clang is used, even for example on ARM, leading
   to a compilation error.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3583
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6969>
2020-10-01 21:08:12 +00:00
Marek Olšák 4fb2eddfdf gallium/util: remove empty file u_half.h
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6774>
2020-09-30 16:28:24 +00:00
Marek Olšák 2b6a172343 util: remove util_float_to_half and util_half_to_float wrappers
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6774>
2020-09-30 16:28:24 +00:00
Erico Nunes e622e010fd lima/ppir: rework select conditions
This is yet another simple optimization that attemts to save the
insertion of an unnecessary mov for a large number of cases.
If the node outputting the condition for select satisfies a few
requirements (which are common in the case of comparison conditions),
it can just be changed to pipeline output and used directly.
In case of difficult corner cases, just fall back to the mov as before.

The sel_cond op is removed as the scheduler can be smart enough to place
nodes that output to ^fmul in the ALU_SCL_MUL slot, and as there can be
alu ops other than just mov.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4632>
2020-05-09 14:40:40 +02:00
Erico Nunes 8c47640731 lima/ppir: rework store output
In many cases, it is possible to avoid creating a mov for the store
output node.
Additionally, nodes other than alu, such as load varying, can be valid
store output nodes too.

This is another small optimization, but helps a vast majority of
programs by 1 instruction.
Shaders with discard easily become complicated to handle properly.
Some example issues: ppir has to rely on instruction ordering; or a
node with ssa output could be required both before a discard_if (as a
condition) and after it (as the instruction with the 'stop' bit set).
So don't try to handle them here.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4632>
2020-05-09 14:40:34 +02:00
Erico Nunes 741aa3439d lima/ppir: fix lod bias register codegen
The lod bias register is correctly run through the entire compilation
process, but in the end its allocated register value was never being
added to the instruction.
It seems that most programs were lucky enough that lod bias was assigned
register 0.x so that things worked anyway.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4535>
2020-05-09 11:30:07 +00:00
Vasily Khoruzhick 1b49534df2 lima: add support for R and RG formats
Unfortunately these are not supported natively for sampling
so we have to lower them.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4241>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4241>
2020-03-20 17:00:10 +00:00
Erico Nunes d56710ab82 lima/ppir: fix lod bias src
ppir has some code that operates on all ppir_src variables, and for that
uses ppir_node_get_src.
lod bias support introduced a separate ppir_src that is inaccessible by
that function, causing it to be missed by the compiler in some routines.
Ultimately this caused, in some cases, a bug in const lowering:

  .../pp/lower.c:42: ppir_lower_const: Assertion `src != NULL' failed.

This fix moves the ppir_srcs in ppir_load_texture_node together so they
don't get missed.

Fixes: 721d82cf06 lima/ppir: add lod-bias support

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
2019-12-20 19:39:55 +00:00
Arno Messiaen 721d82cf06 lima/ppir: add lod-bias support
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-11-20 22:24:00 +00:00
Arno Messiaen a9391a1a01 lima: add cubemap support
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen 9890590fba lima: introduce ppir_op_load_coords_reg to differentiate between loading texture coordinates straight from a varying vs loading them from a register
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Timothy Arceri 7f106a2b5d util: rename list_empty() to list_is_empty()
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Vasily Khoruzhick 678ebda8b7 lima/ppir: add support for indirect load of uniforms and varyings
Utgard PP supports indirect load of uniforms and varyings, so let's
enable it.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 20:33:27 -07:00
Vasily Khoruzhick e23fd2c375 lima/ppir: don't assume that load coords gets value from register
It can load value from varying directly as well. Also load_regs is the
only op that has a source, so add src_num field to load node and set it
accordingly.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-04 00:02:13 +00:00
Vasily Khoruzhick 28d4b456a5 lima/ppir: add control flow support
This commit adds support for nir_jump_instr, if and loop
nir_cf_nodes.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-24 08:17:31 -07:00
Vasily Khoruzhick 8dd195e865 lima/ppir: turn store_color into ALU node
We don't have a special OP to store color in PP, all we need to do is to
store gl_FragColor into reg0, thus it's just a mov and therefore ALU node.

Yet we still need to indicate that it's store_color op so regalloc ignores
its destination.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick fd129817f0 lima/ppir: add support for unconditional branches and condition negation
We need 'negate' modifier for branch condition to minimize branching. Idea
is to generate following:

current_block: { ...; if (!statement) branch else_block; }
then_block: { ...; branch after_block; }
else_block: { ... }
after_block: { ... }

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:46 -07:00
Andreas Baierl 1c45541c7f lima/ppir: Add fddx and fddy
Lower fddx and fddy and set the right bits in codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-08-12 23:20:04 +02:00
Erico Nunes fd29c4d6c5 lima/ppir: simplify select op lowering and scheduling
The select operation relies on the select condition coming from the
result of the the alu scalar mult slot, in the same instruction.
The current implementation creates a mov node to be the predecessor of
select, and then relies on an exception during scheduling to ensure that
both ops are inserted in the same instruction.

Now that the ppir scheduler supports pipeline register dependencies,
this can be simplified by making the mov explicitly output to the fmul
pipeline register, and the scheduler can place it without an exception.

Since the select condition can only be placed in the scalar mult slot,
differently than a regular mov, define a separate op for it.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-04 13:38:18 +02:00
Andreas Baierl 5254e53deb lima/ppir: Add gl_FrontFace handling
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-03 08:04:12 +00:00
Erico Nunes 65e6c42d27 lima/ppir: fix branch codegen register encode
The branch instruction has 6 bits per register operand which allows it
to specify a component in the register.
Fix codegen so that it outputs the right component, otherwise it always
outputs the x component.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-23 08:49:19 +00:00
Andreas Baierl 4627a0c4eb lima/ppir: Add gl_PointCoord handling
Treat gl_PointCoord as a system value and
add the necessary bits for correct codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Vinson Lee d1a55d9559 lima/ppir: Fix assert condition in ppir_codegen_encode_branch.
Fixes: af0de6b91c ("lima/ppir: implement discard and discard_if")
Reported-by: Coverity Scan
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-07-15 23:48:34 +00:00
Vasily Khoruzhick eb862c2365 lima/ppir: Fix branch codegen
"unknown_2" field is actually a size of instruction that branch
points to. If it's set to a smaller size than actual instruction
branch behavior is not defined (and it usually wedges the GPU).

Fix it by setting this field correctly.

Fixes: af0de6b91c ("lima/ppir: implement discard and discard_if")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-07-14 19:49:14 -07:00
Vasily Khoruzhick 8f0160ca24 lima/ppir: Fix assert condition in ppir_codegen_encode_discard
Fixes: af0de6b91c ("lima/ppir: implement discard and discard_if")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-07-14 19:48:55 -07:00
Mateusz Krzak 60009aefdb lima/ppir: change offset type to int
Offset doesn't need to be 64-bit. This fixes compilation error
with 64-bit off_t.

Fixes: af0de6b9 lima/ppir: implement discard and discard_if

Suggested-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Mateusz Krzak <kszaquitto@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-06-13 07:43:24 +02:00
Vasily Khoruzhick b412e05751 lima/ppir: add missing handling of min/max ops for vec4 add slot
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-06 04:30:36 +00:00
Vasily Khoruzhick af0de6b91c lima/ppir: implement discard and discard_if
This commit also adds codegen for branch since we need it
for discard_if.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-05-27 07:39:03 -07:00
Andreas Baierl c960323a81 lima/ppir: Add gl_FragCoord handling
Treat gl_FragCoord variable as a system value and lower the w component
with a nir pass.
Add the necessary bits for correct codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-04-29 02:46:44 +00:00
Erico Nunes 56230f0428 lima/ppir: support ppir_op_ceil
Add a few missing ppir_op_ceil enum handling entries to implement
nir_op_fceil in lima ppir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-04-19 10:22:03 +00:00
Qiang Yu 92d7ca4b1c gallium: add lima driver
v2:
- use renamed util_dynarray_grow_cap
- use DEBUG_GET_ONCE_FLAGS_OPTION for debug flags
- remove DRM_FORMAT_MOD_ARM_AGTB_MODE0 usage
- compute min/max index in driver

v3:
- fix plbu framebuffer state calculation
- fix color_16pc assemble
- use nir_lower_all_source_mods for lowering neg/abs/sat
- use float arrary for static GPU data
- add disassemble comment for static shader code
- use drm_find_modifier

v4:
- use lima_nir_lower_uniform_to_scalar

v5:
- remove nir_opt_global_to_local when rebase

Cc: Rob Clark <robdclark@gmail.com>
Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Koen Kooi <koen@dominion.thruhere.net>
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: marmeladema <xademax@gmail.com>
Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rohan Garg <rohan@garg.io>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
2019-04-11 09:57:53 +08:00