mirror of https://gitlab.freedesktop.org/mesa/mesa
OpenCL can generate large loads and stores that we can't support, so we need to lower. We can load/store up to 128-bits in a single go. We currently only handle up to 32-bit components in the load and no more than vec4, so we split up accordingly. It's not clear to me what the requirements are for alignment on Valhall, so we conservatively generate aligned access, at worst there's a performance penalty in those cases. I think unaligned access is suppoerted, but likely with a performance penalty of its own? So in the absence of hard data otherwise, let's just use natural alignment. Oddly, this shaves off a tiny bit of ALU in a few compute shaders on Valhall, all in gfxbench. Seems to just be noise from the RA lottery. total instructions in shared programs: 2686768 -> 2686756 (<.01%) instructions in affected programs: 584 -> 572 (-2.05%) helped: 6 HURT: 0 Instructions are helped. total cvt in shared programs: 14644.33 -> 14644.14 (<.01%) cvt in affected programs: 5.77 -> 5.58 (-3.25%) helped: 6 HURT: 0 total quadwords in shared programs: 1455320 -> 1455312 (<.01%) quadwords in affected programs: 56 -> 48 (-14.29%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22228> |
||
---|---|---|
.. | ||
bifrost | ||
test | ||
valhall | ||
ISA.xml | ||
Notes.txt | ||
README.md | ||
bi_builder.h.py | ||
bi_helper_invocations.c | ||
bi_layout.c | ||
bi_liveness.c | ||
bi_lower_divergent_indirects.c | ||
bi_lower_swizzle.c | ||
bi_opcodes.c.py | ||
bi_opcodes.h.py | ||
bi_opt_constant_fold.c | ||
bi_opt_copy_prop.c | ||
bi_opt_cse.c | ||
bi_opt_dce.c | ||
bi_opt_dual_tex.c | ||
bi_opt_mod_props.c | ||
bi_opt_push_ubo.c | ||
bi_packer.c.py | ||
bi_pressure_schedule.c | ||
bi_print.c | ||
bi_print_common.c | ||
bi_print_common.h | ||
bi_printer.c.py | ||
bi_quirks.h | ||
bi_ra.c | ||
bi_test.h | ||
bi_validate.c | ||
bifrost.h | ||
bifrost_compile.c | ||
bifrost_compile.h | ||
bifrost_isa.py | ||
bifrost_nir.h | ||
bifrost_nir_algebraic.py | ||
bir.c | ||
cmdline.c | ||
compiler.h | ||
gen_disasm.py | ||
meson.build | ||
nodearray.h |
README.md
Bifrost compiler
Register file
Defined partially in software, partially in hardware.
Blend shaders
R0 - R3: input (color #0) R4 - R7: input (color #1) R8 - R15: general purpose R48: return address
Fragment
Anything live during BLEND must respect blend shader registers.
R0 - R3: preloaded (message #0) R4 - R7: preloaded (message #1) R57 - R63: preloaded (various)
R0 - R15: general purpose (full threads) R48 - R63: general purpose (full threads)
R32 - R47: general purpose (half threads, or v6)