Commit Graph

3434 Commits

Author SHA1 Message Date
Jonathan Marek 501c6e70d4 freedreno: update a2xx registers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-28 18:21:16 -05:00
Emil Velikov 9cc8e12505 freedreno: automake: ship ir3_nir_trig.py in the tarball
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-24 18:13:24 +00:00
Karol Herbst 8bb46de08b mesa: add MESA_SHADER_KERNEL
used for CL kernels

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Karol Herbst 9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Rob Clark 04aff7e42b freedreno: make cmdstream bo's read-only to GPU
If nothing else, this will make problems with cmdstream getting blit
over with pixels easier to track down (ie. faults when it first happens
rather than strange failures later from corrupted cmdstream when a
stateobj is later reused).

(NOTE this somewhat depends on the kernel supporting the flag, and the
iommu implementation.  But the worst case is just that the cmdstream
ends up writeable as before.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-01-10 14:21:39 -05:00
Bas Nieuwenhuizen 3fcec4a550 freedreno: Move register constant files to src/freedreno.
This way they can be shared. Build tested with meson, but not too sure
on the autotools stuff though.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Rob Clark <robdclark@gmail.com>
2019-01-08 21:46:14 +01:00
Chia-I Wu 3cb65cf8aa freedreno/drm: sync uapi again
"pad" was missing in Mesa's msm_drm.h.  sizeof(drm_msm_gem_info)
remains the same, but now the compiler initializes the field to
zero.

Buffer allocation results in EINVAL without this for me.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
2019-01-08 19:55:28 +00:00
Karol Herbst d0c6ef2793 nir: rename global/local to private/function memory
the naming is a bit confusing no matter how you look at it. Within SPIR-V
"global" memory is memory accessible from all threads. glsl "global" memory
normally refers to shader thread private memory declared at global scope. As
we already use "shared" for memory shared across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).
glsl "local" memory is memory only accessible within a function, while SPIR-V
"local" memory is memory accessible within the same workgroup.

v2: rename local to function as well
v3: rename vtn_variable_mode_local as well

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:51:46 +01:00
Rob Clark 6667dde098 freedreno/ir3: don't treat all inputs/outputs as vec4
This was a hold-over from the early TGSI days, and mostly not needed
with NIR.  This avoids burning an entire 4 consecutive scalar regs
for vec3 outputs, for example.  Which fixes a few places that we were
doing worse that we should on register usage.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-22 15:27:21 -05:00
Rob Clark 3453814622 freedreno/ir3: fix fallout of extra assert
Fixes the following crash that happened after d6110d4d

The problem happens if we first compile a "vanilla" shader with nothing
lowered in NIR, which perform the final lowering passes on so->shader->
nir (including nir_lower_locals_to_regs()), and then later we have
compile a shader with some lowering.  The second time through we would
have already done nir_lower_locals_to_regs().

Arguably this was already a bug, just one we hadn't noticed yet.

Fixes: d6110d4d54 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-21 19:04:22 -05:00
Eduardo Lima Mitev c2ebc38052 freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()
An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
on desktop GL profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-19 22:49:05 +01:00
Eduardo Lima Mitev 5820e63418 freedreno/ir3: Make imageStore use num components from image format
emit_intrinsic_store_image() is always using 4 components when
collecting registers for the value. When image has less than
4 components (e.g, r32f, rg32i, etc) this results in extra mov
instructions.

This patch uses the actual number of components from the image format.

For example, in a shader like:

layout (r32f, binding=0) writeonly uniform imageBuffer u_image;
...
void main(void) {
   ...
   imageStore (u_image, some_offset, vec4(1.0));
   ...
}

instruction count is reduced in at least 3 instructions (note image
format is r32f, 1 component only).

This obviously reduces register pressure as well.

v2: - Added support for image formats from NV_image_format extension
    (Ilia Mirkin).
    - Return 4 components by default instead of asserting. (Rob Clark).

v3: Added more missing formats (Ilia Mirkin).

v4: Added a debug message for unknown image formats (Rob Clark).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-18 21:15:20 +01:00
Ian Romanick 378f996771 nir/opt_peephole_select: Don't peephole_select expensive math instructions
On some GPUs, especially older Intel GPUs, some math instructions are
very expensive.  On those architectures, don't reduce flow control to a
csel if one of the branches contains one of these expensive math
instructions.

This prevents a bunch of cycle count regressions on pre-Gen6 platforms
with a later patch (intel/compiler: More peephole select for pre-Gen6).

v2: Remove stray #if block.  Noticed by Thomas.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Ian Romanick 09b7e1d8e4 nir/opt_peephole_select: Don't try to remove flow control around indirect loads
That flow control may be trying to avoid invalid loads.  On at least
some platforms, those loads can also be expensive.

No shader-db changes on any Intel platform (even with the later patch
"intel/compiler: More peephole select").

v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select.  Suggested
by Rob.  See also the big comment in src/intel/compiler/brw_nir.c.

v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from
nir_lower_io_arrays_to_elements.c).

v4: Fix inverted condition in brw_nir.c.  Noticed by Lionel.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00
Jason Ekstrand 11dc130779 nir: Add a bool to int32 lowering pass
We also enable it in all of the NIR drivers.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand 80e8dfe9de nir: Rename Boolean-related opcodes to include 32 in the name
This is a squash of a bunch of individual changes:

    nir/builder: Generate 32-bit bool opcodes transparently

    nir/algebraic: Remap Boolean opcodes to the 32-bit variant

    Use 32-bit opcodes in the NIR producers and optimizations

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

     Use 32-bit opcodes in the NIR back-ends

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Rob Clark cca1e9606c freedreno/ir3: don't remove unused input components
Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark c19c4bf488 freedreno/ir3: fix crash
Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z

Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark 4cd016b5d6 freedreno: debug GEM obj names
With a recent enough kernel, set debug names for GEM BOs, which will
show up in $debugfs/gem

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark 7ef722861b freedreno/drm: sync uapi and enable softpin
Pull in updated UAPI and use kernel API version to enable softpin.
Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to
signal to kernel that cmdstream buffers are useful to dump for
debugging/cmdstream-traces.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Neil Roberts 8600aa35bd freedreno: Add .dir-locals to the common directory
The commit aa0fed10d3 moved a bunch of Freedreno code to a common
directory. The previous directory had a .dir-locals file for Emacs.
This patch copies it to the new directory as well.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-12-11 13:14:08 -08:00
Kristian H. Kristensen 9578dde1c8 freedreno: Fix the Makefile.am fix
Commit b028ce29f0 fixed a typo in
src/freedreno/Makefile.am, but ended up breaking the build for
freedreno.  The typo inadvertently made things work, as we were not
supposed to link with libnir or libmesautil to begin with.  Those come
in through libmesagallium and the typo prevented the duplicated
linkage.

Fixes: b028ce29f ("freedreno: add the missing _la in libfreedreno_ir3_la")
Cc: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 14:28:09 -08:00
Emil Velikov b028ce29f0 freedreno: add the missing _la in libfreedreno_ir3_la
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Emil Velikov b30e37ec64 freedreno: drop duplicate MKDIR_GEN declaration
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-10 16:35:01 +00:00
Rob Clark d014af98b7 freedreno/drm: fix memory leak
Fix an emberrasing memory leak with the non-softpin submit/rb
implementation.

Fixes: f3cc0d2747 freedreno: import libdrm_freedreno + redesign submit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 14:12:12 -05:00
Rob Clark 5c2c1f0a2d freedreno/ir3: track max flow control depth for a5xx/a6xx
Rather than just hard-coding BRANCHSTACK size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark 9517037bdc freedreno/ir3: code-motion
Split up ir3_compiler_nir.c a bit before starting to add new stuff for
a6xx SSBO/image instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark e37351fa57 freedreno/ir3: sync instr/disasm
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark 0d240c2214 freedreno/ir3: don't fetch unused tex components
Detect when a component of an (for example) texture fetch is unused and
propagate the updated wrmask back to the parent instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Rob Clark e779725f0b freedreno/drm: fix relocs in nested stateobjs
If we have an reloc from stateobjA to stateobjB, we would previously
leave stateobjB's bos out of the submit's bos table.  Handle this case
by copying into stateobjA's reloc_bos table.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-07 13:49:21 -05:00
Jason Ekstrand dca6cd9ce6 nir: Make boolean conversions sized just like the others
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64.  This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:07 -06:00
Jonathan Marek e68cd91251 freedreno: use MSM_BO_SCANOUT with scanout buffers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2018-11-27 15:44:03 -05:00
Rob Clark aa0fed10d3 freedreno: move ir3 to common location
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be
re-used by some future vulkan driver.  The parts that are gallium
specific have been refactored out and remain in the gallium driver.

Getting the move done now so that it can happen before further
refactoring to support a6xx specific instructions.

NOTE also removes ir3_cmdline compiler tool from autotools build since
that was easier than fixing it and I normally use meson build.  Waiting
patiently for the day that we can remove *everything* from the autotools
build.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00
Rob Clark b4476138d5 freedreno: move drm to common location
So that we can re-use at least parts of it for vulkan driver, and so
that we can move ir3 to a common location (which uses fd_bo to allocate
storage for shaders)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-11-27 15:44:02 -05:00