Commit Graph

31152 Commits

Author SHA1 Message Date
Marek Olšák 2beb31bd7c radeonsi/gfx9: compile shaders with +xnack
so that LLVM doesn't allocate SGPRs where XNACK is.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-22 19:23:39 +02:00
Rhys Kidd 499f45163a vc4: Remove dead code in vc4_dump_surface_msaa()
Coverity caught the use of dead code copy-paste for
found_colors[] and num_found_colors.

CID: 1341850
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-22 09:50:22 -07:00
John Brooks bf4d7671f4 driconf: Add allow_glsl_builtin_variable_redeclaration option
This option will allow GLSL builtins to be redeclared verbatim (e.g.
redeclaring "in int gl_VertexID" in a vertex shader). This is not strictly
valid and would normally fail to compile, but some applications (such as
newer Techland ports) do it and need more leniency.

v2 (Samuel Pitoiset):
    - Rename allow_glsl_builtin_redeclaration ->
      allow_glsl_builtin_variable_redeclaration

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-20 17:29:55 +02:00
Ilia Mirkin 61d8f3387d nv50,nvc0: clear index buffer bufctx bin unconditionally
The previous condition was to clear it out if it had previously been
set, not what's in the current draw. That information is gone now, so
just clear it unconditionally.

Fixes: 330d0607e ("gallium: remove pipe_index_buffer and set_index_buffer")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-20 04:20:11 -04:00
Ilia Mirkin 85d2186326 nv50: fix vtxbuf cleanup
Use a user-buffer-aware cleanup function.

Fixes: c24c3b94ed ("gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytes")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-20 04:20:11 -04:00
Ilia Mirkin 82e77d4e44 nvc0/ir: SHLADD's middle source must be an immediate
The instruction encodings only allow for immediates. Don't try to
replace a zero (which is dumb to have in that op in any case) with RZ.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-05-20 03:12:40 -04:00
Emil Velikov 5233eaf9ee automake: add SWR LLVM gen_builder.hpp workaround
As gen_builder.hpp file is generated, it contains information that is
specific to the LLVM version it originates from.

As suggested by Tim, the file seems to be forwards compatible. So in
order to produce ship a file which will work everywhere we should be
using earlies supported LLVM - 3.9.

With this we're back on track and can build all of mesa without
python/mako/flex and friends.

In the long term we might want to see if the python generators can be
updated to produce LLVM version agnostic files. At least within the
range supported by SWR.

Cc: <mesa-stable@lists.freedesktop.org>
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-05-20 00:12:56 +01:00
Emil Velikov 912f24fd32 st/xvmc: add DRI3 support
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-19 19:46:54 +01:00
Emil Velikov fdc90e1286 st/omx: add DRI3 support
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
2017-05-19 19:46:54 +01:00
Emil Velikov fcbedce310 gallium/targets: link against XCB only as needed
OMX and VA can optionally use the X11 DRI2/DRI3, thus we should link
only as required.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:46:54 +01:00
Emil Velikov 115cb729d8 st/omx: fix building against X11-less setups
The vl_*_screen_create API properly falls back to a NOP when we're
building without specific platforms. So the only thing we need is to
handle the lack of X11/Xlib.h and provide a dummy Display define.

Cc: <mesa-stable@lists.freedesktop.org>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:46:49 +01:00
Emil Velikov d71ce62e84 st/omx: remove unneeded X11 include
En route to a X11-less builds

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:48 +01:00
Emil Velikov 8b9868ad4c st/omx: remove unused drm_driver.h includes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:47 +01:00
Emil Velikov 28703d605d st/va: check if vl_*_screen_create has failed only once
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:46 +01:00
Emil Velikov aaea53c2c0 st/va: fix misplaced closing bracket
It's been like this since the code was introduced.

Fixes: 86eb4131a9 (st/va: add headless support, i.e. VA_DISPLAY_DRM)
Cc: <mesa-stable@lists.freedesktop.org>
Cc: Julien Isorce <julien.isorce@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:46 +01:00
Emil Velikov c34a008891 st/va: move variable declaration to where its used
... and make it const, since we shouldn't tinker with it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:46 +01:00
Emil Velikov 369e5dd939 auxiliary/vl: use vl_*_screen_create stubs when building w/o platform
Provide a dummy stub when the user has opted w/o said platform, thus
we can build the binaries without unnecessarily requiring X11/other
headers.

In order to avoid build and link-time issues, we remove the HAVE_DRI3
guards in the VA and VDPAU state-trackers.

With this change st/va will return VA_STATUS_ERROR_ALLOCATION_FAILED
instead of VA_STATUS_ERROR_UNIMPLEMENTED. That is fine since upstream
users of libva such as vlc and mpv do little error checking, let
alone distinguish between the two.

Cc: Leo Liu <leo.liu@amd.com>
Cc: Guttula, Suresh <Suresh.Guttula@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:46:41 +01:00
Emil Velikov acf3d2afab configure: check once for DRI3 dependencies
Currently we are having the XCB_DRI3 dependencies duplicated,
partially.

Just do a once-off check and add all of the respective CFLAGS/LIBS
where needed.

As a nice side effect this helps us solve a couple of FIXMEs.

DRI3 is not a thing w/o X11 so disable it in such cases.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:15 +01:00
Rob Herring de6f3cce8c Android: r600: fix build when LLVM is disabled
There's still an error after my recent clean-up if LLVM is not patched to
enable AMDGPU target:

external/mesa3d/src/amd/common/ac_llvm_util.c:38:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUTargetInfo();
        ^
external/mesa3d/src/amd/common/ac_llvm_util.c:39:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUTarget();
        ^
external/mesa3d/src/amd/common/ac_llvm_util.c:40:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUTargetMC();
        ^
external/mesa3d/src/amd/common/ac_llvm_util.c:41:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUAsmPrinter();
        ^

We need to drop libmesa_amd_common when LLVM is disabled, however there's
still a dependency on include paths for ac_binary.h. So explicitly add the
include path when LLVM is disabled.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-19 19:03:08 +01:00
Rob Herring 5771ecc90e virgl: fix virgl_bo_transfer_{put, get} box struct copy
Commit 3dfe61ed6e ("gallium: decrease the size of pipe_box - 24 -> 16
bytes") changed the size of pipe_box, but the virgl code was relying on
pipe_box and drm_virtgpu_3d_box structs having the same size/layout doing
a struct copy. Copy the fields one by one instead.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Fixes: 3dfe61ed6e ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-19 19:02:32 +01:00
Marek Olšák 807e1d2577 radeonsi/gfx9: use CE RAM optimally
On GFX9 with only 4K CE RAM, define the range of slots that will be
allocated in CE RAM. All other slots will be uploaded directly. This will
switch dynamically according to which slots are used by current shaders.

GFX9 CE usage should now be similar to VI instead of being often disabled.

Tested on VI by taking the GFX9 CE allocation codepath and setting
num_ce_slots = 2 everywhere to get frequent switches between both modes.
CE is still disabled on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák 1cde473ec0 radeonsi: remove CE offset alignment restriction
This was only needed by LOAD_CONST_RAM, which is now only used to load
whole CE.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák a7f098fb76 radeonsi: only upload (dump to L2) those descriptors that are used by shaders
This decreases the size of CE RAM dumps to L2, or the size of descriptor
uploads without CE.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák 53c2ef36da radeonsi: record which descriptor slots are used by shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák 38828094e9 radeonsi: update si_ce_needed_cs_space
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák edb59ef2dc radeonsi: do only 1 big CE dump at end of IBs and one reload in the preamble
A later commit will only upload descriptors used by shaders, so we won't do
full dumps anymore, so the only way to have a complete mirror of CE RAM
in memory is to do a separate dump after the last draw call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák 06690e63f7 radeonsi: remove early return in si_upload_descriptors
All updates of descriptors_dirty also set dirty_mask, so the return is
unnecessary. The next commit will want this function to be executed
even if dirty_mask == 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák b8f8d9e46c radeonsi: clamp indirect index to the number of declared shader resources
We'll do partial uploads of descriptor arrays, so we need to clamp
against what shaders declare.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák f07c15ef80 radeonsi: merge sampler and image descriptor lists into one
Sampler slots: slot[8], .. slot[39] (ascending)
Image slots: slot[7], .. slot[0] (descending)

Each image occupies 1/2 of each slot, so there are 16 images in total,
therefore the layout is: slot[15], .. slot[0]. (in 1/2 slot increments)

Updating image slot 2n+i (i <= 1) also dirties and re-uploads slot 2n+!i.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák 5df24c3fa6 radeonsi: merge constant and shader buffers descriptor lists into one
Constant buffers: slot[16], .. slot[31] (ascending)
Shader buffers: slot[15], .. slot[0] (descending)

The idea is that if we have 4 constant buffers and 2 shader buffers, we only
have to upload 6 slots. That optimization is left for a later commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák d88ca12350 gallium/u_threaded: add a fast path for unbinding shader buffers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák d4c8f429d1 gallium/u_threaded: add a fast path for unbinding shader images
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Samuel Pitoiset 1468e29e02 radeonsi: get the sampler view type from inst->Texture for TG4
This will also magically fix this special lowering for
bindless samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Samuel Pitoiset 5cb2eee557 tgsi: store the sampler view type directly in the instruction
RadeonSI needs to do a special lowering for Gather4 with integer
formats, but with bindless samplers we just can't access the index.

Instead, store the return type in the instruction like the target.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Samuel Pitoiset ac3f6bf608 tgsi: remove some unused OPCODE macros
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Tom Stellard 14e525a4d7 gallivm: Make sure module has the correct data layout when pass manager runs
The datalayout for modules was purposely not being set in order to work around
the fact that the ExecutionEngine requires that the module's datalayout
matches the datalayout of the TargetMachine that the ExecutionEngine is
using.

When the pass manager runs on a module with no datalayout, it uses
the default datalayout which is little-endian.  This causes problems
on big-endian targets, because some optimizations that are legal on
little-endian or illegal on big-endian.

To resolve this, we set the datalayout prior to running the pass
manager, and then clear it before creating the ExectionEngine.

This patch fixes a lot of piglit tests on big-endian ppc64.

Cc: mesa-stable@lists.freedesktop.org
2017-05-18 17:52:47 +00:00
Nicolai Hähnle 6c01c4b907 ac: add radeon_info::num_{sdma,compute}_rings
Vulkan needs them.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:53 +02:00
Nicolai Hähnle 98a2492290 ac_surface: use radeon_info from ac_gpu_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle 988c866212 ac/radeonsi: move radeon_info initialization to amd/common
v2: update Android.common.mk (Emil)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle de9dd4f9f1 ac/radeonsi: move struct radeon_info to ac_gpu_info.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle 4d6e75776d ac/radeonsi: move some aspects of sanity checking to ac_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle 00f466bad9 ac/radeonsi: add ac_compute_surface to automatically switch gfx6 vs. gfx9
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle 8aabed64c3 ac/radeonsi: move the bulk of gfx9_surface_init to ac_surface
We can now merge the two *_surface_init functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle db77cd879b ac/radeonsi: move the bulk of gfx6_surface_init to ac_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle f187a49322 ac/radeonsi: move amdgpu_addr_create to ac_surface
v2:
- update Android.common.mk (Emil)
- rebase on top of Raven support

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-05-18 11:48:51 +02:00
Nicolai Hähnle 15a844986a ac/radeonsi: move surface definitions to new header ac_surface.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Eric Anholt e8ea42d245 vc4: Don't allocate new BOs to avoid synchronization when they're shared.
If X11 did a software fallback to the entire screen, we would throw out
the BO the screen is scanning out from and allocate a new one.

Cc: mesa-stable@lists.freedesktop.org
2017-05-17 14:18:29 -07:00
Eric Anholt 50e78cd04f vc4: Drop pointless indirections around BO import/export.
I've since found them to be more confusing by adding indirections than
clarifying by screening off resources from the handle/fd import/export
process.
2017-05-17 14:18:26 -07:00
Eric Anholt 76e4ab5715 vc4: Drop the u_resource_vtbl no-op layer.
We only ever attached one vtbl, so it was a waste of space and
indirections.
2017-05-17 14:18:26 -07:00
Marek Olšák bd4b224fa6 gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSED
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00