Commit Graph

96883 Commits

Author SHA1 Message Date
Eric Engestrom 47273d7312 egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom 6414d6bd8d egl: drop EGL driver `name`
The "DRI2" name was reported as confusing when printing EGL infos (one
user reported thinking DRI3 was not working on his X server), and the
only alternative is Haiku, which can only be used on a Haiku machine.

The name therefore doesn't add any information that the user wouldn't
know already, so let's just drop it.

Cc: Kai Wasserbäch <kai@dev.carbon-project.org>
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Related-to: b174a1ae72 ("egl: Simplify the "driver" interface")
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom d7e769abec egl: drop always-false TestOnly option
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Nicholas Miell 3012885b3f Fix the xf86vm meson dependency
The pkg-config file is called xxf86vm.

Signed-off-by: Nicholas Miell <nmiell@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom 8cb84c8477 egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}
Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's
overwritten by the EGL_NOT_INITIALIZED in eglInitialize().

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom 4893673b15 egl_dri2: drop dri2_egl_driver struct
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom 7823cfe9fe egl_dri2: move glFlush out of struct dri2_egl_driver
There's no reason to store this there, it doesn't depend on the driver.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-18 17:25:41 +01:00
Roland Scheidegger 3d0deed12a llvmpipe: handle shader sample mask output
This probably isn't all that useful for GL, but there are apis where
sample_mask is a valid output even without msaa.
Just discard the pixel if the sample_mask doesn't include the bit for
sample 0.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-18 18:16:44 +02:00
Vinson Lee c5124fbc74 anv: Fix instance typos.
Fix build error.

  CC       vulkan/vulkan_libvulkan_common_la-anv_device.lo
In file included from vulkan/anv_device.c:33:0:
vulkan/anv_device.c: In function ‘anv_AllocateMemory’:
vulkan/anv_device.c:1562:37: error: ‘struct anv_device’ has no member named ‘instace’; did you mean ‘instance’?
          result = vk_errorf(device->instace, device,
                                     ^
vulkan/anv_private.h:317:17: note: in definition of macro ‘vk_errorf’
     __vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\
                 ^~~~~~~~

Fixes: 9775894f10 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 09:08:08 -07:00
Brian Paul e17aa6cd9d mesa: fix trivial typo in _mesa_PixelMapusv() error string
Signed-off-by: Brian Paul <brianp@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103323
2017-10-18 09:53:00 -06:00
Eric Engestrom 2515eb63f8 meson: move expat dependency where it's needed
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 14:27:20 +01:00
Hongxu Jia 05fc62d89f automake: intel: move expat handling where it's used
Linking libvulkan_intel.so can fail, due to unresolved references to
libexpat.so.

EXPAT_CFLAGS should be moved as well.

Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-18 14:27:20 +01:00
Timothy Arceri e5e9e21e9f radv: don't create dummy fs when compiling compute stage
Fixes: d1c9f30d7f "radv: add radv_create_shaders() helper"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 22:47:53 +11:00
Samuel Pitoiset e6b9abf294 radv: use the dispatch initiator for indirect dispatches
Missed that when I allowed waves to be launched out-of-order.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 11:22:41 +02:00
Samuel Pitoiset 095e709717 radv: remove XtoY_temps structs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 11:22:39 +02:00
Tapani Pälli 6ef9bea734 anv: Install as Vulkan HAL module in Android.mk build
Now that anvil fully implements the Vulkan HAL interface, we can install
it as the vendor HAL module at /vendor/lib/hw/vulkan.${board}.so. To do
so:

  - Rename LOCAL_MODULE to vulkan.$(TARGET_BOARD_PLATFORM).
  - Use LOCAL_PROPRIETARY_MODULE to install under vendor path.

Tested by running different Sascha Williams demos on Android-IA.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
[chadv: Extract this hunk from Tapani's patch, and embed it as
 stand-alone patch in my arc-vulkan series].
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-18 00:23:38 -07:00
Chad Versace 053d4c328f anv: Implement VK_ANDROID_native_buffer (v9)
This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

  Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
  fd into a VkSemaphore or VkFence, the driver instead simply blocks on
  the sync fd, then puts the VkSemaphore or VkFence into the signalled
  state. Thanks to implicit sync, this produces correct behavior (with
  extra latency overhead, perhaps) despite its ugliness.

  Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
  a collection of wait semaphores as a sync fd, the driver instead
  submits the semaphores to the queue, then returns sync fd -1, which
  informs the caller that no additional synchronization is needed.
  Again, thanks to implicit sync, this produces correct behavior (with
  extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

  Reason 1. I've already tested this patch with dEQP and with demos
  apps. It works. I wanted to get the tested patches into the tree now,
  and polish the implementation afterwards.

  Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
  supports neither Android's sync_fence, nor upstream's sync_file, nor
  drm_syncobj. Again, I tested these patches on Android with a 3.18
  kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===========
I did not test at all using the Android.mk buildsystem. I may have broke
it. Please test and review that.

Testing
=======
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass (as of patchset v9):

  - a little spinning cube demo APK
  - several Sascha demos
  - dEQP-VK.info.*
  - dEQP-VK.api.wsi.android.*
      (except dEQP-VK.api.wsi.android.swapchain.*.image_usage, because
      dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT)
  - dEQP-VK.api.smoke.*
  - dEQP-VK.api.info.instance.*
  - dEQP-VK.api.info.device.*

v2:
  - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
    the VkImage.
  - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
    during vkCreateImage. Instead, directly import its dma-buf during
    vkCreateImage with anv_bo_cache_import(). [for jekstrand]
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
    exist.

v3:
  - Delete duplicate #include "anv_private.h". [per Tapani]
  - Try to fix the Android-IA build in Android.vulkan.mk by following
    Tapani's example.

v4:
  - Unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE on the imported
    gralloc buffer, just as we do for all other winsys buffers in
    anv_wsi.c. [found by Tapani]

v5:
  - Really fix the Android-IA build by ensuring that Android.vulkan.mk
    uses Mesa' vulkan.h and not Android's.  Insert -I$(MESA_TOP)/include
    before -Iframeworks/native/vulkan/include. [for Tapani]
  - In vkAcquireImageANDROID, submit signal operations to the
    VkSemaphore and VkFence. [for zhou]

v6:
  - Drop copy-paste duplication in vkGetSwapchainGrallocUsageANDROID().
    [found by zhou]
  - Improve comments in vkGetSwapchainGrallocUsageANDROID().

v7:
  - Fix vkGetSwapchainGrallocUsageANDROID() to inspect its
    VkImageUsageFlags parameter. [for tfiga]
  - This fix regresses dEQP-VK.api.wsi.android.swapchain.*.image_usage
    because dEQP wants to create swapchains with
    VK_IMAGE_USAGE_STORAGE_BIT.

v8:
  - Drop unneeded goto in vkAcquireImageANDROID. [for tfiga]

v8.1: (minor changes)
  - Drop errant hunks added by rerere in anv_device.c.
  - Drop explicit mention of VK_ANDROID_native_buffer in
    anv_entrypoints_gen.py. [for jekstrand]

v9:
  - Isolate as much Android code as possible, moving it from anv_image.c
    to anv_android.c. Connect the files with anv_image_from_gralloc().
    Remove VkNativeBufferANDROID params from all anv_image.c
    funcs. [for krh]
  - Replace some intel_loge() with vk_errorf() in anv_android.c.
  - Use © in copyright line. [for krh]

Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v5)
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v9)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v9)
Cc: zhoucm1 <david1.zhou@amd.com>
Cc: Tomasz Figa <tfiga@chromium.org>
2017-10-18 00:23:38 -07:00
Chad Versace 9775894f10 anv: Move size check from anv_bo_cache_import() to caller (v2)
This change prepares for VK_ANDROID_native_buffer. When the user imports
a gralloc hande into a VkImage using VK_ANDROID_native_buffer, the user
provides no size. The driver must infer the size from the internals of
the gralloc buffer.

The patch is essentially a refactor patch, but it does change behavior
in some edge cases, described below. In what follows, the "nominal size"
of the bo refers to anv_bo::size, which may not match the bo's "actual
size" according to the kernel.

Post-patch, the nominal size of the bo returned from
anv_bo_cache_import() is always the size of imported dma-buf according
to lseek(). Pre-patch, the bo's nominal size was difficult to predict.
If the imported dma-buf's gem handle was not resident in the cache, then
the bo's nominal size was align(VkMemoryAllocateInfo::allocationSize,
4096).  If it *was* resident, then the bo's nominal size was whatever
the cache returned. As a consequence, the first cache insert decided the
bo's nominal size, which could be significantly smaller compared to the
dma-buf's actual size, as the nominal size was determined by
VkMemoryAllocationInfo::allocationSize and not lseek().

I believe this patch cleans up that messy behavior. For an imported or
exported VkDeviceMemory, anv_bo::size should now be the true size of the
bo, if I correctly understand the problem (which I possibly don't).

v2:
  - Preserve behavior of aligning size to 4096 before checking. [for
    jekstrand]
  - Check size with < instead of <=, to match behavior of commit c0a4f56
    "anv: bo_cache: allow importing a BO larger than needed". [for
    chadv]
2017-10-17 23:46:06 -07:00
Dylan Baker fbf39fd7c3 meson: turn on pl111 not vc4 when pl111 driver specificed
Reviewed-by: Eric Anholt <eric@anholt.net>
fixes: 1918c9b162 ("meson: Add support for the pl111 driver.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-17 15:34:35 -07:00
Bas Nieuwenhuizen 06f05040eb radv: Link shaders.
Here we make use of NIR the linking helpers to remove unused
varyings.

Sascha Willems demo results:

computecullandlod 39 -> 41 fps
pipelines ~6100 -> ~6200 fps

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Timothy Arceri dbbf10541b radv: reuse the multiple shader store & load functions for gs copy variant
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Timothy Arceri 351f9dde60 radv: remove some now unused shader compile code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Timothy Arceri 7d45d22fdd radv: switch to using radv_create_shaders()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen d1c9f30d7f radv: add radv_create_shaders() helper
This is a combined shader creation helper than will help us to
create the shaders for each stage at once. This will allow us to
do some link time optimisations.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen ed9218f154 radv: add radv_hash_shaders() helper
This will be used to create a hash of the combined shaders in the
pipeline.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen 7f29055751 radv: Add multiple shader cache store & load functions.
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen 670c02b430 radv: Change cache datastructures for combined pipelines.
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Timothy Arceri 56998558f4 radv: reorder init function calls
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Eric Anholt 4f3e380fa0 meson: Add support for the vc5 driver.
v2: Default vc5 to off, since it requires the simulator currently.  Add
    missing dep on the XML generation from libbroadcom_vc5.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
2017-10-17 13:41:59 -07:00
Eric Anholt 1918c9b162 meson: Add support for the pl111 driver.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 13:41:59 -07:00
Eric Anholt 1ae8018a6a meson: Add support for the vc4 driver.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 13:41:59 -07:00
Marek Olšák 2f4705afde radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer
SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0
if there is no other buffer there.

Benefits:
- there is no constbuf descriptor upload and shader load

It's assumed that all constant addresses are within bounds. Non-constant
addresses are clamped against the last declared CONST variable.
This only works if the state tracker ensures the bound constant buffer
matches what the shader needs.

Once we get 32-bit pointers, we can only do this for user constant buffers
where the driver is in charge of the upload so that it can guarantee a 32-bit
address.

The real performance benefit might not be measurable.

These apps get 100% theoretical benefit in all shaders (except where noted):
- antichamber
- barman arkham origins
- borderlands 2
- borderlands pre-sequel
- brutal legend
- civilization BE
- CS:GO
- deadcore
- dota 2 -- most shaders
- europa universalis
- grid autosport -- most shaders
- left 4 dead 2
- legend of grimrock
- life is strange
- payday 2
- portal
- rocket league
- serious sam 3 bfe
- talos principle
- team fortress 2
- thea
- unigine heaven
- unigine valley -- also sanctuary and tropics
- wasteland 2
- xcom: enemy unknown & enemy within
- tesseract
- unity (engine)

Changed stats only:
    SGPRS: 2059998 -> 2086238 (1.27 %)
    VGPRS: 1626888 -> 1626904 (0.00 %)
    Spilled SGPRs: 7902 -> 7865 (-0.47 %)
    Code Size: 60924520 -> 60982660 (0.10 %) bytes
    Max Waves: 374539 -> 374526 (-0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 854593b8eb ac: clean up ac_build_indexed_load function interfaces
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák cdb21dfffa radeonsi: handle 64-bit loads earlier in fetch_constant
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák ee0e1a47ce radeonsi: add si_descriptors::gpu_address and remove buffer_offset
This allows us to change the pointer arbitrarily.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 6d2664880c radeonsi: unify code for extracting a buffer address from a descriptor
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 8d2685d129 radeonsi: remove atom parameter from si_upload_descriptors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 4ddce1b1a4 radeonsi: pack si_descriptors better again
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 859eeffb3d radeonsi: emit dirty consecutive pointers in one SET_SH_REG packet
IB size: -1.6%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 36626ffe46 radeonsi: split si_emit_shader_pointer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 69325fa88d radeonsi: generalize the SI_VS_SHADER_POINTER_MASK macro
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 79c2e7388c radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON
IB size: -0.4%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák b762a08896 radeonsi/gfx9: move RW_BUFFERS from s[0:1] to s[8:9] for HS and GS
Let's use the same user data SGPRs in all stages.
(for SPI_SHADER_USER_DATA_COMMON_0)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 0aafedbbb2 radeonsi: add GFX-IB-size query to the HUD
It shows the sum of all IBs per frame.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 4d944c72b1 winsys/amdgpu: disable CPU caching for GFX & SDMA IBs
This should decrease IB fetch latency.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 49f5ce39c1 winsys/amdgpu: don't do read-modify-write on command buffers
i.e. don't use |=

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Eric Anholt cde209960c broadcom/vc4: Fix false-positive for the tiling ioctls on simulator mode.
If there happened to be an ENOENT laying around, we would try using the
ioctls later and fail out resource allocation.
2017-10-17 12:35:16 -07:00
Eric Anholt b202f90f65 broadcom/vc4: Skip BO labeling when in simulator mode.
It was calling down into i915 trying to label the BO, which is definitely
not the right thing.
2017-10-17 12:35:16 -07:00
Eric Anholt d623a34ab2 broadcom/vc5: Don't forget to set the RT format for 1555 textures.
Fixes dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb5_a1
2017-10-17 12:35:16 -07:00
Chad Versace b5dc551014 anv: Add func anv_gem_get_tiling()
Will use in VK_ANDROID_native_buffer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00