Commit Graph

2206 Commits

Author SHA1 Message Date
Topi Pohjolainen c4ac0d4949 intel/isl/gen4: Represent cube maps with 3D layout
v2 (Jason): Check for !ISL_SURF_DIM_3D instead of CUBE_BIT.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen 171b72542c intel/isl: Add i915 to isl_tiling converter
v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Chad Versace 5d69052113 anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT.  We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.

Found by assertion failures while enabling VK_ANDROID_native_buffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-19 11:25:50 -07:00
Topi Pohjolainen 0926fb69a4 intel/blorp/gen4: Drop cube map flag for single face copy
This will falsely trigger an assert on number of layers once
isl is used for 3D layouts of Gen4 cube maps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen 4733891e51 intel/isl: Take 3D surfaces into account in image params
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:44 +03:00
Jason Ekstrand cd9fd68a50 anv: Advertise support for VK_KHR_variable_pointers
We don't support the general version yet because that requires us to
lower shared variables up-front in SPIR-V -> NIR.  This shouldn't be a
whole lot of work but it's not something we support today.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand bc9319583a anv: Advertise support for VK_KHR_storage_buffer_storage_class
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand 828c437078 intel/isl: Add a row_pitch parameter to surf_get_ccs_surf
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand c5700ed72e anv/image: Add INPUT_ATTACHMENT to the list of required usages
From the Vulkan 1.0.53 spec VU for vkCreateImageView:

    "image must have been created with a usage value containing at least
    one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
    VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
    VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand cbdfd1daa2 anv: Stop leaking the no_aux sampler surface state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand bd41564746 anv/cmd_buffer: Properly handle render passes with 0 attachments
We were early returning and never created the NULL surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Emil Velikov 43c188f970 anv: advertise v6 of the wayland surface extension
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-17 15:24:32 +01:00
Lionel Landwerlin 59adde0eab anv: ensure device name contains terminating character
v2: Use sizeof() (Chris)

CID: 1415113
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-07-17 14:36:38 +01:00
Jason Ekstrand 0ee8d81718 anv: Implement VK_KHR_external_memory_*
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand c02da9cad6 anv: Implement VK_KHR_dedicated_allocation
We always recommend sub-allocation and don't do anything special for
dedicated allocations.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 8c82aa5f43 anv: Implement VK_KHR_get_memory_requirements2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 5b57bdc1cf anv: Advertise version 1.0.54
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 227debdc92 vulkan: Update to the new 1.0.54 spec XML and headers
There is one small ANV change here because we used the
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had
to be updated to have the _KHR suffix.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand dc179aa123 anv: Drop support for VK_KHX_external_semaphore_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:58:51 -07:00
Jason Ekstrand 4ac94d0dee anv: Drop support for VK_KHX_external_memory_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-14 22:12:39 -07:00
Juan A. Suarez Romero 5cd4ece34e anv/pipeline: do not use BITFIELD64_BIT()
In the previous commit, forgot to apply v2 suggestions.

Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check
enable vertex inputs)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-07-14 10:33:19 +00:00
Juan A. Suarez Romero 28d0c38d85 anv/pipeline: use unsigned long long constant to check enable vertex inputs
When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.

But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.

Thus, use 1ull, which is an unsigned long long value.

This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved

v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-14 08:09:18 +00:00
Kenneth Graunke b2da123801 i965: Use pushed UBO data in the scalar backend.
This actually takes advantage of the newly pushed UBO data, avoiding
pull loads.

Improves performance in GLBenchmark Manhattan 3.1 by:

   HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%.
   (thanks to Eero Tamminen for these numbers)

shader-db results on Skylake, ignoring programs with spill/fill changes:

   total instructions in shared programs: 13963994 -> 13651893 (-2.24%)
   instructions in affected programs: 4250328 -> 3938227 (-7.34%)
   helped: 28527
   HURT: 0

   total cycles in shared programs: 179808608 -> 172535170 (-4.05%)
   cycles in affected programs: 79720410 -> 72446972 (-9.12%)
   helped: 26951
   HURT: 1248

   LOST:   46
   GAINED: 21

Many "Deus Ex: Mankind Divided" shaders which already spilled end up
spill a lot more (about 240 programs hurt, 9 helped).  The cycle
estimator suggests this is still overall a win (-0.23% in cycle counts)
presumably because we trade pull loads for fills.

v2: Drop "PULL" environment variable left in for initial debugging
    (caught by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke c9ef27e77b i965: Factor out push locations.
With UBOs, the answer of "have we decided to push this uniform" gets
a bit more complicated - for one, we have multiple surfaces.  This
patch refactors things so we can add the new code in a single place.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke 4f586cd8f1 i965: Push UBO data, but don't use it just yet.
This patch starts uploading UBO data via 3DSTATE_CONSTANT_* packets,
and updates the compiler to know that there's extra payload data, so
things continue working.  However, it still issues pull loads for all
data.  I wanted to separate the two aspects for greater bisectability.

v2: Update for new intel_bufferobj_buffer parameter.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:30 -07:00
Kenneth Graunke 6d28c6e52c i965: Select ranges of UBO data to be uploaded as push constants.
This adds a NIR pass that decides which portions of UBOS we should
upload as push constants, rather than pull constants.

v2: Switch to uint16_t for the UBO block number, because we may
    have a lot of them in Vulkan (suggested by Jason).  Add more
    comments about bitfield trickery (requested by Matt).

v3: Skip vec4 stages for now...I haven't finished wiring up support
    in the vec4 backend, and so pushing the data but not using it
    will just be wasteful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke 8ec5a4e4a4 i965: Switch to absolute addressing for constant buffer 0.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.  I'd like
to be able to use all four push buffers.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Lionel Landwerlin 6131a1ae40 aubinator: don't leak fd of opened aubfile
CID: 1373563
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:50 +01:00
Lionel Landwerlin d1bd731e30 anv: don't use strcpy for copying strings
CID: 1358935
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:47 +01:00
Lionel Landwerlin 226fae7849 intel/compiler: no need to check unsigned is >= 0
CID: 1338342
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:45 +01:00
Lionel Landwerlin 95c917668c intel/compiler: don't check unsigned is >= 0
CID: 1224468
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:38 +01:00
Lionel Landwerlin a25a533458 intel/compiler: remove check unsigned is >= 0
By definition unsigned are always >= 0.

CID: 742212
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:29 +01:00
Lionel Landwerlin 19869d6091 isl: use 64bit arithmetic to compute size
If we allow the size to be more than 2^32, then we should compute it
in 64bit arithmetic otherwise we might run into overflow issues.

CID: 1412892, 1412891
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:26 +01:00
Jason Ekstrand 5b3363e3f1 intel/isl: Add a helper to convert tilings from ISL to i915
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand a668ba9c18 intel/isl: Add basic modifier introspection
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Anuj Phogat 0a56c5f3f1 intel/compiler: Don't use opt_sampler_eot() optimization on gen10+
This optimization has been removed on gen10+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-12 11:27:31 -07:00
Eric Anholt 5d6271c6a5 intel: Move the DRM uapi headers to a non-Intel location.
I want to remove vc4's dependency on headers from libdrm as well, but
storing multiple copies of drm_fourcc.h in our tree would be silly.

v2: Update Android.mk as well, move distcheck drm*.h references to
    top-level noinst_HEADERS.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Rob Herring <robh@kernel.org>
2017-07-12 10:58:33 -07:00
Jason Ekstrand 8e3d9c5d09 anv: Round u_vector element sizes to a power of two
This fixes 32-bit builds of the driver.  Commit 08413a81b9
changed things so that we now put struct anv_states in the u_vector for
binding tables.  On 64-bit builds, sizeof(struct anv_state) is a power
of two but it isn't on 32-bit builds.

Fixes: 08413a81b9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 10:34:13 -07:00
Lionel Landwerlin 384aaa4d3f intel: add number of subslices to device info
We could have used a single integer to store that value, but
Cannonlake has different number of subslices per slice depending on
the GT.

v2: Add CFL subslice numbers (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-07-11 16:14:57 +01:00
Kenneth Graunke c2c37f5185 intel: Fix clflushing on modern (Baytrail+) Atom CPUs.
Thanks to Chris Wilson for pointing this out.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-10 15:55:26 -07:00
Kenneth Graunke 3e50607a40 intel: Move clflush helpers from anv to common/gen_clflush.h.
I want to use these in the OpenGL driver as well.

v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:19 -07:00
Jason Ekstrand 781263486f anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITE
The reason we were doing this was to ensure that the kernel did the
appropriate cross-ring synchronization and flushing.  However, the
kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to
insert a fence.  It only cares about the domain for determining whether
or not it needs to clflush the BO before using it for scanout but the
domain automatically gets set to RENDER internally by the kernel if
EXEC_OBJECT_WRITE is set.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-10 08:55:47 -07:00
Nanley Chery 753a7bbc84 Revert "intel/isl: Only create a CCS buffer if the image supports rendering"
This reverts commit 8aaa13467d, which was
based on an incorrect assumption. Unlike the restriction placed on image
views in the Vulkan API, OpenGL allows you to render to texture views
whose formats differ from the originals.

Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
2017-07-07 14:24:58 -07:00
Tomasz Figa 50a8a7377a intel: common: Fix link failure with standalone Android build
Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).

Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.

Fixes: d5b355ce5f ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 18:49:21 +01:00
Samuel Iglesias Gonsálvez 5dd96b1156 anv: check support for enabled features in vkCreateDevice()
From Vulkan spec, 4.2.1. "Device Creation":

  "vkCreateDevice verifies that extensions and features requested in
   the ppEnabledExtensionNames and pEnabledFeatures members of
   pCreateInfo, respectively, are supported by the implementation."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@gmail.com>
2017-07-03 08:01:31 +02:00
Samuel Iglesias Gonsálvez ba05f6f72b anv: merge tessellation's primitive mode in merge_tess_info()
SPIR-V tessellation shaders that were created from HLSL will have
the primitive generation domain set in tessellation control shader
(hull shader in HLSL) instead of the tessellation evaluation shader.

v2:
- Add assert (Kenneth)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-03 08:00:43 +02:00
Lionel Landwerlin 038c45a40e anv: fix reported timestampPeriod value
We lost some precision on a previous change due to switching to
integers. Since we report a float in timestampPeriod, we want the
division to happen in floats.

CID: 1413021
Fixes: c77d98ef32 ("intel: common: express timestamps units in frequency")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-02 12:11:55 +01:00
Lionel Landwerlin 34560ba9e5 intel: genxml: make a couple of enums show up in aubinator
In particular Shader Channel Select & Texture Address Control Mode.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-07-02 00:45:38 +01:00
Johnson Lin 165e704719 i965/i915: Add UYVY as the supported format
Trigger the correct sampler options for it. Similar with YUYV

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Mauro Rossi b693fd8464 android: anv: drop libdrm_intel dependency
In addition to Rob Herring "Android: i965: remove libdrm_intel dependency",
we can drop libdrm_intel dependency in anv for Android.

Please check if libdrm has to stay as shared dependency and drop this comment line.

Fixes: 7dd20bc ("anv/i965: drop libdrm_intel dependency completely")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-29 12:31:00 +01:00
Lionel Landwerlin d8bf2861ad anv: use devinfo for number of thread/eu
It turns out Gen9LP has fewer threads per EU (6 vs 7).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-06-29 10:07:52 +01:00
Juan A. Suarez Romero 93b8dc4b94 intel: tools: add intel_aub.h as part of aubinator
Include intel_aub.h in the Makefile.tools.am

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Juan A. Suarez Romero be5fe2153b intel: automake: include Makefile.drm.am
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Ian Romanick 36bd4a5f21 genxml: Silence about a billion unused parameter warnings
v2: Use textwrap.dedent to make the source line a lot shorter.
Shortening (?) the line was requested by Jason.

v3: Simplify the texwrap.dedent usage.  Suggested by Dylan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-06-28 14:50:14 -07:00
Lionel Landwerlin 7dd20bc3ee anv/i965: drop libdrm_intel dependency completely
With Ken's work to drop the library dependency on libdrm_intel, we now
only depend on libdrm for the kernel uapi headers it provides. It
seems like we're better off just embeddeding those headers ourselves,
making the lives of people developping news features tightly
integrated with the kernel a tiny bit easier.

This change also makes it a bit more obvious what cflags/libs are
required by the i915 drivers vs i965, by renaming INTEL_CFLAGS/LIBS
into I915_CFLAGS/LIBS.

Headers were generated from drm-tip on the following commit :

   commit 6d61e70ccc21606ffb8a0a03bd3aba24f659502b
   Merge: 338ffbf7cb5e c0bc126f97fb
   Author: Dave Airlie <airlied@redhat.com>
   Date:   Tue Jun 27 07:24:49 2017 +1000

       Backmerge tag 'v4.12-rc7' into drm-next

v2: Use installed files from the kernel (Daniel Vetter)

v3: Use headers from drm-next rather than drm-tip (Dave/Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin 230691b8e5 aubinator: import intel_aub.h from libdrm
This enables us to compile aubinator without the libdrm dependency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:28 +03:00
Topi Pohjolainen fbcc9555c5 intel/anv: Add missing break in anv_CreateDevice()
CID: 1413018
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-27 10:19:55 +03:00
Ian Romanick 1b101ca809 blorp: Use normalized coordinates on Gen6
Apparently, the sampler has some sort of precision issues for
non-normalized texture coordinates with linear filtering.  This caused
some small precision issues in scaled blits.  Work around this by using
normalized coordinates.  There is some extra work necessary because Gen6
uses TEX (instead of TXF) for some multisample resolve blits.

Fixes piglit.spec.arb_framebuffer_object.fbo-blit-stretch on SNB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-06-26 13:41:11 -07:00
Nanley Chery d6748f1fc4 anv/gpu_memcpy: Rename the gpu_memcpy function
A GPU memcpy function could alternatively be implemented using MI_*
commands. Provide more detail into how this one operates in case another
memcpy function is created.

v2:
- Update the commit message.
v3:
- Use 'memcpy' instead of 'cpy' (Jason Ekstrand)
- Shorten 'streamout' to 'so'

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 1415e7a997 anv/blorp: Provide surface states for CCS resolves
In the future, we plan on using this method to resolve images whose
surface state fast-clear value is dynamically updated during command
buffer execution. Start using it now for testing and to reduce churn
later on.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 4b2a2b70e0 anv/blorp: Add a surface-state-based CCS resolve function
This will be used in the next patch.

v2:
- Omit BLORP_BATCH_NO_EMIT_DEPTH_STENCIL (Jason Ekstrand)
- Update commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery d1119ab7b6 blorp/clear: Add a binding-table-based CCS resolve function
v2:
- Do layered resolves.
(Jason Ekstrand):
- Replace "bt" suffix with "attachment".
- Rename helper function to prepare_ccs_resolve.
- Move blorp_params_init() into helper function.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 6235f08ff8 anv: Adjust params of color buffer transitioning functions
Splitting out these fields will make the color buffer transitioning
function simpler when it gains more features.

v2: Remove unintended blank line (Iago Toral)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery e15b1c41a4 anv/blorp: Remove 3D subresource transition workaround
For 3D image subresources undergoing a layout transition via
PipelineBarrier, we increase the number of fast-cleared layers to match
the intended behaviour of KHR_maintenance1. When such subresources
undergo layout transitions between subpasses, we don't do this to avoid
failing incorrect CTS tests. Instead, unify the behaviour in both
scenarios, and wait for the CTS tests to catch up. See CL 1111 for the
test fix and Vulkan issue #849 for more information.

On SKL+, this causes 3 test failures under:
dEQP-VK.pipeline.render_to_image.3d.*

v2: Add a reference to the Vulkan issue (Iago Toral).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 5ca2fbcee2 anv/cmd_buffer: Adjust the image view reloc function
Make the function take in an image instead of an image view. This
enables us to record relocations for surfaces states created outside of
the anv_CreateImageView path.

v2 (Jason Ekstrand):
- Use image->offset instead of surf_offset in aux_offset calculation.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 5f4f50419c anv/cmd_buffer: Adjust layout transition aspect checking
Reflect the fact that an image view or subresource range with the color
aspect cannot have any other aspect.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery bc838fc759 anv: Add and use color auxiliary buffer helpers
v2:
- Check for aux levels in layer helper (Jason Ekstrand)
- Don't assert aux is present, return 0 if it isn't.
- Use the helpers.
v3:
- Make the helpers aspect-agnostic (Jason Ekstrand)
- Drop anv_image_has_color_aux()

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 8aaa13467d intel/isl: Only create a CCS buffer if the image supports rendering
v2: Omit the commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery b934330191 intel/isl: Limit CCS to one level and layer on gen7
v2 (Jason Ekstrand):
- Remove Vulkan-specific terminology from the commit title.
- Replace '== 7' with '<= 7' to hint that this is a new feature on BDW+.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 6b23c65f3a intel/blorp: Check for layer fast-clear restriction
v2: Update commit title (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery b46a071758 intel/blorp: Assert levels and layers are in range
v2 (Jason Ekstrand):
- Update commit title.
- Check aux level and layer as well.
v3 (Jason Ekstrand):
- Move the non-aux layer check.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Eric Engestrom 2b237ff64c anv: use Mesa's u_atomic.h header
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-26 18:21:22 +01:00
Anuj Phogat 7896dee349 anv/cnl: Don't write to Cache Mode Register 1 on gen10+
For PartialResolveDisableInVC field recommendation is to
always set this to 0 and that's the default value of the bit.
So, we have nothing left to write to CACHE_MODE_1.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-23 11:16:00 -07:00
Rafael Antognolli 9b78a52042 genxml: fix gen5 sampler border color state.
Based on the current code, gen5 and gen6 have the same sampler border color
state struct. So fix the gen5 one to match gen6.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Rafael Antognolli f43c21cbbd aubinator: Dump sampler state pointers on gen6 too.
We already have a function to dump sampler states, so do that for gen6
too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Chad Versace ecd8f85802 anv: Fix -Wswitch in anv_layout_to_aux_usage()
anv_layout_to_aux_usage() lacked a case for
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR. Add an unreachable case, because we
don't support the extension.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 15:18:24 -07:00
Anusha Srivatsa de7ed0ba55 i965/CFL: Add PCI Ids for Coffee Lake.
Coffee Lake has a gen9 graphics following KBL.
From 3D perspective, CFL is a clone of KBL/SKL features.

v2: Change commit message, correct alignment <Anuj Phogat>
v3: Update IDs.
v4: Initialize l3_banks, correct nomenclature <Anuj>

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Acked-by: Benjamin Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-22 14:28:43 -07:00
Anuj Phogat 43d11b128c intel: Enable vulkan build for gen10
This patch just enables building Vulkan libs for gen10. We
still don't have gen 10 support enabled on Vulkan.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:46 -07:00
Anuj Phogat ac6bc0e034 anv/cnl: Generate and use gen10 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat c17e214a6b anv/cnl: Don't set FloatBlendOptimizationEnable{Mask}
This field is remove from CACHE_MODE_1 register in gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat bf1d2c37c6 anv/cnl: Use GENX(xx) in place of GEN9_xx
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat 1e5a5d18d1 anv/cnl: Add #defines for MOCS and genX(x)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat ceed55e7bb intel/genxml: Add Gen10 CACHE_MODE_1 definitions
Few of the fields in this register are changed as compared
to gen9.xml.

V2: Remove some fields which are not valid anymore.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 6338b63270 intel/genxml: Rename StartInstanceLocation to StartingInstanceLocation
This is required because we already have a macro defined with
the name StartInstanceLocation.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 8869c8b3dc intel/genxml: Rename IndirectStatePointer to BorderColorPointer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 97f75fdfd0 intel/genxml: Combine DataDWord{0, 1} fields in to ImmediateData field
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat c61b909d14 intel/genxml: Add INSTDONE registers in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 03fddd3c1d intel/genxml: Add better support for MI_MATH in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Andres Gomez 5352174d49 anv: FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR not used with VkFormatProperties.bufferFeatures
VK_FORMAT_FEATURE_TRANSFER_[SRC|DST]_BIT_KHR is a flag value of the
VkFormatFeatureFlagBits enum that can only be hold and checked against
the linearTilingFeatures or optimalTilingFeatures members of the
VkFormatProperties struct but not the bufferFeatures member.

>From the Vulkan® 1.0.51, with the VK_KHR_maintenance1 extension,
section 32.3.2 docs for VkFormatProperties:

   "* linearTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_LINEAR.

    * optimalTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_OPTIMAL.

    * bufferFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by buffers."

    ...

    Bits which can be set in the VkFormatProperties features
    linearTilingFeatures, optimalTilingFeatures, and bufferFeatures
    are:

    typedef enum VkFormatFeatureFlagBits {

    ...

      VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR = 0x00004000,
      VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR = 0x00008000,

    ...

    } VkFormatFeatureFlagBits;

    ...

    The following bits may be set in linearTilingFeatures and
    optimalTilingFeatures, specifying that the features are supported
    by images or image views created with the queried
    vkGetPhysicalDeviceFormatProperties::format:

    ...

    * VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR specifies that an image
      can be used as a source image for copy commands.

    * VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR specifies that an image
      can be used as a destination image for copy commands and clear
      commands."

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 13:45:22 +03:00
Rafael Antognolli 78b843af3c intel/genxml: Use the same naming convention for Floating Point Mode.
In newer gens, this field has a prefix and the non-IEEEE-745 mode is called
"Alternate", instead of simply "Alt".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli ce728594fd intel/genxml: Normalize URB Data field in WM_STATE.
On gen6+, this is called "Dispatch GRF Start Register For Constant/Setup Data
0", while on gen5 and lower it's called only "Dispatch GRF Start Register For
URB Data", but it's essentially the same thing (URB data), so rename it to
match newer gens and simplify the C code that handles it.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 44415056e7 intel/genxml: Rename field on WM_STATE to match gen6+.
"Pixel Shader Kill Pixel" -> "Pixel Shader Kills Pixel", which is how it's
called on newer gens.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 82c66965ac intel/genxml: Normalize fields on WM_STATE.
On gen4, WM_STATE only has one Kernel Start Pointer and one GRF Register
Count, but we can make the code that handles this on multiple gens simpler if
we add an index 0 to it too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli eddb1ebccf intel/genxml: Add missing field to CLIP_STATE.
Just because it's not set doesn't mean that it doesn't exist. And since the
field is there on newer gens, having it on gen5 simplifies the code when
porting gen5 and lower.

Also add missing value to API Mode on CLIP_STATE on gen4.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 9a5ae19cbb intel/genxml: Fix type of UserClipFlags ClipTest Enable Bitmask.
This is a bitmask, so it can't be a boolean. Also rename it so it matches
gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 19d1defcd5 intel/genxml: Add missing fields to CLIP_STATE on gen4-5.
These fields are set by brw_clip_unit, so we need them when converting to
genxml.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli faa4f5c42d intel/genxml: Normalize GS_STATE.
Rename "Rendering Enable" to "Rendering Enabled", so it matches gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Lionel Landwerlin 030abc6109 intel: compiler/i965: fix is_broxton checks
In 5f2fe9302c is_geminilake was introduced for the differenciate
broxton from geminilake. Unfortunately I failed as verifying that
is_broxton is throughout the code base to mean Gen9lp.

Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-20 23:26:42 +01:00
Ben Widawsky 3e1055591b i965/cnl: Add l3 configuration for Cannonlake
V2 (Anuj):
Squash the changes in one patch rebase on master.
Address the review comments made by Francisco Jerez.
Do the URB allocation per slice (not per bank).

V3 (Anuj):
Update the comment.
Format the table as other l3 config tables.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
---
V1 was sent out with the heading:
"i965/cnl: Properly handle l3 configuration"
2017-06-20 12:18:26 -07:00
Anuj Phogat 1024dad4d9 i965: Add a variable for way size per bank in get_l3_way_size()
Adding this variable better explains the computation of L3 way
size in the function.

V2: Use const variable for way_size_per_bank.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00
Anuj Phogat 8521559e08 i965: Fix broxton 2x6 l3 config
The new table added in this patch matches with the table
in gfxspecs. We were programming the wrong values earlier.

V2: Update the comment.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00
Ian Romanick cbb941cdec intel/blorp: Apply source offset in the TEX case
Previously the offset was only applied in the TXF case.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick 990f2be139 intel/blorp: Apply Gen4 coord. normalization after cubemap sizes are adjusted
Otherwise the values used for coordinate normalization use the wrong
sizes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Jason Ekstrand b2dd61196e intel/blorp: Set needs_(dst|src)_offset for Gen4 cubemaps
We call convert_to_single_slice so they may end up with a non-trivial
offset that needs to be taken into account.

v2 (idr): Also set needs_src_offset.  Suggested by Jason.

Fixes ES2-CTS.functional.texture.specification.basic_copyteximage2d.cube_rgba
and ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba
on G45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101284
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Lionel Landwerlin 6d759cbd49 intel: common: add number of thread per eu
This will be used by to normalize OA counters.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin c77d98ef32 intel: common: express timestamps units in frequency
Rather than storing the period as a double that looses some precision.

Also fixes the Gen9LP timestamp frequency which is no 19200123 but
19200000 as pointed by Ville :

https://lists.freedesktop.org/archives/intel-gfx/2017-April/125126.html

Finally add the Cannonlake timestamp frequency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin 5f2fe9302c intel: common: add flag to identify platforms by name
The perf infrastructure needs to identify specific platforms, not just
generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Jonas Kulla a52ee32a9a anv: Fix L3 cache programming on Bay Trail
Valid values for URBAllocation start at 32, so substract that
before programming the register.

This was missed when porting from the GL driver.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-19 12:05:52 -07:00
Topi Pohjolainen 0d1af164e1 intel/isl/gen6: Allow arrayed stencil
Nothing prevents arrayed stencil surfaces even though hardware
doesn't support mipmapping.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-17 06:38:56 +03:00
Rafael Antognolli 3a767f8b06 genxml: The viewport state offset is actually an address.
This fixes code generation on gen45.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli ad109c16c2 genxml: Rename fields to match gen6+.
"Anti-aliasing Enable" to "Anti-Aliasing Enable".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli 1b42cd52a2 genxml: Rename SF_STATE field to match gen6+.
Rename "Use Point Width State" to "Point Width Source". It accepts the same
values and has the same meaning as gen6+, so lets keep them with the same name
to simplify the code.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Anuj Phogat c07271fef0 intel/isl: Add the maximum surface size limit
V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and
    2^38 bytes for gen9+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-06-16 09:05:05 -07:00
Anuj Phogat 7022978237 intel/isl: Use uint64_t to store total surface size
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-06-16 09:05:05 -07:00
Jason Ekstrand 7175561598 intel/blorp: Work around Sandy Bridge occlusion query issue
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand 96f9d4de7d intel/isl: Properly set SeparateStencilBufferEnable on gen5-6
On gen5-6, SeparateStencilBufferEnable and HierarchicalDepthBufferEnable
come hand in hand and we have to set either both or neither.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Kenneth Graunke af373ea4a2 genxml: Fix Gen4-5 SF_STATE "Line Width" fixed point type.
It's a U3.1.  It became a U3.7 on Sandybridge.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Ben Widawsky e179a3438a i965/cnl: Add a preliminary device for Cannonlake
v2 (Anuj):
Rebased on master and updated pci ids
Remove redundant initialization of max_wm_threads to 64 * 12.
For gen9+ max_wm_threads are initialized in gen_get_device_info().

v3 (Anuj):
Move the patch to end of series.
Remove unused gt1, gt2, gt3 functions.
Remove l3_banks variable. Variable is now available on master.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:03:00 -07:00
Jason Ekstrand f2cbf738b4 anv: Don't advertise support on anything above gen9
This will prevent the driver from even trying to work on Cannon Lake
until we get actual support added.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:03:00 -07:00
Anuj Phogat 9acc93feeb i965/cnl: Enable CCS_E and RT support for few formats
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat 61f171292e i965/cnl: Reformat surface_format_info table to accomodate gen10+
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat f9e31a26d4 i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3
v1: By Ben Widawsky <benjamin.widawsky@intel.com>
v2: v1 had an assert only for VS. Add the restriction for GS, HS and
    DS as well and make sure the allocated sizes are not multiple of 3.
v3: Move the entry_size checks in to compiler code (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-09 16:02:59 -07:00
Anuj Phogat 111881abac i965/cnl: Handle gen10 in switch cases across the driver
V2: Start using gen10 functions isl_gen10*(), gen10_blorp_exec()
    gen10_init_atoms() (Jason)
    Remove Vulkan changes. Do them later in a separate patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat 30e749c8f1 i965/cnl: Update few assertions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat 56b4d82729 i965/cnl: Add cnl bits in aubinator
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat dc83ce7a16 i965/cnl: Wire up android Mesa build files for gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-06-09 16:02:58 -07:00
Anuj Phogat e01c5a6824 i965/cnl: Wire up Mesa build files for gen10
V2: Remove isl_gen10.c and isl_gen10.h

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-09 16:02:58 -07:00
Anuj Phogat 2417d5ca19 intel/genxml: Update genx_bits for gen10+
This commit adds a gen10 case to the switch statement and
drops some unneeded code for handling gen numbers which
doesn't work on gen10 and above.

V2: Drop "z = float(z)" and the "z *= 10" lines

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat 98b95a3735 i965/cnl: Add gen10 specific function declarations
These declarations will help the code start compiling
once we wire up the makefiles for gen10. Later patches
will start using these functions for gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat 2704ccc646 i965/cnl: Include gen10_pack.h
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat a48cb9cf7f i965/cnl: Define genX(x) and GENX(x) for gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Jason Ekstrand aa416f515a i965/genxml: Add gen10.xml
V2(Anuj):
Add default value for length of 3DPRIMITIVE command
Add values for 'Attribute Active Component Format'
Rename few fields to match gen9.xml

V3 (Ander Conselvan de Oliveira)
Add gen10 alias for MOCS
Make 3DSTATE_CONSTANT_BODY on Gen10 use arrays

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:00:49 -07:00
Ben Widawsky d968f072bc i965: Make feature macros gen8 based
All the "features" of the hardware are similar starting with GEN8, so remove as
much of the GEN9 uniqueness as possible. This makes implementing future gen
platforms a bit easier.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 15:27:14 -07:00
Jason Ekstrand a59c7f834c intel/isl: Add an enum for describing auxiliary compression state
This enum describes all of the states that a auxiliary compressed
surface can have.  All of the states as well as normative language for
referring to each of the compression operations is provided in the
truly colossal comment for the new isl_aux_state enum.  There is also
a diagram showing how surfaces move between the different states.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand bacae7221b blorp: Use FullSurfaceDepthandStencilClear for blorp_hiz_op
The blorp_hiz_op entrypoint always acts on a full subresource of a HiZ
buffer so we can just set the flag unconditionally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand 9cb6ac62fb intel/blorp: Plumb through access to the workaround BO
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101283
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Nanley Chery ed5801864e anv/blorp: Move the depth cache flush outside of BLORP
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-07 08:54:54 -07:00
Jason Ekstrand fbd8a33f61 intel/blorp: Refactor the HiZ op interface
This commit does a few things:

 1) Now that BLORP can do HiZ ops on gen8+, drop the gen6 prefix.
 2) Switch parameters to uint32_t to match the rest of blorp.
 3) Take a range of layers and loop internally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand f9fd976e8a i965/miptree: Store fast clear colors in an isl_color_value
This commit, out of necessity, makes a number of changes at once:

 1) Changes intel_mipmap_tree to store the clear color for both color
    and depth as an isl_color_value.

 2) Changes the depth/stencil emit code to do the format conversion of
    the depth clear value on Haswell and earlier instead of pulling a
    uint32_t directly from the miptree.

 3) Changes ISL's depth/stencil emit code to perform the format
    conversion of the depth clear value on Haswell and earlier instead
    of assuming that the depth value in the float is pre-converted.

 4) Changes blorp to pass the depth value through as a float.

 5) Changes the Vulkan driver to pass the depth value to blorp as a
    float rather than a uint.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 08:54:54 -07:00
Anuj Phogat 8d02916e0c intel: Fix broxton 2x6 way size computation
This patch is undoing the changes to way size computation
in broxton 2x6, made by below commit:

Commit: 0d576fbfbe
Author:     Anuj Phogat <anuj.phogat@gmail.com>
i965: Simplify l3 way size computations

By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 21:30:51 -07:00
Eric Engestrom 63a8a88ac4 tree-wide: remove trailing backslash
Simple search for a backslash followed by two newlines.
If one of the newlines were to be removed, this would cause issues, so
let's just remove these trailing backslashes.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-07 01:18:09 +01:00
Alex Smith 922b038864 anv: Set better descriptor set limits
Based on discussions with Jason, Ivy Bridge and Bay Trail only actually
support 16 samplers, while newer hardware can support more than the
current limit of 64. Therefore set the lower limit where needed, and
bump up to 128 for everything else. There is also a limit on the total
number of other resources of around 250.

This allows Dawn of War III to render correctly on ANV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:20:09 -07:00
Alex Smith 59c1797d56 anv: Set driver version to Mesa version
As already done by RADV.

v2: Move version calculation function to src/vulkan/util to share with
    RADV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:20:00 -07:00
Alex Smith 621b3410f5 util/vulkan: Move Vulkan utilities to src/vulkan/util
We have Vulkan utilities in both src/util and src/vulkan/util. The
latter seems a more appropriate place for Vulkan-specific things, so
move them there.

v2: Android build system changes (from Tapani Pälli)

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:17:13 -07:00
Lionel Landwerlin 2ef73473c8 intel: gen-decoder: rework how we handle groups
The current way of handling groups doesn't seem to be able to handle
MI_LOAD_REGISTER_* with more than one register. This change reworks
the way we handle groups by building a traversal list on loading the
GENXML files.

Let's say you have

Instruction {
  Field0
  Field1
  Field2
  Group0 (count=2) {
    Field0-0
    Field0-1
  }
  Group1 (count=4) {
    Field1-0
    Field1-1
  }
}

We build of linked on load that goes :

Instruction -> Group0 -> Group1

All of those are gen_group structures, making the traversal trivial.
We just need to iterate groups for the right number of timers (count
field in genxml).

The more fancy case is when you have only a single group of unknown
size (count=0). In that case we keep on reading that group for as long
as we're within the DWordLength of that instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-06 14:04:37 +01:00
Kenneth Graunke 9cd69022d5 i965: Change INTEL_DEBUG=vec4 to INTEL_SCALAR_VS for consistency.
We moved to INTEL_SCALAR_* when we added more than a single stage, but
never went back and converted the VS to work that way.  Be consistent.

Also update the documentation to actually mention these debug variables.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-05 23:32:40 -07:00
Anuj Phogat 0d576fbfbe i965: Simplify l3 way size computations
By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

V2: Keep the get_l3_way_size() function.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-02 16:21:56 -07:00
Anuj Phogat eb23be1d97 i965: Add and initialize l3_banks field for gen7+
This new field helps simplify l3 way size computations
in next patch.

V2: Initialize the l3_banks to 0 in macros.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-02 16:21:56 -07:00
Jason Ekstrand 1a22c4c960 intel/blorp: Handle gen6 stencil/HiZ offsets in the back-end
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:34:01 -07:00
Jason Ekstrand d065a9540c intel/isl: Add a helper for getting the byte/tile offset of a subimage
Frequently, get_image_offset_sa is combined with get_intratile_offset_sa
so it makes sense to have a single helper to do both.  If the caller
doesn't want the intratile offsets, it can simply pass NULL and ISL will
assert that they are 0.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:58 -07:00
Jason Ekstrand b178762d05 intel/isl: Make get_intratile_offset_el take the element size in bits
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:56 -07:00
Jason Ekstrand 757f7087a5 intel/isl: Add a new layout for HiZ and stencil on Sandy Bridge
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:47 -07:00
Jason Ekstrand cb8cdab8e8 intel/isl: Generate phys_total_el from isl_calc_phys_extent
The only surface layout for which slice0 makes any sense is GEN4_2D.
Move all of the slice0 stuff into isl_calc_phys_total_extent_el_gen4_2d
and make the others trivially return the total size in surface elements.
As a side-effect, array_pitch_el_rows is now returned from these helpers
as well.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:45 -07:00
Jason Ekstrand 918f41bb29 intel/isl: Don't check array pitch for gen4 3D textures
Array pitch doesn't matter in this layout.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:43 -07:00
Jason Ekstrand 044bfb292f intel/isl: Refactor to use a phys_total_el extent.
We've already implicitly been using a physical total size in surface
elements.  This just centralizes things a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:41 -07:00
Jason Ekstrand 1547d133ac intel/isl: Add an isl_assert_div helper
This is a fairly common operation and it's nice to be able to just call
the one little function.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:39 -07:00
Jason Ekstrand 58051ad220 intel/isl: Refactor isl_calc_array_pitch_el_rows
Over 90% of the function only applies to ISL_DIM_LAYOUT_GEN4_2D anyway
so we can just handle the other two as special cases at the top.  The
two "generic" cases below the switch only apply on gen9 and above and
only to 3D or CCS surfaces.  This implies that they only apply to
surfaces with ISL_DIM_LAYOUT_GEN4_2D.  Making them look generic is a
lie.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:37 -07:00
Jason Ekstrand fe13c59c1b intel/isl: Move isl_calc_array_pitch_el_rows higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:34 -07:00
Jason Ekstrand c1a70165be intel/isl: Remove the device parameter from isl_tiling_get_info
We were only using it for validating that we don't use Ys/Yf on gen8 and
earlier.  Removing it from isl_tiling_get_info lets us remove it from a
bunch of other things that had no business needing a hardware
generation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:31 -07:00
Kenneth Graunke fe14a9a501 i965: Drop duplicate shadow variable.
We already initialized this at the top of the function.

Trivial.
2017-06-01 14:28:12 -07:00
Kenneth Graunke fe9699dcb4 genxml: Make 3DSTATE_CONSTANT_BODY on Gen7+ use arrays.
This will let us initialize the constant buffers with loops.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:46 -07:00
Kenneth Graunke 12303bd390 genxml: Fix decoder to print the array element on field members.
Previously we'd print things like:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength: 0
       ReadLength: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength: 1
       ReadLength: 0

instead of the more obvious:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength[0]: 0
       ReadLength[1]: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength[2]: 1
       ReadLength[3]: 0

(Yes, the ralloc context here is bogus - the decoder leaks just about
everything.  We need to use proper ralloc contexts someday...)

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:46 -07:00
Kenneth Graunke 73c21e69d0 genxml: Fix decoding of array groups.
If you had a group as the first element of a struct, i.e.

  <struct name="3DSTATE_CONSTANT_BODY" length="10">
    <group count="4" start="0" size="16">
      <field name="ReadLength" start="0" end="15" type="uint"/>
    </group>
    ...
  </struct>

we would get a group_offset of 0, causing create_field() to think the
field wasn't in a group, and fail to offset forward for successive array
elements.  So we'd mark all the array elements as offset 0.

Using ctx->group->elem_size is a better check for "are we in a group?".

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke d1b949282f genxml: Fix decoder for groups with multiple fields.
If you have something like:

    <group count="0" start="96" size="32">
      <field name="Entry_0" start="0" end="15" type="GATHER_CONSTANT_ENTRY"/>
      <field name="Entry_1" start="16" end="31" type="GATHER_CONSTANT_ENTRY"/>
    </group>

We would reset ctx->group_count to 0 after processing the first field,
so the second would not have a group count.

This is largely untested, as the only groups with multiple fields are
packets we don't emit in Mesa.  Found by inspection.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke df2d55ba57 genxml: Fix parsing of address fields in groups.
For example,

    <group count="4" start="64" size="64">
      <field name="Pointer" start="5" end="63" type="address"/>
    </group>

used to generate:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer, 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer, 0);
   ...

but now generates code with proper subscripts:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer[0], 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer[1], 0);
   ...

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke 65f5f3c85c i965: Move SOL PSIZ hacks from draw time to link time.
We can just update the gl_transform_feedback_info fields at link time
to make the VUE header fields have the right location and component.
Then we don't need to handle them specially at draw time, which is
expensive.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-01 00:08:29 -07:00
Kenneth Graunke 56535959fd anv: Port over CACHE_MODE_1 optimization fix enables from brw.
Ben and I haven't observed these to help anything, but they enable
hardware optimizations for particular cases.  It's probably best to
enable them ahead of time, before we run into such a case.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Kenneth Graunke 53368b008e genxml: Add Gen9 CACHE_MODE_1 definitons.
These were already in gen8.xml but not gen9.xml.  There are a few new
fields and a couple that have changed.  These are all documented in the
Skylake PRM, Volume 2c Command Reference: Registers, Part 1.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Kenneth Graunke 9afe5846d2 genxml: Make a SCISSOR_RECT structure on Gen4-5.
Gen6+ support multiple scissor rectangles, and define a SCISSOR_RECT
structure containing their dimensions.  On Gen4-5, those same fields
exist in SF_VIEWPORT.

This patch extracts the SF_VIEWPORT fields into a SCISSOR_RECT
structure.  Although not a named concept on Gen4-5, it works just
as well, and gives us a consistent SCISSOR_RECT structure across
all generations, making it easier to reuse code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:37 -07:00
Kenneth Graunke 1e3880544e i965: Ignore INTEL_SCALAR_* debug variables on Gen10+.
Scalar mode has been default since Broadwell, and vector mode is getting
increasingly unmaintained.  There are a few things that don't even fully
work in vector mode on Skylake, but we've never cared because nobody
uses it.  There's no point in porting it forward to new platforms.

So, just ignore the debug options to force it on.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-29 21:40:44 -07:00
Emil Velikov 3e8790bff0 anv: automake: list shared libraries after the static ones
The compiler can discard the shared ones from the link chain, since
there is no user (the static libraries) before it on the command line.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-05-29 16:42:41 +01:00
Jason Ekstrand 79f2a5541f i965: Use BLORP for color clears on gen4-5
We don't support replicated data clears yet.  Those take a bit more work
and enabling replicated data clears in its own commit is probably better
for bisectibility anyway.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand fa13ef285d intel/blorp: Assert that no one tries to blit combined depth stencil
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 752d7af77a i965: Add blorp support for gen4-5
Due to complications with things such as URB setup on gen4-5, it's
easier to keep gen4 support in blorp completely internal to i965.  This
makes things a bit awkward because that means there's a file in i965
that includes blorp_priv.h but it's either that or have a file in blorp
that includes brw_context.h.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 23125b7102 intel/blorp: Set additional brw_wm_prog_key fields on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 0ed6f196fc intel/blorp: Add support for gen4-5 SF programs
As part of enabling support for SF programs, we plumb the SF URB size
through to emit_urb_config.  For now, it's always zero but, on gen4, it
may be something larger.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 8bce7bda45 intel/blorp: Make convert_to_single_slice available outside blorp_blit
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 110061afa2 intel/blorp: Use designated initializers to set up VERTEX_ELEMENTS
We also add a slot variable and use it as an iterator.  This will make
it much easier to conditionally put something between the header and the
vertex position.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand ac79806766 intel/blorp: Rename emit_viewport_state to emit_cc_viewport
The real point of this packet is that it sets up CC_VIEWPORT so that
name is a bit better.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 1f2f90be1f intel/blorp: Make the common genX_blorp_exec code gen4-safe
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand a7f5d6df8a intel/blorp: Re-arrange blorp_genX_exec.h
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 302c0488cf intel/blorp: Don't use ffma directly
It isn't supported prior to gen6 and, on gen6+, NIR will fuse the fmul
and fadd into an ffma automatically for us anyway.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 675ec434f3 intel/blorp: Delete isl_to_gen_ds_surfype
It's no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand e80f0840bf intel/blorp: Pull the pipeline bits of blorp_exec into a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 3d35e5a51e intel/blorp/blit: Add support for normalized coordinates
Gen5 and earlier can't do non-normalized coordinates so we need to
compensate in the shader.  Fortunately, it's pretty easy plumb through.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 18e18a1863 i965: Move clip program compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 9fb8a8775b i965: Move SF compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 21ba2b4bef intel/compiler: Make brw_disasm take const assembly
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand c336c224a6 intel/decoder: Handle the BLT ring in gen_group_get_length
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 9d1001c8e5 intel/decoder: Handle gen4 VF_STATISTICS and PIPELINE_SELECT
These need special handling because they have no "DWord Length"
parameter and they have an unusual bias of 1.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 87588e546e intel/genxml: Rename 3DSTATE_AA_LINE_PARAMS on gen5
All of the other gens use "PARAMETERS".

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 04f6d975e1 intel/genxml: Use the right subtype for VF_STATISTICS on gen4
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 1fcc5e2399 intel/genxml: Iron Lake doesn't support non-normalized sampler coordinates
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 648b618dc5 intel/genxml: Add SAMPLER_STATE to gen 4.5
Somehow this got missed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 3f8ee8c703 intel/genxml: Rename the CC_VIEWPORT pointer on gen4-5
It isn't a pointer to "color calc state", that's the packet it's in.
It's a pointer to the CC viewport state.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 0ee1ef0cbb intel/genxml: Sampler state is a pointer on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 64243d3b8e intel/genxml: Suffix KSP0 fields on Iron Lake
Iron Lake introduced the multiple KSP thing and so you have KSP0-3.
However, the genxml didn't have an index on the first "Kernel Start
Pointer" or "GRF Register Count".  Add one to match gen6+.  While we're
here, we drop the brackets from the other "GRF Register Count" fields.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 7769e448aa intel/genxml: Make a bunch of things offsets on gen4-5
Most things on gen4-5 are addresses because we don't have dynamic state
base address and we don't have instruction state base on gen4.  However,
whoever converted things to addresses got a little over-excited and
converted too much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 8257fe7b18 intel/isl: Add gen4_filter_tiling
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 332a5d7a3f intel/isl: Add support for setting component write disables
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 8958355549 intel/isl: Add support for gen4 cube maps to get_image_offset_sa
Gen4 cube maps are a 2-D surface with ISL_DIM_LAYOUT_GEN4_3D which is a
bit weird but accurate none the less.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand b9b7792d9a intel/isl: Don't request space for stencil/hiz packets unless needed
On Iron Lake, the packets exist but we never emit them so there's no
need for us to ask the driver to make batch space for them.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 554a1731a5 intel/blorp: Move the gen7 stencil format workaround to blorp_blit
It's not needed for blorp_copy because it already overrides formats.
It's also not needed for blorp_clear because it clears stencil as
stencil.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Lionel Landwerlin 359fa0e9a0 aubinator: report error on unknown device id
Since we're going to stop aubinator without a valid device id, better
report an error. This also silences a Coverity warning.

CID: 1405004
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-24 10:50:18 +01:00
Lionel Landwerlin 8f1f1d294d aubinator: be consistent on exit code
We're using both exit(1) & exit(EXIT_FAILURE), settle for one, same
for success.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-24 10:50:18 +01:00
Lionel Landwerlin 6200d835a0 aubinator: fix double free
1;4601;0c
Free previously allocated filename outside the for loop.

CID: 1405014
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-24 10:50:18 +01:00
Jason Ekstrand 39adea9330 anv: Require vertex buffers to come from a 32-bit heap
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 17:37:43 -07:00
Jason Ekstrand 50d0eb5096 anv: Advertise both 32-bit and 48-bit heaps when we have enough memory
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 17:37:42 -07:00
Jason Ekstrand 34581fdd4f anv: Refactor memory type setup
This makes us walk over the heaps one at a time and add the types for
LLC and !LLC to each heap.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:42 -07:00
Jason Ekstrand b83b1af6f6 anv: Make supports_48bit_addresses a heap property
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:40 -07:00
Jason Ekstrand 00df1cd9d6 anv: Stop setting BO flags in bo_init_new
The idea behind doing this was to make it easier to set various flags.
However, we have enough custom flag settings floating around the driver
that this is more of a nuisance than a help.  This commit has the
following functional changes:

 1) The workaround_bo created in anv_CreateDevice loses both flags.
    This shouldn't matter because it's very small and entirely internal
    to the driver.

 2) The bo created in anv_CreateDmaBufImageINTEL loses the
    EXEC_OBJECT_ASYNC flag.  In retrospect, it never should have gotten
    EXEC_OBJECT_ASYNC in the first place.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:38 -07:00
Jason Ekstrand 10fad58b31 anv: Set image memory types based on the type count
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:36 -07:00
Jason Ekstrand f7736ccf53 anv: Add valid_bufer_usage to the memory type metadata
Instead of returning valid types as just a number, we now walk the list
and check the buffer's usage against the usage flags we store in the new
anv_memory_type structure.  Currently, valid_buffer_usage == ~0.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:34 -07:00
Jason Ekstrand 92325a7efc anv: Determine the type of mapping based on type metadata
Before, we were just comparing the type index to 0.  Now we actually
look the type up in the table and check its properties to determine what
kind of mapping we want to do.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:32 -07:00
Jason Ekstrand c1f4343807 anv: Set up memory types and heaps during physical device init
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:30 -07:00
Jason Ekstrand eceaf7e234 anv: Predicate 48bit support on gen >= 8
This doesn't matter right now since it only affects whether or not we
set the kernel bit but, if we ever do anything else based on it, we'll
want it to be correct per-gen.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:27 -07:00
Jason Ekstrand 4eecd534f0 anv/image: Get rid of the memset(aux, 0, sizeof(aux)) hack
Up until now, we've been memsetting the auxiliary surface to 0 at
BindImageMemory time to ensure that it is properly initialized.
However, this isn't correct because apps are allowed to freely alias
memory between different images and buffers so long as they properly
track whether or not a particular image is valid and, if it isn't,
transition from UNINITIALIZED to something else before using it.  We
now implement those transitions so we can drop the hack.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:22 -07:00
Jason Ekstrand cc45c4bb80 anv: Handle transitioning depth from UNDEFINED to other layouts
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:20 -07:00
Jason Ekstrand 75edecf502 anv: Handle color layout transitions from the UNINITIALIZED layout
This causes dEQP-VK.api.copy_and_blit.resolve_image.partial.* to start
failing due to test bugs.  See CL 1031 for a test fix.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:03 -07:00
Nanley Chery 52a6fd9871 intel/isl: Add ASTC HDR to format lists and helpers
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-22 11:13:53 -07:00
Tapani Pälli f0051fcf2b android: add -Wl,--build-id=sha1 to LDFLAGS for libvulkan_intel
Just like is done on desktop and what is expected by the build-id code.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-20 08:59:57 +03:00
Emil Velikov acf3d2afab configure: check once for DRI3 dependencies
Currently we are having the XCB_DRI3 dependencies duplicated,
partially.

Just do a once-off check and add all of the respective CFLAGS/LIBS
where needed.

As a nice side effect this helps us solve a couple of FIXMEs.

DRI3 is not a thing w/o X11 so disable it in such cases.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:15 +01:00
Nanley Chery 56458cb168 anv/formats: Update the three-channel BC1 mappings
The procedure for decompressing an opaque BC1 Vulkan format is dependant on the
comparison of two colors stored in the first 32 bits of the compressed block.
Here's the specified OpenGL (and Vulkan) behavior for reference:

   The RGB color for a texel at location (x,y) in the block is given by:

      RGB0,              if color0 > color1 and code(x,y) == 0
      RGB1,              if color0 > color1 and code(x,y) == 1
      (2*RGB0+RGB1)/3,   if color0 > color1 and code(x,y) == 2
      (RGB0+2*RGB1)/3,   if color0 > color1 and code(x,y) == 3

      RGB0,              if color0 <= color1 and code(x,y) == 0
      RGB1,              if color0 <= color1 and code(x,y) == 1
      (RGB0+RGB1)/2,     if color0 <= color1 and code(x,y) == 2
      BLACK,             if color0 <= color1 and code(x,y) == 3

The sampling operation performed on an opaque DXT1 Intel format essentially
hard-codes the comparison result of the two colors as color0 > color1. This
means that the behavior is incompatible with OpenGL and Vulkan. This is stated
in the SKL PRM, Vol 5: Memory Views:

   Opaque Textures (DXT1_RGB)
      Texture format DXT1_RGB is identical to DXT1, with the exception that the
      One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and
      the resulting texel color is derived strictly from the Opaque Color Encoding.
      The alpha channel defaults to 1.0.

      Programming Note
      Context: Opaque Textures (DXT1_RGB)
      The behavior of this format is not compliant with the OGL spec.

The opaque and non-opaque BC1 Vulkan formats are specified to be decoded in
exactly the same way except the BLACK value must have a transparent alpha
channel in the latter. Use the four-channel BC1 Intel formats with the alpha
set to 1 to provide the behavior required by the spec.

v2 (Kenneth Graunke):
- Provide a more detailed commit message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-18 16:46:15 -07:00
Jason Ekstrand c499faebd7 anv: Add an option to abort on device loss
This is mostly for running in our CI system to prevent dEQP from
continuing on to the next test if we get a GPU hang.  As it currently
stands, dEQP uses the same VkDevice for almost all tests and if one of
the tests hangs, we set the anv_device::device_lost flag and report
VK_ERROR_DEVICE_LOST for all queue operations from that point forward
without sending anything to the GPU.  dEQP will happily continue trying
to run tests and reporting failures until it eventually gets crash that
forces the test runner to start over.  This circumvents the problem by
just aborting the process if we ever get a GPU hang.  Since this is not
the recommended behavior most of the time, we hide it behind an
environment variable.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-18 16:32:11 -07:00
Jason Ekstrand 53f997de77 anv: Wrap the device lost error in vk_error in QueueSubmit
We weren't wrapping this before because anv_cmd_buffer_execbuf may throw
a more meaningful error message.  However, we do change the error code
into VK_ERROR_DEVICE_LOST, so we should print a new message.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-18 16:32:11 -07:00
Iago Toral Quiroga 2322ddf548 anv: fix multiview for clear commands
According to the VK_KHX_multiview spec:

"Multiview causes all drawing and clear commands in the subpass to
behave as if they were broadcast to each view, where each view is
represented by one layer of the framebuffer attachments."

This adds support for multiview clears, which were missing in the
initial implementation.

v2 (Jason):
  - split multiview from regular case
  - Use for_each_bit() macro

Fixes new CTS multiview tests:
dEQP-VK.multiview.clear_attachments.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-18 11:53:25 +02:00
Samuel Iglesias Gonsálvez e69e5c7006 i965/vec4: load dvec3/4 uniforms first in the push constant buffer
Reorder the uniforms to load first the dvec4-aligned variables in the
push constant buffer and then push the vec4-aligned ones. It takes
into account that the relocated uniforms should be aligned to their
channel size.

This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.

v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
  of the push constant buffer slots, as this patch does not pack the
  uniforms.

v3:
- Implemented the push constant buffer usage optimization.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:54 +02:00
Samuel Iglesias Gonsálvez 8aa6ada838 i965/vec4: fix swizzle and writemask when loading an uniform with constant offset
It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.

Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.

The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.

v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:54 +02:00
Samuel Iglesias Gonsálvez 354f7f2cb9 i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution
We are going to add a packing feature to reduce the usage of the push
constant buffer. One of the consequences is that 'nr_params' would be
modified by vec4_visitor's run call, so we need to restore it if one of
them failed before executing the fallback ones. Same thing happens to the
uniforms values that would be reordered afterwards.

Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
the dvec4 alignment and packing patch is applied.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:28 +02:00
Chih-Wei Huang bfc0c23843 Android: correct libz dependency
Commit 6facb0c0 ("android: fix libz dynamic library dependencies")
unconditionally adds libz as a dependency to all shared libraries.
That is unnecessary.

Commit 85a9b1b5 introduced libz as a dependency to libmesa_util.
So only the shared libraries that use libmesa_util need libz.

Fix Android Lollipop build by adding the include path of zlib to
libmesa_util explicitly instead of getting the path implicitly
from zlib since it doesn't export the include path in Lollipop.

Fixes: 6facb0c0 "android: fix libz dynamic library dependencies"

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-17 14:04:18 +01:00
Jason Ekstrand e0d6f9afba intel/isl/gen6: Fix combined depth stencil alignment
All combined depth stencil buffers (even those with just stencil)
require a 4x4 alignment on Sandy Bridge.  The only depth/stencil buffer
type that requires 4x2 is separate stencil.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Jason Ekstrand 74d626f383 intel/isl: Refactor gen8_choose_image_alignment_el
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Jason Ekstrand 2486c7dd54 intel/isl: Refactor gen6_choose_image_alignment_el
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Jason Ekstrand 715f47cb34 intel/isl: Refactor gen7_choose_image_alignment_el
The Ivy Bridge PRM provides a nice table that handles most of the
alignment cases in one place.  For standard color buffers we have a
little freedom of choice but for most depth, stencil and compressed it's
hard-coded.  Chad's original functions split halign and valign apart and
implemented them almost entirely based on restrictions and not the
table.  This makes things way more confusing than they need to be.  This
commit gets rid of the split and makes us implement the exact table
up-front.  If our surface isn't one of the ones in the table then we
have to make real choices.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Pohjolainen, Topi 236f17a9f7 intel/isl/gen7: Use stencil vertical alignment of 8 instead of 4
The reasoning Chad gave in the comment for choosing a valign of 4 is
entirely bunk.  The fact that you have to multiply pitch by 2 is
completely unrelated to the halign/valign parameters used for texture
layout.  (Not completely unrelated.  W-tiling is just Y-tiling with a
bit of extra swizzling which turns 8x8 W-tiled chunks into 16x4 y-tiled
chunks so it makes everything easier if miplevels are always aligned to
8x8.)  The fact that RENDER_SURFACE_STATE::SurfaceVerticalAlignmet
doesn't have a VALIGN_8 option doesn't matter since this is gen7 and you
can't do stencil texturing anyway.

v2 (Jason Ekstrand):
 - Delete most of Chad's comment and add a more descriptive commit
   message.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Matt Turner 169e1e26ee i965: Fix test_eu_validate.cpp
Broken by commit a7217e909c ("i965: Pass pointer and end of assembly
to brw_validate_instructions").

Reported-by: Aaron Watry <awatry@gmail.com>
2017-05-16 11:45:07 -07:00
Jason Ekstrand b5437fc05c anv: Implement VK_KHR_get_surface_capabilities2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:46 -07:00
Matt Turner b1af896853 intel/aubinator_error_decode: Disassemble shader programs
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 12:04:04 -07:00
Matt Turner 23685f07d1 intel/aubinator_error_decode: Stop decoding after MI_BATCH_BUFFER_END
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:43:20 -07:00
Matt Turner 8e7221fa5a intel/tools: Refactor gen_disasm_disassemble() to use annotations
Which will allow us to print validation errors found in shader assembly
in GPU hang error states.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:43:14 -07:00
Matt Turner aaa0329b5f intel/decoder: Fix indentation
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-15 11:43:13 -07:00
Matt Turner 3443bd45a3 genxml: Remove brackets from kernel start pointer names
Newer Gens' names don't have the brackets. Having common names will make
some later patches simpler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-15 11:43:11 -07:00
Matt Turner aae2626be8 i965: Add a weak no-op nir_print_instr() symbol
intel_asm_annotation.c is part of libintel_compiler.la, which contains
code for disassembling and validating shaders that we want to call in
aubinator_error_decode.

dump_assembly() calls nir_print_instr() to print annotations, and
although dump_assembly() is not called by aubinator_error_decode (nor is
any function in intel_asm_annotation.c) it causes undefined references
to nir_print_instr().

To work around, provide a no-op weak symbol to resolve against.
2017-05-15 11:43:01 -07:00
Matt Turner d98e82c772 i965: Allow brw_eu_validate to handle compact instructions
This will allow the validator to run on shader programs we find in the
GPU hang error state.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:42:56 -07:00
Matt Turner a7217e909c i965: Pass pointer and end of assembly to brw_validate_instructions
This will allow us to more easily run brw_validate_instructions() on
shader programs we find in GPU hang error states.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-15 11:42:47 -07:00
Lionel Landwerlin 55be6653e0 intel: gen-decoder: fix xml parser leak
In the unlikely case the parsing of genxml files fails, we were
leaking an xml parser object.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-15 14:06:11 +01:00
Rafael Antognolli d9b4a81672 genxml: Add alias for MOCS.
Use an alias for this field on 3DSTATE_INDEX_BUFFER on gen6+, so we can set
the same value as the defines.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-11 21:27:38 -07:00
Kenneth Graunke f790d6e0b4 i965: Port Gen4-5 VS_STATE to genxml.
It's actually not that much code.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-11 16:52:59 -07:00
Kenneth Graunke d65e19f5c6 genxml: Fix KSPs on Ironlake to be offsets, not pointers.
We use Instruction State Base Address on Ironlake, so we want KSP to be
an offset not an actual pointer.  Gen4/G45 use pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-11 16:33:48 -07:00
Emil Velikov 4c22b99953 anv: document that anv_gem_mmap returns MAP_FAILED on error
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-11 13:58:20 +01:00
Kenneth Graunke 620f12a53f i965: Drop INTEL_DEBUG=stats.
For whatever reason, we had an INTEL_DEBUG=stats option that enabled
various statistics counters on Gen4-5 systems.  It's been around
forever, though I can't think of a single time that it's been useful.

On Gen6+, we enable statistics all the time because they're necessary
to support various query object targets.  Turning them off would break
those queries.

Gen4-5 don't support those queries, so the statistics counters generally
aren't useful; we disabled them by default.  This patch disables them
altogether.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-10 11:37:19 -07:00
Grazvydas Ignotas 0ef302638f anv: don't leak DRM devices
After successful drmGetDevices2() call, drmFreeDevices() needs to be
called.

Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> # radv version
2017-05-10 01:13:44 +03:00
Grazvydas Ignotas e0aee8b667 anv: fix possible stack corruption
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.

Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-10 01:13:44 +03:00
Jason Ekstrand 037ce253b1 i965/vec4: Delete the system value infastructure
The only thing still using it is INVOCATION_ID for geometry shaders.
That's easily enough inlined into the nir_intrinsic_load_invocation_id
handling code.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:07 -07:00
Jason Ekstrand 2e9916ea04 i965/vec4: Use NIR to do GS input remapping
We're already doing this in the FS back-end.  This just does the same
thing in the vec4 back-end.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:07 -07:00
Jason Ekstrand e31042ab40 i965/fs: Move remapping of gl_PointSize to the NIR level
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand 5b00c3cc05 i965/nir: Inline remap_inputs_with_vue_map
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand 0d5f89cdc3 i965/vec4: Use NIR remapping for VS attributes
The NIR pass already handles remapping system values to attributes for
us so we delete the system value code as part of the conversion.

We also change nir_lower_vs_inputs to take an explicit inputs_read
bitmask and pass in the inputs_read from prog_data instead from pulling
it out of NIR.  This is because the version in prog_data may get
EDGEFLAG added to it on some old platforms.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand 80aa6e9d32 intel/compiler/vs: Move inputs_read handling to generic code
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:03 -07:00
Jason Ekstrand d2fe804d18 i965/vec4: Set VERT_BIT_EDGEFLAG based on the VUE map
We also add a nice little comment to make it more clear exactly what
happens with the edge flag copy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand ca4d192802 i965/fs: Lower gl_VertexID and friends to inputs at the NIR level
NIR calls these system values but they come in from the VF unit as
vertex data.  It's terribly convenient to just be able to treat them as
such in the back-end.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand 24e6fba500 i965/vs: Set uses_vertexid and friends from brw_compile_vs
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand 5e832302dc i965: Move multiply by 4 for VS ATTR setup into the scalar backend.
The vec4 backend will want to count in units of vec4s, not scalar
components.  The simplest solution is to move the multiplication by 4
into the scalar backend.  This also improves consistency with how we
count varyings.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand 36764b6923 i965/nir: Inline remap_vs_attrs
Now that we have nice block iterators, there's no good reason for this
to be off on it's own.  While we're here, we convert to using the NIR
const index getters/setters instead of whacking const_index values
directly.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand b86dba8a0e nir: Embed the shader_info in the nir_shader again
Commit e1af20f18a changed the shader_info
from being embedded into being just a pointer.  The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct.  This, however, has
caused a few problems:

 1) There are many things which generate NIR without GLSL.  This means
    we have to support both NIR shaders which come from GLSL and ones
    that don't and need to have an info elsewhere.

 2) The solution to (1) raises all sorts of ownership issues which have
    to be resolved with ralloc_parent checks.

 3) Ever since 00620782c9, we've been
    using nir_gather_info to fill out the final shader_info.  Thanks to
    cloning and the above ownership issues, the nir_shader::info may not
    point back to the gl_shader anymore and so we have to do a copy of
    the shader_info from NIR back to GLSL anyway.

All of these issues go away if we just embed the shader_info in the
nir_shader.  There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Lionel Landwerlin 32f14332f5 intel: compiler: prevent integer overflow
CID: 1399477, 1399478 (Integer handling issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-09 13:56:17 +01:00
Lionel Landwerlin 85182e490c intel: compiler: remove duplicated code
CID: 1399470: (Control flow issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-09 13:56:17 +01:00
Lionel Landwerlin 4201b7d1bf intel: gen decoder: don't check for size_t negative values
We should get either 0 or 1 here.

CID: 1373562 (Control flow issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2017-05-09 13:54:08 +01:00
Lionel Landwerlin e3a5ab2d66 anv: check return value of anv_execbuf_add_bo
CID: 1405919 (Error handling issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-08 14:38:27 +01:00
Lionel Landwerlin 6247b8b413 anv: avoid null pointer dereference
The application might not give an output structure.

CID: 1405765 (Null pointer dereferences)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-08 14:38:27 +01:00
Jason Ekstrand e05e3e07ab anv/allocator: Only write to _vg_ptr if we have valgrind
This fixes the build when not building against valgrind headers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100945
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-05 12:49:51 -07:00
Iago Toral Quiroga 7761cf6d01 anv/query: handle more cases of 'out of host memory'
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-05 08:53:33 +02:00
Jason Ekstrand 98cd512089 anv/allocator: Improve block pool growing asserts
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 24827fdf50 anv: Drop the instruction pool block size
Now that we can allocate states larger than the block size, we no longer
need a block size of 1MB which can be rather wasteful.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 955127db93 anv/allocator: Add support for large stream allocations
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand f82d3d38b6 anv/allocator: Allow state pools to allocate large states
Previously, the maximum size of a state that could be allocated from a
state pool was a block.  However, this has caused us various issues
particularly with shaders which are potentially very large.  We've also
hit issues with render passes with a large number of attachments when we
go to allocate the block of surface state.  This effectively removes the
restriction on the maximum size of a single state.  (There's still a
limit of 1MB imposed by a fixed-length bucket array.)

For states larger than the block size, we just grab a large block off of
the block pool rather than sub-allocating.  When we go to allocate some
chunk of state and the current bucket does not have state, we try to
pull a chunk from some larger bucket and split it up.  This should
improve memory usage if a client occasionally allocates a large block of
state.

This commit is inspired by some similar work done by Juan A. Suarez
Romero <jasuarez@igalia.com>.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 8c079b566e anv/allocator: Support pushing multiple blocks onto a free list at once
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 8769fb48fb anv/allocator: Add helpers for dealing with bucket sizes
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 12043ca696 anv/allocator: Add the capability to allocate blocks of different sizes
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 01170df262 anv/allocator: Rework a comment
This commit just fixes up the English a bit and re-flows the comment.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand bcc5d0defb anv/allocator: Tweak the block pool growing algorithm
The old algorithm worked fine assuming a constant block size.  We're
about to break that assumption so we need an algorithm that's a bit more
robust against suddenly growing by a huge amount compared to the
currently allocated quantity of memory.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand d3ed72e2c2 anv/allocator: Embed the block_pool in the state_pool
Now that the state stream is allocating off of the state pool, there's
no reason why we need the block pool to be separate.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand bb2a3f0df8 anv/allocator: Get rid of the ability to free blocks
Now that everything is going through the state pools, the block pool no
longer needs to be able to handle re-use.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 08413a81b9 anv: Allocate binding table blocks through the state pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 55f49e6b7e anv/allocator: Add support for "back" allocations to state_pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 49ecaf88d1 anv/allocator: Drop the block_size field from block_pool
Since the state_stream is now pulling from a state_pool, the only thing
pulling directly off the block pool is the state pool so we can just
move the block_size there.  The one exception is when we allocate
binding tables but we can just reference the state pool there as well.

The only functional change here is that we no longer grow the block pool
immediately upon creation so no BO gets allocated until our first state
allocation.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 30d63ffe26 anv/allocator: Pull the userptr part of block_pool_grow into a helper
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand c73ce41a48 anv/allocator: Roll fixed_size_state_pool into state_pool
The helper functions aren't really gaining us as much as they claim and
are actually about to be in the way.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 6d02ef011e anv/allocator: Remove the state_size field from fixed_size_state_pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 367031a5c8 anv: Get rid of a bunch of uses of size_t
We should only use size_t when referring to sizes of bits of CPU memory.
Anything on the GPU or just a regular array length should be a type that
has the same size on both 32 and 64-bit architectures.  For state
objects, we use a uint32_t because we'll never allocate a piece of
driver-internal GPU state larger than 2GB (more like 16KB).

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand e86aeecb6a anv/allocator: Convert the state stream to pull from a state pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand e049dea5b2 anv/allocator: Return a null state for zero-size allocations
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand 45e1829274 anv/allocator: Add no-valgrind versions of state_pool_alloc/free
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Kenneth Graunke 9377801fbd anv: Simplify Cherryview line handling.
We can just use the new CHVLineWidth field rather than an entirely
different generation's packing function.

v2: Inline the function (requested by Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-04 16:17:34 -07:00
Kenneth Graunke 31f094e691 i965: Fix line width on Cherryview.
We just add another field to gen8.xml for the Cherryview line width,
rather than trying to replicate the gymnastics done in the Vulkan
driver to use gen9 SF pack functions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-04 16:17:34 -07:00
Emil Velikov 9d2aa6e506 anv: fix anv_gem_mmap comment to not mention NULL
The function cannot return NULL, update the comment accordingly.

Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-04 18:06:18 +01:00
Samuel Iglesias Gonsálvez 939b015736 anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST,DEVICE}_MEMORY on failure
According to the spec we get VK_ERROR_OUT_OF_HOST_MEMORY or
VK_ERROR_OUT_OF_DEVICE_MEMORY on vkBindImageMemory failure.

Fixes returned value changed by b546c9d.

Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 15:13:08 +02:00
Samuel Iglesias Gonsálvez b546c9d318 anv: anv_gem_mmap() returns MAP_FAILED as mapping error
Take it into account when checking if the mapping failed.

v2:
- Remove map == NULL and its related comment (Emil)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

Fixes: 6f3e3c715a ("vk/allocator: Add a BO pool")
Fixes: 9919a2d34d ("anv/image: Memset hiz surfaces to 0 when binding memory")
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
2017-05-04 08:56:36 +02:00
Rafael Antognolli f321f695d3 genxml: Fix 3DSTATE_DEPTH_BUFFER length on gen5.
The hardware docs are wrong, but the length used in the xml is also
wrong.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli 2e5d65ccb6 anv: Use BRW_BARYCENTRIC_NONPERSPECTIVE_BITS from common header.
In a previous patch some enums were split out from brw_eu_defines.h, so
they could be used by genxml based code. anv can also benefit from this.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:58:55 -07:00