Commit Graph

2206 Commits

Author SHA1 Message Date
Emil Velikov 5d47dd9c2a intel/blorp: ship blorp_genX_exec.h within the tarball
Fixes: c9cb37b2a6 ("intel/blorp: Add a partial resolve pass for MCS")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 15:14:21 +01:00
Jason Ekstrand 6874b953f6 anv/image: zalloc image views
This allows us to avoid some extra zeroing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand a1cad8218e anv/image: Use vk_zalloc instead of an explicit memset
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 1e32c8303a anv: Separate surface states by layout instead of aux_usage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 628bfaf1c6 intel/isl: Add some sanity checks for compressed surfaces
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 5de4209f91 intel/isl: Add a helper to get a subimage surface
We already have a helper for doing this in BLORP, this just moves the
logic into ISL where we can share it with other components.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 72bc38cfc5 anv: Get rid of some unused function declarations
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand d4de403f91 intel/isl: Add a helper for determining if a color is 0/1
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand b26b2490e5 intel/blorp: Allow blorp_copy on sRGB formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand fb86ac94cb intel/isl/format: Add an srgb_to_linear helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand 44e9d65757 intel/isl/format: Dedent the template in gen_format_layout.py
This makes it much easier to edit the template and doesn't really dirty
the python all that much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand 268ba028dc intel/isl: Add an aux state for "partial clear"
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand c9cb37b2a6 intel/blorp: Add a partial resolve pass for MCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Nanley Chery 67027ddf3f anv: Predicate fast-clear resolves
Image layouts only let us know that an image *may* be fast-cleared. For
this reason we can end up with redundant resolves. Testing has shown
that such resolves can measurably hurt performance and that predicating
them can avoid the penalty.

v2:
- Introduce additional resolve state management function (Jason Ekstrand).
- Enable easy retrieval of fast clear state fields.
v3: Use more descriptive field enums (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 8e2729fbb8 intel/blorp: Allow BLORP calls to be predicated
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery be516ba9b1 anv/cmd_buffer: Skip some input attachment transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 597ff919e7 anv: Stop resolving CCS implicitly
With an earlier patch from this series, resolves are additionally
performed on layout transitions. Remove the now unnecessary implicit
resolves within render passes.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 5ba93e6f5a anv: Transition more color buffer layouts
v2: Expound on comment for the pipe controls (Jason Ekstrand).
v3:
- Cast base_layer to uint64_t to avoid overflow.
- Remove "seems" from the pipe control comment.
- Fix clamp of layer_count (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery a899747eb3 anv/cmd_buffer: Warn about not enabling CCS_E
Use the performance warning infrastructure to provide helpful
information when testing applications.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 9c9f63d1c7 anv/cmd_buffer: Move aux_usage assignment up
For readability, bring the assignment of CCS closer to the assignment of
NONE and MCS.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 62d72bb5d0 anv/cmd_buffer: Always enable CCS_D in render passes
The lifespan of the fast-clear data will surpass the render pass scope.
We need CCS_D to be enabled in order to invalidate blocks previously
marked as cleared and to sample cleared data correctly.

v2: Avoid refactoring.
v3: Allow CCS_D for subpass resolves.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 8e532aa028 anv/cmd_buffer: Disable CCS on gen7 color attachments upfront
The next patch enables the use of CCS_D even when the color attachment
will not be fast-cleared. Catch the gen7 case early to simplify the
changes required.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 9fd1f2aa3c anv/cmd_buffer: Ensure fast-clear values are current
v2: Rewrite functions, change location of synchronization.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 0b16600056 anv/gpu_memcpy: Add a lighter-weight GPU memcpy function
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.

v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery dcff5ab9f1 anv/cmd_buffer: Restrict fast clears in the GENERAL layout
v2: Remove ::first_subpass_layout assertion (Jason Ekstrand).
v3: Allow some fast clears in the GENERAL layout.
v4: Remove extra '||' and adjust line break (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 9ffe87122b anv/cmd_buffer: Don't partially fast clear image layers
v2: Don't pass in the command buffer (Jason Ekstrand).
v3: Remove an incorrect assertion and an if condition for gen7.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 07cc2ec9db anv/cmd_buffer: Initialize the clear values buffer
v2: Rewrite functions.
v3 (Jason Ekstrand):
- Don't set ResourceMinLOD.
- Fix clamp of level_count.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 88200e87f6 anv/image: Append CCS/MCS with a fast-clear state buffer
v2: Update comments, function signatures, and add assertions.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 325ecffc62 anv/image: Disable CCS if the image doesn't support rendering
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 01db9a74c6 intel/isl: Add surface state clear value information
This will be used to load and store clear values from surface state
objects.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery b178e239dd anv: Transition MCS buffers from the undefined layout
v2: Define MCS buffers with any sample count (Jason)

Cc: <mesa-stable@lists.freedesktop.org>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-07-22 20:12:09 -07:00
Jason Ekstrand f793c57cc5 intel/isl: Tighten up restrictions for CCS on gen7
It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7.  However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering.  Let's just play it safe and disallow it for now.  If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.
2017-07-22 20:12:07 -07:00
Jason Ekstrand 20533e0da7 anv/blorp: Assert isl_surf_init success in do_buffer_copy
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 08:21:27 -07:00
Jason Ekstrand cf39fb06e3 anv/blorp: Explicitly set row_pitch in do_buffer_copy
We have a very specific row pitch that we want and we don't want ISL to
be changing it on us so just be explicit about it.

Fixes: a40f043034
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 08:20:07 -07:00
Kenneth Graunke 30d6bc470a i965: Set lower_vote_trivial in vector_nir_options_gen6 too.
There's a second struct for Gen6+.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-21 18:09:01 -07:00
Topi Pohjolainen fbfc6a2f67 intel/isl/gen7: Don't allow multisampled surfaces with valign2
There is the same constraintg later on as assert in
isl_gen7_choose_image_alignment_el() so catch it earlier in order
to return error instead of crash.

Needed to avoid crashes with piglits on IVB and HSW:

arb_internalformat_query2.image_format_compatibility_type pname checks
arb_internalformat_query2.all internalformat_<x>_type pname checks
arb_internalformat_query2.max dimensions related pname checks
arb_copy_image.arb_copy_image-formats --samples=2/4/6/8
arb_texture_float.multisample-fast-clear gl_arb_texture_float

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen df9bb8dc05 intel/isl/gen7: Allow msaa with signed integer formats
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.

There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try 16I/32I in addition to GL_RGBA8I.
IvyBridge passed all tests with all sample numbers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen abb84e3f2d intel/isl/gen7: Allow msaa with 128-bit formats
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.

There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I.
IvyBridge passed all tests with all sample numbers and even
with 128-bit formats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen 514d68576d intel/isl: Allow 1D surfaces with compressed formats
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen a40f043034 intel/isl: Align non-tiled horizontally by cache line
in order to support blit engine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Matt Turner 069bf7c907 i965/fs: Match destination type to size for ballot
No use in taking a 64-bit value when we know the high 32-bits are zero.
2017-07-20 16:56:50 -07:00
Matt Turner 1038d385a9 nir: Reduce destination size of ballot intrinsic when possible
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner 782ef30451 i965/fs: Implement ARB_shader_ballot operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner 8238930510 i965/fs: Do not move MOVs writing the flag outside of control flow
The implementation of ballotARB() will start by zeroing the flags
register. So, a doing something like

        if (gl_SubGroupInvocationARB % 2u == 0u) {
                ... = ballotARB(true);
		[...]
        } else {
                ... = ballotARB(true);
		[...]
	}

(like fs-ballot-if-else.shader_test does) would generate identical MOVs
to the same destination (the flag register!), and we definitely do not
want to pull that out of the control flow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez f1b7c47913 i965/fs: Handle explicit flag sources in flags_read()
The implementations of the ARB_shader_ballot intrinsics will explicitly
read the flag as a source register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner 43ef75b394 nir: Add system values from ARB_shader_ballot
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.

Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner ee9fa4ac18 i965/fs: Implement ARB_shader_group_vote operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez 93dc736f4e i965/fs: Handle explicit flag destinations in flags_written()
The implementations of the ARB_shader_group_vote intrinsics will
explicitly write the flag as the destination register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner 30b72f4126 i965/vec4: Lower ARB_shader_group_vote intrinsics
I don't expect anyone is going to care about using this in vec4 programs
(vertex/tessellation/geometry on Gen6/7), no one has come up with a good
way to implement it much less test it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner d4c9d6a3b2 nir: Add pass to optimize intrinsics
Specifically, constant fold intrinsics from ARB_shader_group_vote, but I
suspect it'll be useful for other things in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Topi Pohjolainen c4ac0d4949 intel/isl/gen4: Represent cube maps with 3D layout
v2 (Jason): Check for !ISL_SURF_DIM_3D instead of CUBE_BIT.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen 171b72542c intel/isl: Add i915 to isl_tiling converter
v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Chad Versace 5d69052113 anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT.  We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.

Found by assertion failures while enabling VK_ANDROID_native_buffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-19 11:25:50 -07:00
Topi Pohjolainen 0926fb69a4 intel/blorp/gen4: Drop cube map flag for single face copy
This will falsely trigger an assert on number of layers once
isl is used for 3D layouts of Gen4 cube maps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen 4733891e51 intel/isl: Take 3D surfaces into account in image params
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:44 +03:00
Jason Ekstrand cd9fd68a50 anv: Advertise support for VK_KHR_variable_pointers
We don't support the general version yet because that requires us to
lower shared variables up-front in SPIR-V -> NIR.  This shouldn't be a
whole lot of work but it's not something we support today.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand bc9319583a anv: Advertise support for VK_KHR_storage_buffer_storage_class
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand 828c437078 intel/isl: Add a row_pitch parameter to surf_get_ccs_surf
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand c5700ed72e anv/image: Add INPUT_ATTACHMENT to the list of required usages
From the Vulkan 1.0.53 spec VU for vkCreateImageView:

    "image must have been created with a usage value containing at least
    one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
    VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
    VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand cbdfd1daa2 anv: Stop leaking the no_aux sampler surface state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand bd41564746 anv/cmd_buffer: Properly handle render passes with 0 attachments
We were early returning and never created the NULL surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Emil Velikov 43c188f970 anv: advertise v6 of the wayland surface extension
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-17 15:24:32 +01:00
Lionel Landwerlin 59adde0eab anv: ensure device name contains terminating character
v2: Use sizeof() (Chris)

CID: 1415113
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-07-17 14:36:38 +01:00
Jason Ekstrand 0ee8d81718 anv: Implement VK_KHR_external_memory_*
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand c02da9cad6 anv: Implement VK_KHR_dedicated_allocation
We always recommend sub-allocation and don't do anything special for
dedicated allocations.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 8c82aa5f43 anv: Implement VK_KHR_get_memory_requirements2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 5b57bdc1cf anv: Advertise version 1.0.54
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 227debdc92 vulkan: Update to the new 1.0.54 spec XML and headers
There is one small ANV change here because we used the
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had
to be updated to have the _KHR suffix.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand dc179aa123 anv: Drop support for VK_KHX_external_semaphore_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:58:51 -07:00
Jason Ekstrand 4ac94d0dee anv: Drop support for VK_KHX_external_memory_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-14 22:12:39 -07:00
Juan A. Suarez Romero 5cd4ece34e anv/pipeline: do not use BITFIELD64_BIT()
In the previous commit, forgot to apply v2 suggestions.

Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check
enable vertex inputs)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-07-14 10:33:19 +00:00
Juan A. Suarez Romero 28d0c38d85 anv/pipeline: use unsigned long long constant to check enable vertex inputs
When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.

But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.

Thus, use 1ull, which is an unsigned long long value.

This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved

v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-14 08:09:18 +00:00
Kenneth Graunke b2da123801 i965: Use pushed UBO data in the scalar backend.
This actually takes advantage of the newly pushed UBO data, avoiding
pull loads.

Improves performance in GLBenchmark Manhattan 3.1 by:

   HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%.
   (thanks to Eero Tamminen for these numbers)

shader-db results on Skylake, ignoring programs with spill/fill changes:

   total instructions in shared programs: 13963994 -> 13651893 (-2.24%)
   instructions in affected programs: 4250328 -> 3938227 (-7.34%)
   helped: 28527
   HURT: 0

   total cycles in shared programs: 179808608 -> 172535170 (-4.05%)
   cycles in affected programs: 79720410 -> 72446972 (-9.12%)
   helped: 26951
   HURT: 1248

   LOST:   46
   GAINED: 21

Many "Deus Ex: Mankind Divided" shaders which already spilled end up
spill a lot more (about 240 programs hurt, 9 helped).  The cycle
estimator suggests this is still overall a win (-0.23% in cycle counts)
presumably because we trade pull loads for fills.

v2: Drop "PULL" environment variable left in for initial debugging
    (caught by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke c9ef27e77b i965: Factor out push locations.
With UBOs, the answer of "have we decided to push this uniform" gets
a bit more complicated - for one, we have multiple surfaces.  This
patch refactors things so we can add the new code in a single place.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke 4f586cd8f1 i965: Push UBO data, but don't use it just yet.
This patch starts uploading UBO data via 3DSTATE_CONSTANT_* packets,
and updates the compiler to know that there's extra payload data, so
things continue working.  However, it still issues pull loads for all
data.  I wanted to separate the two aspects for greater bisectability.

v2: Update for new intel_bufferobj_buffer parameter.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:30 -07:00
Kenneth Graunke 6d28c6e52c i965: Select ranges of UBO data to be uploaded as push constants.
This adds a NIR pass that decides which portions of UBOS we should
upload as push constants, rather than pull constants.

v2: Switch to uint16_t for the UBO block number, because we may
    have a lot of them in Vulkan (suggested by Jason).  Add more
    comments about bitfield trickery (requested by Matt).

v3: Skip vec4 stages for now...I haven't finished wiring up support
    in the vec4 backend, and so pushing the data but not using it
    will just be wasteful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke 8ec5a4e4a4 i965: Switch to absolute addressing for constant buffer 0.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.  I'd like
to be able to use all four push buffers.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Lionel Landwerlin 6131a1ae40 aubinator: don't leak fd of opened aubfile
CID: 1373563
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:50 +01:00
Lionel Landwerlin d1bd731e30 anv: don't use strcpy for copying strings
CID: 1358935
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:47 +01:00
Lionel Landwerlin 226fae7849 intel/compiler: no need to check unsigned is >= 0
CID: 1338342
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:45 +01:00
Lionel Landwerlin 95c917668c intel/compiler: don't check unsigned is >= 0
CID: 1224468
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:38 +01:00
Lionel Landwerlin a25a533458 intel/compiler: remove check unsigned is >= 0
By definition unsigned are always >= 0.

CID: 742212
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:29 +01:00
Lionel Landwerlin 19869d6091 isl: use 64bit arithmetic to compute size
If we allow the size to be more than 2^32, then we should compute it
in 64bit arithmetic otherwise we might run into overflow issues.

CID: 1412892, 1412891
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:26 +01:00
Jason Ekstrand 5b3363e3f1 intel/isl: Add a helper to convert tilings from ISL to i915
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand a668ba9c18 intel/isl: Add basic modifier introspection
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Anuj Phogat 0a56c5f3f1 intel/compiler: Don't use opt_sampler_eot() optimization on gen10+
This optimization has been removed on gen10+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-12 11:27:31 -07:00
Eric Anholt 5d6271c6a5 intel: Move the DRM uapi headers to a non-Intel location.
I want to remove vc4's dependency on headers from libdrm as well, but
storing multiple copies of drm_fourcc.h in our tree would be silly.

v2: Update Android.mk as well, move distcheck drm*.h references to
    top-level noinst_HEADERS.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Rob Herring <robh@kernel.org>
2017-07-12 10:58:33 -07:00
Jason Ekstrand 8e3d9c5d09 anv: Round u_vector element sizes to a power of two
This fixes 32-bit builds of the driver.  Commit 08413a81b9
changed things so that we now put struct anv_states in the u_vector for
binding tables.  On 64-bit builds, sizeof(struct anv_state) is a power
of two but it isn't on 32-bit builds.

Fixes: 08413a81b9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 10:34:13 -07:00
Lionel Landwerlin 384aaa4d3f intel: add number of subslices to device info
We could have used a single integer to store that value, but
Cannonlake has different number of subslices per slice depending on
the GT.

v2: Add CFL subslice numbers (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-07-11 16:14:57 +01:00
Kenneth Graunke c2c37f5185 intel: Fix clflushing on modern (Baytrail+) Atom CPUs.
Thanks to Chris Wilson for pointing this out.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-10 15:55:26 -07:00
Kenneth Graunke 3e50607a40 intel: Move clflush helpers from anv to common/gen_clflush.h.
I want to use these in the OpenGL driver as well.

v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:19 -07:00
Jason Ekstrand 781263486f anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITE
The reason we were doing this was to ensure that the kernel did the
appropriate cross-ring synchronization and flushing.  However, the
kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to
insert a fence.  It only cares about the domain for determining whether
or not it needs to clflush the BO before using it for scanout but the
domain automatically gets set to RENDER internally by the kernel if
EXEC_OBJECT_WRITE is set.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-10 08:55:47 -07:00
Nanley Chery 753a7bbc84 Revert "intel/isl: Only create a CCS buffer if the image supports rendering"
This reverts commit 8aaa13467d, which was
based on an incorrect assumption. Unlike the restriction placed on image
views in the Vulkan API, OpenGL allows you to render to texture views
whose formats differ from the originals.

Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
2017-07-07 14:24:58 -07:00
Tomasz Figa 50a8a7377a intel: common: Fix link failure with standalone Android build
Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).

Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.

Fixes: d5b355ce5f ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 18:49:21 +01:00
Samuel Iglesias Gonsálvez 5dd96b1156 anv: check support for enabled features in vkCreateDevice()
From Vulkan spec, 4.2.1. "Device Creation":

  "vkCreateDevice verifies that extensions and features requested in
   the ppEnabledExtensionNames and pEnabledFeatures members of
   pCreateInfo, respectively, are supported by the implementation."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@gmail.com>
2017-07-03 08:01:31 +02:00
Samuel Iglesias Gonsálvez ba05f6f72b anv: merge tessellation's primitive mode in merge_tess_info()
SPIR-V tessellation shaders that were created from HLSL will have
the primitive generation domain set in tessellation control shader
(hull shader in HLSL) instead of the tessellation evaluation shader.

v2:
- Add assert (Kenneth)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-03 08:00:43 +02:00
Lionel Landwerlin 038c45a40e anv: fix reported timestampPeriod value
We lost some precision on a previous change due to switching to
integers. Since we report a float in timestampPeriod, we want the
division to happen in floats.

CID: 1413021
Fixes: c77d98ef32 ("intel: common: express timestamps units in frequency")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-02 12:11:55 +01:00
Lionel Landwerlin 34560ba9e5 intel: genxml: make a couple of enums show up in aubinator
In particular Shader Channel Select & Texture Address Control Mode.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-07-02 00:45:38 +01:00
Johnson Lin 165e704719 i965/i915: Add UYVY as the supported format
Trigger the correct sampler options for it. Similar with YUYV

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Mauro Rossi b693fd8464 android: anv: drop libdrm_intel dependency
In addition to Rob Herring "Android: i965: remove libdrm_intel dependency",
we can drop libdrm_intel dependency in anv for Android.

Please check if libdrm has to stay as shared dependency and drop this comment line.

Fixes: 7dd20bc ("anv/i965: drop libdrm_intel dependency completely")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-29 12:31:00 +01:00
Lionel Landwerlin d8bf2861ad anv: use devinfo for number of thread/eu
It turns out Gen9LP has fewer threads per EU (6 vs 7).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-06-29 10:07:52 +01:00
Juan A. Suarez Romero 93b8dc4b94 intel: tools: add intel_aub.h as part of aubinator
Include intel_aub.h in the Makefile.tools.am

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Juan A. Suarez Romero be5fe2153b intel: automake: include Makefile.drm.am
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Ian Romanick 36bd4a5f21 genxml: Silence about a billion unused parameter warnings
v2: Use textwrap.dedent to make the source line a lot shorter.
Shortening (?) the line was requested by Jason.

v3: Simplify the texwrap.dedent usage.  Suggested by Dylan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-06-28 14:50:14 -07:00
Lionel Landwerlin 7dd20bc3ee anv/i965: drop libdrm_intel dependency completely
With Ken's work to drop the library dependency on libdrm_intel, we now
only depend on libdrm for the kernel uapi headers it provides. It
seems like we're better off just embeddeding those headers ourselves,
making the lives of people developping news features tightly
integrated with the kernel a tiny bit easier.

This change also makes it a bit more obvious what cflags/libs are
required by the i915 drivers vs i965, by renaming INTEL_CFLAGS/LIBS
into I915_CFLAGS/LIBS.

Headers were generated from drm-tip on the following commit :

   commit 6d61e70ccc21606ffb8a0a03bd3aba24f659502b
   Merge: 338ffbf7cb5e c0bc126f97fb
   Author: Dave Airlie <airlied@redhat.com>
   Date:   Tue Jun 27 07:24:49 2017 +1000

       Backmerge tag 'v4.12-rc7' into drm-next

v2: Use installed files from the kernel (Daniel Vetter)

v3: Use headers from drm-next rather than drm-tip (Dave/Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin 230691b8e5 aubinator: import intel_aub.h from libdrm
This enables us to compile aubinator without the libdrm dependency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:28 +03:00
Topi Pohjolainen fbcc9555c5 intel/anv: Add missing break in anv_CreateDevice()
CID: 1413018
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-27 10:19:55 +03:00
Ian Romanick 1b101ca809 blorp: Use normalized coordinates on Gen6
Apparently, the sampler has some sort of precision issues for
non-normalized texture coordinates with linear filtering.  This caused
some small precision issues in scaled blits.  Work around this by using
normalized coordinates.  There is some extra work necessary because Gen6
uses TEX (instead of TXF) for some multisample resolve blits.

Fixes piglit.spec.arb_framebuffer_object.fbo-blit-stretch on SNB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-06-26 13:41:11 -07:00
Nanley Chery d6748f1fc4 anv/gpu_memcpy: Rename the gpu_memcpy function
A GPU memcpy function could alternatively be implemented using MI_*
commands. Provide more detail into how this one operates in case another
memcpy function is created.

v2:
- Update the commit message.
v3:
- Use 'memcpy' instead of 'cpy' (Jason Ekstrand)
- Shorten 'streamout' to 'so'

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 1415e7a997 anv/blorp: Provide surface states for CCS resolves
In the future, we plan on using this method to resolve images whose
surface state fast-clear value is dynamically updated during command
buffer execution. Start using it now for testing and to reduce churn
later on.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 4b2a2b70e0 anv/blorp: Add a surface-state-based CCS resolve function
This will be used in the next patch.

v2:
- Omit BLORP_BATCH_NO_EMIT_DEPTH_STENCIL (Jason Ekstrand)
- Update commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery d1119ab7b6 blorp/clear: Add a binding-table-based CCS resolve function
v2:
- Do layered resolves.
(Jason Ekstrand):
- Replace "bt" suffix with "attachment".
- Rename helper function to prepare_ccs_resolve.
- Move blorp_params_init() into helper function.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 6235f08ff8 anv: Adjust params of color buffer transitioning functions
Splitting out these fields will make the color buffer transitioning
function simpler when it gains more features.

v2: Remove unintended blank line (Iago Toral)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery e15b1c41a4 anv/blorp: Remove 3D subresource transition workaround
For 3D image subresources undergoing a layout transition via
PipelineBarrier, we increase the number of fast-cleared layers to match
the intended behaviour of KHR_maintenance1. When such subresources
undergo layout transitions between subpasses, we don't do this to avoid
failing incorrect CTS tests. Instead, unify the behaviour in both
scenarios, and wait for the CTS tests to catch up. See CL 1111 for the
test fix and Vulkan issue #849 for more information.

On SKL+, this causes 3 test failures under:
dEQP-VK.pipeline.render_to_image.3d.*

v2: Add a reference to the Vulkan issue (Iago Toral).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 5ca2fbcee2 anv/cmd_buffer: Adjust the image view reloc function
Make the function take in an image instead of an image view. This
enables us to record relocations for surfaces states created outside of
the anv_CreateImageView path.

v2 (Jason Ekstrand):
- Use image->offset instead of surf_offset in aux_offset calculation.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 5f4f50419c anv/cmd_buffer: Adjust layout transition aspect checking
Reflect the fact that an image view or subresource range with the color
aspect cannot have any other aspect.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery bc838fc759 anv: Add and use color auxiliary buffer helpers
v2:
- Check for aux levels in layer helper (Jason Ekstrand)
- Don't assert aux is present, return 0 if it isn't.
- Use the helpers.
v3:
- Make the helpers aspect-agnostic (Jason Ekstrand)
- Drop anv_image_has_color_aux()

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 8aaa13467d intel/isl: Only create a CCS buffer if the image supports rendering
v2: Omit the commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery b934330191 intel/isl: Limit CCS to one level and layer on gen7
v2 (Jason Ekstrand):
- Remove Vulkan-specific terminology from the commit title.
- Replace '== 7' with '<= 7' to hint that this is a new feature on BDW+.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 6b23c65f3a intel/blorp: Check for layer fast-clear restriction
v2: Update commit title (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery b46a071758 intel/blorp: Assert levels and layers are in range
v2 (Jason Ekstrand):
- Update commit title.
- Check aux level and layer as well.
v3 (Jason Ekstrand):
- Move the non-aux layer check.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Eric Engestrom 2b237ff64c anv: use Mesa's u_atomic.h header
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-26 18:21:22 +01:00
Anuj Phogat 7896dee349 anv/cnl: Don't write to Cache Mode Register 1 on gen10+
For PartialResolveDisableInVC field recommendation is to
always set this to 0 and that's the default value of the bit.
So, we have nothing left to write to CACHE_MODE_1.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-23 11:16:00 -07:00
Rafael Antognolli 9b78a52042 genxml: fix gen5 sampler border color state.
Based on the current code, gen5 and gen6 have the same sampler border color
state struct. So fix the gen5 one to match gen6.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Rafael Antognolli f43c21cbbd aubinator: Dump sampler state pointers on gen6 too.
We already have a function to dump sampler states, so do that for gen6
too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Chad Versace ecd8f85802 anv: Fix -Wswitch in anv_layout_to_aux_usage()
anv_layout_to_aux_usage() lacked a case for
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR. Add an unreachable case, because we
don't support the extension.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 15:18:24 -07:00
Anusha Srivatsa de7ed0ba55 i965/CFL: Add PCI Ids for Coffee Lake.
Coffee Lake has a gen9 graphics following KBL.
From 3D perspective, CFL is a clone of KBL/SKL features.

v2: Change commit message, correct alignment <Anuj Phogat>
v3: Update IDs.
v4: Initialize l3_banks, correct nomenclature <Anuj>

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Acked-by: Benjamin Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-22 14:28:43 -07:00
Anuj Phogat 43d11b128c intel: Enable vulkan build for gen10
This patch just enables building Vulkan libs for gen10. We
still don't have gen 10 support enabled on Vulkan.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:46 -07:00
Anuj Phogat ac6bc0e034 anv/cnl: Generate and use gen10 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat c17e214a6b anv/cnl: Don't set FloatBlendOptimizationEnable{Mask}
This field is remove from CACHE_MODE_1 register in gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat bf1d2c37c6 anv/cnl: Use GENX(xx) in place of GEN9_xx
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat 1e5a5d18d1 anv/cnl: Add #defines for MOCS and genX(x)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat ceed55e7bb intel/genxml: Add Gen10 CACHE_MODE_1 definitions
Few of the fields in this register are changed as compared
to gen9.xml.

V2: Remove some fields which are not valid anymore.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 6338b63270 intel/genxml: Rename StartInstanceLocation to StartingInstanceLocation
This is required because we already have a macro defined with
the name StartInstanceLocation.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 8869c8b3dc intel/genxml: Rename IndirectStatePointer to BorderColorPointer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 97f75fdfd0 intel/genxml: Combine DataDWord{0, 1} fields in to ImmediateData field
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat c61b909d14 intel/genxml: Add INSTDONE registers in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 03fddd3c1d intel/genxml: Add better support for MI_MATH in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Andres Gomez 5352174d49 anv: FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR not used with VkFormatProperties.bufferFeatures
VK_FORMAT_FEATURE_TRANSFER_[SRC|DST]_BIT_KHR is a flag value of the
VkFormatFeatureFlagBits enum that can only be hold and checked against
the linearTilingFeatures or optimalTilingFeatures members of the
VkFormatProperties struct but not the bufferFeatures member.

>From the Vulkan® 1.0.51, with the VK_KHR_maintenance1 extension,
section 32.3.2 docs for VkFormatProperties:

   "* linearTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_LINEAR.

    * optimalTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_OPTIMAL.

    * bufferFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by buffers."

    ...

    Bits which can be set in the VkFormatProperties features
    linearTilingFeatures, optimalTilingFeatures, and bufferFeatures
    are:

    typedef enum VkFormatFeatureFlagBits {

    ...

      VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR = 0x00004000,
      VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR = 0x00008000,

    ...

    } VkFormatFeatureFlagBits;

    ...

    The following bits may be set in linearTilingFeatures and
    optimalTilingFeatures, specifying that the features are supported
    by images or image views created with the queried
    vkGetPhysicalDeviceFormatProperties::format:

    ...

    * VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR specifies that an image
      can be used as a source image for copy commands.

    * VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR specifies that an image
      can be used as a destination image for copy commands and clear
      commands."

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 13:45:22 +03:00
Rafael Antognolli 78b843af3c intel/genxml: Use the same naming convention for Floating Point Mode.
In newer gens, this field has a prefix and the non-IEEEE-745 mode is called
"Alternate", instead of simply "Alt".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli ce728594fd intel/genxml: Normalize URB Data field in WM_STATE.
On gen6+, this is called "Dispatch GRF Start Register For Constant/Setup Data
0", while on gen5 and lower it's called only "Dispatch GRF Start Register For
URB Data", but it's essentially the same thing (URB data), so rename it to
match newer gens and simplify the C code that handles it.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 44415056e7 intel/genxml: Rename field on WM_STATE to match gen6+.
"Pixel Shader Kill Pixel" -> "Pixel Shader Kills Pixel", which is how it's
called on newer gens.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 82c66965ac intel/genxml: Normalize fields on WM_STATE.
On gen4, WM_STATE only has one Kernel Start Pointer and one GRF Register
Count, but we can make the code that handles this on multiple gens simpler if
we add an index 0 to it too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli eddb1ebccf intel/genxml: Add missing field to CLIP_STATE.
Just because it's not set doesn't mean that it doesn't exist. And since the
field is there on newer gens, having it on gen5 simplifies the code when
porting gen5 and lower.

Also add missing value to API Mode on CLIP_STATE on gen4.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 9a5ae19cbb intel/genxml: Fix type of UserClipFlags ClipTest Enable Bitmask.
This is a bitmask, so it can't be a boolean. Also rename it so it matches
gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 19d1defcd5 intel/genxml: Add missing fields to CLIP_STATE on gen4-5.
These fields are set by brw_clip_unit, so we need them when converting to
genxml.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli faa4f5c42d intel/genxml: Normalize GS_STATE.
Rename "Rendering Enable" to "Rendering Enabled", so it matches gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Lionel Landwerlin 030abc6109 intel: compiler/i965: fix is_broxton checks
In 5f2fe9302c is_geminilake was introduced for the differenciate
broxton from geminilake. Unfortunately I failed as verifying that
is_broxton is throughout the code base to mean Gen9lp.

Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-20 23:26:42 +01:00
Ben Widawsky 3e1055591b i965/cnl: Add l3 configuration for Cannonlake
V2 (Anuj):
Squash the changes in one patch rebase on master.
Address the review comments made by Francisco Jerez.
Do the URB allocation per slice (not per bank).

V3 (Anuj):
Update the comment.
Format the table as other l3 config tables.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
---
V1 was sent out with the heading:
"i965/cnl: Properly handle l3 configuration"
2017-06-20 12:18:26 -07:00
Anuj Phogat 1024dad4d9 i965: Add a variable for way size per bank in get_l3_way_size()
Adding this variable better explains the computation of L3 way
size in the function.

V2: Use const variable for way_size_per_bank.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00
Anuj Phogat 8521559e08 i965: Fix broxton 2x6 l3 config
The new table added in this patch matches with the table
in gfxspecs. We were programming the wrong values earlier.

V2: Update the comment.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00
Ian Romanick cbb941cdec intel/blorp: Apply source offset in the TEX case
Previously the offset was only applied in the TXF case.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick 990f2be139 intel/blorp: Apply Gen4 coord. normalization after cubemap sizes are adjusted
Otherwise the values used for coordinate normalization use the wrong
sizes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Jason Ekstrand b2dd61196e intel/blorp: Set needs_(dst|src)_offset for Gen4 cubemaps
We call convert_to_single_slice so they may end up with a non-trivial
offset that needs to be taken into account.

v2 (idr): Also set needs_src_offset.  Suggested by Jason.

Fixes ES2-CTS.functional.texture.specification.basic_copyteximage2d.cube_rgba
and ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba
on G45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101284
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Lionel Landwerlin 6d759cbd49 intel: common: add number of thread per eu
This will be used by to normalize OA counters.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin c77d98ef32 intel: common: express timestamps units in frequency
Rather than storing the period as a double that looses some precision.

Also fixes the Gen9LP timestamp frequency which is no 19200123 but
19200000 as pointed by Ville :

https://lists.freedesktop.org/archives/intel-gfx/2017-April/125126.html

Finally add the Cannonlake timestamp frequency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin 5f2fe9302c intel: common: add flag to identify platforms by name
The perf infrastructure needs to identify specific platforms, not just
generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Jonas Kulla a52ee32a9a anv: Fix L3 cache programming on Bay Trail
Valid values for URBAllocation start at 32, so substract that
before programming the register.

This was missed when porting from the GL driver.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-19 12:05:52 -07:00
Topi Pohjolainen 0d1af164e1 intel/isl/gen6: Allow arrayed stencil
Nothing prevents arrayed stencil surfaces even though hardware
doesn't support mipmapping.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-17 06:38:56 +03:00
Rafael Antognolli 3a767f8b06 genxml: The viewport state offset is actually an address.
This fixes code generation on gen45.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli ad109c16c2 genxml: Rename fields to match gen6+.
"Anti-aliasing Enable" to "Anti-Aliasing Enable".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli 1b42cd52a2 genxml: Rename SF_STATE field to match gen6+.
Rename "Use Point Width State" to "Point Width Source". It accepts the same
values and has the same meaning as gen6+, so lets keep them with the same name
to simplify the code.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Anuj Phogat c07271fef0 intel/isl: Add the maximum surface size limit
V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and
    2^38 bytes for gen9+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-06-16 09:05:05 -07:00
Anuj Phogat 7022978237 intel/isl: Use uint64_t to store total surface size
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-06-16 09:05:05 -07:00
Jason Ekstrand 7175561598 intel/blorp: Work around Sandy Bridge occlusion query issue
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand 96f9d4de7d intel/isl: Properly set SeparateStencilBufferEnable on gen5-6
On gen5-6, SeparateStencilBufferEnable and HierarchicalDepthBufferEnable
come hand in hand and we have to set either both or neither.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Kenneth Graunke af373ea4a2 genxml: Fix Gen4-5 SF_STATE "Line Width" fixed point type.
It's a U3.1.  It became a U3.7 on Sandybridge.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Ben Widawsky e179a3438a i965/cnl: Add a preliminary device for Cannonlake
v2 (Anuj):
Rebased on master and updated pci ids
Remove redundant initialization of max_wm_threads to 64 * 12.
For gen9+ max_wm_threads are initialized in gen_get_device_info().

v3 (Anuj):
Move the patch to end of series.
Remove unused gt1, gt2, gt3 functions.
Remove l3_banks variable. Variable is now available on master.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:03:00 -07:00
Jason Ekstrand f2cbf738b4 anv: Don't advertise support on anything above gen9
This will prevent the driver from even trying to work on Cannon Lake
until we get actual support added.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:03:00 -07:00
Anuj Phogat 9acc93feeb i965/cnl: Enable CCS_E and RT support for few formats
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat 61f171292e i965/cnl: Reformat surface_format_info table to accomodate gen10+
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat f9e31a26d4 i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3
v1: By Ben Widawsky <benjamin.widawsky@intel.com>
v2: v1 had an assert only for VS. Add the restriction for GS, HS and
    DS as well and make sure the allocated sizes are not multiple of 3.
v3: Move the entry_size checks in to compiler code (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-09 16:02:59 -07:00
Anuj Phogat 111881abac i965/cnl: Handle gen10 in switch cases across the driver
V2: Start using gen10 functions isl_gen10*(), gen10_blorp_exec()
    gen10_init_atoms() (Jason)
    Remove Vulkan changes. Do them later in a separate patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat 30e749c8f1 i965/cnl: Update few assertions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat 56b4d82729 i965/cnl: Add cnl bits in aubinator
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat dc83ce7a16 i965/cnl: Wire up android Mesa build files for gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-06-09 16:02:58 -07:00
Anuj Phogat e01c5a6824 i965/cnl: Wire up Mesa build files for gen10
V2: Remove isl_gen10.c and isl_gen10.h

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-09 16:02:58 -07:00
Anuj Phogat 2417d5ca19 intel/genxml: Update genx_bits for gen10+
This commit adds a gen10 case to the switch statement and
drops some unneeded code for handling gen numbers which
doesn't work on gen10 and above.

V2: Drop "z = float(z)" and the "z *= 10" lines

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat 98b95a3735 i965/cnl: Add gen10 specific function declarations
These declarations will help the code start compiling
once we wire up the makefiles for gen10. Later patches
will start using these functions for gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat 2704ccc646 i965/cnl: Include gen10_pack.h
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat a48cb9cf7f i965/cnl: Define genX(x) and GENX(x) for gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Jason Ekstrand aa416f515a i965/genxml: Add gen10.xml
V2(Anuj):
Add default value for length of 3DPRIMITIVE command
Add values for 'Attribute Active Component Format'
Rename few fields to match gen9.xml

V3 (Ander Conselvan de Oliveira)
Add gen10 alias for MOCS
Make 3DSTATE_CONSTANT_BODY on Gen10 use arrays

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:00:49 -07:00
Ben Widawsky d968f072bc i965: Make feature macros gen8 based
All the "features" of the hardware are similar starting with GEN8, so remove as
much of the GEN9 uniqueness as possible. This makes implementing future gen
platforms a bit easier.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 15:27:14 -07:00
Jason Ekstrand a59c7f834c intel/isl: Add an enum for describing auxiliary compression state
This enum describes all of the states that a auxiliary compressed
surface can have.  All of the states as well as normative language for
referring to each of the compression operations is provided in the
truly colossal comment for the new isl_aux_state enum.  There is also
a diagram showing how surfaces move between the different states.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand bacae7221b blorp: Use FullSurfaceDepthandStencilClear for blorp_hiz_op
The blorp_hiz_op entrypoint always acts on a full subresource of a HiZ
buffer so we can just set the flag unconditionally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand 9cb6ac62fb intel/blorp: Plumb through access to the workaround BO
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101283
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Nanley Chery ed5801864e anv/blorp: Move the depth cache flush outside of BLORP
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-07 08:54:54 -07:00
Jason Ekstrand fbd8a33f61 intel/blorp: Refactor the HiZ op interface
This commit does a few things:

 1) Now that BLORP can do HiZ ops on gen8+, drop the gen6 prefix.
 2) Switch parameters to uint32_t to match the rest of blorp.
 3) Take a range of layers and loop internally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand f9fd976e8a i965/miptree: Store fast clear colors in an isl_color_value
This commit, out of necessity, makes a number of changes at once:

 1) Changes intel_mipmap_tree to store the clear color for both color
    and depth as an isl_color_value.

 2) Changes the depth/stencil emit code to do the format conversion of
    the depth clear value on Haswell and earlier instead of pulling a
    uint32_t directly from the miptree.

 3) Changes ISL's depth/stencil emit code to perform the format
    conversion of the depth clear value on Haswell and earlier instead
    of assuming that the depth value in the float is pre-converted.

 4) Changes blorp to pass the depth value through as a float.

 5) Changes the Vulkan driver to pass the depth value to blorp as a
    float rather than a uint.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 08:54:54 -07:00
Anuj Phogat 8d02916e0c intel: Fix broxton 2x6 way size computation
This patch is undoing the changes to way size computation
in broxton 2x6, made by below commit:

Commit: 0d576fbfbe
Author:     Anuj Phogat <anuj.phogat@gmail.com>
i965: Simplify l3 way size computations

By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 21:30:51 -07:00
Eric Engestrom 63a8a88ac4 tree-wide: remove trailing backslash
Simple search for a backslash followed by two newlines.
If one of the newlines were to be removed, this would cause issues, so
let's just remove these trailing backslashes.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-07 01:18:09 +01:00
Alex Smith 922b038864 anv: Set better descriptor set limits
Based on discussions with Jason, Ivy Bridge and Bay Trail only actually
support 16 samplers, while newer hardware can support more than the
current limit of 64. Therefore set the lower limit where needed, and
bump up to 128 for everything else. There is also a limit on the total
number of other resources of around 250.

This allows Dawn of War III to render correctly on ANV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:20:09 -07:00
Alex Smith 59c1797d56 anv: Set driver version to Mesa version
As already done by RADV.

v2: Move version calculation function to src/vulkan/util to share with
    RADV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:20:00 -07:00
Alex Smith 621b3410f5 util/vulkan: Move Vulkan utilities to src/vulkan/util
We have Vulkan utilities in both src/util and src/vulkan/util. The
latter seems a more appropriate place for Vulkan-specific things, so
move them there.

v2: Android build system changes (from Tapani Pälli)

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:17:13 -07:00
Lionel Landwerlin 2ef73473c8 intel: gen-decoder: rework how we handle groups
The current way of handling groups doesn't seem to be able to handle
MI_LOAD_REGISTER_* with more than one register. This change reworks
the way we handle groups by building a traversal list on loading the
GENXML files.

Let's say you have

Instruction {
  Field0
  Field1
  Field2
  Group0 (count=2) {
    Field0-0
    Field0-1
  }
  Group1 (count=4) {
    Field1-0
    Field1-1
  }
}

We build of linked on load that goes :

Instruction -> Group0 -> Group1

All of those are gen_group structures, making the traversal trivial.
We just need to iterate groups for the right number of timers (count
field in genxml).

The more fancy case is when you have only a single group of unknown
size (count=0). In that case we keep on reading that group for as long
as we're within the DWordLength of that instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-06 14:04:37 +01:00
Kenneth Graunke 9cd69022d5 i965: Change INTEL_DEBUG=vec4 to INTEL_SCALAR_VS for consistency.
We moved to INTEL_SCALAR_* when we added more than a single stage, but
never went back and converted the VS to work that way.  Be consistent.

Also update the documentation to actually mention these debug variables.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-05 23:32:40 -07:00
Anuj Phogat 0d576fbfbe i965: Simplify l3 way size computations
By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

V2: Keep the get_l3_way_size() function.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-02 16:21:56 -07:00
Anuj Phogat eb23be1d97 i965: Add and initialize l3_banks field for gen7+
This new field helps simplify l3 way size computations
in next patch.

V2: Initialize the l3_banks to 0 in macros.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-02 16:21:56 -07:00
Jason Ekstrand 1a22c4c960 intel/blorp: Handle gen6 stencil/HiZ offsets in the back-end
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:34:01 -07:00
Jason Ekstrand d065a9540c intel/isl: Add a helper for getting the byte/tile offset of a subimage
Frequently, get_image_offset_sa is combined with get_intratile_offset_sa
so it makes sense to have a single helper to do both.  If the caller
doesn't want the intratile offsets, it can simply pass NULL and ISL will
assert that they are 0.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:58 -07:00
Jason Ekstrand b178762d05 intel/isl: Make get_intratile_offset_el take the element size in bits
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:56 -07:00
Jason Ekstrand 757f7087a5 intel/isl: Add a new layout for HiZ and stencil on Sandy Bridge
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:47 -07:00
Jason Ekstrand cb8cdab8e8 intel/isl: Generate phys_total_el from isl_calc_phys_extent
The only surface layout for which slice0 makes any sense is GEN4_2D.
Move all of the slice0 stuff into isl_calc_phys_total_extent_el_gen4_2d
and make the others trivially return the total size in surface elements.
As a side-effect, array_pitch_el_rows is now returned from these helpers
as well.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:45 -07:00
Jason Ekstrand 918f41bb29 intel/isl: Don't check array pitch for gen4 3D textures
Array pitch doesn't matter in this layout.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:43 -07:00
Jason Ekstrand 044bfb292f intel/isl: Refactor to use a phys_total_el extent.
We've already implicitly been using a physical total size in surface
elements.  This just centralizes things a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:41 -07:00
Jason Ekstrand 1547d133ac intel/isl: Add an isl_assert_div helper
This is a fairly common operation and it's nice to be able to just call
the one little function.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:39 -07:00
Jason Ekstrand 58051ad220 intel/isl: Refactor isl_calc_array_pitch_el_rows
Over 90% of the function only applies to ISL_DIM_LAYOUT_GEN4_2D anyway
so we can just handle the other two as special cases at the top.  The
two "generic" cases below the switch only apply on gen9 and above and
only to 3D or CCS surfaces.  This implies that they only apply to
surfaces with ISL_DIM_LAYOUT_GEN4_2D.  Making them look generic is a
lie.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:37 -07:00
Jason Ekstrand fe13c59c1b intel/isl: Move isl_calc_array_pitch_el_rows higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:34 -07:00
Jason Ekstrand c1a70165be intel/isl: Remove the device parameter from isl_tiling_get_info
We were only using it for validating that we don't use Ys/Yf on gen8 and
earlier.  Removing it from isl_tiling_get_info lets us remove it from a
bunch of other things that had no business needing a hardware
generation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:31 -07:00
Kenneth Graunke fe14a9a501 i965: Drop duplicate shadow variable.
We already initialized this at the top of the function.

Trivial.
2017-06-01 14:28:12 -07:00
Kenneth Graunke fe9699dcb4 genxml: Make 3DSTATE_CONSTANT_BODY on Gen7+ use arrays.
This will let us initialize the constant buffers with loops.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:46 -07:00
Kenneth Graunke 12303bd390 genxml: Fix decoder to print the array element on field members.
Previously we'd print things like:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength: 0
       ReadLength: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength: 1
       ReadLength: 0

instead of the more obvious:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength[0]: 0
       ReadLength[1]: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength[2]: 1
       ReadLength[3]: 0

(Yes, the ralloc context here is bogus - the decoder leaks just about
everything.  We need to use proper ralloc contexts someday...)

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:46 -07:00
Kenneth Graunke 73c21e69d0 genxml: Fix decoding of array groups.
If you had a group as the first element of a struct, i.e.

  <struct name="3DSTATE_CONSTANT_BODY" length="10">
    <group count="4" start="0" size="16">
      <field name="ReadLength" start="0" end="15" type="uint"/>
    </group>
    ...
  </struct>

we would get a group_offset of 0, causing create_field() to think the
field wasn't in a group, and fail to offset forward for successive array
elements.  So we'd mark all the array elements as offset 0.

Using ctx->group->elem_size is a better check for "are we in a group?".

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke d1b949282f genxml: Fix decoder for groups with multiple fields.
If you have something like:

    <group count="0" start="96" size="32">
      <field name="Entry_0" start="0" end="15" type="GATHER_CONSTANT_ENTRY"/>
      <field name="Entry_1" start="16" end="31" type="GATHER_CONSTANT_ENTRY"/>
    </group>

We would reset ctx->group_count to 0 after processing the first field,
so the second would not have a group count.

This is largely untested, as the only groups with multiple fields are
packets we don't emit in Mesa.  Found by inspection.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke df2d55ba57 genxml: Fix parsing of address fields in groups.
For example,

    <group count="4" start="64" size="64">
      <field name="Pointer" start="5" end="63" type="address"/>
    </group>

used to generate:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer, 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer, 0);
   ...

but now generates code with proper subscripts:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer[0], 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer[1], 0);
   ...

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke 65f5f3c85c i965: Move SOL PSIZ hacks from draw time to link time.
We can just update the gl_transform_feedback_info fields at link time
to make the VUE header fields have the right location and component.
Then we don't need to handle them specially at draw time, which is
expensive.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-01 00:08:29 -07:00
Kenneth Graunke 56535959fd anv: Port over CACHE_MODE_1 optimization fix enables from brw.
Ben and I haven't observed these to help anything, but they enable
hardware optimizations for particular cases.  It's probably best to
enable them ahead of time, before we run into such a case.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Kenneth Graunke 53368b008e genxml: Add Gen9 CACHE_MODE_1 definitons.
These were already in gen8.xml but not gen9.xml.  There are a few new
fields and a couple that have changed.  These are all documented in the
Skylake PRM, Volume 2c Command Reference: Registers, Part 1.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Kenneth Graunke 9afe5846d2 genxml: Make a SCISSOR_RECT structure on Gen4-5.
Gen6+ support multiple scissor rectangles, and define a SCISSOR_RECT
structure containing their dimensions.  On Gen4-5, those same fields
exist in SF_VIEWPORT.

This patch extracts the SF_VIEWPORT fields into a SCISSOR_RECT
structure.  Although not a named concept on Gen4-5, it works just
as well, and gives us a consistent SCISSOR_RECT structure across
all generations, making it easier to reuse code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:37 -07:00
Kenneth Graunke 1e3880544e i965: Ignore INTEL_SCALAR_* debug variables on Gen10+.
Scalar mode has been default since Broadwell, and vector mode is getting
increasingly unmaintained.  There are a few things that don't even fully
work in vector mode on Skylake, but we've never cared because nobody
uses it.  There's no point in porting it forward to new platforms.

So, just ignore the debug options to force it on.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-29 21:40:44 -07:00
Emil Velikov 3e8790bff0 anv: automake: list shared libraries after the static ones
The compiler can discard the shared ones from the link chain, since
there is no user (the static libraries) before it on the command line.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-05-29 16:42:41 +01:00
Jason Ekstrand 79f2a5541f i965: Use BLORP for color clears on gen4-5
We don't support replicated data clears yet.  Those take a bit more work
and enabling replicated data clears in its own commit is probably better
for bisectibility anyway.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand fa13ef285d intel/blorp: Assert that no one tries to blit combined depth stencil
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 752d7af77a i965: Add blorp support for gen4-5
Due to complications with things such as URB setup on gen4-5, it's
easier to keep gen4 support in blorp completely internal to i965.  This
makes things a bit awkward because that means there's a file in i965
that includes blorp_priv.h but it's either that or have a file in blorp
that includes brw_context.h.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 23125b7102 intel/blorp: Set additional brw_wm_prog_key fields on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 0ed6f196fc intel/blorp: Add support for gen4-5 SF programs
As part of enabling support for SF programs, we plumb the SF URB size
through to emit_urb_config.  For now, it's always zero but, on gen4, it
may be something larger.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 8bce7bda45 intel/blorp: Make convert_to_single_slice available outside blorp_blit
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 110061afa2 intel/blorp: Use designated initializers to set up VERTEX_ELEMENTS
We also add a slot variable and use it as an iterator.  This will make
it much easier to conditionally put something between the header and the
vertex position.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand ac79806766 intel/blorp: Rename emit_viewport_state to emit_cc_viewport
The real point of this packet is that it sets up CC_VIEWPORT so that
name is a bit better.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 1f2f90be1f intel/blorp: Make the common genX_blorp_exec code gen4-safe
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand a7f5d6df8a intel/blorp: Re-arrange blorp_genX_exec.h
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 302c0488cf intel/blorp: Don't use ffma directly
It isn't supported prior to gen6 and, on gen6+, NIR will fuse the fmul
and fadd into an ffma automatically for us anyway.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 675ec434f3 intel/blorp: Delete isl_to_gen_ds_surfype
It's no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand e80f0840bf intel/blorp: Pull the pipeline bits of blorp_exec into a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 3d35e5a51e intel/blorp/blit: Add support for normalized coordinates
Gen5 and earlier can't do non-normalized coordinates so we need to
compensate in the shader.  Fortunately, it's pretty easy plumb through.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 18e18a1863 i965: Move clip program compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 9fb8a8775b i965: Move SF compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 21ba2b4bef intel/compiler: Make brw_disasm take const assembly
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand c336c224a6 intel/decoder: Handle the BLT ring in gen_group_get_length
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 9d1001c8e5 intel/decoder: Handle gen4 VF_STATISTICS and PIPELINE_SELECT
These need special handling because they have no "DWord Length"
parameter and they have an unusual bias of 1.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 87588e546e intel/genxml: Rename 3DSTATE_AA_LINE_PARAMS on gen5
All of the other gens use "PARAMETERS".

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 04f6d975e1 intel/genxml: Use the right subtype for VF_STATISTICS on gen4
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 1fcc5e2399 intel/genxml: Iron Lake doesn't support non-normalized sampler coordinates
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 648b618dc5 intel/genxml: Add SAMPLER_STATE to gen 4.5
Somehow this got missed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 3f8ee8c703 intel/genxml: Rename the CC_VIEWPORT pointer on gen4-5
It isn't a pointer to "color calc state", that's the packet it's in.
It's a pointer to the CC viewport state.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 0ee1ef0cbb intel/genxml: Sampler state is a pointer on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 64243d3b8e intel/genxml: Suffix KSP0 fields on Iron Lake
Iron Lake introduced the multiple KSP thing and so you have KSP0-3.
However, the genxml didn't have an index on the first "Kernel Start
Pointer" or "GRF Register Count".  Add one to match gen6+.  While we're
here, we drop the brackets from the other "GRF Register Count" fields.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 7769e448aa intel/genxml: Make a bunch of things offsets on gen4-5
Most things on gen4-5 are addresses because we don't have dynamic state
base address and we don't have instruction state base on gen4.  However,
whoever converted things to addresses got a little over-excited and
converted too much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 8257fe7b18 intel/isl: Add gen4_filter_tiling
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand 332a5d7a3f intel/isl: Add support for setting component write disables
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00