Commit Graph

2206 Commits

Author SHA1 Message Date
Matt Turner ce6b8627d8 i965: Validate destination restrictions with vector immediates
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner 1d79c828d8 i965: Don't let raw-move check be tricked by immediate vector types
UB and B type encodings are the same as UV and VF. Noticed when writing
the following patch.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner 48aa6ecb87 i965: Only change type of 0.0f to VF if destination stride == 1
The destination stride must be equivalent to a dword if VF is used.

Also, since the only compaction table entires with "i:vf" have the
destination as "r:f" specifically check that the destination is of type
float.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner 56a676eed2 i965: Remove CONT/BREAK from instruction compaction test
These cannot be compacted. A similar mistake was fixed in commit
90eaf01616

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner 3d661e6062 i965: Test instruction compaction on all supported Gens
Note that there's no point in testing on G45, since its compaction is
the same as Gen5. Same logic applies to Gen7 variants and low-power
parts.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner 9ff7d9b853 i965: Silence signed/unsigned comparison warning
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner eac89911e5 i965: Move compaction "prepass" into brw_eu_compact.c
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner 17641f6388 i965: Mark src inst pointer const in compaction code
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Topi Pohjolainen 393ec1a507 intel/blorp: Adjust intra-tile x when faking rgb with red-only
v2 (Jason): Adjust directly in surf_fake_rgb_with_red()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-21 09:55:08 +03:00
Kenneth Graunke 6f8a577ed2 anv: Use ISL for emitting null surface states.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-19 00:46:48 -07:00
Kenneth Graunke 5db9757bd7 isl: Add a null surface fill function.
ISL already offers functions to fill out most kinds of SURFACE_STATE,
so why not handle null surfaces too?

Null surfaces are simple, so we can just take the dimensions, rather
than an entirte fill structure.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-19 00:46:36 -07:00
Eric Anholt 9caba0f16f anv: Move a comment that got left behind in the u_vector refactor. 2017-08-18 11:56:58 -07:00
Jason Ekstrand 1af8342b0c intel/isl: Replace switch statements of doom with a macro
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-17 18:09:05 -07:00
Jason Ekstrand 2d68d27071 intel/isl: Reduce header file duplication
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-17 18:09:05 -07:00
Jason Ekstrand bf1d2e84f3 anv/gem: Add a stub for sync_file_merge
This fixes make check

Fixes: 5c4e4932e0
2017-08-16 18:44:26 -07:00
Jason Ekstrand 98983503cb anv: Advertise VK_KHR_external_semaphore
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand 55bce22d8d anv: Use DRM sync objects for external semaphores when available
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand f41a0e4b0d anv/gem: Add a drm syncobj support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand 5c4e4932e0 anv: Implement support for exporting semaphores as FENCE_FD
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand e4054ab77b anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand 017cdb10cf anv: Submit a dummy batch when only semaphores are provided.
Vulkan allows you to do a submit whose only job is to wait on and
trigger semaphores.  The easiest way for us to support that right
now is to insert a dummy execbuf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand 031f57eba3 anv: Add a basic implementation of VK_KHX_external_semaphore
This patch adds an implementation based on DRM BOs.  We don't actually
advertise the extension yet because we want to add a couple more paths
first.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Scott D Phillips d6539608a4 intel/genxml: Fix gen10 BLEND_STATE variable length packing
BLEND_STATE packing was modified to be variable-length in:

 9670124e31 genxml: Make BLEND_STATE command support variable length array.

The initial gen10.xml still had the old, fixed-length style
definition for BLEND_STATE. So gen10_upload_blend_state would
overwrite the packed BLEND_STATE_ENTRYs with its own fixed array
of all-zero entries when packing BLEND_STATE. This caused
BLEND_STATE upload to not work at all.

Fixes: aa416f515a ("i965/genxml: Add gen10.xml")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-08-15 09:06:29 -07:00
Ben Widawsky 8f6e54c929 i965: Pretend that CCS modified images are two planes
v2: move is_aux into if block. (Jason)
Use else block instead of goto (Jason)

v3: Fix up logic for is_aux (Ben)
Fix up size calculations and add FIXME (Ben)

v4 (Jason Ekstrand):
Use the aux_pitch in the image instead of calculating it

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Jason Ekstrand cf2e92262b intel/isl: Add support for I915_FORMAT_MOD_Y_TILED_CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Iago Toral Quiroga 81615ad444 intel/compiler: properly size attribute wa_flags array for Vulkan
Mesa will map user defined vertex input attributes to slots
starting at VERT_ATTRIB_GENERIC0 which gives us room for only 16
slots (up to GL_VERT_ATTRIB_MAX). This sufficient for GL, where
we expose exactly 16 vertex attributes for user defined inputs, but
in Vulkan we can expose up to 28 (which are also mapped from
VERT_ATTRIB_GENERIC0 onwards) so we need to account for this when
we scope the size of the array of attribute workaround flags
that is used during the brw_vertex_workarounds NIR pass. This
prevents out-of-bounds accesses in that array for NIR shaders
that use more than 16 vertex input attributes.

Fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.*

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-11 10:41:44 +02:00
Kenneth Graunke 5563872dbf isl: Validate row pitch of stencil surfaces.
Also, silence an obnoxious finishme that started occurring for all
GL applications which use stencil after the i965 ISL conversion.

v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using
    separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using
    combined depth-stencil.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-10 15:18:58 -07:00
Jason Ekstrand 4d27c6095e intel/isl: Don't align the height of the last array slice
We were calculating the total height of 2D surfaces by multiplying the
row pitch by the number of slices.  This means that we actually request
slightly more space than actually needed since the padding on the last
slice is unnecessary.  For tiled surfaces this is not likely to make a
difference.  For linear surfaces, on the other hand, this means we may
require additional memory.  In particular, this makes the i965 driver
reject EGL imports of buffers which do not have this extra padding.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 09:31:11 -07:00
Jason Ekstrand c15b92ce11 intel/isl: Stop padding surfaces
The docs contain a bunch of commentary about the need to pad various
surfaces out to multiples of something or other.  However, all of those
requirements are about avoiding GTT errors due to missing pages when the
data port or sampler accesses slightly out-of-bounds.  However, because
the kernel already fills all the empty space in our GTT with the scratch
page, we never have to worry about faulting due to OOB reads.  There are
two caveats to this:

 1) There is some potential for issues with caches here if extra data
    ends up in a cache we don't expect due to OOB reads.  However,
    because we always trash the entire cache whenever we need to move
    anything between cache domains, this shouldn't be an issue.

 2) There is a potential issue if a surface gets placed at the very top
    of the GTT by the kernel.  In this case, the hardware could
    potentially end up trying to read past the top of the GTT.  If it
    nicely wraps around at the 48-bit (or 32-bit) boundary, then this
    shouldn't be an issue thanks to the scratch page.  If it doesn't,
    then we need to come up with something to handle it.

Up until some of the GL move to ISL, having the padding code in there
just caused us to harmlessly use a bit more memory in Vulkan.  However,
now that we're using ISL sizes to validate external dma-buf images,
these padding requirements are causing us to reject otherwise valid
images due to the size of the BO being too small.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 09:31:11 -07:00
Jason Ekstrand 06d3115bb9 anv/formats: Allow sampling on depth-only formats on gen7
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-08-07 08:27:09 -07:00
Chris Wilson 6c530ad116 i965: Reduce passing 2x32b of reloc_domains to 2 bits
The kernel only cares about whether the object is to be written to or
not, only reduces (reloc.read_domains, reloc.write_domain) down to just
!!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read
those relocs and instead userspace has to pass that information in the
execobject.flags. We can simplify our reloc api by also removing the
unused read/write domains and only pass the resultant flags.

The caveat to the above are when we need to make the kernel aware that
certain objects need to take into account different work arounds.
Previously, this was done using the magic (INSTRUCTION, INSTRUCTION)
reloc domains. NO_RELOC requires this to be passed in the execobject
flags as well, and now we push that up the callstack.

The API is more compact, more expressive of what happens underneath, but
unfortunately requires more knowledge of the system at the point of use.
Conversely it also means that knowledge is specific and not generally
applied and so not overused.

   text	   data	    bss	    dec	    hex	filename
8502991	 356912	 424944	9284847	 8dacef	lib/i965_dri.so (before)
8500455	 356912	 424944	9282311	 8da307	lib/i965_dri.so (after)

v2: (by Ken) Rebase.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Juan A. Suarez Romero 86c68e0a33 anv: Makefile.vulkan.am: ICD json files are now generated with python
Commit 0ab04ba979 (anv: Use python to generate ICD json files) changed
the way ICD json files are created.

Remove the old .in files from extra dist, and add the python script.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-04 09:54:46 +02:00
Lionel Landwerlin 1006cd512d anv: put anv_extensions.c in gitignore
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-03 16:14:45 +01:00
Tapani Pälli ca6237eb4f android: anv_extensions.c is generated to libmesa_vulkan_common
Fixes build error with anv_extensions.c not found for
libmesa_anv_entrypoints.

Fixes: d62063c "anv: Autogenerate extension query and lookup"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-03 13:09:59 +03:00
Dave Airlie 271fa3a684 intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.
If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.

Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-03 16:54:08 +10:00
Matt Turner 858f554078 i965: Fix indentation 2017-08-02 16:49:32 -07:00
Jason Ekstrand 277644221d anv: Advertise VK_KHR_relaxed_block_layout
There is literally no work for us to do here.  It already just works in
our driver.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand 600605e3fc anv: Bump the advertised version to 1.0.57
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand 077b200096 anv: Pull the API version from anv_extensions.py
This way everything stays in sync and we only have the one version
number.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand 0ab04ba979 anv: Use python to generate ICD json files
This is more lines of code but the python is far easier to read than the
sed expressions we were using before.  Also, this allows us to pull the
API version from anv_entrypoints.py so it never gets out-of-sync.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand 7382d8a416 anv: Add MAX_API_VERSION to anv_extensions.py
The VkVersion class is probably overkill but it makes it really easy to
compare versions in a way that's safe without the caller having to think
about patch vs. no patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand a25267654b anv: Make some bits of anv_extensions module-private
This way we can use "from anv_extensions import *" in the entrypoint
generator without worrying too much about pollution

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Tapani Pälli 99c764b647 intel: move gen_decoder.* back to COMMON_FILES
this change reverts commit 4f695731, we want to be able to build
with -DDEBUG and gen_decoder on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-02 10:31:13 +03:00
Tapani Pälli 9bd15da85d android: link libmesa_intel_common with zlib and expat
Makes it possible to build Mesa on Android with -DDEBUG with
the next patch that reverts 4f695731.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-02 10:30:50 +03:00
Jason Ekstrand d62063ce31 anv: Autogenerate extension query and lookup
As time goes on, extension advertising is going to get more complex.
Today, we either implement an extension or we don't.  However, in the
future, whether or not we advertise an extension will depend on kernel
or hardware features.  This commit introduces a python codegen framework
that generates the anv_EnumerateFooExtensionProperties functions as well
as a pair of anv_foo_extension_supported functions for querying for the
support of a given extension string.  Each extension has an "enable"
predicate that is any valid C expression.  For device extensions, the
physical device is available as "device" so the expression could be
something such as "device->has_kernel_feature".  For instance
extensions, the only option is VK_USE_PLATFORM defines.

This mechanism also means that we have a single one-line-per-entry table
for all extension declarations instead of the two tables we had in
anv_device.c and the one we had in anv_entrypoints_gen.py.  The Python
code is smart and uses the XML to determine whether an extension is an
instance extension or device extension.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-01 11:12:41 -07:00
Jason Ekstrand ddc86c1d0e anv: Add a new centralized extensions file
This will allow us to keep everything in one place when it comes to
declaring what extensions are supported.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-01 11:12:41 -07:00
Iago Toral Quiroga 31f1863ace anv: only expose up to 28 vertex attributes
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-26 08:16:43 +02:00
Iago Toral Quiroga a848e693ef anv/cmd_buffer: fix off by one error in assertion
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-26 08:02:06 +02:00
Eric Anholt decd2b32aa intel/decoder: Reuse the gen_make_gen() helper.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-25 14:44:52 -07:00
Eric Anholt 19ffa4bfb2 intel/decoder: Reuse the MAX2 macro instead of defining another one.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-25 14:44:52 -07:00
Emil Velikov 5d47dd9c2a intel/blorp: ship blorp_genX_exec.h within the tarball
Fixes: c9cb37b2a6 ("intel/blorp: Add a partial resolve pass for MCS")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 15:14:21 +01:00
Jason Ekstrand 6874b953f6 anv/image: zalloc image views
This allows us to avoid some extra zeroing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand a1cad8218e anv/image: Use vk_zalloc instead of an explicit memset
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 1e32c8303a anv: Separate surface states by layout instead of aux_usage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 628bfaf1c6 intel/isl: Add some sanity checks for compressed surfaces
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 5de4209f91 intel/isl: Add a helper to get a subimage surface
We already have a helper for doing this in BLORP, this just moves the
logic into ISL where we can share it with other components.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand 72bc38cfc5 anv: Get rid of some unused function declarations
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand d4de403f91 intel/isl: Add a helper for determining if a color is 0/1
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand b26b2490e5 intel/blorp: Allow blorp_copy on sRGB formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand fb86ac94cb intel/isl/format: Add an srgb_to_linear helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand 44e9d65757 intel/isl/format: Dedent the template in gen_format_layout.py
This makes it much easier to edit the template and doesn't really dirty
the python all that much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand 268ba028dc intel/isl: Add an aux state for "partial clear"
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand c9cb37b2a6 intel/blorp: Add a partial resolve pass for MCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Nanley Chery 67027ddf3f anv: Predicate fast-clear resolves
Image layouts only let us know that an image *may* be fast-cleared. For
this reason we can end up with redundant resolves. Testing has shown
that such resolves can measurably hurt performance and that predicating
them can avoid the penalty.

v2:
- Introduce additional resolve state management function (Jason Ekstrand).
- Enable easy retrieval of fast clear state fields.
v3: Use more descriptive field enums (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 8e2729fbb8 intel/blorp: Allow BLORP calls to be predicated
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery be516ba9b1 anv/cmd_buffer: Skip some input attachment transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 597ff919e7 anv: Stop resolving CCS implicitly
With an earlier patch from this series, resolves are additionally
performed on layout transitions. Remove the now unnecessary implicit
resolves within render passes.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 5ba93e6f5a anv: Transition more color buffer layouts
v2: Expound on comment for the pipe controls (Jason Ekstrand).
v3:
- Cast base_layer to uint64_t to avoid overflow.
- Remove "seems" from the pipe control comment.
- Fix clamp of layer_count (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery a899747eb3 anv/cmd_buffer: Warn about not enabling CCS_E
Use the performance warning infrastructure to provide helpful
information when testing applications.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 9c9f63d1c7 anv/cmd_buffer: Move aux_usage assignment up
For readability, bring the assignment of CCS closer to the assignment of
NONE and MCS.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 62d72bb5d0 anv/cmd_buffer: Always enable CCS_D in render passes
The lifespan of the fast-clear data will surpass the render pass scope.
We need CCS_D to be enabled in order to invalidate blocks previously
marked as cleared and to sample cleared data correctly.

v2: Avoid refactoring.
v3: Allow CCS_D for subpass resolves.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 8e532aa028 anv/cmd_buffer: Disable CCS on gen7 color attachments upfront
The next patch enables the use of CCS_D even when the color attachment
will not be fast-cleared. Catch the gen7 case early to simplify the
changes required.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 9fd1f2aa3c anv/cmd_buffer: Ensure fast-clear values are current
v2: Rewrite functions, change location of synchronization.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery 0b16600056 anv/gpu_memcpy: Add a lighter-weight GPU memcpy function
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.

v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery dcff5ab9f1 anv/cmd_buffer: Restrict fast clears in the GENERAL layout
v2: Remove ::first_subpass_layout assertion (Jason Ekstrand).
v3: Allow some fast clears in the GENERAL layout.
v4: Remove extra '||' and adjust line break (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 9ffe87122b anv/cmd_buffer: Don't partially fast clear image layers
v2: Don't pass in the command buffer (Jason Ekstrand).
v3: Remove an incorrect assertion and an if condition for gen7.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 07cc2ec9db anv/cmd_buffer: Initialize the clear values buffer
v2: Rewrite functions.
v3 (Jason Ekstrand):
- Don't set ResourceMinLOD.
- Fix clamp of level_count.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 88200e87f6 anv/image: Append CCS/MCS with a fast-clear state buffer
v2: Update comments, function signatures, and add assertions.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 325ecffc62 anv/image: Disable CCS if the image doesn't support rendering
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery 01db9a74c6 intel/isl: Add surface state clear value information
This will be used to load and store clear values from surface state
objects.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery b178e239dd anv: Transition MCS buffers from the undefined layout
v2: Define MCS buffers with any sample count (Jason)

Cc: <mesa-stable@lists.freedesktop.org>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-07-22 20:12:09 -07:00
Jason Ekstrand f793c57cc5 intel/isl: Tighten up restrictions for CCS on gen7
It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7.  However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering.  Let's just play it safe and disallow it for now.  If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.
2017-07-22 20:12:07 -07:00
Jason Ekstrand 20533e0da7 anv/blorp: Assert isl_surf_init success in do_buffer_copy
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 08:21:27 -07:00
Jason Ekstrand cf39fb06e3 anv/blorp: Explicitly set row_pitch in do_buffer_copy
We have a very specific row pitch that we want and we don't want ISL to
be changing it on us so just be explicit about it.

Fixes: a40f043034
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 08:20:07 -07:00
Kenneth Graunke 30d6bc470a i965: Set lower_vote_trivial in vector_nir_options_gen6 too.
There's a second struct for Gen6+.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-21 18:09:01 -07:00
Topi Pohjolainen fbfc6a2f67 intel/isl/gen7: Don't allow multisampled surfaces with valign2
There is the same constraintg later on as assert in
isl_gen7_choose_image_alignment_el() so catch it earlier in order
to return error instead of crash.

Needed to avoid crashes with piglits on IVB and HSW:

arb_internalformat_query2.image_format_compatibility_type pname checks
arb_internalformat_query2.all internalformat_<x>_type pname checks
arb_internalformat_query2.max dimensions related pname checks
arb_copy_image.arb_copy_image-formats --samples=2/4/6/8
arb_texture_float.multisample-fast-clear gl_arb_texture_float

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen df9bb8dc05 intel/isl/gen7: Allow msaa with signed integer formats
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.

There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try 16I/32I in addition to GL_RGBA8I.
IvyBridge passed all tests with all sample numbers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen abb84e3f2d intel/isl/gen7: Allow msaa with 128-bit formats
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.

There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I.
IvyBridge passed all tests with all sample numbers and even
with 128-bit formats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen 514d68576d intel/isl: Allow 1D surfaces with compressed formats
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen a40f043034 intel/isl: Align non-tiled horizontally by cache line
in order to support blit engine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Matt Turner 069bf7c907 i965/fs: Match destination type to size for ballot
No use in taking a 64-bit value when we know the high 32-bits are zero.
2017-07-20 16:56:50 -07:00
Matt Turner 1038d385a9 nir: Reduce destination size of ballot intrinsic when possible
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner 782ef30451 i965/fs: Implement ARB_shader_ballot operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner 8238930510 i965/fs: Do not move MOVs writing the flag outside of control flow
The implementation of ballotARB() will start by zeroing the flags
register. So, a doing something like

        if (gl_SubGroupInvocationARB % 2u == 0u) {
                ... = ballotARB(true);
		[...]
        } else {
                ... = ballotARB(true);
		[...]
	}

(like fs-ballot-if-else.shader_test does) would generate identical MOVs
to the same destination (the flag register!), and we definitely do not
want to pull that out of the control flow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez f1b7c47913 i965/fs: Handle explicit flag sources in flags_read()
The implementations of the ARB_shader_ballot intrinsics will explicitly
read the flag as a source register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner 43ef75b394 nir: Add system values from ARB_shader_ballot
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.

Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner ee9fa4ac18 i965/fs: Implement ARB_shader_group_vote operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez 93dc736f4e i965/fs: Handle explicit flag destinations in flags_written()
The implementations of the ARB_shader_group_vote intrinsics will
explicitly write the flag as the destination register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner 30b72f4126 i965/vec4: Lower ARB_shader_group_vote intrinsics
I don't expect anyone is going to care about using this in vec4 programs
(vertex/tessellation/geometry on Gen6/7), no one has come up with a good
way to implement it much less test it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner d4c9d6a3b2 nir: Add pass to optimize intrinsics
Specifically, constant fold intrinsics from ARB_shader_group_vote, but I
suspect it'll be useful for other things in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Topi Pohjolainen c4ac0d4949 intel/isl/gen4: Represent cube maps with 3D layout
v2 (Jason): Check for !ISL_SURF_DIM_3D instead of CUBE_BIT.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen 171b72542c intel/isl: Add i915 to isl_tiling converter
v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Chad Versace 5d69052113 anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT.  We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.

Found by assertion failures while enabling VK_ANDROID_native_buffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-19 11:25:50 -07:00
Topi Pohjolainen 0926fb69a4 intel/blorp/gen4: Drop cube map flag for single face copy
This will falsely trigger an assert on number of layers once
isl is used for 3D layouts of Gen4 cube maps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen 4733891e51 intel/isl: Take 3D surfaces into account in image params
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:44 +03:00
Jason Ekstrand cd9fd68a50 anv: Advertise support for VK_KHR_variable_pointers
We don't support the general version yet because that requires us to
lower shared variables up-front in SPIR-V -> NIR.  This shouldn't be a
whole lot of work but it's not something we support today.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand bc9319583a anv: Advertise support for VK_KHR_storage_buffer_storage_class
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand 828c437078 intel/isl: Add a row_pitch parameter to surf_get_ccs_surf
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand c5700ed72e anv/image: Add INPUT_ATTACHMENT to the list of required usages
From the Vulkan 1.0.53 spec VU for vkCreateImageView:

    "image must have been created with a usage value containing at least
    one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
    VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
    VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand cbdfd1daa2 anv: Stop leaking the no_aux sampler surface state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand bd41564746 anv/cmd_buffer: Properly handle render passes with 0 attachments
We were early returning and never created the NULL surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Emil Velikov 43c188f970 anv: advertise v6 of the wayland surface extension
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-17 15:24:32 +01:00
Lionel Landwerlin 59adde0eab anv: ensure device name contains terminating character
v2: Use sizeof() (Chris)

CID: 1415113
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-07-17 14:36:38 +01:00
Jason Ekstrand 0ee8d81718 anv: Implement VK_KHR_external_memory_*
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand c02da9cad6 anv: Implement VK_KHR_dedicated_allocation
We always recommend sub-allocation and don't do anything special for
dedicated allocations.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 8c82aa5f43 anv: Implement VK_KHR_get_memory_requirements2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 5b57bdc1cf anv: Advertise version 1.0.54
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand 227debdc92 vulkan: Update to the new 1.0.54 spec XML and headers
There is one small ANV change here because we used the
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had
to be updated to have the _KHR suffix.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand dc179aa123 anv: Drop support for VK_KHX_external_semaphore_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:58:51 -07:00
Jason Ekstrand 4ac94d0dee anv: Drop support for VK_KHX_external_memory_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-14 22:12:39 -07:00
Juan A. Suarez Romero 5cd4ece34e anv/pipeline: do not use BITFIELD64_BIT()
In the previous commit, forgot to apply v2 suggestions.

Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check
enable vertex inputs)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-07-14 10:33:19 +00:00
Juan A. Suarez Romero 28d0c38d85 anv/pipeline: use unsigned long long constant to check enable vertex inputs
When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.

But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.

Thus, use 1ull, which is an unsigned long long value.

This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved

v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-14 08:09:18 +00:00
Kenneth Graunke b2da123801 i965: Use pushed UBO data in the scalar backend.
This actually takes advantage of the newly pushed UBO data, avoiding
pull loads.

Improves performance in GLBenchmark Manhattan 3.1 by:

   HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%.
   (thanks to Eero Tamminen for these numbers)

shader-db results on Skylake, ignoring programs with spill/fill changes:

   total instructions in shared programs: 13963994 -> 13651893 (-2.24%)
   instructions in affected programs: 4250328 -> 3938227 (-7.34%)
   helped: 28527
   HURT: 0

   total cycles in shared programs: 179808608 -> 172535170 (-4.05%)
   cycles in affected programs: 79720410 -> 72446972 (-9.12%)
   helped: 26951
   HURT: 1248

   LOST:   46
   GAINED: 21

Many "Deus Ex: Mankind Divided" shaders which already spilled end up
spill a lot more (about 240 programs hurt, 9 helped).  The cycle
estimator suggests this is still overall a win (-0.23% in cycle counts)
presumably because we trade pull loads for fills.

v2: Drop "PULL" environment variable left in for initial debugging
    (caught by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke c9ef27e77b i965: Factor out push locations.
With UBOs, the answer of "have we decided to push this uniform" gets
a bit more complicated - for one, we have multiple surfaces.  This
patch refactors things so we can add the new code in a single place.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke 4f586cd8f1 i965: Push UBO data, but don't use it just yet.
This patch starts uploading UBO data via 3DSTATE_CONSTANT_* packets,
and updates the compiler to know that there's extra payload data, so
things continue working.  However, it still issues pull loads for all
data.  I wanted to separate the two aspects for greater bisectability.

v2: Update for new intel_bufferobj_buffer parameter.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:30 -07:00
Kenneth Graunke 6d28c6e52c i965: Select ranges of UBO data to be uploaded as push constants.
This adds a NIR pass that decides which portions of UBOS we should
upload as push constants, rather than pull constants.

v2: Switch to uint16_t for the UBO block number, because we may
    have a lot of them in Vulkan (suggested by Jason).  Add more
    comments about bitfield trickery (requested by Matt).

v3: Skip vec4 stages for now...I haven't finished wiring up support
    in the vec4 backend, and so pushing the data but not using it
    will just be wasteful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke 8ec5a4e4a4 i965: Switch to absolute addressing for constant buffer 0.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.  I'd like
to be able to use all four push buffers.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Lionel Landwerlin 6131a1ae40 aubinator: don't leak fd of opened aubfile
CID: 1373563
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:50 +01:00
Lionel Landwerlin d1bd731e30 anv: don't use strcpy for copying strings
CID: 1358935
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:47 +01:00
Lionel Landwerlin 226fae7849 intel/compiler: no need to check unsigned is >= 0
CID: 1338342
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:45 +01:00
Lionel Landwerlin 95c917668c intel/compiler: don't check unsigned is >= 0
CID: 1224468
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:38 +01:00
Lionel Landwerlin a25a533458 intel/compiler: remove check unsigned is >= 0
By definition unsigned are always >= 0.

CID: 742212
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:29 +01:00
Lionel Landwerlin 19869d6091 isl: use 64bit arithmetic to compute size
If we allow the size to be more than 2^32, then we should compute it
in 64bit arithmetic otherwise we might run into overflow issues.

CID: 1412892, 1412891
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:26 +01:00
Jason Ekstrand 5b3363e3f1 intel/isl: Add a helper to convert tilings from ISL to i915
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand a668ba9c18 intel/isl: Add basic modifier introspection
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Anuj Phogat 0a56c5f3f1 intel/compiler: Don't use opt_sampler_eot() optimization on gen10+
This optimization has been removed on gen10+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-12 11:27:31 -07:00
Eric Anholt 5d6271c6a5 intel: Move the DRM uapi headers to a non-Intel location.
I want to remove vc4's dependency on headers from libdrm as well, but
storing multiple copies of drm_fourcc.h in our tree would be silly.

v2: Update Android.mk as well, move distcheck drm*.h references to
    top-level noinst_HEADERS.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Rob Herring <robh@kernel.org>
2017-07-12 10:58:33 -07:00
Jason Ekstrand 8e3d9c5d09 anv: Round u_vector element sizes to a power of two
This fixes 32-bit builds of the driver.  Commit 08413a81b9
changed things so that we now put struct anv_states in the u_vector for
binding tables.  On 64-bit builds, sizeof(struct anv_state) is a power
of two but it isn't on 32-bit builds.

Fixes: 08413a81b9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 10:34:13 -07:00
Lionel Landwerlin 384aaa4d3f intel: add number of subslices to device info
We could have used a single integer to store that value, but
Cannonlake has different number of subslices per slice depending on
the GT.

v2: Add CFL subslice numbers (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-07-11 16:14:57 +01:00
Kenneth Graunke c2c37f5185 intel: Fix clflushing on modern (Baytrail+) Atom CPUs.
Thanks to Chris Wilson for pointing this out.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-10 15:55:26 -07:00
Kenneth Graunke 3e50607a40 intel: Move clflush helpers from anv to common/gen_clflush.h.
I want to use these in the OpenGL driver as well.

v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:19 -07:00
Jason Ekstrand 781263486f anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITE
The reason we were doing this was to ensure that the kernel did the
appropriate cross-ring synchronization and flushing.  However, the
kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to
insert a fence.  It only cares about the domain for determining whether
or not it needs to clflush the BO before using it for scanout but the
domain automatically gets set to RENDER internally by the kernel if
EXEC_OBJECT_WRITE is set.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-10 08:55:47 -07:00
Nanley Chery 753a7bbc84 Revert "intel/isl: Only create a CCS buffer if the image supports rendering"
This reverts commit 8aaa13467d, which was
based on an incorrect assumption. Unlike the restriction placed on image
views in the Vulkan API, OpenGL allows you to render to texture views
whose formats differ from the originals.

Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
2017-07-07 14:24:58 -07:00
Tomasz Figa 50a8a7377a intel: common: Fix link failure with standalone Android build
Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).

Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.

Fixes: d5b355ce5f ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 18:49:21 +01:00
Samuel Iglesias Gonsálvez 5dd96b1156 anv: check support for enabled features in vkCreateDevice()
From Vulkan spec, 4.2.1. "Device Creation":

  "vkCreateDevice verifies that extensions and features requested in
   the ppEnabledExtensionNames and pEnabledFeatures members of
   pCreateInfo, respectively, are supported by the implementation."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@gmail.com>
2017-07-03 08:01:31 +02:00
Samuel Iglesias Gonsálvez ba05f6f72b anv: merge tessellation's primitive mode in merge_tess_info()
SPIR-V tessellation shaders that were created from HLSL will have
the primitive generation domain set in tessellation control shader
(hull shader in HLSL) instead of the tessellation evaluation shader.

v2:
- Add assert (Kenneth)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-03 08:00:43 +02:00
Lionel Landwerlin 038c45a40e anv: fix reported timestampPeriod value
We lost some precision on a previous change due to switching to
integers. Since we report a float in timestampPeriod, we want the
division to happen in floats.

CID: 1413021
Fixes: c77d98ef32 ("intel: common: express timestamps units in frequency")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-02 12:11:55 +01:00
Lionel Landwerlin 34560ba9e5 intel: genxml: make a couple of enums show up in aubinator
In particular Shader Channel Select & Texture Address Control Mode.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-07-02 00:45:38 +01:00
Johnson Lin 165e704719 i965/i915: Add UYVY as the supported format
Trigger the correct sampler options for it. Similar with YUYV

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Mauro Rossi b693fd8464 android: anv: drop libdrm_intel dependency
In addition to Rob Herring "Android: i965: remove libdrm_intel dependency",
we can drop libdrm_intel dependency in anv for Android.

Please check if libdrm has to stay as shared dependency and drop this comment line.

Fixes: 7dd20bc ("anv/i965: drop libdrm_intel dependency completely")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-29 12:31:00 +01:00
Lionel Landwerlin d8bf2861ad anv: use devinfo for number of thread/eu
It turns out Gen9LP has fewer threads per EU (6 vs 7).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-06-29 10:07:52 +01:00
Juan A. Suarez Romero 93b8dc4b94 intel: tools: add intel_aub.h as part of aubinator
Include intel_aub.h in the Makefile.tools.am

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Juan A. Suarez Romero be5fe2153b intel: automake: include Makefile.drm.am
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Ian Romanick 36bd4a5f21 genxml: Silence about a billion unused parameter warnings
v2: Use textwrap.dedent to make the source line a lot shorter.
Shortening (?) the line was requested by Jason.

v3: Simplify the texwrap.dedent usage.  Suggested by Dylan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-06-28 14:50:14 -07:00
Lionel Landwerlin 7dd20bc3ee anv/i965: drop libdrm_intel dependency completely
With Ken's work to drop the library dependency on libdrm_intel, we now
only depend on libdrm for the kernel uapi headers it provides. It
seems like we're better off just embeddeding those headers ourselves,
making the lives of people developping news features tightly
integrated with the kernel a tiny bit easier.

This change also makes it a bit more obvious what cflags/libs are
required by the i915 drivers vs i965, by renaming INTEL_CFLAGS/LIBS
into I915_CFLAGS/LIBS.

Headers were generated from drm-tip on the following commit :

   commit 6d61e70ccc21606ffb8a0a03bd3aba24f659502b
   Merge: 338ffbf7cb5e c0bc126f97fb
   Author: Dave Airlie <airlied@redhat.com>
   Date:   Tue Jun 27 07:24:49 2017 +1000

       Backmerge tag 'v4.12-rc7' into drm-next

v2: Use installed files from the kernel (Daniel Vetter)

v3: Use headers from drm-next rather than drm-tip (Dave/Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin 230691b8e5 aubinator: import intel_aub.h from libdrm
This enables us to compile aubinator without the libdrm dependency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:28 +03:00
Topi Pohjolainen fbcc9555c5 intel/anv: Add missing break in anv_CreateDevice()
CID: 1413018
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-27 10:19:55 +03:00
Ian Romanick 1b101ca809 blorp: Use normalized coordinates on Gen6
Apparently, the sampler has some sort of precision issues for
non-normalized texture coordinates with linear filtering.  This caused
some small precision issues in scaled blits.  Work around this by using
normalized coordinates.  There is some extra work necessary because Gen6
uses TEX (instead of TXF) for some multisample resolve blits.

Fixes piglit.spec.arb_framebuffer_object.fbo-blit-stretch on SNB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-06-26 13:41:11 -07:00
Nanley Chery d6748f1fc4 anv/gpu_memcpy: Rename the gpu_memcpy function
A GPU memcpy function could alternatively be implemented using MI_*
commands. Provide more detail into how this one operates in case another
memcpy function is created.

v2:
- Update the commit message.
v3:
- Use 'memcpy' instead of 'cpy' (Jason Ekstrand)
- Shorten 'streamout' to 'so'

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 1415e7a997 anv/blorp: Provide surface states for CCS resolves
In the future, we plan on using this method to resolve images whose
surface state fast-clear value is dynamically updated during command
buffer execution. Start using it now for testing and to reduce churn
later on.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 4b2a2b70e0 anv/blorp: Add a surface-state-based CCS resolve function
This will be used in the next patch.

v2:
- Omit BLORP_BATCH_NO_EMIT_DEPTH_STENCIL (Jason Ekstrand)
- Update commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery d1119ab7b6 blorp/clear: Add a binding-table-based CCS resolve function
v2:
- Do layered resolves.
(Jason Ekstrand):
- Replace "bt" suffix with "attachment".
- Rename helper function to prepare_ccs_resolve.
- Move blorp_params_init() into helper function.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 6235f08ff8 anv: Adjust params of color buffer transitioning functions
Splitting out these fields will make the color buffer transitioning
function simpler when it gains more features.

v2: Remove unintended blank line (Iago Toral)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery e15b1c41a4 anv/blorp: Remove 3D subresource transition workaround
For 3D image subresources undergoing a layout transition via
PipelineBarrier, we increase the number of fast-cleared layers to match
the intended behaviour of KHR_maintenance1. When such subresources
undergo layout transitions between subpasses, we don't do this to avoid
failing incorrect CTS tests. Instead, unify the behaviour in both
scenarios, and wait for the CTS tests to catch up. See CL 1111 for the
test fix and Vulkan issue #849 for more information.

On SKL+, this causes 3 test failures under:
dEQP-VK.pipeline.render_to_image.3d.*

v2: Add a reference to the Vulkan issue (Iago Toral).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 5ca2fbcee2 anv/cmd_buffer: Adjust the image view reloc function
Make the function take in an image instead of an image view. This
enables us to record relocations for surfaces states created outside of
the anv_CreateImageView path.

v2 (Jason Ekstrand):
- Use image->offset instead of surf_offset in aux_offset calculation.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 5f4f50419c anv/cmd_buffer: Adjust layout transition aspect checking
Reflect the fact that an image view or subresource range with the color
aspect cannot have any other aspect.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery bc838fc759 anv: Add and use color auxiliary buffer helpers
v2:
- Check for aux levels in layer helper (Jason Ekstrand)
- Don't assert aux is present, return 0 if it isn't.
- Use the helpers.
v3:
- Make the helpers aspect-agnostic (Jason Ekstrand)
- Drop anv_image_has_color_aux()

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 8aaa13467d intel/isl: Only create a CCS buffer if the image supports rendering
v2: Omit the commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery b934330191 intel/isl: Limit CCS to one level and layer on gen7
v2 (Jason Ekstrand):
- Remove Vulkan-specific terminology from the commit title.
- Replace '== 7' with '<= 7' to hint that this is a new feature on BDW+.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery 6b23c65f3a intel/blorp: Check for layer fast-clear restriction
v2: Update commit title (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery b46a071758 intel/blorp: Assert levels and layers are in range
v2 (Jason Ekstrand):
- Update commit title.
- Check aux level and layer as well.
v3 (Jason Ekstrand):
- Move the non-aux layer check.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Eric Engestrom 2b237ff64c anv: use Mesa's u_atomic.h header
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-26 18:21:22 +01:00
Anuj Phogat 7896dee349 anv/cnl: Don't write to Cache Mode Register 1 on gen10+
For PartialResolveDisableInVC field recommendation is to
always set this to 0 and that's the default value of the bit.
So, we have nothing left to write to CACHE_MODE_1.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-23 11:16:00 -07:00
Rafael Antognolli 9b78a52042 genxml: fix gen5 sampler border color state.
Based on the current code, gen5 and gen6 have the same sampler border color
state struct. So fix the gen5 one to match gen6.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Rafael Antognolli f43c21cbbd aubinator: Dump sampler state pointers on gen6 too.
We already have a function to dump sampler states, so do that for gen6
too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Chad Versace ecd8f85802 anv: Fix -Wswitch in anv_layout_to_aux_usage()
anv_layout_to_aux_usage() lacked a case for
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR. Add an unreachable case, because we
don't support the extension.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 15:18:24 -07:00
Anusha Srivatsa de7ed0ba55 i965/CFL: Add PCI Ids for Coffee Lake.
Coffee Lake has a gen9 graphics following KBL.
From 3D perspective, CFL is a clone of KBL/SKL features.

v2: Change commit message, correct alignment <Anuj Phogat>
v3: Update IDs.
v4: Initialize l3_banks, correct nomenclature <Anuj>

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Acked-by: Benjamin Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-22 14:28:43 -07:00
Anuj Phogat 43d11b128c intel: Enable vulkan build for gen10
This patch just enables building Vulkan libs for gen10. We
still don't have gen 10 support enabled on Vulkan.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:46 -07:00
Anuj Phogat ac6bc0e034 anv/cnl: Generate and use gen10 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat c17e214a6b anv/cnl: Don't set FloatBlendOptimizationEnable{Mask}
This field is remove from CACHE_MODE_1 register in gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat bf1d2c37c6 anv/cnl: Use GENX(xx) in place of GEN9_xx
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat 1e5a5d18d1 anv/cnl: Add #defines for MOCS and genX(x)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat ceed55e7bb intel/genxml: Add Gen10 CACHE_MODE_1 definitions
Few of the fields in this register are changed as compared
to gen9.xml.

V2: Remove some fields which are not valid anymore.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 6338b63270 intel/genxml: Rename StartInstanceLocation to StartingInstanceLocation
This is required because we already have a macro defined with
the name StartInstanceLocation.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 8869c8b3dc intel/genxml: Rename IndirectStatePointer to BorderColorPointer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 97f75fdfd0 intel/genxml: Combine DataDWord{0, 1} fields in to ImmediateData field
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat c61b909d14 intel/genxml: Add INSTDONE registers in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat 03fddd3c1d intel/genxml: Add better support for MI_MATH in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Andres Gomez 5352174d49 anv: FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR not used with VkFormatProperties.bufferFeatures
VK_FORMAT_FEATURE_TRANSFER_[SRC|DST]_BIT_KHR is a flag value of the
VkFormatFeatureFlagBits enum that can only be hold and checked against
the linearTilingFeatures or optimalTilingFeatures members of the
VkFormatProperties struct but not the bufferFeatures member.

>From the Vulkan® 1.0.51, with the VK_KHR_maintenance1 extension,
section 32.3.2 docs for VkFormatProperties:

   "* linearTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_LINEAR.

    * optimalTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_OPTIMAL.

    * bufferFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by buffers."

    ...

    Bits which can be set in the VkFormatProperties features
    linearTilingFeatures, optimalTilingFeatures, and bufferFeatures
    are:

    typedef enum VkFormatFeatureFlagBits {

    ...

      VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR = 0x00004000,
      VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR = 0x00008000,

    ...

    } VkFormatFeatureFlagBits;

    ...

    The following bits may be set in linearTilingFeatures and
    optimalTilingFeatures, specifying that the features are supported
    by images or image views created with the queried
    vkGetPhysicalDeviceFormatProperties::format:

    ...

    * VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR specifies that an image
      can be used as a source image for copy commands.

    * VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR specifies that an image
      can be used as a destination image for copy commands and clear
      commands."

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 13:45:22 +03:00
Rafael Antognolli 78b843af3c intel/genxml: Use the same naming convention for Floating Point Mode.
In newer gens, this field has a prefix and the non-IEEEE-745 mode is called
"Alternate", instead of simply "Alt".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli ce728594fd intel/genxml: Normalize URB Data field in WM_STATE.
On gen6+, this is called "Dispatch GRF Start Register For Constant/Setup Data
0", while on gen5 and lower it's called only "Dispatch GRF Start Register For
URB Data", but it's essentially the same thing (URB data), so rename it to
match newer gens and simplify the C code that handles it.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 44415056e7 intel/genxml: Rename field on WM_STATE to match gen6+.
"Pixel Shader Kill Pixel" -> "Pixel Shader Kills Pixel", which is how it's
called on newer gens.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 82c66965ac intel/genxml: Normalize fields on WM_STATE.
On gen4, WM_STATE only has one Kernel Start Pointer and one GRF Register
Count, but we can make the code that handles this on multiple gens simpler if
we add an index 0 to it too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli eddb1ebccf intel/genxml: Add missing field to CLIP_STATE.
Just because it's not set doesn't mean that it doesn't exist. And since the
field is there on newer gens, having it on gen5 simplifies the code when
porting gen5 and lower.

Also add missing value to API Mode on CLIP_STATE on gen4.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 9a5ae19cbb intel/genxml: Fix type of UserClipFlags ClipTest Enable Bitmask.
This is a bitmask, so it can't be a boolean. Also rename it so it matches
gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli 19d1defcd5 intel/genxml: Add missing fields to CLIP_STATE on gen4-5.
These fields are set by brw_clip_unit, so we need them when converting to
genxml.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli faa4f5c42d intel/genxml: Normalize GS_STATE.
Rename "Rendering Enable" to "Rendering Enabled", so it matches gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Lionel Landwerlin 030abc6109 intel: compiler/i965: fix is_broxton checks
In 5f2fe9302c is_geminilake was introduced for the differenciate
broxton from geminilake. Unfortunately I failed as verifying that
is_broxton is throughout the code base to mean Gen9lp.

Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-20 23:26:42 +01:00
Ben Widawsky 3e1055591b i965/cnl: Add l3 configuration for Cannonlake
V2 (Anuj):
Squash the changes in one patch rebase on master.
Address the review comments made by Francisco Jerez.
Do the URB allocation per slice (not per bank).

V3 (Anuj):
Update the comment.
Format the table as other l3 config tables.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
---
V1 was sent out with the heading:
"i965/cnl: Properly handle l3 configuration"
2017-06-20 12:18:26 -07:00
Anuj Phogat 1024dad4d9 i965: Add a variable for way size per bank in get_l3_way_size()
Adding this variable better explains the computation of L3 way
size in the function.

V2: Use const variable for way_size_per_bank.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00