HiZ buffers can be multisampled and, on Broadwell and earlier, simply using
interleaved multisampling with a compression block size of 8x4 samples
yields the correct HiZ surface size calculations. Unfortunately,
choose_msaa_layout was rejecting multisampled HiZ buffers because of format
checks. Now that we have a simple helper for determining if a format
supports multisampling, that's an easy enough issue to fix.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Compressed 1-D textures are not well-defined thing in either GL or Vulkan.
However, auxiliary surfaces are treated as compressed textures in ISL and
we can do HiZ and CCS with 1-D so we need to be able to create them. In
order to prevent actually using them (the docs say no), we assert in the
state setup code.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
The assertion that a format is uncompressed in the multisample layouts
isn't quite right. What we really want to assert is that the format
supports multisampling which is a bit more complicated query. We also want
to assert that it has a block size of 1x1 since we do nothing with the
block size in the phys_level0_sa assignment.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
gcc-4 and earlier don't allow compound literals where a constant
is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
when populating the anv_formats[] array. There are a few ways around
it: First one would be -std=c89/gnu89, but the rest of the code
depends on c99 so it's not really an option. The second option
would be to upgrade to gcc-5+ where the compiler behaviour was relaxed
a bit [1]. And the third option is just to avoid using compound
literals. I chose the last option since it keeps gcc-4 and earlier
working.
[1] https://gcc.gnu.org/gcc-5/porting_to.html
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 7ddb21708c ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixed the way the values that span two Dwords are decoded.
Based on the start and end indices of the field, the Dwords
are fetched and decoded accordingly.
v2: rename dw to qw in gen_field_iterator_next
and remove extra white space (Anuj)
v3: change all instances of dw to qw (Anuj)
Earlier, 64-bit fields (such as most pointers on Gen8+)
weren't decoded correctly. gen_field_iterator_next seemed
to walk one DWord at a time, sets v.dw, and then passes it
to field(). So, even though field() takes a uint64_t, we're
passing it a uint32_t (which gets promoted, so the top 32
bits will always be zero). This seems pretty bogus... (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
From the Vulkan spec:
Size is the number of bytes to fill, and must be either a multiple of 4,
or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer.
If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a
multiple of 4, then the nearest smaller multiple is used.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Make gen_device_info a mutable structure so we can update the fields that
can be refined by querying the kernel (like subslices and EU numbers).
This patch does not make any functional change, it just makes
gen_get_device_info() fill a structure rather than returning a const
pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reproduces this commit :
commit 0fb85ac08d
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Mon Jun 6 21:37:34 2016 -0700
i965: Use the correct number of threads for compute shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reproduces this commit :
commit 2213ffdb4b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Mon Jun 6 21:37:34 2016 -0700
i965: Allocate scratch space for the maximum number of compute threads.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Gen6 only has one additional restriction over Gen7+, so we just add it
to the existing gen7 function (which actually covers later gens too).
This should stop FINISHME spew when running GL on Sandybridge.
v2: Fix bytes per block vs. bits per block confusion (Jason) and
rename function to gen6_filter_tiling (Jason and Chad).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Without this bit set, the value in "L3 Atomic Disable" won't get applied by
the hardware so we won't properly get L3 atomic caching.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of
the dEQP-VK.image.atomic_operations.* tests on HSW
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT
at the GPU initialization time and never sets it again. The GL driver sets
it every time the framebuffer changes. Originally, blorp set it to the
size of the drawing area but meant we had to set it back in the Vulkan
driver. Instead, we can easily just do that in the GL driver's blorp_exec
implementation and not set it in blorp core.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However,
now that Vulkan and GL use the same sample positions, we can set up
3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Earlier, the loop pretends to loop over instructions from "start" to "end",
but the callers always pass 8192 for end, which is some huge bogus
value. The real loop termination condition is send-with-EOT or 0. (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Skylake adds new SENDS and SENDSC opcodes, which should be
handled in the send-with-EOT check. Make an is_send() helper
that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken)
v2: Make is_send() much more crispier, Mix declaration and
code to make the code compact (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Remove the float/dword union and use the iter->p[f->start / 32]
directly as printf formatter %08x expects uint32_t (Ken)
v2: Make the cleanup much more crispier (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Since Vulkan doesn't allow single-slice 3D storage images, we need to just
set the base_array_layer and array_len to the full size of the 3-D LOD.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
This reverts commit 3943888c94. It turns out
that commit was pretty-much bogus since it breaks binding a 3-D texture as a
2-D storage image. The correct fix for the Vulkan CTS tests needs to be in
the Vulkan driver itself rather than ISL.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Everything that we were once using the blit2d framework for is now done
with blorp.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This is a lot cleaner and easier to read than the old piles of if
statements.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
This allows us to #undef them later if we don't want them to persist
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This allows us to #undef them later if we don't want them to persist
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The original blorp_alloc_binding_table helper was supposed to return the
binding table offset and map along with the surface state maps. This isn't
quite what we want, however. What we really want is the binding table
offsets, surface state offsets, and surface state maps. In the GL driver,
the binding table map *is* an array of surface state offsets. However, in
Vulkan, this isn't quite true as the entries in the binding table are
surface state offsets combined with another binding table block offset.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Previously, we were relying on the fact that VALGRIND_MEMPOOL_FREE came
later on in the function to prevent "link->bo = bo" from causing an invalid
write. However, in the case where the size requested by the user is very
small (less than sizeof(struct anv_bo)), this isn't sufficient. Instead,
we should call VALGRIND_MEMPOOL_FREE early and then use VG_NOACCESS_WRITE.
We do, however, have to call VALGRIND_MEMPOOL_FREE after reading bo_in
because it may be stored in the bo itself.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
The time we want to restrict the Z range of a 3-D surface is when rendering
to it. For storage surfaces, we always want he full range. However, we
still need to set MinimumArrayElement and RenderTargetViewExtent to
sensible values so we'll just set them to the reasonable defaults we used
before we started respecting the base_array_layer and array_len.
This fixes a bunch of Vulkan CTS regressions caused by 48f195d7c6.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97790
Reviewed-by: Chad Versace <chadversary@chromium.org>
Normally, using a non-linear tiling format helps improve cache locality by
ensuring that neighboring pixels are usually close-by in memory. For RGB
formats, this still sort-of holds, but it can also lead to rather terrible
memory access patterns where a single RGB pixel value crosses a tile
boundary and gets split into two pieces in different 4K pages. It also
makes for some rather awkward calculations because your tile size is no
longer an even multiple of surface element size. For these reasons, we
chose to simply never create tiled RGB images in the Vulkan driver.
The GL driver, however, is not so kind so we need to support it somehow. I
briefly toyed with a couple of different schemes but this is the best one I
could come up with. The fundamental problem is that a tile no longer
contains an integer number of surface elements. I briefly considered a
couple other options but found them wanting:
1) Using floats for the logical tile size. This leads to potential
rounding error problems.
2) When presented with a RGB format, just make the tile 3-times as wide.
This isn't so nice because now our tiles are no longer power-of-two
size. Also, it can force the row_pitch to be larger than needed which,
while not strictly a problem for ISL, causes incompatibility problems
with the way the GL driver chooses surface pitches.
The chosen method requires that you pay attention and not just assume that
your tile_info is in the units you think it is. However, it's nice because
it provides a nice "these are the units" declaration in isl_tile_info
itself. Previously, the tile_info wasn't usable as a stand-alone structure
because you had to also know the format. It also forces figuring out how
to deal with inconsistencies between tiling and format back to the caller
which is good because the two different consumers of isl_tile_info really
want to deal with it differently: Computation of the surface size wants
the fewest number of horizontal tiles possible while get_intratile_offset
is far more concerned with things aligning nicely.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Chad Versace <chadversary@chromium.org>
The restriction that Y-tiled surfaces must have valign == 4 only aplies to
render targets but we were applying it universally. This causes problems
if ISL_FORMAT_R32G32B32_FLOAT is used because it requires valign == 2; this
should be okay because you can't render to that format.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
When Ivy Bridge introduced array multisampling, someone made the decision
to do lots of stuff throughout the driver in terms of physical array layers
rather than logical array layers. In ISL, we use logical array layers most
of the time and it really makes no sense to use physical array layers in
the blorp API. Every time someone passes physical array layers into blorp
for an array multisampled surface, they're always divisible by the number
of samples and we divide right away.
Eventually, I'd like to rework most of the GL driver internals to use
logical array layers but that's going to be a big project and will probably
happen as part of the ISL conversion. For now, we'll do the conversion in
brw_blorp and let blorp just use the logical layers.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The result of this calculation goes into an fma() in the shader and we
would like it to be as precise as possible. The division in particular
was a source of imprecision whenever dst1 - dst0 was not a power of two.
This prevents regressions in some of the new Vulkan CTS tests for blitting
using a filtering of NEAREST.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
While we're here, we also re-arrange the parameters to better match the
parameter order of blorp_blit.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
While it can be useful, the field has substantial limtations. In
particular, the bittom 2 or 3 bits is missing so your offset always has to
be a multiple of 4 or 8. While surface alignments usually work out to make
this ok, when you start trying to fake compressed surfaces as uncompressed
(which we will want to do) this falls apart. The easiest solution is to
simply align all offsets to a tile boundary and munge the regions we're
copying to account for the intratile offset.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The convert_to_single_slice operation is *mostly* idempotent. The only
non-repeatable thing it does is that, when it sets the intratile offset
fields, it just overwrites them instead of doing a += operation. This is
supposed to be ok because we have an early return at the top that should
make it bail of the surface is already a single slice. Unfortunately, the
if condition has been broken ever since it was first added in 96fa98c18.
This commit fixes the condition and adds an assert to ensure we don't stomp
any non-zero intratile offsets.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
If we use the view format, it may be an uncompressed view of a compressed
image which throws things off. Since we're computing offsets of images, we
want the actual surface offset anyway.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We're going to use it for more than just stencil textures
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This should be more compact than the enum isl_channel_select[4] that we
were using before. It's also very convenient because we already had such a
structure in the Vulkan driver we just needed to pull it over.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
In particular, this means that isl_view::base_array_layer and
isl_view::array_len get applied to 3-D textures but only when rendering.
We were already applying isl_view::base_array_layer for rendering into 3-D
textures so this isn't a huge deviation.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Copy the whole devinfo structure instead of just few fields (Ken)
Earlier, copied only couple of fields which added more code. So,
simplify code by copying the whole structure.
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This provides a nice little place to share notes on what still needs to be
done and/or would be nice to have in BLORP.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Rather than using platform specific methods to retrieve the program
name pass it explicitly. The function is called directly from main().
Similarly - basename comes in two versions POSIX (can modify string,
always pass a copy) and GNU (never modifies the string).
Just printout the complete program name, esp. since the program is not
meant to be installed. Thus using $basename is unlikely to work, not to
mention it is misleading.
Reported-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
This will indicate target layer (Render Target Array Index) needed
for layered clears.
v2: Use 3DSTATE_VF_SGVS for gen8+
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Otherwise once mcs buffer gets allocated without delay for lossless
compression (same as we do for msaa), assert starts to fire in
piglit case: tex3d. The test uses depth of one which is in fact
supported even now.
v2 (Jason): Allow also 1D case as there is nothing in the specs
constraining it either.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
program_invocation_short_name is a gnu extension. Limit use of it
to glibc and cygwin and otherwise use getprogname() which is available
on BSD and OS X.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Include libgen.h for basename as required by posix.
The definition is not found on at least OpenBSD otherwise.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
error() is a gnu extension and is not present on OpenBSD
and likely other systems.
Convert use of error to fprintf/strerror/exit.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The Makefile unconditionally linked libX11-xcb into libvulkan_intel.so.
But it's needed only if HAVE_PLATFORM_X11.
Fixes build of libvulkan_intel.so on Chromium OS, which has no X11
libraries.
Fixes: 71258e9462 ("anv/x11: Add support for Xlib platform")
Cc: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This is the only remaining part of genX_l3.c and there's really no good
reason for it to be in its own file.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Now that we're using gen_l3_config.c, we no longer have one set of l3
config functions per gen and we can simplify a bit. Also, we know that
only compute uses SLM so we don't need to look for it in all of the stages.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
When Jordan first implement L3$ configuration for Vulkan, he copied+pasted
from the GL driver because we had no good place to share it. Now that we
have src/intel/common, we should be sharing these tables.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Generated by:
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The first thing to go in this new library is brw_device_info.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
A few inline asserts in anv assume alignments are power of 2, but with
formats like R8G8B8 we have odd alignments.
v2: round up to power of 2 (Ilia)
v3: reuse util_next_power_of_two() from gallium/aux/util/u_math.h (Ilia)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The original pipeline cache the Kristian wrote was based on a now-false
premise that the shaders can be stored in the pipeline cache. The Vulkan
1.0 spec explicitly states that the pipeline cache object is transiant and
you are allowed to delete it after using it to create a pipeline with no
ill effects. As nice as Kristian's design was, it doesn't jive with the
expectation provided by the Vulkan spec.
The new pipeline cache uses reference-counted anv_shader_bin objects that
are backed by a large state pool. The cache itself is just a hash table
mapping keys hashes to anv_shader_bin objects. This has the added
advantage of removing one more hand-rolled hash table from mesa.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
This new anv_shader_bin struct stores the compiled kernel (as an anv_state)
as well as all of the metadata that is generated at shader compile time.
The struct is very similar to the old cache_entry struct except that it
is reference counted and stores the actual pipeline_bind_map. Similarly to
cache_entry, much of the actual data is floating-size and stored after the
main struct. Unlike cache_entry, which was storred in GPU-accessable
memory, the storage for anv_shader_bin kernels comes from a state pool.
The struct itself is reference-counted so that it can be used by multiple
pipelines at a time without fear of allocation issues.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
All of these worked before because they were depending on prog_data to be
null. Soon, we won't be able to depend on a nice prog_data pointer and
it's nice to be more explicit anyway.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
The range from ANV_MIN_STATE_SIZE_LOG2 to ANV_MAX_STATE_SIZE_LOG2 should
be inclusive and we have asserts that ensure that you never try to allocate
a state larger than (1 << ANV_MAX_STATE_SIZE_LOG2). However, without
adding 1 to the difference, we allocate 1 too few bucckts and so, even
though we have an assert, anything landing in the last bucket will fail to
allocate properly..
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
We hash this data structure so we can't afford to have uninitialized data
even if it is just structure padding.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Topi asked to have the prefix removed because there's nothing gen7 about
it. However, now that everything is in a single file, there is no good
reason to have it split out into a helper function anyway. Let's just put
the contents in emit_urb_config and call it a day.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This keeps invalid surface states from leaking through and potentially
hanging the GPU. We shouldn't actually be hitting this on a regular basis,
but a helpful assert is better than a hang.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This allows us to use the actual render format as opposed to the texture
format. I don't know that the hardware actually cares in the case of fast
clears, but it certainly seems more correct.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
At this point, blorp is completely driver agnostic and can be safely moved
into its own folder. Soon, we hope to start using it for doing blits in
the Vulkan driver.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Android porting of commit bebc1a1 "intel: Flatten the makefile structure"
Automake approach was followed, by moving makefiles a level up,
naming them Android.genxml.mk and Android.isl.mk,
performing the necessary adjustments to the paths,
adding src/intel/Android.mk and fixing mesa top level makefile.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
This probably isn't the only thing that needs to be done to get
multisampled array textures working in Vulkan but I think this is all that
ISL really needs and it does fix 8 of the new CTS tests.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason suggested adding an assert(function->impl) here. All callers
of this function actually want ->impl, so I decided just to change
the API.
We also change the nir_lower_io_to_temporaries API here. All but one
caller passed nir_shader_get_entrypoint(), and with the previous commit,
it now uses a nir_function_impl internally. Folding this change in
avoids the need to change it and change it back.
v2: Fix one call I missed in ir3_compiler (caught by Eric).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
This pulls isl and genxml into a single make file so that they can properly
build in parallel. This isn't terribly important now as genxml just
generates sources which happens serially first anyway but it will be more
important as we add more stuff to src/intel.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The tests assumed that isl would be in the include path but that usually
isn't the case. Instead, we usually have src/intel and you need to add an
"isl/" prefix.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
If the surface has a layout of GEN4_2D then we need to compute a normal 2D
alignment and not use the magic linewar 1D alignment.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
The Sky Lake 1D layout is only used if the surface is linear. For tiled
surfaces such as depth and stencil the old gen4 2D layout is used.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
The pipeline layout affects shader compilation because it is what
determines binding table locations as well as whether or not a particular
buffer has dynamic offsets. Since this affects the generated shader, it
needs to be in the hash. This fixes a bunch of CTS tests now that the CTS
is using a pipeline cache.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
This option makes installed Vulkan ICD files contain only a driver library
name and not a path. This is intended for distros to help them work around
multi-arch issues.
Reviewed-by: Dave Airlie <airlied@redhat.com>
We need to compute detiling coordinates using the physical size of W tiling
(128x32) rather than the logical size (64x64).
v2: Correct comment (Jason)
Fixes dEQP-VK.api.copy_and_blit.image_to_image_stencil
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97448
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Several fixes have been added as part of this as listed below:
1) Fix the mask and add disassembler handling for STATE_DS, STATE_HS
as the mask returned wrong values of the fields.
2) Fix the GEN_TYPE_ADDRESS/GEN_TYPE_OFFSET decoding - the address/
offset were handled the same way as the other fields and that gives
the wrong values for the address/offset.
3) Decode nested/recurssive structures - Many packets contain nested
structures, ex: 3DSATE_SO_BUFFER, STATE_BASE_ADDRESS, etc contain MOC
structures. Previously, the aubinator printed 1 if there was a MOC
structure. Now we decode the entire structure and print out its fields.
4) Print out the DWord address along with its hex value - For a better
clarity of information, it is helpful to print both the address and
hex value of the DWord along with the DWord count. Since the DWord0
contains the instruction code and the instruction length, it is
unnecessary to print the decoded values for DWord0. This information
is already available from the DWord hex value.
5) Decode the <group> and the corresponding fields in the group- The
<group> tag can have fields of several types including structures. A
group can contain one or more number of fields and this has be correctly
decoded. Previously, aubinator did not decode the groups or the
fields/structures inside them. Now we decode the <group> in the
instructions and structures where the fields in it repeat for any number
of times specified.
v2: Fix the formatting (per Matt)
Make the start and end pos calculation to extract fields from a DWord
more appropriate by moving %32 away from mask() method
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
The Aubinator tool is designed to help the driver developers in debugging
the driver functionality by decoding the data in the .aub files.
Primary Authors of this tool are Damien Lespiau <damien.lespiau at intel.com>
and Kristian Høgsberg Kristensen <krh at bitplanet.net>.
v2: Review comments are incorporated by Sirisha Gandikota as below:
1) Make Makefile.am more crisp, reuse intel_aub.h from libdrm (per Emil)
2) Aubinator will use platform name instead of GEN number (per Matt)
3) Disassmebler gets created based on pciid rather then GEN number (per Matt)
4) Other formatting comments (per Ken, Matt and Emil)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
In ca2a8e5628, we updated the format table to add more formats (most of
which are new on SKL) but accidentally marked some integer formats as
filterable. You can't filter an integer format.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
We can't actually clear these images normally because we can't render to
them. Instead, we have to manually unpack the rgb9e5 color value on the
CPU and clear it as R32_UINT. We still have a bit of work to do to clear
non-power-of-two images, but this should get all of the power-of-two clears
working on at least Haswell. This fixes three of the new Vulkan CTS tests
in the dEQP-VK.api.image_clearing.clear_color_image.* group.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS
tests.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
There were a lot of formats where support was added on Haswell or later but
we never updated the format table.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
The whole point of using RGBX is so that we can render to it so if it isn't
renderable, that kind-of defeats the purpose. Some formats (one example is
R32G32B32X32_SFLOAT) exist in the format table but aren't actually
renderable. Eventually, we'd like to get away from RGBX entirely, but this
fixes hangs on BDW today.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
The only reason we should throw INITIALIZATION_FAILED is if we have found
useable intel hardware but have failed to bring it up for some reason.
Otherwise, we should just throw INCOMPATIBLE_DRIVER which will turn into
successfully advertising 0 physical devices
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We let the user believe we support some transfer formats which we don't.
This can lead to crashes when actually trying to use those formats for
example on dEQP-VK.api.copy_and_blit.image_to_image.* tests.
Let all formats we can render to or sample from as meta implements transfers
using attachments.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Not providing a path allows the ICD to work on multi-arch systems but
breaks it if you install anywhere other than /usr/lib. Given that users
may be installing locally in .local or similar, we probably do want to
provide a filename. Distros can carry a revert of this commit if they want
an intel_icd.json file without the path.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chad Versace <chad@kiwitree.net>
This is easier than dealing with structs all the time
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The actual data storred is in float, UNORM24, or UNORM16 depending on the
actual depth format.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>