Stencil on Gfx7 has a vertical alignment element of 8, but the largest
its surface state can express is 4. Apply the Gfx6 solution of changing
the alignment in blorp_surf_retile_w_to_y.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
On XeHP, NPOT and POT formatted surfaces will use different image
alignment units when emitting surface states. When BLORP fakes an RGB
image as RED, update the image alignment to prevent assert failures when
emitting surface states.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
They are not used yet but the layout of Yf and Ys tiles are dependent on
these parameters. While we're here, better document the function.
Rework:
* Nanley: Update crocus.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
This is mostly a bit of future-proofing. We never end up with offsets
that don't fit in 32 bits today because, thanks to driver limitations
caused by relocations, we don't allocate buffers bigger than 2GB today.
However, if we ever did, it's possible to create a surface on modern
platforms that consumes more than 4GB and we would end up with wrapping
in our offset calculations.
Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
It's intel-specific, used to get at MSAA compression information.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
We can actually create array surfaces instead of requiring single-slice
in a few cases. This does require us to be very careful about our
checks, though.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11647>
It doesn't matter for the actual copy rectangle and this makes the
asserts a bit nicer as we don't need to bother with the intratile
offsets because there aren't any yet.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11647>
Coverity complains that memset has no effect, because of size 0.
Size of BLEND_STATE struct is 0 on gfx [6, 7.5], so memset has
nothing to do there. This is of course harmless, but we can make
code simpler by replacing memset with an empty initializer list
and at the same time avoid a warning from Coverity.
CID: 1486015
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11574>
Upon looking at caching the raytracing shader (in particular the
trampoline one) I kind of got afraid that some of the keys used for
blorp would end up matching other keys. This is because blorp keys are
fairly simple. There is no SPIRV module hash included.
This change includes a "blorp" string at the beginning of the queue to
ensure we don't collide with other keys.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
We need to do this in order to handle Yf and Ys tiling because they use
a four-dimensional tile instead of laying everything out in two
dimensions.
v2 (Jason Ekstrand):
- Update functions added since v1:
- isl_surf_get_image_range_B_tile
- blorp_can_hiz_clear_depth
- get_image_offset_el
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1)
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11330>
Avoids warning with GCC 10:
../src/intel/blorp/blorp_blit.c: In function 'blorp_nir_combine_samples':
../src/intel/blorp/blorp_blit.c:702:25: error: 'texture_data[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
702 | texture_data[0] = nir_fmul(b, texture_data[0],
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
703 | nir_imm_float(b, 1.0 / tex_samples));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>
Make INTEL_DEBUG=blorp dump the blorp shaders instead using the
general INTEL_DEBUG=fs,vs, which is now reserved to the actual FS and
VS shaders used by the pipeline.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
Use GEN_VERSIONx10 == 75 check in place of GEN_IS_HASWELL macro.
GEN_GEN and GEN_VERSIONx10 macros provide a consistent way to do platform
version checks. We can avoid platform specific macros.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9608>
This patch renames all macros with "GEN_" prefix defined in
common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
This patch renames functions, structures, enums etc. with "gen_"
prefix defined in common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
Changes in this patch include:
- Rename all files in src/intel/common path
- Update the filenames used in source and build files
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
On BDW and SKL, assert that render target dimensions are 8x4-aligned
when performing HiZ ambiguates on LOD1+. Testing indicates that the
assertion should hold in order to achieve consistent/correct ambiguate
operations on gen9.
v2. Account for the relaxed restrictions on ICL+. (Ken)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3788
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8853>
When calculating a URB configuration, we start with a notion of how
much space each stage /wants/ (to achieve the maximum amount of
concurrency), but sometimes fall back to giving it less than that,
because we don't have enough space. (Typically, this happens when
the per-stage size is large, or there are many stages, or both.)
We now output a "constrained" boolean which is true if we weren't
able to satisfy all the "wants" due to a lack of space.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>
This cleans up a bunch of gross sprintfs and keeps the caller from needing
to remember to ralloc_strdup. I added a couple of '"%s", name ? name :
""' to radv where I didn't fully trace through whether a non-null name was
being passed in.
I also took the liberty of adding a basic name to a few shaders (pan_blit,
unit tests)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
These two consumers were the only ones out of the ~65 calls to
init_simple_shader, so there's a pretty clear consensus on how to allocate
simple shaders. I suspect that actually these would be just fine with
b.shader being the mem_ctx, but that would take a bit more rework.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
The current blorp API only allows source layers for 3D images to be
integers. That is causing problems with the Vulkan API where we need
to be able to use a 3D layer that could be in between 2 layers.
This change allows a floating point value to be passed for blits and
internally sets up the input parameters to pass floating point values
to kernels.
v2: Use tex op to determinate what types are the coordinates (Jason)
Drop setting params->z (Lionel)
v3: Fix nir_texop_txf_ms_mcs op not considered as having integer coords (Lionel)
v4: Fix incorrect test on nir_texop_txf_ms_mcs (Ivan)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
I considered a couple other options (including adding #if / #endif
around UNUSED and adding an UNUSED_ON_SOME_GEN), but this seemed the
best. There was also at least one other case of having UNUSED on a
paramter that is sometimes unused (params in
blorp_emit_color_calc_state).
This header gets included in a lot of places (esp. in files that get
built per-Gen), so the warnings are repeated a lot.
In file included from src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:
src/intel/blorp/blorp_genX_exec.h: In function ‘emit_urb_config’:
src/intel/blorp/blorp_genX_exec.h:193:48: warning: unused parameter ‘deref_block_size’ [-Wunused-parameter]
193 | enum gen_urb_deref_block_size *deref_block_size)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_fill_vertex_buffer_state’:
src/intel/blorp/blorp_genX_exec.h:350:52: warning: unused parameter ‘batch’ [-Wunused-parameter]
350 | blorp_fill_vertex_buffer_state(struct blorp_batch *batch,
| ~~~~~~~~~~~~~~~~~~~~^~~~~
src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_surface_state’:
src/intel/blorp/blorp_genX_exec.h:1403:42: warning: unused parameter ‘aux_op’ [-Wunused-parameter]
1403 | enum isl_aux_op aux_op,
| ~~~~~~~~~~~~~~~~^~~~~~
src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_update_clear_color’:
src/intel/blorp/blorp_genX_exec.h:1867:46: warning: unused parameter ‘batch’ [-Wunused-parameter]
1867 | blorp_update_clear_color(struct blorp_batch *batch,
| ~~~~~~~~~~~~~~~~~~~~^~~~~
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>
We should set "Full Surface Depth and Stencil Clear" field of WM_HZ_OP
3DSTATE packet, only when application requires the entire depth surface
to be cleared.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>
Instead of having separate lists of variables, roughly sorted by mode,
use a single list for all shader-level NIR variables. This makes a few
list walks a bit longer here and there but list walks aren't a very
common thing in NIR at all. On the other hand, it makes a lot of things
like validation, printing, etc. way simpler. Also, there are a number
of cases where we move variables from inputs/outputs to globals and this
makes it way easier because we no longer have to move them between
lists. We only have to deal with that if moving them from the shader to
a nir_function_impl.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
Add a new aux usage which more accurately describes the behavior of
CCS_E on gen12. On this platform, writes using the 3D engine are either
compressed or substituted with fast-cleared blocks.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5363>
This allows us to do API specific checks before removing variable
without filling nir_remove_dead_variables() with API specific code.
In the following patches we will use this to support the removal
of dead uniforms in GLSL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
Identify if view_index is used only for position calculation, and use
Primitive Replication to implement Multiview in Gen12. This feature
allows storing per-view position information in a single execution of
the shader, treating position as an array.
The shader is transformed by adding a for-loop around it, that have an
iteration per active view (in the view_mask). Stores to the position
now store into the position array for the current index in the loop,
and load_view_index() will return the view index corresponding to the
current index in the loop.
The feature is controlled by setting the environment variable
ANV_PRIMITIVE_REPLICATION_MAX_VIEWS, which defaults to 2 if unset.
For pipelines with view counts larger than that, the regular
instancing will be used instead of Primitive Replication. To disable
it completely set the variable to 0.
v2: Don't assume position is set in vertex shader; remove only stores
for position; don't apply optimizations since other passes will
do; clone shader body without extract/reinsert; don't use
last_block (potentially stale). (Jason)
Fix view_index immediate to contain the view index, not its order.
Check for maximum number of views supported.
Add guard for gen12.
v3: Clone the entire shader function and change it before reinsert;
disable optimization when shader has memory writes. (Jason)
Use a single environment variable with _DEBUG on the name.
v4: Change to use new nir_deref_instr.
When removing stores, look for mode nir_var_shader_out instead
of the walking the list of outputs.
Ensure unused derefs are removed in the non-position part of the
shader.
Remove dead control flow when identifying if can use or not
primitive replication.
v5: Consider all the active shaders (including fragment) when deciding
that Primitive Replication can be used.
Change environment variable to ANV_PRIMITIVE_REPLICATION.
Squash the emission of 3DSTATE_PRIMITIVE_REPLICATION into this patch.
Disable Prim Rep in blorp_exec_3d.
v6: Use a loop around the shader, instead of manually unrolling, since
the regular unroll pass will kick in.
Document that we don't expect to see copy_deref or load_deref
involving the position variable.
Recover use_primitive_replication value when loading pipeline from
the cache.
Set VARYING_SLOT_LAYER to 0 in the shader. Earlier versions were
relying on ForceZeroRTAIndexEnable but that might not be
sufficient.
Disable Prim Rep in cmd_buffer_so_memcpy.
v7: Don't use Primitive Replication if position is not set, fallback
to instancing; change environment variable to be
ANV_PRIMITVE_REPLICATION_MAX_VIEWS and default it to 2 based on
experiments.
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Change brw_compute_vue_map() to also take the number of pos slots. If
more than one slot is used, the VARYING_SLOT_POS is treated as an
array.
When using Primitive Replication, instead of a single position, the
VUE must contain an array of positions. Padding might be
necessary (after clip distance) to ensure rest of attributes start
aligned.
v2: Add note about array in the commit message and assert that
pos_slots >= 1 to make clear 0 is invalid. (Jason)
Move padding to be after the clip distance.
v3: Apply the correct offset when gathering the sources from outputs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
Stencil CCS is slightly different from color CCS. Using a color CCS
resolve with stencil CCS doesn't do the right thing and you can't sample
from a stencil CCS image without the DepthStencilResource bit set or you
will get the wrong data. Stencil CCS also has it's own rules such as it
doesn't support fast-clear and has no partial resolve. This seems to
indicate that it should probably be its own isl_aux_usage. Now that
adding new isl_aux_usage values is pretty cheap, let's split stencil CCS
out on its own.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Previously, we always set the aux_usage to ISL_AUX_USAGE_HIZ_CCS and let
ISL choose write-through based on isl_surf_supports_hiz_ccs_wt. This
commit makes us choose explicitly at surface creation time whether to
use HIZ_CCS or HIZ_CCS_WT based on the same set of conditions. This is
more explicit and should be more robust as it lets us choose WT mode in
one place rather than trusting isl_surf_supports_hiz_ccs_wt to return
the same thing every time.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
On Gen4 and G45 and earlier, we have to handle weird offsetting to write
to depth and stencil due to a lack of proper depth mipmapping support in
hardware. On Gen6, we have to deal with strange HiZ and stencil
layouts. Prior to Gen9, we also had to do crazy things for stencil
writes because we didn't support GL_ARB_shader_stencil_export and
friends in hardware. However, starting with Gen7 for depth and Gen9 for
stencil, we can easily write out with the "right" hardware. This allows
us to leave HiZ and other compression enabled for blorp_blit() and
blorp_copy() operations.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3717>
Previously, i965/iris tried to reuse the currently programmed URB config
if it was good enough for BLORP, rather than reprogramming it each time.
However, this will make some things harder on Gen12+ and we've not seen
any performance impact from emitting URB more frequently in ANV.
This makes the blorp <-> driver interface a bit simpler on Gen7+ because
now all the driver has to do is to provide the L3$ config rather than
trying to hand off URB re-config to blorp.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
According to the BSpec, this should prevent hangs when using shaders
with large URB entries. A more precise fix can be done but it requires
re-arranging URB setup.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>