To support multipass, querying perf counters happens in several steps
below.
0) There's a scratch reg to set pass indices for perf counters query.
Prepare cmd streams to set each pass index to the reg at device
creation time. See tu_CreateDevice in tu_device.c
1) Emit command streams to read all requested perf counters at all
passes in begin/end query with CP_REG_TEST/CP_COND_REG_EXEC, which
reads the scratch reg where pass index is set.
2) Pick the right cs setting proper pass index to the reg and prepend it
to the command buffer at each submit time.
3) If the pass index in the reg is true, then executes the command
stream below CP_COND_REG_EXEC.
Would need to implement for kgsl in the future.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
There are still some commands unimplemented yet.
- vkGetPhysicalDeviceQueueFamilyPerformanceQueryPassesKHR:
The following patch supports this.
- vkAcquireProfilingLockKHR / vkReleaseProfilingLock
This patch supports only monitoring perf counters for each submit.
To reserve/configure counters across submits we would need a kernel
interface to be able to do that.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
All odd numbers above 10 need to be rounded up to an even number, so
add one and mask off the least significant bit instead of maintaining
a list of special cases.
Fixes crashes in SuperTuxKart.
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8191>
This can result in meaningful compression changes so we shouldn't skip.
Fixes: 66131ceb8b "radv: Pass through render loop detection to internal layout decisions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
All depth/stencil formats are incompatible each others, so the
mutable bit and the image format list can be ignored.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8126>
Because compile_shader function state variable
not determine whether the compilation is successful.
Signed-off-by: cheyang <cheyang@bytedance.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8178>
GL_MAX_VARYING_COMPONENTS is bumped to 124 since it should
not include the components of gl_Position. (Same as in blob)
GL_MAX_*_OUTPUT_COMPONENTS is bumped to 128, only
GL_MAX_GEOMETRY_INPUT_COMPONENTS is 64. (Same as in blob)
Per GL 3.2 spec the minimum of:
- GL_MAX_GEOMETRY_OUTPUT_COMPONENTS is 128
- GL_MAX_FRAGMENT_INPUT_COMPONENTS is 128
- others is 64
Per ARB_tessellation_shader the minimum of:
- GL_MAX_TESS_CONTROL_*_COMPONENTS to be 128
- GL_MAX_TESS_EVALUATION_*_COMPONENTS to be 128
Allows passing of:
gl-3.2-minmax
arb_tessellation_shader-minmax
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7917>
POS, PSIZE, CLIP_DIST0, and CLIP_DIST1 have their own predefined
indices, map's size should take this into account.
Fixes: 9e063b01 "ir3: Switch tess lowering to use location"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7917>
The actual max count is 32 which corresponds to 128 output components.
Fixes: 2251a434 "freedreno/a6xx: Write multiple regs for SP_VS_OUT_REG and SP_VS_VPC_DST_REG"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7917>
This is hard to abstract using the vulkan interface, so just
add support for copying both values in the llvmpipe backend
for the lavapipe frontend.
v2: use a loop
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7981>
This is needed to implement the vulkan transform feedback pause
resume functionality
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7981>
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable data going out of scope leaks the storage it points to.
Fixes: 2ea15cd661 ("d3d12: introduce d3d12 gallium driver")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8170>
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable data going out of scope leaks the storage it points to.
Fixes: 2ea15cd661 ("d3d12: introduce d3d12 gallium driver")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8150>
MALI_WRAP_MODE_CLAMP doesn't work fully on either GPU generation, so
use other wrap modes instead in some cases.
With nearest filtering, Midgard only clamps to the edge for two of the
edges, and uses the border colour for the other two. Using the clamp
mode on Bifrost causes broken rendering and/or GPU faults.
Fixes piglit test "texwrap" on both Midgard and Bifrost, and fixes
Chromium B.S.U. rendering on Bifrost.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8176>
Depth/stencil resolves are only allowed inside a subpass, which means
the offset is always 0 and the draw/dispatch covers the whole image.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8127>
SPIR-V modules can have multiple shaders (including of the same
stage), but the global variables are all declared for the whole
module. This can result in variables with same Binding but
incompatible types, so those need to be removed before we use.
Previously, a similar issue but with a narrower scope was fixed by
6775665e5e ("spirv: Eliminate dead input/output variables after
translation.").
This patch depends on the previous patch that prevents variables used
only in pointer initializers to be considered dead.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3686
Fixes: 3a266a18 ("nir/spirv: Add support for declaring variables")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8133>
Between the creation of a shader (from GLSL or SPIRV frontends) and
nir_lower_variable_initializers is called, variables may refer to
other variables for initialization. Those referred variables need to
be kept alive, so consider that in the pass.
Fixes: 7acc81056f ("compiler/nir: Add support for variable initialization from a pointer")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8133>
Uses same NIR intrinsic as glsl_to_nir. Make it an option so it is
easy later to move Vulkan drivers incrementally to use it.
Fixes piglit test spec/arb_gl_spirv/execution/ssbo/unsized-array-length.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3691
Fixes: 15e43907 ("iris: Enable ARB_gl_spirv and ARB_spirv_extensions")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8136>
hash table keys for inserted items have to be valid memory ranges for the
lifetime of the corresponding entry, so using a stack-allocated key like this
is broken and doesn't accurately return the correct renderpass
Fixes a872f4636924: ("zink: cache render-passes")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8011>
packed buffers will still return the full format when we're using only the
stencil aspect, so we need to explicitly set the format here
also due to 7ca72f1726 we can't use the provided
swizzle for this case and have to force y -> x component swizzle
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7491>
this reduces to a load_ubo after optimization, so we need to ensure that
the constant data is put in a buffer instead of relying on it happening
coincidentally
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5885>
geometry shaders need to output this variable as well, and the variable
added using the pass on the vertex shader won't be passed through here
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5885>
It should actually be 4 because the maximum fragment size supported
by the hardware is 2x2.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8100>
SWR is missing implementation of pipe_context::flush_resource
function, which is now in the execution path on Windows.
This change adds an empty implementation (flush_resource
is NOOP in SWR) to prevent crashes
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8159>
LLVM (like NIR) requires phi instructions to be before any other
instructions in the block. ac_branch_exited() can insert non-phi
instructions before visit_block() adds phis, so visit_block() should add
phi instructions before the non-phi instructions ac_branch_exited()
inserts.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: aa757f4f8c ("ac/llvm: fix demote inside conditional branches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8054>
This was disabled due to some depth/stencil resolve CTS failures
which are now fixed.
I figured that disabling TC-compat HTILE for D32_SFLOAT+MSAA reduced
performance in Control by -11% on Vega10. In fact, the game only uses
D32_SFLOAT for depth rendering.
This gives a huge boost in Control on Navi10 (eg. +17% in MSAA4x).
Note that the game is still slower than PRO without MSAA on Navi10,
but as fast (or even a bit faster) on Vega10.
I think TC-compat HILE could also be enabled for D32_SFLOAT_S8_UINT
but it needs more testing first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8143>
imageSize() expects the last component of the return value to be the
number of layers in the texture array. In the case of cube map array,
it will return a ivec3, with the third component being the number of
layer-faces.
Fixes: dEQP-VK.image.image_size.cube_array.*
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8087>
The glPopAttrib optimizations incorrectly removed it.
Use GL_ALL_ATTRIB_BITS to mean "all texture parameters have changed" to
make it more efficient.
Fixes: d0e18550e2 - mesa: optimize saving/restoring bound textures for glPush/PopAttrib
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8046>
It was a typo, or thinko, sort of.
Fixes: d0e18550e2 - mesa: optimize saving/restoring bound textures for glPush/PopAttrib
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8046>
See the comment. This is something I spotted in the code. There is
no known bug caused by this.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8046>
This changes the code so that program parameters no longer have to be
sorted (meaning uniforms and constants are before state variables).
Instead of checking if the parameter is a state variable for every element,
teach all functions to handle non-state parameters safely. This is better
for the most common case where parameters are sorted or semi-sorted.
The new enum STATE_NOT_STATE_VAR identifes that a parameter is not
a state variable.
Fixes: 63f7d7dd - mesa: take advantage of sorted parameters in _mesa_load_state_parameters
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3914
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8046>
"state" contains NIR, while "vs->base.state" contains TGSI generated
from NIR. It was a typo.
This fixes the arb_vp subtest of: DRAW_USE_LLVM=1 piglit/bin/rasterpos
Fixes: df11ceaaaf - draw: add NIR support to draw_create_vertex_shader
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8046>
We apparently don't have anything else making sure that it's flushed in
between use as a render target and use as a texture source, so bypass-mode
depth texture sampling could get stale data.
Fixes consistent (as far as I could see) failures in FD_MESA_DEBUG=nogmem
on:
dEQP-GLES31.functional.texture.multisample.samples_*.use_texture_depth_2d
dEQP-GLES31.functional.stencil_texturing.render.depth24_stencil8_draw
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8146>
list_del dereferences both next and prev, so if only one of them could
be NULL we would get crashes already.
Should fix "Dereference after null check" reported by Coverity.
Code was added in: 64b73b770b ("iris: Fix bad external BO hash table and zombie list interactions")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8110>
If the only user is a trivial bcsel which in a second step
can be turned into a phi, this conversion is also worth it
even if the previous result is not undefined or constant.
Allows for some more loop unrolling or saves a few instructions.
Totals from 62 (0.04% of 139391) affected shaders (NAVI10):
SGPRs: 4976 -> 4992 (+0.32%)
VGPRs: 4408 -> 4472 (+1.45%); split: -0.45%, +1.91%
CodeSize: 453632 -> 464000 (+2.29%); split: -0.32%, +2.60%
MaxWaves: 527 -> 511 (-3.04%); split: +0.38%, -3.42%
Instrs: 84940 -> 86681 (+2.05%); split: -0.36%, +2.41%
Cycles: 11946844 -> 11783708 (-1.37%); split: -1.40%, +0.04%
VMEM: 9403 -> 10357 (+10.15%); split: +11.59%, -1.45%
SMEM: 3003 -> 3025 (+0.73%); split: +1.07%, -0.33%
VClause: 1756 -> 1997 (+13.72%); split: -0.11%, +13.84%
SClause: 2914 -> 2915 (+0.03%); split: -0.10%, +0.14%
Copies: 6426 -> 6768 (+5.32%); split: -4.14%, +9.46%
Branches: 2105 -> 2102 (-0.14%); split: -1.66%, +1.52%
PreSGPRs: 2921 -> 2909 (-0.41%); split: -0.55%, +0.14%
PreVGPRs: 4151 -> 4179 (+0.67%); split: -0.24%, +0.92%
cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8123>
The cl_khr_extended_versioning extension differs from the OpenCL 3.0
version on this specific as it only reports a single supported OpenCL C
version, whereas the OpenCL 3.0 equivalent will report all of them.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7590>
Since we're requiring the branch condition to be in WQM, we have to ensure
that the block is in the worklist.
Fixes Trials Fusion hang at 4K and High settings.
fossil-db (Sienna):
Totals from 216 (0.15% of 139391) affected shaders:
SGPRs: 13392 -> 13360 (-0.24%)
CodeSize: 1321184 -> 1318592 (-0.20%)
Instrs: 255310 -> 254662 (-0.25%)
Cycles: 2178360 -> 2174652 (-0.17%)
Affected fossils in fossil-db are dirt4, nier and youngblood.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8145>
arraystride is a required decoration for arrays of scalars, so ensure that
we put in some effort on this for the case where an array doesn't specify
an explicit stride
Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8142>
according to spec, dvec3 and dvec4 vertex attribs require 2 slots (locations),
and so the shader loads have to be explicitly split to reflect this
helpfully, gallium already gives us the vertex element state in a split format,
so no other changes are necessary to have this work as expected
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8141>
According to the spec:
"pCounterBuffers is an optional array of buffer handles [...]
If pCounterBuffers is NULL, then transform feedback will start
capturing vertex data to byte offset zero in all bound transform
feedback buffers."
"If counterBufferCount is not 0, and pCounterBuffers is not NULL,
pCounterBuffers must be a valid pointer to an array [...]"
So counterBufferCount could be non-zero with pCounterBuffers
being NULL.
Fixes crash in RenderDoc when inspecting draw call with tesselation
or geometry shader present.
Fixes: 98b0d900 "turnip: rework streamout state and add missing counter buffer read/writes"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8140>
According to the spec:
"pTessellationState [...] is ignored if the pipeline does not
include a tessellation control shader stage and tessellation
evaluation shader stage."
Fixes crash in RenderDoc when inspecting draw call with
geometry shader but without tesselation shaders.
Fixes: eefdca2e "turnip: Parse tess state and support PATCH primtype"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8140>
I thought this was a bug in CTS but the Vulkan spec says:
"VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT specifies write access
to a color, resolve, or depth/stencil resolve attachment during
a render pass or via certain subpass load and store operations."
So, VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT is used to synchronize
depth/stencil resolve attachments. Yes, it's counterintuitive.
This can't actually be fixed properly for now because RADV performs
the end subpass barrier *before* resolve attachments instead of after.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8138>
In case one operand was renamed and another operand came
from an incomplete phi, it could happen, that the original
name was not restored.
This has no impact on the code, but ensures correct SSA
is maintained during RA.
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8109>
EGL_EXT_protected_surface introduces EGL_PROTECTED_CONTENT_EXT,
while EGL_EXT_protected_content is about protected context.
When I implemented EGL_EXT_protected_surface I mixed up the 2
names, so this commit fixes it.
Fixes: bd182777c8 ("egl: implement EGL_EXT_protected_surface support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8122>
Since Gallium supports 8 bit indices, this extension is a simple matter
of plumbing a value through, exposing a feature and flipping the switch
for the extension. This lets zink avoid up-converting the index-buffer
before drawing.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8082>
Instead of checking whether the source and destination are the same,
we should check if the underlying BOs are the same, since we may
be suballocating resources from the same allocation and the kernel
will fail to execute jobs if the BO list has duplicated entries.
Fixes aborts with Unreal Engine due to failed TFU jobs.
Fixes: 30f1fc25ce ('v3dv: implement TFU blits')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8098>
According to the comment of this function,return a non-negative
number for the number of scopes between the current scope and
the scope where a symbol was defined.
Signed-off-by: cheyang <cheyang@bytedance.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8084>
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member name is not initialized in
this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7766>
The LOD bias can be negative, so mark it as signed in the XML.
The code in pan_cmdstream.c already calls FIXED_16 with
'allow_negative' set to true, so doesn't need to be adjusted.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8130>
Any excess sign-extend bits would spill into adjacent fields, so mask
off anything after the end bit.
Shift from 2 instead of 1, because there needs to be one extra bit in
the mask as 'end' is inclusive.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8130>
If there are too many jobs in a batch, split it. Although the GPU
theoretically supports 65536 jobs in a batch, set the threshold lower
to avoid GPU timeouts.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8130>
This moves the parts of zink_format.c that also operates on zink_screen
into zink_screen.c. This has the benefit that we can start testing the
enum-translation code separately from the state.
This will make the next commit a bit cleaner.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7982>
If we have a context, make sure any work on it's done before
reading from the render target. There may even be pending
MSAA resolves that haven't been submitted yet.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
Always use the experimental shader models feature, which allows
unsigned DXIL to be used, so we don't need a libdxil for WSL.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
We've been inconsistent between IID_PPV_ARGS,
__uuidof(var), and __uuidof(type). Since Linux doesn't
support the latter of these, they need to be changed.
While we're at it, switch all __uuidof to the more terse
IIV_PPV_ARGS option.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
MSVC has an extension for getting IIDs (GUIDs) from types. Other
compilers can support this extension when targeting Windows, but
don't support it when targeting Linux. Instead, winadapter.h
defines __uuidof(var) to uuidof<decltype(var)>. Then dxguids.h
provides inline specialized definitions for the known D3D types.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
The winadapter.h provides typedefs and defines to enable the
D3D/DXCore headers to be included as-is when targeting non-
Windows platforms.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
This is more up-to-date with what's on GitHub, and more importantly,
it embeds some of the non-Windows support logic in the header, instead
of shelling out to a nonexistent header.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
Not all Windows platforms have DXGI, and neither does WSL.
Instead, we can use the DXCore API for adapter enumeration.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
This does 2 things for us:
1. Allows us to compile-time depend on any features from new headers,
instead of having to conditionally compile based on Windows SDK version.
2. Allows us to reference d3d12.h when compiling for non-Windows.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7937>
On some platforms, the authenticate callback may be NULL, e.g. on
surfaceless. If a client tries to send a wl_drm.authenticate request
the handler tries to dereference the NULL pointer.
This can be reproduced with libva which unconditionally tries to use
wl_drm.authenticate even with render nodes [1]. Run a compositor with
a surfaceless context, then try to start e.g. mpv to trigger the
segfault.
[1]: https://github.com/intel/libva/pull/476
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7992>
(some) drivers need to have the swizzle set prior to create_sampler_view
being called in order to actually apply it
Fixes: d11fefa961 ("st/mesa: optimize 4-component ubyte glDrawPixels")
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8107>
This uses ralloc for spirv_shader and it's data-payload, which seems a
bit neater than having to remember to free twice. We can now also easily
piggy back on more sophisticated ralloc usage as well.
No need to use rzalloc here, as we'll write all memory in the struct,
and the struct isn't used as a hashmap key, so padding shouldn't matter.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8049>
Using the general layout for samplers can have terrible performance, so
let's use shader-read-only-optimal instead.
This is fairly straight-forward if we use conservative bounds for the
barriers, and assume they are being used in all stages.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7655>
Quoting a comment on the bug report:
I suspect the shader is incorrect.
When a (conditional) discard is executed then control flow
becomes non-uniform, meaning that subsequent implicit
derivatives required for the texture operation are not
computed correctly.
Using glsl_correct_derivatives_after_discard fixes it. Note
that for radeonsi this requires LLVM master to work properly.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1386
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8005>
The spec says:
When disabled, it is as if theline stipple has its default value
(the default value being all 1's)
So treat pattern=0xffff as line stippling = off.
This improves performance in specviewperf13 snx lines tests.
For instance in the last test I get:
* master: 260 fps, gpu-load: ~92%
* with this commit: 280 fps, gpu-load: ~72%
(both tested with d60930c017 reverted)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8105>
This is close to a revert of commit
b5b25ee032, but it limits the scope a bit
to avoid needless performance degregation.
In the long run, we should really allow using tiled resources here, and
instead detile while presenting.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8115>
It doesn't make complete sense to me, but it's copied from the commit
message that made this change.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8022>
The problem was that the shader constants were based on the framebuffer
sample count and ignored the multisample enable state and the line/polygon
smoothing state, which uses MSAA rasterization that only sets SampleMaskIn
to get the coverage for alpha-blended smoothing (the PS epilog computes
the alpha channel from SampleMaskIn and blending generates the AA results).
- This is a complete rework that adds a new state for NGG cull constants.
- It fixes the same thing for the prim discard compute shader.
- It documents how VS_STATE.SMALL_PRIM_PRECISION is encoded.
It fixes blue corruption in Unigine Heaven with MSAA and Medium details
or better.
Fixes: 7648060dc0 - radeonsi: enable NGG culling by default on gfx10.3 dGPUs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8022>
According to the mali driver output, the Mali-400 GP provides space for
304 vec4 uniforms, globals and temporary variables.
The Mali-PP supports a uniform table up to size 32768 total.
However, indirect access to an uniform only supports indices up to 8192
(a 2048 vec4 array). Trying to access beyond that currently causes a pp
job timeout with both lima and the mali driver. To prevent indices
bigger than that in application uniforms, limit to 8192 for now.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8079>
dep_valgrind gives you -I/usr/include/valgrind (or whatever) so if
valgrind/ wasn't in the search path anyway, these includes would fail.
Found in CI when adding valgrind to the build images.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7936>
There's no harm in checking for the extension on non-macOS, just do it.
Nor can I see any point in checking for both the layer and the
extension, since you're never going to see the extension if the layer
isn't available, so just check for the extension instead of the reduced
boolean. Simplify some variable naming while we're at it.
Acked-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8019>
This result isn't actually used within zink_create_instance, so don't do
it there.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8019>
Magic parameters are gross, this makes zink_internal_create_screen a bit
more reusable.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8019>
This complements u_bitcast_f2u and u_bitcast_u2f with similar helpers
to cast between double and unsigned integers as well.
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8034>
These always work on 32-bit variables, so let's make that assumption
explicit.
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8034>
Fixes the piglit tex-miplevel-selection test by:
1. properly taking texture baselevel and maxlevel into account
2. only enable lodbias when mipmapping is enabled
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7634>
The DRM_RDWR flag is needed for mmap with PROT_WRITE to work.
Cc: mesa-stable
Signed-off-by: Robin Ole Heinemann <robin.ole.heinemann@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8075>
According to ANDROID_get_native_client_buffer, EGL implementations must
guarantee that the lifetime of an EGLClientBuffer returned by
eglGetNativeClientBufferANDROID is at least as long as that of the
EGLImage which is bound to. Do this by acquiring a reference to the
underlying AHardwareBuffer for all ANativeWindowBuffers which are bound
to an _EGLImage.
Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7805>
This is a noop, as no loader extensions pass a __DRIimage's
loader_private data back to the loader.
Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7805>
It's hooked up in all the pipe wrapper drivers, and all the
frontends except a couple places in glx/xlib.
This enables a more efficient path for drivers which use
swrast's Present, but hardware rendering (e.g. d3d12, zink).
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Marek Olák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8045>