The biggest change here is the rename of VK_NVX_ray_tracing to
VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Enables on R600 and makes pass:
dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*
v2: remove chunk for dri/radeon (Emil)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Allocating through Gralloc implies buffers are going to be used
outside the driver. We have special MOCS settings for external BOs and
we probably want to use them here too.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a1220e7311 ("anv/android: Set the BO flags in bo_cache_import (v2)")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This reduces the amount of #ifdef ANDROID we'll have to have inside
the driver. Potentially offering better coverage of the android
extensions.
v2: Move anv_android.h include before anv_entrypoints.h (Tapani)
Fix autotools android build (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
At higher resolutions with the addition of MSAA, the number of tiles
can increase to the point where we use more than one VSC pipe per
tile. Which would cause us to calculate an out-of-bounds offset for
VSC_SIZE_ADDRESS. So don't try to be clever, just always put it at
a fixed offset assuming the max 32 VSC pipes in use.
Signed-off-by: Rob Clark <robdclark@gmail.com>
After commit a9fb331ea ("wayland/egl: update surface size on window
resize"), the surface size is updated as soon as the resize is done, and
`update_buffers()` would resize only if the surface size differs from
the attached size.
However, in the case of swrast, there is no resize callback and the
attached size is updated in `dri2_wl_swrast_commit_backbuffer()` prior
to the `swrast_update_buffers()` so the attached size is always up to
date when it reaches `swrast_update_buffers()` and the surface is never
resized.
This can be observed with "totem" using the GDK backend on Wayland (the
default) when running on software rendering:
$ LIBGL_ALWAYS_SOFTWARE=true CLUTTER_BACKEND=gdk totem
Resizing the window would leave the EGL surface size unchanged.
To avoid the issue, partially revert the part of commit a9fb331ea for
`swrast_update_buffers()` and resize on the win size and not the
attached size.
Fixes: a9fb331ea - wayland/egl: update surface size on window resize
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
CC: Daniel Stone <daniel@fooishbar.org>
CC: Juan A. Suarez Romero <jasuarez@igalia.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
If the user provides an invalid display or device the ToVendor lookup
will fail.
In this case, the local [Mesa vendor] error code will be set. Thus on
sequential eglGetError(), the error will be EGL_SUCCESS.
To be more specific, GLVND remembers the last vendor and calls back
into it's eglGetError, although there's no guarantee to ever have had
one.
v2:
- Add _eglError call, so the debug callback is executed (Kyle)
- Drop XXX comment.
Piglit: tests/egl/spec/egl_ext_device_query
Fixes: ce562f9e3f ("EGL: Implement the libglvnd interface for EGL (v3)")
Cc: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kyle Brenneman <kbrenneman@nvidia.com>
eglQueryDevicesEXT (unlike the other three functions) does not depend
on the display. It is implemented in GLVND, which calls into each
driver collecting the list of devices and presenting it to the user.
For the other entrypoints, GLVND acts as pass through stub calling into
the vendor library. The vendor implementation calls back into GLVND to
get the vendor dispatch. Then the driver proceeds to call itself via
the said dispatch.
This design makes is possible to keep using "old" GLVND with newer
vendor drivers. Since effectively all the extension code is within the
latter itself.
Without said entrypoints, any user will outright crash - as reported in
the bug report.
Note: there's a follow-up fix needed to our GLVND code, to make piglit
happy.
v2: add some beefy documentation in the commit message.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108635
Fixes: 7552fcb7b9 ("egl: add base EGL_EXT_device_base implementation")
Reported-by: kyle.devir@mykolab.com
Cc: kyle.devir@mykolab.com
Acked-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 4373dd3215 ("st/va: Support YUV formats in vaCreateSurfaces")
Cc: Drew Davenport <ddavenport@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Message that may show the culprit of assert now will
be dumped before that for debug purposes.
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We can only map at page aligned offsets. We got that wrong with buffer
size where (size % 4096) != 0 (anv has a WA buffer of 1024).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Building of 32bit Mesa may fail if __SSE__ is not specified.
Added missed dependency from libm.
v2: avoided dependecy on any flag, just link
v3: meson doesn't fail, but have added dependency on libm
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108560
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
The spec says:
"vkCmdCopyQueryPoolResults is considered to be a transfer
operation, and its writes to buffer memory must be synchronized
using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT
before using the results."
VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. So, it's useless to set those flags internally.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In the non-SSO case, where multiple shader stages are linked together,
we were recording garbage pipe_stream_output_info structures for all
but the last enabled geometry-processing stage.
Specifically, we were using the gl_transform_feedback_info from
shader_program->last_vert_prog (the stage whose outputs will be
recorded)...but were pairing it with the output varying mappings
from the current shader stage. For example, a program with a VS and
GS, the VS's pipe_shader_state would have a pipe_stream_output_info
based on the GS transform feedback info, but the VS output mapping.
This generally worked out okay because only the pipe_stream_output_info
for the last stage really matters - the others can be ignored. However,
we'd like to avoid confusing the pipe driver. In particular, my new
driver translates the stream out information to hardware packets at
bind_{vs,tes,gs}_state() time...and was hitting asserts about garbage
varyings that didn't exist.
This patch changes st/mesa to record a blank pipe_stream_output_info
with num_outputs = 0 for all stages prior to last_vert_prog. The last
one is captured as normal.
(In the fully-SSO case, nothing should change - each program contains
a single shader stage, so last_vert_prog *is* the current shader.)
Tested with llvmpipe (piglit's gpu profile), and freedreno (a3xx,
gpu profile with -t transform.feedback). Fixes several hundred CTS
tests on my new driver.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
There are some cases where the VS is the only stage enabled, it uses the
entire URB, and the URB is large enough that placing later stages after
the VS exceeds the number of bits for "URB Starting Address".
For example, on Icelake GT2, "varying-packing-simple mat2x4 array" from
Piglit is getting a starting offset of 128 for the GS/HS/DS. But the
field is only large enough to hold an offset of 127.
i965 doesn't hit any genxml assertions because it's still using the old
OUT_BATCH mechanism. 128 << GEN7_URB_STARTING_ADDRESS_SHIFT (57) == 0,
with the extra bit falling off the end. So we place the disabled stage
at the beginning of the URB (overlapping with push constants). This is
likely okay since it's a zero size region (0 entries).
It seems like the Vulkan driver might hit this assertion, however, and
the situation seems harmless. To work around this, always place
disabled stages at the start of the URB, so the last enabled stage can
fill the remaining space without overflowing the field.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
libmesa_git_sha1 whole static dependency is added to get git_sha1.h header
and avoid following building error:
external/mesa/src/amd/vulkan/radv_device.c:46:10:
fatal error: 'git_sha1.h' file not found
^
1 error generated.
Fixes: 9d40ec2cf6 ("radv: Add support for VK_KHR_driver_properties.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
In some cases (not building with llvm, which automatically pulls in
pthreads) nine needs to be directly linked with pthreads. Fixes building
on x86 (32 bit) without llvm.
Distro bug: https://bugs.gentoo.org/670094
Fixes: 6b4c7047d5
("meson: build gallium nine state_tracker")
Tested-by: Rafal Lalik <rafallalik@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
WA_1606682166:
Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes.
Disable the Sampler state prefetch functionality in the SARB by
programming 0xB000[30] to '1'. This is to be done at boot time and
the feature must remain disabled permanently.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In the same spirit as commit a5889d70f2
"i965/icl: Disable binding table prefetching". Fixes some 110+
intermittent piglit failures with tex-miplevel-selection variants.
WA_1606682166:
Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes.
Disable the Sampler state prefetch functionality in the SARB by
programming 0xB000[30] to '1'. This is to be done at boot time and
the feature must remain disabled permanently.
Anuj: Set SamplerCount = 0 for vs, gs, hs, ds and wm units as well.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The option was removed in LLVM r345763
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
We cannot use nir_build_alu() to create the new alu as it has no
way to know how many components of the src we will use. This
results in it guessing the max number of components from one of
its inputs.
Fixes the following CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_frag
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_geom
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_tessc
dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_vert
Fixes: 2975422ceb ("nir: propagates if condition evaluation down some alu chains")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
To avoid build error in u_debug_stack_android.cpp
due to now missing u_debug.h header:
external/mesa/src/gallium/auxiliary/util/u_debug_stack_android.cpp:26:10:
fatal error: 'u_debug.h' file not found
#include "u_debug.h"
^
1 error generated.
Fixes: 37db383abb ("util: Move u_debug to utils")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
The bind flags defined by mesa/gallium might not always be in sync
with the ones copied to virglrenderer/gallium. Therefore, use the
flags defined in virgl like it is done for all the other calls to
create resources.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This only adds support on the Gallium core level, for the drivers
it is likely that additional changes are needed to support the
new texture format and thereby enabling the extension.
Enables on softpipe and makes pass:
dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
v2: - add include for getting GL_SR8_EXT
v4: - since the extension is not required don't bother providing
a fallback (Ilia Mirkin)
- split patch (2/2) to separate Gallium and mesa/st parts
(Roland Scheidegger)
- trim commit message to only contain the history of the patch
relevant to this part
v5: - don't include GLES headers (required enum has been added to glheader.h)
(Ilia Mirkin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This format is needed to support EXT_texture_sRGB_R8. THe patch adds a new
format enum, the format entries in Gallium and and svga, the mapping between
sRGB and linear formats, and tests.
v2: - add mapping to linear format for PIPE_FORMATR_R8_SRGB
v3: - Add texture format to svga format table since otherwise building
mesa will fail when this driver is enabled. It was not tested
whether the extension actually works.
v4: - svga: remove the SVGA specific format definitions and table entries
and only add correct the location of PIPE_FORMAT_R8_SRGB in the
format_conversion_table (Ilia Mirkin)
- Split patch (1/2) to separate Gallium part and mesa/st part.
(Roland Scheidegger)
- Trim the commit message to only contain the relevant parts from the
split.
v5: - svga: correct location of PIPE_FORMAT_SRGB_R8 (Ilia Mirkin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
v2: - fix format definition line
- disable for desktop GL
- don't add GL_R8_EXT to glext.h since it is already in
GLES2/gl2ext.h in glext.h and include this header where needed
(all Emil)
v3: - swrast: Fill the function table for sRGB_R8
The size of the function table is checked at compile time and must
correspond to the number of mesa texture formats.
dri/swrast being gles-2.0 doesn't support the extension though
v4: - correct format layout comment (Ilia Mirkin)
- correct logic for accepting GL_RED only textures (in part Ilia Mirkin)
EXT_texture_sRGB_R8 requires OpenGL ES 3.0 which includes
ARB_texture_rg/EXT_texture_rg, so one only must check for the first
when SR8_EXT is really requested.
v5: - add define for GL_ES8_XT to glheader.h and don't include GLES
headers (Ilia Mirkin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The GLSL 4.6 specification (section 4.1.14. "Implicit Conversions")
says:
"There are no implicit array or structure conversions. For
example, an array of int cannot be implicitly converted to an
array of float."
So let's add a check in place when assigning array initializers to
implicitly sized arrays, to avoid incorrectly allowing code on the
form:
int[] foo = float[](1.0, 2.0, 3.0)
This fixes the following dEQP test-cases:
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_fragment
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_vertex
- dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_fragment
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
EXT_shader_implicit_conversions adds support for implicit conversions
for GLES 3.1 and above.
This is essentially a subset of ARB_gpu_shader5, and augments
OES_gpu_shader5.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
In GLES, we currently either need an exact match with a local function,
or an exact match with a builtin.
However, if we add support for implicit conversions for GLES shaders,
we also need to fall back to a non-exact match in the case where there
were no builtin match either.
Luckily, we already have a variable ready with this, so let's just
return it if the builtin-search failed.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This makes the code a bit easier to read, as well as reduces repetition,
especially when we add support for EXT_shader_implicit_conversions.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This makes the code a bit easier to read, as well as will reduce
repetition when we add support for EXT_shader_implicit_conversions.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
If the user gives 0 counterBuffers then the driver should still
enable transform feedback on all targets. This changes the
driver to always enable xfb, and use counter buffers where
one is defined for the target in question.
Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
In order to handle pause/resume properly, the offset should
be added to the buffer binding not to the begin/end paths.
v2: don't add offset to size
Fixes ext_transform_feedback-alignment* under zink
Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Since 0c1dd9dee0 ("broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride."), we have the vc4-side BO properly laid out
(assuming it's linear) in the winsys BO so that we can skip this extra
copy.
The recompile reduction is nice, but this also makes it so that a straight
texture copy could get optimized some day to not unpack/repack the f16
values.
If the caller has passed in a stride for (linear) BO import, we should use
that stride when rendering to the BO (or, if we some day support texturing
from linear-imported BOs, when doing the linear-to-UIF shadow copy). This
lets us remove the extra stride-changing relayout in the simulator.
This came from vc4, where we had a file format for GPU hangs. I don't
have one of those for V3D, and I probably won't ever have the simulator
side produce dumps even if I do.
The default setting of this bit is not the desirable behavior.
WA_1406697149
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The default setting of this bit is not the desirable behavior.
WA_1406697149
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 7834926a4f
("meson: add support for generating translation mo files")
Some environment (like Travis apparently) set LC_* vars, messing up the
sort ordering, so let's use envvar with the highest priority to make
sure this is actually sorted in ASCII order.
Suggested-by: Michel Dänzer <michel@daenzer.net>
Fixes: b42dc50a5f "egl: fix entrypoint sorting test"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Some memory and file descriptors are not freed/closed.
v2: fixed case where we skipped the 'aub' variable initialization
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Include stdarg.h in error2aub.c otherwise it fails to build on
OpenBSD due to not finding definitions for va_list va_start va_end.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Ported from RadeonSI. It's always TRUE for CIK+ because RADV
doesn't support 16 samples.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Some of these functions were distributed across different
implementation and header files. Put them at a central place.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
The array type draw is no longer directly dependent on the vbo module.
Thus move array type draws into mesa/main/draw.c.
Rename symbols starting with vbo_* to _mesa_* and apply some
reindenting to make it consistent.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
These calls are just the same in each if branch. So pull that
before the if.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
With this change we preserve the no_current_update property when we
observe a glPrimitiveRestart call. That means that we now also get the
no_current_update optimization for display lists that are made
out of indexed draws using primitive restart.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Instead of coding additional information into the primitive
mode, make the only remaining flag there a direct argument to
vbo_save_NotifyBegin.
v2: Fix incorrect no_current_update in glRectf.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
The _mesa_prim::no_current_update flag should tell the compiled
display list if the current attributes that are placed in the dlists
vbo shall take a defined state past replay of a display list.
Immediate mode draws compiled into display lists should set the
current values. Array draws may leave the current values in
undefined state.
So finally this flag is not a property of every primitive
but it is a property of the compiled display list and there it
is a property of the last primitive compiled into the list.
So move the flag out of _mesa_prim into vbo_save.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
The previous patch left a constant if (0) in the code.
Clean that up now.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
When setting the _mesa_prim::mode field we always filter out
all non OpenGL primitive mode bits. So this tested bit cannot be
there anymore and the test evaluates to zero.
The zero is removed with the next patch to ease review.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Now looking at the implementation of vbo_save_NotifyBegin.
The VBO_SAVE_PRIM_WEAK flag, delivered in the primitive mode
argument to vbo_save_NotifyBegin, is not evaluated anymore.
The two users of the mode argument are the primitive mode
itself, where the VBO_SAVE_PRIM_WEAK bit is masked out to
retrieve the underlying OpenGL primitive mode. The other
user is to check for the VBO_SAVE_PRIM_NO_CURRENT_UPDATE bit
which is different from VBO_SAVE_PRIM_WEAK.
So, since vbo_save_NotifyBegin does not care about
VBO_SAVE_PRIM_WEAK, we can savely remove it from the call
arguments of vbo_save_NotifyBegin.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
The only reader of the weak field in _mesa_prim is pretty
console printing. By that, remove the weak field from _mesa_prim.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
On finishing a display list playback the VBO_SAVE_FALLBACK bit
is still kept in vbo_save_context::replay_flags. But examining
replay_flags and the display list flags that feed this value
the corresponding bit is never set these days anymore.
So, since it is nowhere set or checked, we can safely remove it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
One cannot have haiku and dri2 - surfaceless,x11,etc.
Group things up, which will make the addition of platform_device a bit
easier.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Now that we support the extensions, fully, enabled them.
The specs mandate that we always have at least one device and each dpy
has a device associated with it.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
This is the final requirement from the base EGLDevice spec.
v2:
- split from another patch
- move wayland hunk after we have the fd
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Add implementation based around the drmDevice API. As such it's only
available only when building with libdrm. With the latter already a
requirement when using !SW code paths in the platform code.
Note: the current code will work if a device is hot-plugged. Yet
hot-unplugged is not implemented, since I have no ways of testing it.
v2:
- ddd some _eglDeviceSupports checks
- require DRM_NODE_RENDER
- add _eglGetDRMDeviceRenderNode helper
v3:
- flip inverted asserts (Mathias)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Add a plain software device, which is always available.
We can safely assign it as the first/initial device in _eglGlobals,
although we ensure that's the case with a handful of _eglDeviceSupports
checks throughout the code.
v2:
- s/_eglFindDevice/_eglAddDevice/ (Eric)
- s/_eglLookupAllDevices/_eglRefreshDeviceList/ (Eric)
- move ^^ helpers into a earlier patch (Eric, Mathias)
- set the SW device on _eglGlobal init. (Eric)
- add a number of _eglDeviceSupports checks (Mathias)
- split Device/Display attach to a separate patch
v3:
- flip inverted asserts (Mathias)
- s/on-stack/static/ (Mathias)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Introduce the API for device query and enumeration. Those at the moment
produce nothing useful since zero devices are actually available.
That contradicts with the spec, so the extension isn't advertised just
yet.
With later commits we'll add support for software (always) and hardware
devices. Each one exposing the respective extension string.
v2:
- fold API boilerplate into this patch
- move _eglAddDevice, _eglDeviceSupports, _eglRefreshDeviceList to this
patch (Eric, Mathias)
- make _eglFiniDevice the one called last
v3:
- comment on the dummy _egl_device_extension enum entry (Eric)
- annotate dev as MAYBE_UNUSED (Mathias)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Write down both X and GLX visual types when mapping from one to the
other. Makes grepping through the code a tiny bit easier.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The function definition is no longer around, drop the useless declaration.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is incorrect, the offset is into the buffer, and it's legal
to write
loc 0,0 -> buffer0, offset 0
loc 0,1 -> buffer1, offset 0
This fixes a bunch of piglits running on my zink xfb code on
radv.
Fixes: 6c21645046 (radv: emit stream outputs for vertex and tessellation stages)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Instead of using a while loop with indexing. This is much cleaner. This
requires some other small changes.
Acked-by: Emil Velikov <emil.velikov@collabora.com>
This is a very common python anti-pattern. Not using length allows us to
go through faster C paths, but has the same meaning.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Using shell redirection to write to a file is more complicated than
necessary, and has the potential to run into unicode encoding problems.
It's also less code.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108530
v2: - update commit message to say less about LANG=C
- use flags instead of positional arguments for the script (Emil)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
gen_xmlpool uses a style unlike the rest of mesa, spaces between
function/method calls and the parens, strange whitespace to force lining
up method calls, and some other whitespace stuff. Since I'm going to be
doing some work in the file, I'm going to start cleaning those up.
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Meson won't put the .gmo files in the layout that python's
gettext.translation() expects, it puts them in the build directory in a
flat layout. This modifies android and autotools to do the same (scons
doesn't work with translations at all)
v3: - Squash 4 patches into this patch
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Meson has handy a handy built-in module for handling gettext called
i18n, this module works a bit differently than our autotools build does,
namely it doesn't automatically generate translations instead it creates
3 new top level targets to run. These are:
xmlpool-pot
xmlpool-update-po
xmlpool-gmo
v2: - Add new files to autotools dist tarball
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This is a little cleaner than just looking at sys.argv, but it's also
going to allow us to handle the differences in the way meson and
autotools handle translations more cleanly.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We need to update the cursor before we check if the alu use is
dominated by the if condition. Previously we were checking if
the current location of the alu instruction was dominated by
the if condition which would miss some optimisation opportunities.
Fixes: a3b4cb3458 ("nir/opt_if: Rework condition propagation")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This patch fixes this freedreno autotools build error.
CXXLD ir3_compiler
/usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o): In function `_start':
(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o): relocation R_X86_64_32S against undefined symbol `vgPlain_interim_stack' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_trampoline.o): relocation R_X86_64_32 against `.text' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-dispatch-amd64-linux.o): relocation R_X86_64_32S against symbol `vgPlain_stats__n_xindirs_32' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
Fixes: f3cc0d2747 ("freedreno: import libdrm_freedreno + redesign submit")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108595
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.
Note:
- python3.4 is used as it's the earliest supported version
- python2 chosen prior to python3
v2: use python2 by default
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
gtest is an external project that is copied in this tree for technical
reasons, but isn't maintained by us, so its warnings are irrelevant.
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
This is an external project we have no control over, and will not be
fixing (other than by sometimes pulling the latest sources), so warnings
serve no purpose here.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Since commit "intel/compiler: Stop assuming the entrypoint is called
"main"" there is no need to force the entrypoint name to be "main".
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Patch does a 'dry run' of assign_attribute_or_color_locations before
optimizations to catch cases where we have aliasing of unused attributes
which is forbidden by the GLSL ES 3.x specifications.
We need to run this pass before unused attributes may be removed and with
attribute binding information from program, therefore we re-use existing
pass in linker rather than attempt to write another one.
This fixes WebGL2 test 'gl-bindAttribLocation-aliasing-inactive' and
Piglit test 'gles-3.0-attribute-aliasing'.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106833
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This is a revert of Marek's 2cb9ab53dd revert.
It was needed to revert the previous commit, and didn't have any issue
itself.
--
The "DRI2" name was reported as confusing when printing EGL infos (one
user reported thinking DRI3 was not working on his X server), and the
only alternative is Haiku, which can only be used on a Haiku machine.
The name therefore doesn't add any information that the user wouldn't
know already, so let's just drop it.
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Related-to: b174a1ae72 ("egl: Simplify the "driver" interface")
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This is a revert of Marek's 84f3afc2e1 revert, with a missing
line added back. I failed a rebase and dropped that crucial line, and
didn't do a runtime test after my rebase, and as a result broke EGL for
everyone.
This commit has been tested by Intel's CI and I re-read it once more, so
it should be good this time.
--
Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's
overwritten by the EGL_NOT_INITIALIZED in eglInitialize().
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This reverts commit 773d6ea6e7.
Since kernel 4.17 (drm/etnaviv: remove the need for a gpu-subsystem DT
node) the etnaviv DRM driver doesn't have an associated DT node
anymore. This is technically correct, as the etnaviv device is a
virtual device driving multiple hardware devices.
Before 4.17 the userspace had access to the following information:
DRIVER=etnaviv
OF_NAME=gpu-subsystem
OF_FULLNAME=/gpu-subsystem
OF_COMPATIBLE_0=fsl,imx-gpu-subsystem
OF_COMPATIBLE_N=1
MODALIAS=of:Ngpu-subsystemT<NULL>Cfsl,imx-gpu-subsystem
DRIVER=imx-drm
OF_NAME=display-subsystem
OF_FULLNAME=/display-subsystem
OF_COMPATIBLE_0=fsl,imx-display-subsystem
OF_COMPATIBLE_N=1
Afer 4.17:
DRIVER=etnaviv
MODALIAS=platform:etnaviv
The OF node has never been part of the etnaviv UABI, simply due to the
fact that it's still possible to instantiate the etnaviv driver from a
platform file, instead of a devicetree node.
A patch set to fix this problem was send out [1] but it looks like
that a proper solution needs more time to bake.
[1] https://lists.freedesktop.org/archives/dri-devel/2018-October/194651.html
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
I don't think we want to wait for something that hasn't been
correctly submitted. This is similar to the fallback path.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This isn't true for Vulkan so we have to whack it to "main" in anv which
is silly. Instead of walking the list of functions and asserting that
everything is named "main" and hoping there's only one function named
"main", just use the nir_shader_get_entrypoint() helper which has better
assertions anyway.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
ffs() just returns the bit that is set, we need to know what
stage that bit represents so use u_bit_scan() instead.
Fixes: 2ca5d9548f ("st/glsl_to_nir: gather next_stage in shader_info")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This isn't used in mesa, maybe vmware uses this in a closed source state
tracker?
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
In order to pull u_debug into src/util we need to break the generically
useful bits from the bits that are tightly coupled to gallium.
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
amdgpu doesn't use the INPUT but the AVERAGE subfeature:
$ sensors -u
amdgpu-pci-0100
Adapter: PCI adapter
power1:
power1_average: 17.233
power1_cap: 180.000
Signed-off-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
TGSI has no I64MAD/U64MAD opcode.
Fixes: 278580729a ('st/glsl_to_tgsi: add support for 64-bit integers')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dual source blending behaviour is undefined when shader doesn't
have second color output.
"If SRC1 is included in a src/dst blend factor and
a DualSource RT Write message is not used, results
are UNDEFINED. (This reflects the same restriction in DX APIs,
where undefined results are produced if “o1” is not written
by a PS – there are no default values defined)."
Dismissing fragment in such situation leads to a hang on gen8+
if depth test in enabled.
Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.
v2 (Jason Ekstrand):
- Apply the workaround to each individual entry
- Emit a warning through debug_report
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Dual source blending behaviour is undefined when shader doesn't
have second color output, dismissing fragment in such situation
leads to a hang on gen8+ if depth test in enabled.
Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.
v2 (Kenneth Graunke):
- Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND
- Also whack BLEND_STATE[] to keep the two in sync, since we're not
sure exactly which copy of the redundant info the hardware will use.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Apparently, we're supposed to look at the texture object's built-in
sampler object's sRGB decode setting in order to decide whether to
decode/downsample/re-encode, or simply downsample as-is. Previously,
I had always done the decoding/encoding.
Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
If we restore the 'new batch' using 'intel_batchbuffer_reset_to_saved'
function we must restore the default state of the batch using
'brw_new_batch' function because the 'intel_batchbuffer_flush'
function will not do it for the 'new batch' again.
At least the following fields of the batch
'state_base_address_emitted','aperture_space', 'state_used'
should be restored to default values to avoid:
1. the aperture_space overflow
2. the missed STATE_BASE_ADDRESS commad in the batch
3. the memory overconsumption of the 'statebuffer'
due to uncleared 'state_used' field.
etc.
v2: merge with new commits, changes was minimized, added the 'fixes' tag
v3: added in to patch series
Fixes: 3faf56ffbd "intel: Add an interface for saving/restoring
the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.
CC: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 3faf56ffbd "intel: Add an interface for saving/restoring
the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
While at it, turn "unreachable" assert() into unreachable().
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This was required back when MSVC didn't support C99 and was missing this
header, but since MSVC 2013 (or maybe earlier?) this isn't it does and
this code isn't doing anything anymore.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 6ccc435e7a "pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
We were doing this late after nir_lower_io, but we can just reuse the core
code. By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
We always emit 4 slots per slot because things like color output and
position processing in the epilogue will potentially look up more values
than the variable declaration had. However, when we get a .location_frac
!= 0, we don't want to overwrite components of the following
.driver_location.
For supporting scalar VPM i/o at the NIR level, we need to do a pass over
the vars to figure out how big each attribute is after DCE. Once we've
done that, we can just walk over c->vattr_sizes[] instead of bothering
with vars.
This will be used on V3D to cut down the size of the VS inputs in the VPM
(memory area for sharing data between shader stages).
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
On older kernels, the drmGetDevice() call will wake up all the GPUs
on the system, while fetching the PCI revision.
Use the 2 version of the API and pass flags == 0, so we don't fetch the
device PCI revision, since we don't need that information.
Fixes: baa38c144f ("vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Previously, we would create temporary variables and fill them out.
Instead, we create as many function parameters as we need and pass them
through as SSA defs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
x86 32 bit generic target does not enable ARCH_X86_HAVE_SSE4_1
for this reason all Android library modules using SSE4_1 in mesa
are built conditionally to ARCH_X86_HAVE_SSE4_1
The same approach is now applied to libmesa_intel_tiled_memcpy_sse41
in order to avoid the following building errors:
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:574:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val = _mm_stream_load_si128((__m128i *)src);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:578:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:579:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:580:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:581:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 errors generated.
Fixes: 11b1afdc92 ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Preliminary work for adding handling of different pipes to gen_decoder. We
need to be able to distinguish between different pipes in order to decode
the packets correctly due to opcode re-use.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Use the 'DWord Length' and 'bias' fields from the instruction definition to
parse the packet length from the command stream when possible. The hardcoded
mechanism is used whenever an instruction doesn't have this field.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This function was there when the file was introduced in commit
38f10d5a03 "intel: tools: add aubinator viewer", but was
never actually used.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Left over from the cleanup in 6ccc435e7a "pipe-loader: move dup(fd)
within pipe_loader_drm_probe_fd"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Since out variables are copied from shader objects instruction
streams to linked shader instruction steam it should be cloned
at first to keep source instruction steam unaltered.
Fixes: 966a797e43 ("glsl/linker: Link all out vars from a shader
objects on a single stage")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
Meson already automatically tracks included headers, so there's no need
to add them everywhere; cleans up the code a bit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Fixes: 9d40ec2cf6 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
This implementation should work and potential bugs can be
fixed during the release candidates window anyway.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This is required for GS multiple streams support.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
For all streams. We basically just need to update the
base address and compute a stride for every stream.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Same as the previous patch, except that is only the number of
components.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
For multiple streams support we have to set the different ring
buffer sizes correctly. This relies on the number of output
components per stream.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This will be also used for splitting the GS->VS ring buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This will be used for splitting the GS->VS ring buffer. The
stream ID is always 0 for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This is different from the GL_ARB_spirv pass because it generates a much
simpler data structure that isn't tied to OpenGL and mtypes.h.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This doesn't include any new features but it does include an XML and
header typo fix for modifiers.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The computation seems correct compared to RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
In the 'inorder' case (ie. FD_MESA_DEBUG=inorder, or old kernel), if the
u_blitter clear path is used (a3xx, a4xx, and some fallback cases on
newer gens), util_blitter_restore_fb_state() will set_framebuffer_state()
to something that is identical to the current fb state, which triggers
an unnecessary flush, and then eventually an assert:
(gdb) bt
#0 0x0000007fbf24a078 in kill () from /lib64/libc.so.6
#1 0x0000007fbe061278 in _debug_assert_fail (expr=0x7fbe93a820 "!batch->flushed", file=0x7fbe93a628 "../src/gallium/drivers/freedreno/freedreno_batch.c", line=491, function=0x7fbe93a990 <__func__.17380> "fd_batch_check_size") at ../src/gallium/auxiliary/util/u_debug.c:322
#2 0x0000007fbe1ccb8c in fd_batch_check_size (batch=0x55556d5a70) at ../src/gallium/drivers/freedreno/freedreno_batch.c:491
#3 0x0000007fbe1d0e08 in fd_clear (pctx=0x55555c61e0, buffers=5, color=0x55556e388c, depth=1, stencil=0) at ../src/gallium/drivers/freedreno/freedreno_draw.c:463
#4 0x0000007fbe57afa4 in st_Clear (ctx=0x55556e17b0, mask=18) at ../src/mesa/state_tracker/st_cb_clear.c:452
The assert was introduced in 4b847b38ae, so from a functionality
standpoint this patch fixes that commit. But it should also avoid an
unnecessary flush in the 'inorder' case, fixing a performance bug.
Fixes: 4b847b38ae freedreno: make fd_batch a one-shot thing
Signed-off-by: Rob Clark <robdclark@gmail.com>
ZSA state can change whether depth or stencil is enabled
This plus previous patch fix stk, and various things w/
FD_MESA_DEBUG=inorder
Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
The problem isn't directly with ec717fc629 but rather that commit
exposes the problem. When we switch batch we cannot assume previous
state is clean so we should mark all state dirty.
Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
We were previously using relative timeouts and decrementing the
user-provided timeout as we waited. Instead, this commit refactors
things to use absolute timeouts throughout. This should fix a subtle
bug in the waitAll case where we aren't decrementing the timeout after a
successful GPU wait. Since pthread_cond_timedwait already takes an
absolute timeout, it's also significantly simpler.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
It probably doesn't actually break anything but it does cause some
assertions in debug builds.
Fixes: 7a89a0d9ed "anv: Use separate MOCS settings for external BOs"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that it is just called once per draw (instead of once for binning
and once for draw), let's just inline it. If nothing else, it makes
perf-annotate easier to look at.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Historically this wasn't in fdN_emit_state(), because prior to addition
of blitter in a5xx, fdN_emit_state() was also used in the clear path.
These days that is only true for a2xx (a3xx and a4xx use u_blitter). So
the reason for it not to be in fd6_emit_state() no longer exists.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Noticed that with webgl (in chromium, at least) we end up generating a
lot of no-op submits just to get a fence. Tracking the last fence and
returning that if there is no rendering since last flush avoids this.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The scissor maxx/maxy are non-inclusive, so don't subtract one from
framebuffer width and height.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
We get a warning here for assigning a const char * pointer to
char *swizzle in struct ir2_src_register. The constructor strdups a 4
byte string here, so just memcpy to that instead.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
In the pursuit of lowering driver overhead, it became clear that some
amount of redesign of how libdrm_freedreno constructs the submit ioctl
would be needed. In particular, as the gallium driver is starting to
make heavier use of CP_SET_DRAW_STATE state groups/objects, the over-
head of tracking cmd buffers and relocs becomes too much. And for
"streaming" state, which isn't ever reused (like uniform uploads) the
overhead of allocating/freeing ringbuffer[1] objects is too high.
This redesign makes two main changes:
1) Introduces a fd_submit object for tracking bos and cmds table
for the submit ioctl, making ringbuffer objects more light-
weight. This was previously done in the ringbuffer. But we
have many ringbuffer instances involved in a submit (gmem +
draw + potentially 1000's of state-group rbs), and only need
a single bos and cmds table. (Reloc table is still per-rb)
The submit is also a convenient place for a slab allocator for
ringbuffer objects. Other options would have required locking
because, while we can guarantee allocations will only happen on
a single thread, free's could happen either on the application
thread or the flush_queue thread. With the slab allocator in
the submit object, any frees that happen on the flush_queue
thread happen after we know that the application thread is done
with the submit.
2) Introduce a new "softpin" msm_ringbuffer_sp implementation that
does not use relocs and only has cmds table entries for IB1 (ie.
the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER
to from the RB). To do this properly will require some updates
on the kernel side, so whether you get the softpin or legacy
submit/ringbuffer implementation at runtime depends on your
kernel version.
To make all these changes in libdrm would basically require adding a
libdrm_freedreno2, so this is a good point to just pull the libdrm code
into mesa. Plus it allows for using mesa's hashtable, slab allocator,
etc. And it lets us have asserts enabled for debug mesa buids but
omitted for release builds. And it makes life easier if further API
changes become necessary.
At this point I haven't tried to pull in the kgsl backend. Although
I left the level of vfunc indirection which would make it possible
to have other backends. (And this was convenient to keep to allow
for the "softpin" ringbuffer to coexist.)
NOTE: if bisecting a build error takes you here, try a clean build.
There are a bunch of ways things can go wrong if you still have
libdrm_freedreno cflags.
[1] "ringbuffer" is probably a bad name, the only level of cmdstream
buffer that is actually a ring is RB managed by kernel. User-
space cmdstream is all IB1/IB2 and state-groups.
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
This reverts commit 0fa9e6d7b3. The real
issue appears to have been that HiZ ops don't like having WM thread
dispatch force-enabled. The previous commit fixes that problem so we
can go back to using the ForceThreadDispatchEnable bit even on SKL+.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.
This patch introduces the behaviour when a presentation buffer
is used.
ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Windows drivers don't set this flag (which affects ff) to more than 8.
Do the same in case some games check for 8.
v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.
Fixes: https://github.com/iXit/Mesa-3D/issues/261
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.
These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.
We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.
These two changes avoid setting the two mentionned states when we
shouldn't.
Fixes: https://github.com/iXit/Mesa-3D/issues/320
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
The device state changed.* field are never used.
These fields are used only for stateblocks.
Avoid setting them at all for clarity.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
D3DSBT_ALL stateblocks capture the transform matrices.
Fixes some d3d test programs not displaying properly.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.
Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
At some point the project was to adapt the
commented version to csmt.
The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
This lets us get rid of a bunch of duplicated error messages.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
This reverts commit a5fd54f8bf.
The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Follow the restriction of making sure the clear value is between the min
and max values defined in CC_VIEWPORT. Avoids a simulator warning for
some piglit tests, one of them being:
./bin/depthstencil-render-miplevels 146 d=z32f_s8
Jason found this to fix incorrect clearing on SKL.
Fixes: 09948151ab
("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
MESA_GIT_SHA1 resolves to either an empty "" string if not build from git,
or " (git-DEADBEEF)" if it is. No need to wrap it in additional "()".
Fixes: 9d40ec2cf6 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Adding a function for converting h264 pipe video profile to profile idc
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Previously, we would always pull the bit size from the destination which
is wrong for opcodes like nir_ilt where the sources are variable-sized
but the destination is a fixed size. We were getting lucky before
because nir_op_ilt returns a 32-bit value and basically everyone who
uses spec constants uses 32-bit ones.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We have a helper that does exactly what the bany_inequal was doing. It
emits the same code but is a bit higher level and is designed to operate
on a bvec4.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
They do the same thing in the end but i2b is a bit simpler. Also, let's
clean up the mess of code for SSBO handling with one line of builder.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This isn't a great solution for bit-sizes but we don't have a
particularly convenient way to get a bit size from the system value enum
and this keeps the lowering pass from changing it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>