Commit Graph

105518 Commits

Author SHA1 Message Date
Kristian H. Kristensen 4222fe8af2 freedreno/a2xx: Squash a compiler warning
We get a warning here for assigning a const char * pointer to
char *swizzle in struct ir2_src_register.  The constructor strdups a 4
byte string here, so just memcpy to that instead.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen 4fd6265f42 freedreno/a6xx: Use fd6_emit_ib from a6xx
Move it to a header and use it where possible to avoid vfunc call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Rob Clark f3cc0d2747 freedreno: import libdrm_freedreno + redesign submit
In the pursuit of lowering driver overhead, it became clear that some
amount of redesign of how libdrm_freedreno constructs the submit ioctl
would be needed.  In particular, as the gallium driver is starting to
make heavier use of CP_SET_DRAW_STATE state groups/objects, the over-
head of tracking cmd buffers and relocs becomes too much.  And for
"streaming" state, which isn't ever reused (like uniform uploads) the
overhead of allocating/freeing ringbuffer[1] objects is too high.

This redesign makes two main changes:

 1) Introduces a fd_submit object for tracking bos and cmds table
    for the submit ioctl, making ringbuffer objects more light-
    weight.  This was previously done in the ringbuffer.  But we
    have many ringbuffer instances involved in a submit (gmem +
    draw + potentially 1000's of state-group rbs), and only need
    a single bos and cmds table.  (Reloc table is still per-rb)

    The submit is also a convenient place for a slab allocator for
    ringbuffer objects.  Other options would have required locking
    because, while we can guarantee allocations will only happen on
    a single thread, free's could happen either on the application
    thread or the flush_queue thread.  With the slab allocator in
    the submit object, any frees that happen on the flush_queue
    thread happen after we know that the application thread is done
    with the submit.

 2) Introduce a new "softpin" msm_ringbuffer_sp implementation that
    does not use relocs and only has cmds table entries for IB1 (ie.
    the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER
    to from the RB).  To do this properly will require some updates
    on the kernel side, so whether you get the softpin or legacy
    submit/ringbuffer implementation at runtime depends on your
    kernel version.

To make all these changes in libdrm would basically require adding a
libdrm_freedreno2, so this is a good point to just pull the libdrm code
into mesa.  Plus it allows for using mesa's hashtable, slab allocator,
etc.  And it lets us have asserts enabled for debug mesa buids but
omitted for release builds.  And it makes life easier if further API
changes become necessary.

At this point I haven't tried to pull in the kgsl backend.  Although
I left the level of vfunc indirection which would make it possible
to have other backends.  (And this was convenient to keep to allow
for the "softpin" ringbuffer to coexist.)

NOTE: if bisecting a build error takes you here, try a clean build.
There are a bunch of ways things can go wrong if you still have
libdrm_freedreno cflags.

[1] "ringbuffer" is probably a bad name, the only level of cmdstream
    buffer that is actually a ring is RB managed by kernel.  User-
    space cmdstream is all IB1/IB2 and state-groups.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Jason Ekstrand aa02d7e878 Revert "anv/skylake: disable ForceThreadDispatchEnable"
This reverts commit 0fa9e6d7b3.  The real
issue appears to have been that HiZ ops don't like having WM thread
dispatch force-enabled.  The previous commit fixes that problem so we
can go back to using the ForceThreadDispatchEnable bit even on SKL+.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-26 16:39:47 -05:00
Jason Ekstrand b6b2b27809 blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OP
Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-26 16:39:35 -05:00
Axel Davy 2318ca68bb st/nine: Handle window resize when a presentation buffer is used
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.

This patch introduces the behaviour when a presentation buffer
is used.

ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy e50d374b61 d3dadapter: Fix wrong naming in header file
GetWindowInfo used to be GetWindowSize before gallium
nine was merged. A left-over remained...

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 3d975e98e4 st/nine: Reduce MaxSimultaneousTextures to 8
Windows drivers don't set this flag (which affects ff) to more than 8.

Do the same in case some games check for 8.

v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 739c700950 st/nine: Enable shadow mapping for ps 1.X
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.

Fixes: https://github.com/iXit/Mesa-3D/issues/261

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 847861aab4 st/nine: Do not set unused states for stateblocks
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 6f373b9b74 st/nine: Fix aliasing states for stateblocks
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.

These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.

We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.

These two changes avoid setting the two mentionned states when we
shouldn't.

Fixes: https://github.com/iXit/Mesa-3D/issues/320

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 454201b452 st/nine: Never update device changed.* fields
The device state changed.* field are never used.
These fields are used only for stateblocks.

Avoid setting them at all for clarity.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 2594b2efdc st/nine: Capture also default matrices for D3DSBT_ALL
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy bbeddb801e st/nine: Mark transform matrices dirty for D3DSBT_ALL
D3DSBT_ALL stateblocks capture the transform matrices.

Fixes some d3d test programs not displaying properly.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy a4e9bbb8f8 st/nine: Don't update unused world matrices
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.

Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy 2e51c4c7cc st/nine: Remove two unused states.
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy cb8ea21e1c st/nine: Remove commented nine_context_apply_stateblock
At some point the project was to adapt the
commented version to csmt.

The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Brian Paul 7e64e39f8b nir: Fix array initializer
Empty initializer is not standard C.  This fixes MSVC build.

Trivial.
2018-10-26 12:35:48 -06:00
Jason Ekstrand 07eb8e7466 anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lost
This lets us get rid of a bunch of duplicated error messages.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 13:27:21 -05:00
Jason Ekstrand ade22ae1ac anv/util: Split a vk_errorv helper out of vk_errorf
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 13:27:21 -05:00
Brian Paul d6be0b5556 scons/svga: remove opt from the list of valid build types
This reverts commit a5fd54f8bf.

The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2018-10-26 12:09:00 -06:00
Nanley Chery 5bcf479524 intel/blorp: Define the clear value bounds for HiZ clears
Follow the restriction of making sure the clear value is between the min
and max values defined in CC_VIEWPORT. Avoids a simulator warning for
some piglit tests, one of them being:

./bin/depthstencil-render-miplevels 146 d=z32f_s8

Jason found this to fix incorrect clearing on SKL.

Fixes: 09948151ab
       ("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-26 10:34:07 -07:00
Eric Engestrom 285ebc84c7 radv: remove duplicate brackets in version string
MESA_GIT_SHA1 resolves to either an empty "" string if not build from git,
or " (git-DEADBEEF)" if it is. No need to wrap it in additional "()".

Fixes: 9d40ec2cf6 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 18:33:11 +01:00
Eric Engestrom 738f0f789b vulkan: drop always-true param
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-26 18:33:11 +01:00
Boyuan Zhang f4126cfaab radeon/vcn: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang 55cf565698 radeon/vce: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang b15d0200a9 vl: get h264 profile idc
Adding a function for converting h264 pipe video profile to profile idc

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Jason Ekstrand 5cdeefe057 intel/nir: Use the OPT macro for more passes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 18fb2c5d92 spirv: Initialize subgroup destinations with the destination type
Instead of initializing them manually, just use the type that we already
have sitting there.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 8fa70cfcfd spirv: Use the right bit-size for spec constant ops
Previously, we would always pull the bit size from the destination which
is wrong for opcodes like nir_ilt where the sources are variable-sized
but the destination is a fixed size.  We were getting lucky before
because nir_op_ilt returns a 32-bit value and basically everyone who
uses spec constants uses 32-bit ones.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 1d2ed694c1 nir/prog: Use nir_bany in kill handling
We have a helper that does exactly what the bany_inequal was doing.  It
emits the same code but is a bit higher level and is designed to operate
on a bvec4.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 2fe3031440 glsl/nir: Use i2b instead of ine for fixing UBO/SSBO Booleans
They do the same thing in the end but i2b is a bit simpler.  Also, let's
clean up the mess of code for SSBO handling with one line of builder.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 5bfce5fcc2 nir/system_values: Use the bit size from the load_deref
This isn't a great solution for bit-sizes but we don't have a
particularly convenient way to get a bit size from the system value enum
and this keeps the lowering pass from changing it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand a3b4cb3458 nir/opt_if: Rework condition propagation
Instead of doing our own constant folding, we just emit instructions and
let constant folding happen.  This is substantially simpler and lets us
use the nir_imm_bool helper instead of dealing with the const_value's
ourselves.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 4cd8a58595 nir/search: Use the nir_imm_* helpers from nir_builder
This requires that we rework the interface a bit to use nir_builder but
that's a nice little modernization anyway.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 6e32115bd6 nir/builder: Handle 16-bit floats in nir_imm_floatN_t
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand ff45649bc2 nir/builder: Add a nir_imm_true/false helpers
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 249e32ab17 nir/constant_folding: Use nir_src_as_bool for discard_if
Missed one while converting to the nir_src_as_* helpers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 6de1869e86 nir/constant_folding: Add an unreachable to a switch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 28bb6abd1d nir/validate: Print when the validation failed
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand 292ebdbf98 anv: Handle the device loss abort in anv_device_set_lost
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 08:40:23 -05:00
Jason Ekstrand cd0960b430 anv: Add helpers for setting/checking device lost
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 08:40:21 -05:00
Jason Ekstrand 319ff6f1ad anv: Provide a error message with a DEVICE_LOST
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-26 08:40:10 -05:00
Alex Smith 3bd239f71d anv: Fix sanitization of stencil state when the depth test is disabled
When depth testing is disabled, we shouldn't pay attention to the
specified depthCompareOp, and just treat it as always passing. Before,
if the depth test is disabled, but depthCompareOp is VK_COMPARE_OP_NEVER
(e.g. from the app having zero-initialized the structure), then
sanitize_stencil_face() would have incorrectly changed passOp to
VK_STENCIL_OP_KEEP.

v2: Roll the depthTestEnable check into the ds_aspect check below since
    they now both do the same thing.

Fixes: 028e1137e6 "anv/pipeline: Be smarter about depth/stencil state"
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-26 10:25:40 +01:00
Samuel Pitoiset 79bbdf8e45 radv: implement image to image operations for R32G32B32
This should address the remaining failures in Batman Arkhman City.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:08 +02:00
Samuel Pitoiset 6198245775 radv: fix a comment in radv_meta_buffer_to_image_cs_r32g32b32()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:05 +02:00
Samuel Pitoiset 02ccef7874 radv: add get_image_stride_for_r32g32b32() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:03 +02:00
Samuel Pitoiset 468c33e2f7 radv: add create_bview_for_r32g32b32() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:00 +02:00
Samuel Pitoiset e60e3e1b3f radv: add create_buffer_from_image() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:49:58 +02:00
Sagar Ghuge 416abe809a intel/compiler: Print message descriptor as immediate source
While disassembling send(c) instruction print message descriptor as
immediate source operand along with message descriptor. This allows
assembler to read immediate source operand and set bits accordingly.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-26 06:42:14 +02:00