Commit Graph

2278 Commits

Author SHA1 Message Date
Philip Rebohle 293551dc8d
[d3d10] Fix winelib build
There's no IID_PPV_ARGS on wine.
2019-05-15 21:42:42 +02:00
Philip Rebohle 7d9a75c82c
[dxbc] Use subgroup operations for atomic append/consume operations
Reduces the number of atomic operations performed per subgroup to 1.
2019-05-15 19:32:27 +02:00
Philip Rebohle dfa3caa946
[spirv] Add OpUndef and more subgroup instructions 2019-05-15 19:31:43 +02:00
Philip Rebohle d94d89c3ef
[dxbc] Add option to use subgroup ops for atomic counter operations
This can greatly reduce the number of atomic operations when using
append/consume buffers.
2019-05-15 18:49:02 +02:00
Philip Rebohle 78ab26347d
[d3d10] Add static method implementing D3D10CreateDeviceAndSwapChain1
Same as the D3D11 change to make ReShade happy.
2019-05-15 17:18:05 +02:00
Philip Rebohle 8cae607db0
[d3d11] Add static method implementing D3D11CreateDeviceAndSwapChain
ReShade requires this as it hooks both D3D11CreateDevice and *AndSwapChain,
which means that we can't call D3D11CreateDevice without entering infinite
recursion. Fixes #1057.

Suggested-by: Riesi <riesi@opentrash.com>
2019-05-15 16:46:48 +02:00
Philip Rebohle 192310d481
[util] Don't use `if constexpr`
Fixes compilation on GCC 6.3.
2019-05-15 03:18:23 +02:00
Philip Rebohle b3f61936d2
[dxvk] Don't align pushg constant data to 64 bytes
Causes binaries compiled with GCC 6.3 to crash on device creation.
2019-05-15 03:07:05 +02:00
Philip Rebohle 9c93ca451d
[dxvk] Apply view swizzles to image clears
Fixes tone mapping in Yakuza Kiwami 2, which uses an A8_UNORM
render target and clears the alpha component to 1.
2019-05-14 21:21:29 +02:00
Philip Rebohle 0b61901424
[dxvk] Add method to swizzle clear color values 2019-05-14 21:21:14 +02:00
Philip Rebohle 3efec8960c
[dxvk] Pass clear value to clearRenderTarget by value 2019-05-14 21:19:56 +02:00
Philip Rebohle 8784ed673b
[d3d11] Use private references for render targets
Matches Windows behaviour and fixes a crash in Yakuza Kiwami 2,
which calls Release() on RTVs and DSVs until the public reference
count reaches zero. Close #1053.
2019-05-14 15:22:24 +02:00
Philip Rebohle 61b97e5dd1
[util] Add support for private references in Com<...> wrapper 2019-05-14 15:20:27 +02:00
Philip Rebohle 54d3103b04
[util] Fix COM reference count type
On Windows, ref counts are only 32 bits wide.
2019-05-14 14:48:42 +02:00
Philip Rebohle 26602b296f
[meta] Release 1.2 2019-05-13 20:40:11 +02:00
Entryhazard 8c2709a1c6 Changed visibility of the winelib build to behave more like MinGW 2019-05-13 20:37:57 +02:00
Philip Rebohle 7d6f78182b
[dxvk] Don't use ALL_COMMANDS_BIT to notify events 2019-05-09 18:07:49 +02:00
Philip Rebohle 2c45eb79c4
[dxvk] Increase number of queued command buffers to 12
Might help avoid stalls in some edge cases.
2019-05-09 18:07:38 +02:00
Philip Rebohle a54548dae9
[d3d11] Flush more aggressively when CPU bound
Submitting GPU work early is especially important if there is
a CPU<>GPU synchronization point somewhere.
2019-05-09 18:04:36 +02:00
Philip Rebohle 45be1dfb53
[d3d11] Flush more aggressively on stalling Event queries
Increases GPU utilization in Quake Champions.
2019-05-09 18:04:36 +02:00
Philip Rebohle af45f810b2
[dxvk] Change flushing behaviour of immediate context methods
Should fix some inappropriate flushing, while flushing more
aggressively on render target changes.

We still keep the flush on UpdateSubresource since some games
use it to update large quantities of data.
2019-05-09 18:04:36 +02:00
Philip Rebohle dcd75a4f09
[util] Optimize popcnt operation 2019-05-09 18:04:33 +02:00
Philip Rebohle a1feaa6748
[dxvk] Add aspect mask parameter to clearImageView 2019-05-09 09:10:06 +02:00
Philip Rebohle 1fb8b5ec69
[dxvk] Begin render pass in clearImageViewFb if necessary
Otherwise, we might end up calling vkCmdClearAttachments outside
a render pass instance, which is invalid.
2019-05-09 09:09:11 +02:00
Philip Rebohle 0d40c20aef
[dxvk] Compact vertex buffer bindings
This way, we can always update all vertex buffer bindings with one
single API call, without having to deal with any gaps in binding IDs.

The previous optimization triggers a bug in some drivers when no
vertex buffer is ever bound to a given binding point, and may also
trigger inefficient behaviour if the binding range is assumed to
be contiguous.
2019-05-08 03:37:49 +02:00
Philip Rebohle 644f33a82b
[dxvk] Optimize unbound vertex buffer handling
We can actually just set the stride to 0 when binding a null
buffer, so that we can avoid all the runtime tracking.
2019-05-08 00:52:30 +02:00
Philip Rebohle 8029712aa4
[dxvk] Fix unbound vertex buffer handling
Bit of a brain fart there; we can't just change the meaning
of bindingMask since it indirectly affetcs binding strides.
2019-05-08 00:27:13 +02:00
Philip Rebohle fb70de8852
[dxvk] Optimize vertex buffer binding
If there are gaps in the binding numbers, we don't want to
create overhead by iterating over unused bindings.
2019-05-08 00:21:35 +02:00
Sam Fomenko 078bc27b14 [util] Disable NvAPI hack for Mirror's Edge Catalyst
#893
2019-05-07 22:34:56 +02:00
Philip Rebohle 9355580c4f
[hud] Optimize HUD rendering
Saves some bandwidth by using more compact vertex formats, and
by using push constants for text colors instead of a vertex
attribute.
2019-05-07 22:05:35 +02:00
Philip Rebohle 02768182f1
[dxvk] Implement push constant API 2019-05-07 20:51:27 +02:00
Philip Rebohle 8931013234
[dxvk] Add push constant range info to shaders and pipeline layout 2019-05-07 20:51:18 +02:00
Philip Rebohle d5b2c2fd23
[dxvk] Pass slot mapping to pipeline layout constructor
We're not getting the info from any other source anyway.
2019-05-07 20:36:18 +02:00
Philip Rebohle 0224dbc371
[dxvk] Optimize spinlock implementation 2019-05-07 13:38:02 +02:00
Philip Rebohle 584fd870b2
[dxvk] Bump state cache version to v5 2019-05-06 03:15:45 +02:00
Philip Rebohle 37f1087783
[dxvk] Add API for specialization constants 2019-05-06 03:15:45 +02:00
Philip Rebohle 7687db0303
[dxvk] Remove extra pipeline state
This can be expressed with specialization constants now.
2019-05-06 00:18:59 +02:00
Philip Rebohle a0c67191a7
[d3d11] Implement depth bounds extension 2019-05-06 00:08:58 +02:00
Philip Rebohle 3867270812
[d3d11] Implement MultiDrawIndirectCount extension 2019-05-06 00:08:58 +02:00
Philip Rebohle 492b7db07b
[d3d11] Support count buffer in Set|BindDrawBuffers 2019-05-06 00:08:58 +02:00
Philip Rebohle 117b7b1ba1
[d3d11] Implement MultiDrawIndirect extension 2019-05-06 00:08:58 +02:00
Philip Rebohle 9e57b03e64
[d3d11] Implement barrier control extension 2019-05-06 00:08:58 +02:00
Philip Rebohle 04bef3c67a
[d3d11] Add stub implementation of D3D11DeviceExt 2019-05-06 00:08:58 +02:00
Philip Rebohle 1cd8749234
[d3d11] Add stub implementation of D3D11DeviceContextExt 2019-05-06 00:08:58 +02:00
Philip Rebohle edbbdef787
[d3d11] Add interfaces to support D3D11 extensions 2019-05-06 00:08:57 +02:00
Philip Rebohle 8a3044a342
[dxvk] Implement depth bounds test in backend 2019-05-06 00:08:57 +02:00
Philip Rebohle 5ad212d279
[dxvk] Introduce new pipeline state to enable depth bounds test 2019-05-06 00:08:57 +02:00
Philip Rebohle bacb1f7c60
[dxvk] Implement indirct draw commands with indirect count 2019-05-06 00:08:57 +02:00
Philip Rebohle 13359d68d7
[dxvk] Enable VK_KHR_draw_indirect_count if available
Useful to implement a corresponding D3D11 extension.
2019-05-06 00:08:57 +02:00
Philip Rebohle 66b6b50af6
[dxvk] Fix stale vertex attribute divisor
Not resetting this may result in unnecessary state cache misses.
2019-05-05 23:18:13 +02:00
Philip Rebohle b35f3c14df
[dxvk] Off-load command buffer submission to separate thread
Reduces load on the CS thread a bit, which may yield a small
performance improvement.
2019-05-05 16:49:17 +02:00
Robin 4c0c66892a [d3d11] Fix MSVC 2017 compilation 2019-05-04 22:14:28 +02:00
Philip Rebohle 37f9a7ffff
[meta] Release 1.1.1 2019-05-04 15:59:18 +02:00
Philip Rebohle f733d082f4
[d3d11] Implement D3D11DeviceContext::SwapDeviceContextState 2019-05-04 15:57:57 +02:00
Philip Rebohle 82c6a5eb1a
[d3d11] Implement D3D11Device::CreateDeviceContextState 2019-05-04 15:57:57 +02:00
Philip Rebohle c1929ccb6f
[d3d11] Add class to implement D3DDeviceContextState 2019-05-04 15:57:55 +02:00
Philip Rebohle 81229d66cc
[d3d10] Explicitly define GUID for ID3D10StateBlock
Fixes linker errors when building against winelib.
2019-05-03 20:32:52 +02:00
Philip Rebohle 53aa27336b
[dxvk] Disable depthWriteEnable if depth attachment has read-only layout
Fixes water rendering in SpellForce 3.
2019-05-03 14:38:21 +02:00
Philip Rebohle 6eeb3b6da9
[dxvk] Add helper function to get info about depth-stencil image layouts 2019-05-03 14:37:59 +02:00
Philip Rebohle 81821414b0
[d3d10] Implement state blocks
Improves compatibility to Wine's Direct2D implementation.
2019-05-03 13:08:57 +02:00
Joshua Ashton 7aecd46f93 [spirv] Implement proj sample variants 2019-05-03 08:34:16 +02:00
Philip Rebohle f503ba4c8b
[d3d11] Fix counter value offset in DrawAuto
According to the newly released D3D11.3 functional specification,
we're supposed to subtract the offset of the slot 0 vertex buffer
binding from the counter value.
2019-05-02 16:03:52 +02:00
Philip Rebohle 343818cf1c
[d3d11] Always enable shaderStorageImageWriteWithoutFormat
We compile some compute shaders that need it in FL10/FL9 games.
2019-05-02 08:08:45 +02:00
Philip Rebohle e6eef1d1ec
[d3d11] Minor Map/Unmap optimizations
Avoid unnecessary LockContext call when unmapping a buffer.
This may actually improve performance if the context has
multithreaded protection enabled (e.g. D3D10).
2019-05-01 03:01:36 +02:00
Philip Rebohle f76fd8fa5d
[d3d11] Minor CPU savings 2019-05-01 03:00:23 +02:00
Philip Rebohle 7e1b0ef8e6
[dxvk] Bump state cache format to version 4
Accomodates for the alpha test changes in D3D11.
2019-05-01 01:57:34 +02:00
Philip Rebohle 93bd923c17
[d3d11] Set up extra state for the HUD renderer 2019-05-01 01:57:34 +02:00
Philip Rebohle 9fc09c843d
[d3d11] Set up unused extra state for the backend correctly 2019-05-01 01:57:34 +02:00
Philip Rebohle fc52c1720d
[dxvk] Remove old spec constant code 2019-05-01 01:57:34 +02:00
Philip Rebohle 5714f18d15
[dxvk] Use new specialization constant code for graphics pipelines 2019-05-01 01:57:34 +02:00
Philip Rebohle c3f7dfd197
[dxvk] Use new specialization constant code for compute pipelines 2019-05-01 01:57:34 +02:00
Philip Rebohle 433c707888
[dxvk] Add new structure to generate specialization constant info
We should avoid passing redundant and unused data to the driver, as
that may interfere with caching. It also adds a lot of unnecessary
data to traces.
2019-05-01 01:57:34 +02:00
Philip Rebohle 0a77ebbeaf
[dxgi] Change default of s_gamma_bound to true
In the future, we'll assume true and only pass the spec constant
value to the driver if their value is actually different from the
default, but this requires reliable defaults.
2019-05-01 01:57:34 +02:00
Philip Rebohle 04e6479690
[dxbc] Remove old spec constant code 2019-05-01 01:57:34 +02:00
Philip Rebohle dc3cfc9fa0
[dxbc] Use new spec constant API for rasterizer sample count 2019-05-01 01:57:34 +02:00
Philip Rebohle 7111af423d
[dxbc] Add new emitNewSpecConstant method
Convenience method to declare new specialization constants.
2019-05-01 01:57:34 +02:00
Philip Rebohle a340b3101c
[d3d11] Add missing interface queries for IDXGIObject and IDXGIDeviceSubObject 2019-05-01 01:54:00 +02:00
Eero Kelly b92dc14c4d [util] Enable constant buffer range check for NieR:Automata
This fixes a graphical corruption issue where Operator 6O's portrait
displays as a large flashing circle due to uninitialized buffer reads.

Fixes issue: https://github.com/ValveSoftware/Proton/issues/1543

Reviewed-by: Liam Middlebrook <lmiddlebrook@nvidia.com>
2019-04-30 16:08:52 +02:00
Philip Rebohle dc52212873
[dxbc] Disable Constant Buffer Range Check on AMD
The hardware already behaves as intended, no need to waste GPÜ cycles.
2019-04-30 16:07:24 +02:00
Philip Rebohle 3b1e753bb5
[dxvk] Re-implement early discard with quad granularity
May perform better on some hardware in situations where we cannot
discard a full subgroup. Closes #753.
2019-04-30 14:46:03 +02:00
Danylo Piliaiev 4dd68987d6 [d3d11] Check if uav's counter slice is defined in CopyStructureCount
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
2019-04-30 12:34:27 +02:00
Danylo Piliaiev 261d31cac6 [dxbc] Fix xfb passthrough for system values
vReg should be always allocated for system values which is
necessary for emitXfbOutputSetup and is already done for
hull shader passthrough.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
2019-04-30 12:19:21 +02:00
Philip Rebohle 2afe5ec141 [d3d11] Clean up rasterizer state initialization
The error messages are pointless since all of this is already
handled in NormalizeDesc.
2019-04-29 16:22:42 +02:00
Philip Rebohle 06c144f075 [dxbc] Store sample positions as vec2 array
We can append the zeroes in shader code instead. May
improve generated code on drivers that use scratch
memory or temporary uniform buffers for large arrays.
2019-04-29 14:04:42 +02:00
Philip Rebohle 8f5338b1d1 [spirv] Add constvec2f32 helper
We should probably replace this with a proper template at some point.
2019-04-29 13:47:15 +02:00
Philip Rebohle b14ad7b30c [dxbc] Remove some old TODOs
This is already implemented properly.
2019-04-29 11:48:09 +02:00
Philip Rebohle 04152055b7 [dxbc] Correctly report a sample count of 0 for unbound images 2019-04-29 11:48:09 +02:00
Philip Rebohle a3e0157ab0 [dxbc] Correctly report a size of 0 for unbound images 2019-04-29 11:48:09 +02:00
Philip Rebohle 18cbaefdcb [dxbc] Correctly report a size of 0 for unbound buffers 2019-04-29 11:48:09 +02:00
Philip Rebohle ad5da34c57 [dxbc] Fix typo in sample positions 2019-04-29 11:48:09 +02:00
Philip Rebohle 2c61303976 [d3d11] Implement IDXGIResource1 for textures and buffers 2019-04-27 20:21:54 +02:00
Philip Rebohle 54592b7852 [d3d11] Add basic implementation of IDXGIResource1
We don't support resource sharing and subresource surfaces
yet, but the interface should at least be present.
2019-04-27 20:21:47 +02:00
Philip Rebohle b7a5f11121 [dxgi] Add missing DXGI_CPU_ACCESS_* defines 2019-04-27 19:13:34 +02:00
Philip Rebohle ac79f69a10 [d3d11] Pass texture as D3D11Resource to DXGI interop objects 2019-04-27 16:25:55 +02:00
Philip Rebohle d1a019a043 [d3d11] Implement Map / Unmap for IDXGISurface2 2019-04-27 16:17:50 +02:00
Philip Rebohle af15aa0c32 [d3d11] Implement IDXGISurface2 for compatible 2D textures
Required by SpellForce 3. Fixes #1031.
2019-04-27 15:35:20 +02:00
Philip Rebohle ed0719432d [dxvk] Add support for per-app configuration
Feature request by Alexandr Oleynikov (@tannisroot).
2019-04-26 19:17:32 +02:00
Philip Rebohle e2ebfa9012 [dxvk] Add some 'unlikely' statements 2019-04-26 17:52:53 +02:00
Philip Rebohle 9f264ba008 [d3d11] Remove predication workaround for RADV
This no longer has any effect due to changes in the driver, and
we no longer support Predication anyway.
2019-04-25 18:29:13 +02:00
Joshua Ashton 60827c1b22 [d3d11] Improve CreatePredicate logging 2019-04-25 13:14:23 +02:00
Philip Rebohle 4eff83bdee [d3d11] Disable Predication support
Doesn't work at all in the few games that use it.
2019-04-25 11:55:40 +02:00
Philip Rebohle cd63cebc63 [dxvk] Simplify validateGraphicsState 2019-04-24 23:16:52 +02:00
Philip Rebohle 981ea547f9 [d3d11] Don't use presentation fence 2019-04-23 20:14:34 +02:00
Philip Rebohle 81f6ccb1be [d3d11] Select sync event based on back buffer count
May improve frame pacing in some games.
2019-04-23 20:14:30 +02:00
Philip Rebohle 4cc35da3b2 [d3d11] Allocate one additional swap chain image
DXGI's BufferCount apparently only counts back buffers,
while there's an implicit front buffer.
2019-04-23 20:12:29 +02:00
Philip Rebohle 2245aada03
[dxvk] Use stricter barriers around meta operations
Fixes some rendering issues on AMDVLK in some situations.
2019-04-19 11:41:12 +02:00
Philip Rebohle 95bfac84f1
[dxvk] Support image sub-regions for resolve operations
Required for legacy graphics APIs.
2019-04-19 11:41:12 +02:00
Joshua Ashton a3966b442b [dxvk] Add adapterCount function to DXVKInstance.
Will be needed in the Direct3D9 interface as you can query the number of adapters.
2019-04-18 23:18:15 +02:00
Joshua Ashton 0c34fae9c7 [spirv] Implement constvec3f32 2019-04-18 23:18:02 +02:00
Joshua Ashton a770c73dbc [spirv] Implement opVectorTimesScalar 2019-04-18 23:18:02 +02:00
Joshua Ashton f1a8e02e0f [spirv] Implement opPow 2019-04-18 23:18:02 +02:00
Philip Rebohle 94beec0c13
[dxvk] Fix subresources in barriers for 2D views of 3D images
The array layers passed during framebuffer creation are the selected
3D slices in this case, the image actually only has one array layer.
We should account for that when recording barriers.
2019-04-18 12:08:27 +02:00
Philip Rebohle b44cad4d32
[dxbc] Replace computeResourceSlotId by light-weight alternatives
Slightly reduces overhead of D3D11 binding methods.
2019-04-18 10:06:15 +02:00
Philip Rebohle 044e3967e7
[hud] Show compiler activity indicator for at least one second
Otherwise this would flicker when shaders are already cached.
2019-04-15 12:42:07 +02:00
Philip Rebohle 35a2a02714
[dxbc] Do not emit GS system values if rasterization is disabled
Fixes issue in Star Citizen, which declares a max output vertex count
of 128 in a geometry shader which emits eight components per vertex,
which becomes 12 components in DXVK due to the gl_Position builtin.
This should keep us below the magic limit of 1024 output components.
2019-04-15 09:00:46 +02:00
Philip Rebohle f9e56c97cf
[d3d11] Fix hasing of geometry shaders with stream output
The xfb struct contains pointers, but we should hash the
strings instead, otherwise the hash changes between runs.
2019-04-15 03:48:31 +02:00
Philip Rebohle ca717eeb62
[d3d11] Track query state correctly
Not sure if any game actually needs this, but we should avoid
sending bogus commands to the backend when the app sends bogus
commands to us.
2019-04-14 16:27:15 +02:00
Philip Rebohle 364ae7270d
[d3d11] Don't allocate predicate for unsupported predicates 2019-04-14 14:26:56 +02:00
Philip Rebohle 7dc449ac55
[hud] Add new HUD entry to show shader compiler activity 2019-04-14 13:28:57 +02:00
Philip Rebohle 8b84d002f8
[hud] Pass surface size to HUD renderer 2019-04-14 13:28:57 +02:00
Philip Rebohle bb01318984
[dxvk] Add stat counter for shader compiler activity 2019-04-14 13:28:57 +02:00
Philip Rebohle 2c0ddbd072
[util] Enable D3D11_MAP_FLAG_DO_NOT_WAIT for Anno 1800
Removes a sync point and almost doubles performance as a result.
2019-04-12 10:52:25 +02:00
Philip Rebohle adc447cc9f
[dxvk] Increase query pool sizes
Many games create a very large number of occlusion queries, and
we shouldn't create more pools than necessary.
2019-04-08 01:51:38 +02:00
Philip Rebohle 7018db3614
[dxvk] Implement shader-based resolve
Resolve attachments are currently too broken on most drivers,
so we cannot really rely on them.
2019-04-07 21:07:25 +02:00
Philip Rebohle ea5dcd5b14
[dxvk] Re-implement class to create meta-resolve objects
This time with specialization constants so that we don't have
to read the tetxure's sample count from the descriptor.
2019-04-07 21:07:25 +02:00
Philip Rebohle addbae585f
[dxvk] Enable VK_AMD_shader_fragment_mask if available 2019-04-07 21:07:25 +02:00
Philip Rebohle 14593baebd
[dxvk] Add new resolve shaders 2019-04-07 21:07:21 +02:00
Philip Rebohle 56300ff9b7
[d3d11] Allocate mapped buffers for staging images on cached memory
These will most likely be used for reading, so we should put them
on a memory type which allows reading.
2019-04-07 14:47:43 +02:00
Philip Rebohle 51f229530b
Revert "[d3d11] Select memory type based on CPU access flags"
This reverts commit 6c8042033e.

Batman: Arkham City doesn't set the CPU access flags correctly
for some images it maps for reading, and breaks on Nvidia as a
result.
2019-04-07 14:42:01 +02:00
Philip Rebohle 1da7b1e87c
Revert "[meta] Release 1.1"
This reverts commit a696f69ec2.
2019-04-07 10:13:18 +02:00
Philip Rebohle e901d9d149
[dxgi] Fix broken gamma with combined image samplers
Fixes #1003.
2019-04-07 09:55:54 +02:00
Philip Rebohle a696f69ec2
[meta] Release 1.1 2019-04-06 16:26:21 +02:00
Philip Rebohle f6bdb7bb63
[dxvk] Fix circular reference between DxvkDevice and DxvkGpuQueryPool 2019-04-06 12:31:20 +02:00
pchome 3eb9f35fc3 [build] Use `generator` to produce resource files 2019-04-06 11:33:45 +02:00
Philip Rebohle aa45b3cc31
[dxvk] Fix build failure for some people
Why am I the only one who never has any issues with this?
2019-04-06 10:10:29 +02:00
Sveinar Søpler 4f9dd8d3d0
[build] Add version info to compiled DLLs
Fixes #980.
2019-04-05 21:09:57 +02:00
Philip Rebohle b89646584b
[util] Enable constant buffer range check for Dark Souls Remastered ans Grim Dawn 2019-04-05 20:56:32 +02:00
Philip Rebohle 5819a69302
[d3d11] Add option to enable constant buffer range checks 2019-04-05 20:56:32 +02:00
Philip Rebohle 9b99c55a2e
[dxbc] Implement optional constant buffer range check 2019-04-05 20:56:29 +02:00
Philip Rebohle b9bfbb9ccc
[dxvk] Fix move constructor of DxvkShaderModule 2019-04-04 16:10:44 +02:00
Philip Rebohle da4baefdf0
[spirv] Fix initial allocation size for compressed buffer
The old initial size was still for uint8.
2019-04-04 13:15:59 +02:00
Philip Rebohle ac3cd0b688
[dxvk] Store compressed shader modules in DxvkShader
Reduces the amount of memory used to store shaders to
around ~45%-50% of the original size.
2019-04-04 13:00:31 +02:00
Philip Rebohle f49863f321
[dxvk] Store enabled SPIR-V capabilities explicitly 2019-04-04 13:00:31 +02:00
Philip Rebohle f32200b668
[spirv] Implement in-memory compression for shader modules 2019-04-04 13:00:31 +02:00
Philip Rebohle d2395180af
[util] Add helpers to pack/unpack data to/from larger units 2019-04-04 13:00:31 +02:00
Liam Middlebrook 9d26031dcb [dxvk] Zero-Initialize SpecConstantData
Ensure that specialization constant data passed into the driver is
zero-initialized.

Having the pData field in VkSpecializationInfo be zero-initialized helps
to create more deterministic input to the driver, which is particularly
useful when debugging shader issues.
2019-04-03 23:21:12 +02:00
Philip Rebohle cd93ba570e
[dxvk] Simplify DxvkShaderModule
This is merely a wrapper for a VkShaderModule now, so it really
doesn't need anything fancy and definitely doesn't need to be
heap-allocated.
2019-04-03 20:47:58 +02:00
Philip Rebohle 2bd09e52e7
[dxvk] Don't cache shader modules for graphics pipelines
We're only ever going to need those when actually compiling a new
pipeline, so on average we're just wasting large amounts of memory
by keeping them around.

Trades several hundred MB of memory for a small increase in compile
times. Creating shader modules is typically very cheap.
2019-04-03 20:01:36 +02:00
Philip Rebohle 79e867624a
[dxvk] Don't cache shader modules for compute pipelines 2019-04-03 19:46:28 +02:00
Philip Rebohle 632b254714
[d3d11] Use combined image sampler descriptors for the presenter 2019-04-03 17:40:05 +02:00
Philip Rebohle 257c19ed0a
[hud] Use combined image s1ampler for the font texture 2019-04-03 17:40:05 +02:00
Philip Rebohle ddde5ee6c2
[dxvk] Support combined image sampler descriptors in the backend 2019-04-03 17:40:05 +02:00
Chip Davis 910e1a1835 Only try once to recreate surfaces on surface loss. 2019-04-02 17:26:48 +02:00
Chip Davis 540900b792 [vulkan] Don't loop endlessly on a lost surface.
If the surface is lost in a way that can't be recovered by recreating
the surface from the window, the previous change would wind up looping
forever. Just retry 5 times before giving up.
2019-04-02 17:26:48 +02:00
Chip Davis e633dbc06f [vulkan] Recreate the surface on surface loss.
According to the Vulkan spec:

> Several WSI functions return `VK_ERROR_SURFACE_LOST_KHR` if the
> surface becomes no longer available. After such an error, the surface
> (and any child swapchain, if one exists) **should** be destroyed, as
> there is no way to restore them to a not-lost state.

So if we get this error, we need to recreate the surface in addition to
the swapchain.
2019-04-02 17:26:48 +02:00
Philip Rebohle b5f859915a
[dxvk] Properly reset global barrier access flags
Fixes: adaf98bb9d
2019-04-02 15:05:44 +02:00
Philip Rebohle 6da02c6f56
[dxvk] Fix write access flag for barriers
Fixes: adaf98bb9d
2019-04-02 15:01:37 +02:00
Philip Rebohle adaf98bb9d
[dxvk] Use global memory barrier instead of resource barriers if possible
Hardware doesn't support this type of fine-grained synchronization
anyway, so we really don't need the driver to iterate over anything
up to hundreds of structs - except for layout transitions.
2019-04-02 14:48:39 +02:00
Philip Rebohle 67b9b6e1e1
[dxvk] Pull buffer updates out of render passes whenever possible
Instead of ending the render pass and inserting two barriers, we
perform the update and barrier in a dedicated command buffer.

Improves performance in Sekiro by 5-10% depending on resolution and scene.
2019-04-02 13:17:05 +02:00
Philip Rebohle e59f53abfa
[dxvk] Allow barriers to be recorded into a specific command buffer 2019-04-02 12:14:15 +02:00
Philip Rebohle 2315d55ecc
[dxvk] Rename DxvkCmdBufferFlag -> DxvkCmdBuffer 2019-04-02 12:10:47 +02:00
Philip Rebohle e395712de7
[dxvk] Add missing feature check for conditional rendering 2019-04-02 04:13:23 +02:00
Philip Rebohle 295d583c1d
[d3d11] Lazily allocate predicate on SetPredication
Many games use CreatePredicate to create occlusion queries without
actually using predication, and we don't want to pay any runtime
cost for this when predicates aren't actually being used.
2019-04-02 04:07:05 +02:00
Philip Rebohle 87dc472a8d
[dxvk] Set empty scissor rect when the app requests empty viewport
Since we cannot set the viewport size to zero, we should set an
empty scissor rect so that rasterization is still effectively
disabled for the given viewport index.

Fixes #813, #957.
2019-04-01 15:45:41 +02:00
Philip Rebohle 8702374bf7
[dxvk] Do not invalidate iterator before disabling queries
Reported-by: Joshua Ashton <joshua@froggi.es>
2019-04-01 02:58:02 +02:00
Philip Rebohle 70510bab9a [dxvk] Introduce extra pipeline state
Provides extra state that will be passed in via spec constants.
Whether or not this state is used is determined by the shaders.
2019-04-01 02:31:32 +02:00
Philip Rebohle 18d2905bf7 [dxvk] Remove unused alphaToOne state
Nothing supports this anyway, so no reason to carry it around.
2019-04-01 02:31:22 +02:00
Marin Baron 1c434d86cb [util] Enable deferred surface creation for "Dissidia Final Fantasy NT Free Edition".
Avoid white screen, "D3D11Device: No such vertex shader semantic: COLOR0"...
https://www.reddit.com/r/archlinux/comments/b7e38x/protondxvk_dissidia_nt/
2019-03-31 03:27:05 +02:00
Philip Rebohle a646f8cf2c
[util] Enable deferred surface creation for Nioh
See discussion in #284.
2019-03-29 08:49:37 +01:00
Chip Davis d741bc47ef [dxgi] Use a recursive mutex.
Some games, like Final Fantasy XIV, call `SetFullscreenState()` again
after the window is resized. When the resize itself was triggered by a
`SetFullscreenState()` call, this will cause us to re-enter the mutex.
2019-03-29 08:26:19 +01:00
Philip Rebohle 61adaa941d
[d3d11] Implement fast path for binding full constant buffers
Saves a few CPU cycles in the most common case where
we don't have to perform any sort of range check.
2019-03-28 14:09:08 +01:00
Philip Rebohle 8f580efa25
[d3d11] Correctly handle out-of-bounds constant buffer ranges
Otherwise we pass an invalid offset and length to the backend,
which leads to invalid descriptor set updates in Vulkan.
The D3D11 runtime does not report corrected constant offset
and count parameters to the applicaion in *GetConstantBuffers1.

Reported-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
2019-03-28 13:45:41 +01:00
Philip Rebohle 09d60f42bc
[d3d11] Work around predicate buffer sync issue on RADV
If the predicate buffer is device-local memory, conditional
rendering commands don't seem to see any updates values even
though there is a barrier. When allocating on host-visible
device memory or system memory, it works as expected.
2019-03-28 10:02:11 +01:00
Philip Rebohle 3a3d7fb378
[d3d11] Properly implement SetPredication 2019-03-28 10:02:11 +01:00
Philip Rebohle d81146e3d2
[d3d11] Allocate predicate buffer for predicates 2019-03-28 10:02:11 +01:00
Philip Rebohle 7e16c4cda1
[d3d11] Remove unused revision field from D3D11Query 2019-03-28 10:02:11 +01:00
Philip Rebohle acdb989cfa
[dxvk] Implement conditional rendering 2019-03-28 10:02:11 +01:00
Philip Rebohle 03f00453ef
[dxvk] Add command list functions for conditional rendering 2019-03-28 10:02:11 +01:00
Philip Rebohle 70520e30aa
[dxvk] Enable conditionalRendering feature if present 2019-03-28 10:02:11 +01:00
Philip Rebohle 8f7e606583
[dxvk] Enable VK_EXT_conditional_rendering if available 2019-03-28 10:02:11 +01:00
Philip Rebohle 7f211545ee
[vulkan] Load functions for VK_EXT_conditional_rendering 2019-03-28 10:02:11 +01:00
Chip Davis 13a6ecadcd [dxvk] Remove needless lambda capture of 'this'. 2019-03-27 21:59:15 +01:00
Chip Davis 7a37d88067 [dxvk] Log vertex attributes and buffers when logging pipeline state.
This was invaluable in diagnosing a missing feature from MoltenVK.
2019-03-27 21:59:03 +01:00
Philip Rebohle edd63d3972
[dxvk] Fix buffer offset in copyDepthStencilImageToPackedBuffer 2019-03-27 14:23:58 +01:00
Philip Rebohle 03881dde72
[dxvk] Implement blitImage function 2019-03-27 02:31:04 +01:00
Philip Rebohle 6c8042033e
[d3d11] Select memory type based on CPU access flags 2019-03-26 21:17:52 +01:00
Philip Rebohle 302c6b5e6c
[d3d11] Implement depth-stencil uploads in resource initializer 2019-03-26 18:11:42 +01:00
Philip Rebohle fc3515c16f
[d3d11] Implement depth-stencil uploads in UpdateSubresource1 2019-03-26 18:11:42 +01:00
Philip Rebohle eec1cde1b3
[d3d11] Implement depth-stencil mapping on deferred contexts 2019-03-26 18:05:02 +01:00
Philip Rebohle 97d77fa508
[d3d11] Implement depth-stencil mapping on the immediate context 2019-03-26 18:04:56 +01:00
Philip Rebohle c38b1802a2
[d3d11] Enable shaderStorageImageExtendedFormats device feature 2019-03-26 17:56:57 +01:00
Philip Rebohle 8194bec1bf
[d3d11] Fix image format mapping when creating mapped buffer 2019-03-26 17:54:43 +01:00
Philip Rebohle 7cd3e9a0d4
[d3d11] Add method to look up packed format 2019-03-26 17:54:14 +01:00
Philip Rebohle 6c2f16fce8
[dxgi] Add methods to retrieve original format mappings 2019-03-26 17:54:14 +01:00
Philip Rebohle b3ea1b02eb
[dxvk] Implement depth-stencil upload via temporary buffer 2019-03-26 17:54:14 +01:00
Philip Rebohle 0d889e0dcd
[dxvk] Implement depth-stencil unpacking 2019-03-26 17:54:10 +01:00
Philip Rebohle de45ffd749
[dxvk] Create depth-stencil unpacking pipelines 2019-03-26 16:05:27 +01:00
Philip Rebohle 7124c3f449
[dxvk] Add depth-stencil unpacking shaders 2019-03-26 16:05:27 +01:00
Philip Rebohle 90c7878a53
[dxvk] Rename dxvk_resolve_{vert|geom} -> dxvk_copy_{vert|geom} 2019-03-25 18:22:56 +01:00
Philip Rebohle 7627f6e3ed
[dxvk] Optimize meta copy barriers 2019-03-25 17:58:31 +01:00
Philip Rebohle fd0daa5ec7
[dxvk] Optimize meta geometry shaders 2019-03-25 17:58:25 +01:00
Philip Rebohle be1832a348
[d3d11] Don't sample gamma texture if the gamma curve is identity
Saves some GPU time in games that don't use DXGI gamma control at all.
2019-03-24 18:07:21 +01:00
Philip Rebohle 73bb0d8ae2 [dxvk] Remove shader-based resolve
No longer necessary since we're using render pass resolve now.
2019-03-24 16:36:35 +01:00
Philip Rebohle 75ee1f42c2
[dxvk] Use resolve attachment for meta-resolve ops
Faster than the naive fragment shader-based solution.
2019-03-19 11:45:56 +01:00
Philip Rebohle 209248e26d [dxvk] Use vkResetQueryPoolEXT to reset individual queries
This is much faster than the fallback path which uses GPU functions.
2019-03-17 16:25:00 +01:00
Philip Rebohle 3d53f318fd [dxvk] Enable hostQueryReset device feature if available 2019-03-17 16:24:59 +01:00
Philip Rebohle 9dd9f0ab22 [dxvk] Enable VK_EXT_host_query_reset if available 2019-03-17 16:24:59 +01:00
Philip Rebohle 412d79c8c1
[d3d11] Use new query implementation 2019-03-14 21:16:41 +01:00
Philip Rebohle e5441e841f
[dxvk] Support new query implementation 2019-03-14 21:16:41 +01:00
Philip Rebohle a8144370c8
[dxvk] Create new query pool and forward it to the context 2019-03-14 21:16:41 +01:00
Philip Rebohle 772fa3074f
[dxvk] Add new query implementation 2019-03-14 21:16:41 +01:00
Philip Rebohle 8c3900c533
[d3d11] Use new GPU events for D3D11 Event queries 2019-03-14 21:16:41 +01:00
Philip Rebohle 3dbd755075
[dxvk] Implement method to signal GPU events 2019-03-14 21:16:41 +01:00
Philip Rebohle 6b9653d261
[dxvk] Create GPU event pool and forward it to the context 2019-03-14 21:16:41 +01:00
Philip Rebohle 4da89ccc48
[dxvk] Add GPU event class
GPU events allow for finer-grained CPU<>GPU synchronization than
the current approach, so we should change our implementation.
2019-03-14 21:16:38 +01:00
Philip Rebohle 7fa2fb5188
[meta] Release 1.0.1 2019-03-14 19:07:18 +01:00
Philip Rebohle 19f82826bb
[d3d11] Don't use presentation fence on ANV
Should hopefully fix stuttering issues introduced with 1.0.
2019-03-14 18:50:33 +01:00
Philip Rebohle 1656860486
[dxgi] Remove obsolete global monitor helper functions 2019-03-14 18:26:39 +01:00
Philip Rebohle 5b72e84726
[dxgi] Use IDXGIVkMonitorInfo in DxgiSwapChain 2019-03-14 18:26:39 +01:00
Philip Rebohle 50347e1256
[dxgi] Use IDXGIVkMonitorInfo in DxgiOutput 2019-03-14 18:26:39 +01:00
Philip Rebohle 7d5b5f288c
[dxgi] Implement IDXGIVkMonitorInfo for DxgiFactory 2019-03-14 18:26:39 +01:00
Philip Rebohle cfdac13ea5
[dxgi] Add new COM interface for per-monitor data 2019-03-14 18:26:37 +01:00
Philip Rebohle f272071d8d
[dxvk] Don't enforce HOST_CACHED flag when allocating memory
The better fix would be to support non-coherent memory properly,
but this will have to do for now. Fixes #947.
2019-03-14 16:47:17 +01:00
Philip Rebohle c35af973bb
[util] Disable NvAPI hack for Star Wars Battlefront 2015
Fixes #968.
2019-03-14 16:33:35 +01:00
Philip Rebohle 2d39be4e72
[d3d11] Check image block alignment in UpdateSubresource1
Fixes validation errors in World of Warcraft, which for some reason
tries to update individual pixels of block-compressed textures.
See #964.
2019-03-14 01:11:39 +01:00
Philip Rebohle 833c433556
[util] Enable relaxed barriers for Devil May Cry 5
Same engine as RE2, same ~10% performance improvement.
2019-03-11 18:36:09 +01:00
Philip Rebohle 0326258829 [dxbc] Use SDiv instead of ShiftRightLogical for index calc on Nvidia
Reportedly fixes Resident Evil 2 and Devil May Cry 5 on Nvidia.
2019-03-09 19:59:56 +01:00
Philip Rebohle a40d8d49ea
[dxvk] Only enable xfb subpass barrier if feature is enabled
Silences validation errors when running FL9_x applications, or
a driver which does not support VK_EXT_transform_feedback.
2019-03-02 19:56:01 +01:00
Philip Rebohle 4fc96e60c5
[d3d11] Reimplment GetEnabledShaderStages using getShaderPipelineStages
They do the same thing anyway.
2019-03-02 19:56:01 +01:00
Philip Rebohle d9edb16b75
[dxvk] Use getShaderPipelineStages for dummy resource creation
Fixes some validation errors for FL10_x and FL9_x applications.
2019-03-02 19:56:01 +01:00
Philip Rebohle 2a6d4fa2ba
[dxvk] Implement DxvkDevice::getShaderPipelineStages 2019-03-02 19:56:00 +01:00
Joshua Ashton d01110259c [d3d11, d3d10] Init returnptrs for CreateDevice funcs. 2019-02-27 23:17:08 +01:00
Joshua Ashton 995949a9f9 [d3d10] Fix and cleanup S_FALSE handling 2019-02-27 22:01:04 +01:00
Joshua Ashton 62a833b528 [dxgi] Correct return values for CreateDXGIFactory[1/2] 2019-02-27 22:01:04 +01:00
Joshua Ashton ccf24db428 [d3d10] Fix null pBlendStateDesc being dereferenced on def. desc 2019-02-27 22:01:04 +01:00
Joshua Ashton 2454041903 [d3d10] nullptr checks for resource creation 2019-02-27 22:01:04 +01:00
Joshua Ashton 28df1e0825 [d3d11] nullptr check descs & fix return values 2019-02-27 22:01:04 +01:00
Philip Rebohle 10140f40ca
[dxvk] Release 1.0 2019-02-25 20:26:50 +01:00
Philip Rebohle e03b574cc1
[d3d11] Block on image acquisition fence before presenting
May potentially improve frame timing on drivers where image
acquisition does not block.
2019-02-25 13:34:49 +01:00
Philip Rebohle b6804a95b7
[vulkan] Create per-swap image fences for presenter 2019-02-25 13:34:49 +01:00
Philip Rebohle a6d1fe07f2
[vulkan] Add helper method to wait for presenter fence 2019-02-25 13:34:49 +01:00
Philip Rebohle 2231caaa9e
[vulkan] Add optional fence paratemer to acquireNextImage
We'll reset the fence prior to acquisition, so that the user of
this API won't have to do it.
2019-02-25 13:34:46 +01:00
Philip Rebohle 6d814b24da
[dxbc] Fix invalid SPIR-V for FirstBitHi / FirstBitShi on vectors
Refs #930.
2019-02-23 21:27:40 +01:00
Philip Rebohle d12a8e09a8
[dxbc] Decorate integer fragment shader builtins as flat
Fixes yellow tint in Unreal Engine 4 games on RADV and AMDGPU-PRO.
2019-02-23 14:33:59 +01:00
Philip Rebohle b65520d627
[dxvk] Fix feature query for vertex attribute divisor 2019-02-20 11:47:15 +01:00
Philip Rebohle 38c6eeed26
[dxbc] Only emit depth clamp in fragment shader if necessary
We don't need this if the depth clip extension is supported
by the driver.
2019-02-19 14:27:21 +01:00
Philip Rebohle 9159401b14
[dxvk] Use depthClipEnable during graphics pipeline creation
Fall back to previous behaviour if the feature is not available.
2019-02-19 14:22:36 +01:00
Philip Rebohle 49965fd79e
[dxvk] Enable depthClipEnable feature if available 2019-02-19 13:57:34 +01:00
Philip Rebohle cc5ac885f5
[dxvk] Enable VK_EXT_depth_clip_enable if available 2019-02-19 13:53:56 +01:00
Philip Rebohle 20ea74fa99
[d3d11] Do not enable shaderStorageImageMultisample device feature
See https://github.com/KhronosGroup/MoltenVK/issues/502
2019-02-19 11:32:32 +01:00