KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Scott D Phillips	d849d36c6c	anv: off-by-one in GetDescriptorSetLayoutSupport Loop was accessing one more than bindingCount elements from pBindings, accessing uninitialized memory. Fixes: `ddc4069122` ("anv: Implement VK_KHR_maintenance3") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 07:58:10 -07:00
Caio Marcelo de Oliveira Filho	f6338c3b85	anv/pipeline: set active_stages early Since the intermediate states of active_stages are not used, i.e. active_stages is read only after all stages were set into it, just set its value before compiling the shaders. This will allow to conditionally run certain passes based on what other shaders are being used, e.g. a certain pass might only be applicable to the vertex shader if there's no geometry or tessellation shader being used. v2: Use vk_to_mesa_shader_stage. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho	318073ce66	anv/pipeline: fail if TCS/TES compile fail v2: Add Fixes tag. (Lionel) Fixes: `e50d4807a3` ("anv: Compile TCS/TES shaders.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Eric Anholt	7db1c09d12	anv: Silence warning about heap_size. We only get VK_SUCCESS if it was initialized, but apparently my compiler doesn't track that far. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:10:05 -07:00
Eric Anholt	d25640c3a3	i965: Silence compiler warning about promoted_constants. We only have a cfg != NULL if we went through one of the paths that set it, but my compiler doesn't figure that out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6411defdcd` ("intel/cs: Re-run final NIR optimizations for each SIMD size")	2018-03-16 15:09:55 -07:00
Eric Anholt	9f89452ea3	anv: Silence compiler warnings about uninitialized bind_offset. This is a legitimate warning: if anv's blorp_alloc_binding_table() throws an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently continue to use this undefined value. The rest of this code doesn't seem very allocation-error-proof, though, either. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:09:47 -07:00
Matt Turner	f3833f1ca7	intel/compiler: Use gen_get_device_info() in test_eu_validate Previously the unit test filled out a minimal devinfo struct. A previous patch caused the test to begin assert failing because the devinfo was not complete. Avoid this by using the real mechanism to create devinfo. Note that we have to drop icl from the table, since we now rely on the name -> PCI ID translation done by gen_device_name_to_pci_device_id(), and ICL's PCI IDs are not upstream yet. Fixes: `f89e735719` ("intel/compiler: Check for unsupported register sizes.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Matt Turner	54db78b196	intel: Add cfl to gen_device_name_to_pci_device_id() Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Rafael Antognolli	f89e735719	intel/compiler: Check for unsupported register sizes. Make sure we don't emit 64 bit types if the hardware doesn't support them. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-16 09:27:16 -07:00
Lionel Landwerlin	51783f3e7d	anv: silence unused variable warning Fixes: `59b0ea0c74` ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:56:26 +00:00
Lionel Landwerlin	0f544a3c51	anv: silence unused function warning on gen11 [84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'. ../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:55:42 +00:00
Karol Herbst	b617bfcccf	compiler: int8/uint8 support OpenCL kernels also have int8/uint8. v2: remove changes in nir_search as Jason posted a patch for that Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-03-14 10:08:42 -04:00
Iago Toral Quiroga	1a0aba7216	anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands `af5f2322d0` addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Iago Toral Quiroga	a631575ff4	anv/entrypoints: dispatches to VkQueue are device-level v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Jordan Justen	24b415270f	intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-09 16:15:58 -08:00
Ian Romanick	1583f49eaa	i965/vec4: Allow CSE on subset VF constant loads v2: Rewrite the code that generates the VF mask. Suggested by Ken. No changes on other platforms. Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13059891 -> 13059884 (<.01%) instructions in affected programs: 431 -> 424 (-1.62%) helped: 7 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -3.39% -0.71% Instructions are helped. total cycles in shared programs: 409260032 -> 409260018 (<.01%) cycles in affected programs: 4228 -> 4214 (-0.33%) helped: 7 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -1.15% 0.07% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	360899d457	i965/vec4: Relax writemask condition in CSE If the previously seen instruction generates more fields than the new instruction, still allow CSE to happen. This doesn't do much, but it also enables a couple more shaders in the next patch. It helped quite a bit in another change series that I have (at least for now) abandoned. v2: Add some extra comentary about the parameters to instructions_match. Suggested by Ken. No changes on Skylake, Broadwell, Iron Lake or GM45. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11780295 -> 11780294 (<.01%) instructions in affected programs: 302 -> 301 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 257308315 -> 257308313 (<.01%) cycles in affected programs: 2074 -> 2072 (-0.10%) helped: 1 HURT: 0 Sandy Bridge total instructions in shared programs: 10506687 -> 10506686 (<.01%) instructions in affected programs: 335 -> 334 (-0.30%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	52c7df1643	i965/fs: Merge CMP and SEL into CSEL on Gen8+ v2: Fix several problems handling inverted predicates. Add a much bigger comment around the BRW_CONDITIONAL_NZ case. v3: Allow uniforms and shader inputs as sources for the original SEL and CMP instructions. This enables a LOT more shaders to receive CSEL merging (5816 vs 8564 on SKL). v4: Report progress. Broadwell and Skylake had similar results. (Broadwell shown) helped: 8527 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70% 95% mean confidence interval for instructions value: -2.51 -2.36 95% mean confidence interval for instructions %-change: -1.15% -1.10% Instructions are helped. total cycles in shared programs: 559442317 -> 558288357 (-0.21%) cycles in affected programs: 372699860 -> 371545900 (-0.31%) helped: 6748 HURT: 1450 helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12 helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70% HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14 HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90% 95% mean confidence interval for cycles value: -179.01 -102.51 95% mean confidence interval for cycles %-change: -2.37% -2.08% Cycles are helped. LOST: 0 GAINED: 6 No changes on earlier platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Kenneth Graunke	70de61594d	i965/fs: Add infrastructure for generating CSEL instructions. v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Jason Ekstrand	c217607b65	anv: Support version overrides While always sketchy to do, this is useful for debugging. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d6b65222df	anv: Enable Vulkan 1.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	03c07ac548	anv: Add support for SPIR-V 1.3 subgroup operations This requires us to bump the subgroup size to 32 for all shader stages because Vulkan requires that to be a physical device query. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8b4a5e641b	intel/fs: Add support for subgroup quad operations NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	2292b20b29	intel/fs: Implement reduce and scan opeprations Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	4150920b95	intel/fs: Add a helper for emitting scan operations This commit adds a helper to the builder for emitting "scan" operations. Given a binary operation #, a scan takes the vector [a0, a1, ..., aN] and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each channel contains the combination of all previous channels. The sequence of instructions to perform the scan is fairly optimal; a 16-wide scan on a 32-bit type is only 6 instructions. The subgroup scan and reduction operations will be implemented in terms of this. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	b0858c1cc6	intel/fs: Add a couple of simple helper opcodes Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	90c9f29518	i965/fs: Add support for nir_intrinsic_shuffle Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	7cfece820d	i965/fs: Support nir_intrinsic_vote_feq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	44681e4795	nir: Generalize nir_intrinsic_vote_eq The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	974daec495	i965/fs: Implement basic SPIR-V subgroup intrinsics Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	59b0ea0c74	anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER From the Vulkan 1.1 spec: "Vulkan 1.0 implementations were required to return VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0. Implementations that support Vulkan 1.1 or later must not return VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion." Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cbab2d1da5	anv: Implement vkEnumerateInstanceVersion Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	605fd7c0da	anv/device: fail to initialize device if we have queues with unsupported flags This is not strictly necessary since users should not be requesting any flags that are not valid for the list of enabled features requested and we already fail if they attempt to use an unsupported feature, however it is an easy to implement sanity check that would help developes realize that they are doing things wrong, so we might as well do it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	b262f17b15	anv/device: GetDeviceQueue2 should only return queues with matching flags From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure: "The queue returned by vkGetDeviceQueue2 must have the same flags value from this structure as that used at device creation time in a VkDeviceQueueCreateInfo instance. If no matching flags were specified at device creation time then pQueue will return VK_NULL_HANDLE." For us this means no flags at all since we don't support any. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	9c8b40001d	anv: Support querying for protected memory Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	773a51e772	anv: Implement GetDeviceQueue2 This belongs to the protected memory feature but there's nothing about it that's specific to protected memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68df93ecbc	anv: Trivially implement VK_KHR_device_group Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	dfe18be09e	anv: Implement vkCmdDispatchBase This is part of the device groups extension/feature but it's a decent chunk of work in its own right so it's worth breaking into its own patch. The mechanism we use is fairly straightforward: we just push the base work group id into the shader and add it to the work group id we get from dispatch. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ddc4069122	anv: Implement VK_KHR_maintenance3 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	1deb7967c8	anv: Support VkPhysicalDeviceShaderDrawParameterFeatures This advertises the VK_KHR_shader_draw_parameters functionality as a "core optimal feature" in Vulkan 1.1. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	06719f9d4b	anv/entrypoints: Drop support for protect attributes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	bd1279bd9f	Get rid of a bunch of KHR suffixes Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	af461986db	anv: Add version 1.1.0 but leave it disabled This requires us to rename any Vulkan API entrypoints which became core in 1.1 to no longer have the KHR suffix. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	eb23ca069f	anv/entrypoints: Generate #ifdef guards from platform attributes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	05fc377f2e	anv/extensions: Add support for multiple API versions Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8efa173ed2	anv/entrypoints_gen: Add support for aliases in the XML Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	39d9fcea13	anv/entrypoints: Allow an entrypoint to require multiple extensions In this case, we say an entrypoint is supported if ANY of the extensions is supported. This is because, in the XML, entrypoints don't require extensions so much as extensions require entrypoints. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8e8f167c72	anv/entrypoints: Add an is_device_entrypoint helper Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	54b3493fc0	anv/entrypoints_gen: Allow the string map to grow Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d91da06df5	anv/entrypoints_gen: A bit of refactoring Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00

1 2 3 4 5 ...

2842 Commits