KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Timothy Arceri	fea36a8f43	st/glsl: make sure to propagate initialisers to driver storage This essentially reverts `20234cfe3a`. Fixes piglit test: tests/spec/arb_get_program_binary/execution/uniform-after-restore.shader_test Fixes: `20234cfe3a` "st/mesa: don't propagate uniforms when restoring from cache" Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110784	2019-06-04 11:36:45 +10:00
Caio Marcelo de Oliveira Filho	61de825e11	spirv: Like Uniform, do nothing for UniformId Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	b4eff83180	spirv: Implement SpvOpCopyLogical This is the same as SpvOpCopyObject but without the type checking, which is how vtn_composite_copy works, so we just need to hook the operation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	81586e9f53	spirv: Generalize OpSelect SPIR-V 1.4 supports OpSelect over any composite type, and also allows scalar boolean condition for vector types -- a case which we already handled to support old GLSLang. Added a helper function to recursively perform nir_bcsel, that makes easier to support structs. v2: Replace asserts() with vtn_fail_if(). (Jason) v3: Simplify Condition and Result types verifications. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	17630291e5	spirv: Move OpSelect handling to a function This will make a later change easier to review. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	ea0e89859c	nir/vars_to_ssa: Handle UNDEF_NODE in more places Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110832 Fixes: `911ea2c66f` "nir/vars_to_ssa: Use a non-null UNDEF_NODE pointer" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:09:22 -07:00
Marek Olšák	b2bbd1a27b	ac/registers: don't use the si, cik, vi names, use gfxN trivial	2019-06-03 20:06:41 -04:00
Nicolai Hähnle	f480b8aaa4	amd/common: use generated register header	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	853ef5ccba	amd/common: use SH{0,1}_CU_EN definitions only of COMPUTE_STATIC_THREAD_MGMT_SE0 The automatic header generation unifies identical registers in a series and only emits definitions for the first one. This is mostly to avoid emitting excessive definitions for CB registers, but special-casing an exception for this family of registers doesn't seem worth it.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	cf51009ad2	amd/common: unify PITCH_GFX6 and PITCH_GFX9 The definition of the fields differs, but PITCH_GFX9 is a mere extension of PITCH_GFX6 that does not conflict with any other fields. This aligns the definitions with what will be generated from the register JSON. The information about how large the fields really are is preserved in the register database.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	e04215815e	amd/common: rename R_3F2_CONTROL to IB_CONTROL for disambiguation This "register" name collides with R_370_CONTROL. This aligns the definitions with what will be generated from the register JSON.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	cd247cf456	amd/common: cleanup DATA_FORMAT/NUM_FORMAT field names The field layout wasn't actually changed in gfx9, so having the suffix isn't very useful. The field contents were changed, but this is reflected in the V_xxx_xxx definitions and is taken into account by the ac_debug logic based on the register JSON. This aligns the definitions with what will be generated from the register JSON.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	ef6ef098af	amd/common: derive ac_debug tables from register JSON	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	d02286c753	amd/registers: add JSON description of packet3 fields	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	67702e3319	amd/registers: add JSON descriptions of registers The descriptions are mostly derived from parsing the existing register headers.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	e6184b0892	amd/registers: scripts for processing register descriptions in JSON We will derive both the debugging tables and (the majority of) the register headers from descriptions in JSON, instead of deriving the debugging tables from an awkward parsing of the register headers. Some of the scripts are useful for maintaining the register database itself. The scripts are designed to output reasonably readable JSON by default.	2019-06-03 20:05:20 -04:00
Vinson Lee	d4e70be739	freedreno: Fix GCC build error. ../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant .minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 }, ^ Suggested-by: Kristian Høgsberg <krh@bitplanet.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-03 16:46:54 -07:00
Mark Janes	774a088f64	mesa: Use string literals for format strings Android build settings require format strings to be string literals. Fixes: `d2906293c4` "mesa: EXT_dsa add selectorless matrix stack functions" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110833 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 16:17:23 -07:00
Caio Marcelo de Oliveira Filho	045aeccf0e	iris: Always reserve binding table space for NIR constants Don't have a separate mechanism for NIR constants to be removed from the table. If unused, we will compact it away. The use_null_surface is needed when INTEL_DISABLE_COMPACT_BINDING_TABLE is set. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	5611444809	iris: Print binding tables when INTEL_DEBUG=bt Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	97cd865be2	iris: Compact binding tables Change the iris_binding_table to keep track of what surfaces are actually going to be used, then assign binding table indices just for those. Reducing unused bytes on those are valuable because we use a reduced space for those tables in Iris. The rest of the driver can go from "group indices" (i.e. UBO #2) to BTI and vice-versa using helper functions. The value IRIS_SURFACE_NOT_USED is returned to indicate a certain group index is not used or a certain BTI is not valid. The environment variable INTEL_DISABLE_COMPACT_BINDING_TABLE can be set to skip compacting binding table. v2: (all from Ken) Use BITFIELD64_MASK helper. Improve comments. Assert all group is marked as used when we have indirects. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	79f1529ae0	iris: Create an enum for the surface groups This will make convenient to handle compacting and printing the binding table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	1c8ea8b300	iris: Handle binding table in the driver Stop using brw_compiler to lower the final binding table indices for surface access. This is done by simply not setting the 'prog_data->binding_table.*_start' fields. Then make the driver perform this lowering. This is a better place to perfom the binding table assignments, since the driver has more information and will also later consume those assignments to upload resources. This also prepares us for two changes: use ibc without having to implement binding table logic there; and remove unused entries from the binding table. Since the `block` field in brw_ubo_range now refers to the final binding table index, we need to adjust it before using to index shs->constbuf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	518f83236b	iris: Pull brw_nir_analyze_ubo_ranges() call out setup_uniforms We'll change iris to perform lowering of the binding table indices earlier (before the backend kick in), but the backend compiler uses the result of the analysis to identify load_ubo intrinsics, so we do the analysis after the lowering to have the right indices. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	1f8546ba2f	spirv: Implement OpPtrEqual, OpPtrNotEqual and OpPtrDiff Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Caio Marcelo de Oliveira Filho	ca164ab495	nir: Add functions to subtract and compare addresses v2: Fix comparing addresses from formats that have more than one component by using nir_ball_iequal(). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Caio Marcelo de Oliveira Filho	09cc3389b9	nir: Add nir_ball_iequal() helper Similar to nir_bany_inequal(). Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Sergii Romantsov	88340372ee	mesa: ARB program parser should clean parameters Program parser allocates parameter list. In case of parsing error some variables will not be freed. Patch adds freeing of it. Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 16:41:26 -04:00
Hyunjun Ko	382e3553af	freedreno/ir3: fix counting and printing for half registers. v2: defining 0x100 and use this for setting the FS_OUTPUT_REG.HALF_PRECISION Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:51 -07:00
Neil Roberts	fb53b326c2	freedreno/ir3: Fix up the half reg source even when src instr==NULL Previously the loop for assigning registers was bailing out early if the register had a null source. I think the intention is that in this case it isn’t necessary to assign a register. However it was also missing out the part to fix up the types. This can happen if the instruction is copy propagated to be a move from a constant half-float input register. In that case it still needs to fix up the types. Fixes assert in dEQP-GLES3.functional.shaders.invariance.highp.subexpression_precision_mediump when lowering the precision of the variables. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:51 -07:00
Neil Roberts	3222216a58	freedreno/ir3: Add a 16-bit implementation of nir_op_imul Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:51 -07:00
Hyunjun Ko	daee6bc1a1	freedreno/ir3: set dst type of alu instructions correctly. Though it should be fixed in RA pass, it needs to be set correctly from the beginning according to the bitsize of NIR dest. v2: Would be better for mad,fddx,fddy to fixup later in RA pass. [small cleanup of fallout from imov/fmov removal fallout] Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:26 -07:00
Hyunjun Ko	43d80a3e20	freedreno/ir3: adjust the bitsize of regs when an array loading. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Hyunjun Ko	cbd1f47433	freedreno/ir3: convert back to 32-bit values for half constant registers. It seems to handle only 32-bit values for half constant registers within floating point opcodes according to the blob driver. So we need to convert back to 32-bit values from 16-bit values, when a lower precision pass is in effect. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Hyunjun Ko	a9b556d3a0	freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov. If the type of dest reg and src reg of absneg opcode are different, it shouldn't be considered as same type mov. This patch becomes meaningful when we start to use mediump information for doing precision lowering to 16bit. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Hyunjun Ko	6fb8ef3da6	freedreno/ir3: set proper dst type for uniform according to the type of nir dest. eg. uniform mediump vec4 f; This patch means nothing since there's no mediump lowering pass for now, but will be meaningful when the pass land in the near future. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Neil Roberts	689c3c7d40	freedreno/ir3: Use output type size to set OUTPUT_REG_HALF_PRECISION Previously the A5XX_SP_FS_OUTPUT_REG_HALF_PRECISION was set depending on whether half_precision was set in the shader key. With support for mediump precision, it is possible to have different outputs use different precisions. That means we can’t have a global shader state to specify it. Instead it now tries to copy the half-float-ness from the nir_variable for the output into the ir3_shader_variant. This is then used to decide whether to set half-precision for each output. The a6xx version is copied from the a5xx code but it has not been tested. v2. [Hyunjun Ko (zzoon@igalia.com)] There's the half flag recently added, which represents precision based on IR3_REG_HALF. Now use this flag to avoid duplication. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Neil Roberts	8cd1b76b7d	freedreno/ir3: Fix loading half-float immediate vectors Previously the code to load from a constant instruction was always using the u32 pointer. If the constant is actually a 16-bit source this would end up with the wrong values because the pointer would be offset by the wrong size. This fixes it to use the u16 pointer. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Rob Clark	7bbf21e898	freedreno/ir3: immediately schedule meta instructions The aren't real instructions, and don't change # of live values, so no point in them competing with real instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-03 12:44:03 -07:00
Rob Clark	771d04c82d	freedreno/ir3: scheduler improvements For instructions that increase the # of live values, apply a threshold to avoid scheduling them too early. And factor the net change of # of live values that would result from scheduling an instruction, to prioritize instructions that reduce number of live values as the number of live values increases. For manhattan: total instructions in shared programs: 27869 -> 28413 (1.95%) instructions in affected programs: 26756 -> 27300 (2.03%) helped: 102 HURT: 87 total full in shared programs: 1903 -> 1719 (-9.67%) full in affected programs: 1390 -> 1206 (-13.24%) helped: 124 HURT: 9 The reduction in register usage nets ~20% gain in manhattan. (So getting mediump support should be a huge win for gles gfxbench.) Also significantly helps some of the more complex shadertoy shaders, like IQ's Piano (32 to 18 regs, doubles fps). The effect is less pronounced on smaller shaders. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-03 12:44:03 -07:00
Rob Clark	bb3aa44ade	freedreno/ir3: sched should mark outputs used Account for shader outputs and values live in any direct/indirect successor block. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-03 12:44:03 -07:00
Pierre-Eric Pelloux-Prayer	d2906293c4	mesa: EXT_dsa add selectorless matrix stack functions Allows the legacy matrix stacks to be manipulated without disturbing the matrix mode selector. Adapted from a patch from Chris Forbes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:51 -04:00
Pierre-Eric Pelloux-Prayer	28ce704bb0	mesa: factor out enum -> matrix stack lookup Split this out from glMatrixMode since we're about to need it independently for EXT_DSA. Adapted from Chris Forbes commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:49 -04:00
Timothy Arceri	b69584ad69	mesa: add new EXT_direct_state_access tokens Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:47 -04:00
Chris Forbes	028682f7f4	glapi: add EXT_direct_state_access Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:45 -04:00
Timothy Arceri	9c5d86af38	mesa: add a list of EXT_direct_state_access to dispatch sanity This extension is huge and this gives us a TODO list of functions to implement. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:33 -04:00
Pierre-Eric Pelloux-Prayer	4583f09caa	radeonsi: init sctx->dma_copy before using it Commit `a1378639ab` reordered context functions initializations but broke sctx->b.resource_copy_region init when using AMD_DEBUG=forcedma. In this case sctx->dma_copy was assigned a value after being used in: sctx->b.resource_copy_region = sctx->dma_copy; This commit moves the FORCE_DMA special case after sctx->dma_copy initialization. See https://bugs.freedesktop.org/show_bug.cgi?id=110422 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:05:30 -04:00
Axel Davy	5820ac6756	d3dadapter9: Revert to old throttling limit value Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2 to 1: `20909284f2` No driver seems to overwrite the default value. One user reports severe regressions for some games. For now, revert to the value 2 for nine. Cc: "19.1" mesa-stable@lists.freedesktop.org Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-06-03 20:37:13 +02:00
Marek Olšák	486bc1e17e	ac: use amdgpu-flat-work-group-size Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-03 14:32:47 -04:00
Marek Olšák	4b11ed443b	u_blitter: don't fail mipmap generation for depth formats containing stencil Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109754 Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-06-03 14:32:47 -04:00

1 2 3 4 5 ...

111388 Commits All Branches Search

111388 Commits

All Branches