mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	039a2e3630	draw: add jit image type for vs/gs images. This introduces the jit image type into the jit interface for vertex/geom shaders Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:49 +10:00
Dave Airlie	3c2c232059	llvmpipe: move the fragment shader variant key to dynamic length. This mirrors the vs/gs keys, and will be needed when adding images support. The const changes also mirror how the draw code work (as is needed when we add images) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:42 +10:00
Dave Airlie	d0381ea149	gallivm: add a basic image limit Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:39 +10:00
Dave Airlie	cf84b46a1c	llvmpipe: handle early test property. Also handle setting late for shaders that use stores Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:33 +10:00
Dave Airlie	a1e8fcef47	gallivm: move first/last level jit texture members. This lets us create an image structure with the same basic types as the texture one. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:31 +10:00
Dave Airlie	e8a445d8b5	gallivm: handle helper invocation (v2) Just invert the exec_mask to get if this is a helper or not. v2: get the bld mask (Roland) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:28 +10:00
Dave Airlie	fb34369eb5	gallivm: make lp_build_float_to_r11g11b10 take a const src This allows using it with a const src later. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:25 +10:00
Dave Airlie	a8ef6b5755	llvmpipe: refactor jit type creation This just cleans the code up so the texture/sampler type creation can be reused. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:21 +10:00
Dave Airlie	1eda49cc3d	gallivm: fix atomic compare-and-swap Not sure how I missed this before, but compswap was hitting an assert here as it is it's own special case. Fixes: `b5ac381d8f` ("gallivm: add buffer operations to the tgsi->llvm conversion.") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:28:17 +10:00
Paulo Zanoni	848d5e444a	intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails Looks like a copy/paste error. This patch prevents a segfault when running the following on BDW: INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \ dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4 For the curious, the message we're getting is: CS compile failed: Failure to register allocate. Reduce number of live scalar values to avoid this. Fixes: `864737ce6c` ("i965/fs: Build 32-wide compute shader when needed.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-26 14:54:16 -07:00
Alyssa Rosenzweig	c30116a2fa	pan/midgard: Fix invert fusing with r26 The invert wasn't applying (correctly) due to the issues addressed here. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 13:43:04 -07:00
Alyssa Rosenzweig	75b6be2435	pan/midgard: Fold ssa_args into midgard_instruction This is just a bit of refactoring to simplify MIR. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 13:43:04 -07:00
Eric Anholt	0309fb82ec	gallium: Add the ASTC 3D formats. No driver implements them yet, but this is a long way toward gallium having matching format enums for Mesa formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-26 19:44:00 +00:00
Eric Anholt	9d988f9291	gallium: Add block depth to the format utils. I decided not to update nblocks() with a depth arg as the callers wouldn't be doing ASTC 3D. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-26 19:44:00 +00:00
Eric Anholt	530f424735	gallium: Add a block depth field to the u_formats table. To add ASTC 3D compression formats, we need to be able to express the block depth. While I'm touching every line, line up the columns of the CSV again as they've drifted over time. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-26 19:44:00 +00:00
Alyssa Rosenzweig	9c328ea66e	pan/midgard: Add imov->fmov optimization When moving constants, if switching to a floating-point representation doesn't break anything, we'd rather have an fmov than an imov, permitting inlining the constant in many circumstances. total quadwords in shared programs: 3408 -> 3366 (-1.23%) quadwords in affected programs: 1188 -> 1146 (-3.54%) helped: 41 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1 helped stats (rel) min: 0.19% max: 25.00% x̄: 9.65% x̃: 11.11% 95% mean confidence interval for quadwords value: -1.07 -0.98 95% mean confidence interval for quadwords %-change: -11.38% -7.93% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 11:42:33 -07:00
Alyssa Rosenzweig	0acb5c1774	pan/midgard: Switch constants to uint32 Storing constants as float doesn't make sense when we have integer instructions; better to switch to be integer natively and coerce to/from float rather than the opposite. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 11:42:32 -07:00
Kenneth Graunke	2e1be771e4	isl: Don't set UnormPathInColorPipe for integer surfaces. This fixes dEQP-GLES3.functional.texture.specification subtests on iris: - texsubimage3d_depth.depth24_stencil8_2d_array - texsubimage3d_depth.depth32f_stencil8_2d_array - texsubimage3d_depth.depth_component32f_2d_array - texsubimage3d_depth.depth_component24_2d_array - texstorage2d.format.depth24_stencil8_2d - texstorage2d.format.depth32f_stencil8_2d - texstorage2d.format.depth_component24_2d - texstorage2d.format.depth_component32f_2d - texstorage3d.format.depth24_stencil8_2d_array - texstorage3d.format.depth32f_stencil8_2d_array - texstorage3d.format.depth_component24_2d_array - texstorage3d.format.depth_component32f_2d_array Here, something appears to be going wrong with having this bit set during blorp_copy operations for texture upload, which override the format to R8G8B8A8_UINT. AFAICT this bit should have no effect for integer surfaces, as it has to do with blending, and integer blending is not a thing. So it should be harmless to disable it. The Windows driver appears to be setting this bit universally, so I am unclear why we would need to. Perhaps they simply haven't run into this issue. Fixes: `f741de236b` ("isl: Enable Unorm Path in Color Pipe") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-26 16:54:20 +00:00
Kenneth Graunke	1b090f065e	isl: Drop UnormPathInColorPipe for buffer surfaces. Jason suggested I remove this in review, and he's right. AFAICT this affects blending, and that just isn't going to happen on buffers. Fixes: `f741de236b` ("isl: Enable Unorm Path in Color Pipe") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-26 16:54:20 +00:00
Alyssa Rosenzweig	85cc78a624	pan/midgard, bifrost: Set lower_fdph = true fdph instructions show up in some desktop GL shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 07:47:01 -07:00
Samuel Pitoiset	218ce34962	radv: add mipmap support for the clear depth/stencil values Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:59 +02:00
Samuel Pitoiset	e36e260c42	radv: add mipmap support for the TC-compat zrange bug Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:55 +02:00
Samuel Pitoiset	9db0dc6b8e	radv: allocate metadata space for mipmapped depth/stencil images For each mipmaps, the driver will store the clear values (8-bytes) and the TC-compat zrange value (4-bytes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:51 +02:00
Samuel Pitoiset	76812339f7	radv: decompress mipmapped depth/stencil images during transitions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:48 +02:00
Samuel Pitoiset	81c6473b7f	radv: add mipmaps support for decompress/resummarize Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:45 +02:00
Samuel Pitoiset	18ccde4d68	radv: add radv_process_depth_image_layer() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:42 +02:00
Connor Abbott	b7acf38073	ac/nir: Remove gfx9_stride_size_workaround_for_atomic The workaround was entirely in common code, and it's needed in radeonsi too so just always do it when necessary. Fixes KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9 with LLVM 8. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-26 11:00:49 +02:00
Connor Abbott	4849276ea8	ac/nir: add a workaround for viewing a slice of 3D as a 2D image GL and Vulkan allow you to bind a single layer of a 3D texture to a 2D image, and we weren't implementing a workaround for that on gfx9 that TGSI was. Copy it over. Fixes KHR-GL45.shader_image_load_store.non-layered_binding with radeonsi NIR. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 11:00:44 +02:00
Samuel Pitoiset	89671ef205	radv: fix getting the index type size for uint8_t 16-bit and 32-bit values match hardware values but 8-bit doesn't. This fixes dEQP-VK.pipeline.input_assembly.* with 8-bit index. Fixes: `372c3dcfdb` ("radv: implement VK_EXT_index_type_uint8") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-08-26 09:23:23 +02:00
Dave Airlie	bba4d2f442	virgl: fix format conversion for recent gallium changes. The virgl formats are fixed in time snapshots of the gallium ones, we just need to provide a translation table between them when we enter the hardware. This fixes a regression since Eric renumbered the gallium table. Fixes: `c45c33a5a2` (gallium: Remove manual defining of PIPE_FORMAT enum values.) Bugzilla: https://bugs.freedesktop.org/111454 v1 by Dave Airlie <airlied@redhat.com> v2: virgl: Add a number of formats to the table that are used, e.g. for vertex attributes v3: cover some more missing formats from a piglit run Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2019-08-26 06:35:00 +00:00
Dave Airlie	035cd6cdf9	virgl: drop unused format field	2019-08-26 06:35:00 +00:00
Erico Nunes	4379dcc12d	lima/ppir: enable vectorize optimization pp has vector units and some operations can be optimized when bundled together. Benchmarking this with piglit shaders shows that the instruction count can be greatly reduced on many examples with vectorize. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 18:29:12 +00:00
Erico Nunes	2a8a81d109	lima/ppir: lower selects to scalars nir vec4 fcsel assumes that each component of the condition will be used to select the same component from the options, but pp can't implement that since it only has 1 component for the condition. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 18:29:12 +00:00
Erico Nunes	27e7603c34	lima: fix ppir spill stack allocation The previous spill stack was fixed and too small, and caused instability in programs requiring spilling for roughly more than one value. This patch adds a dynamic calculation of the buffer size based on stack utilization and switches it to a separate allocation at flush time that will fit the shader that requires the largest buffer. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 20:08:59 +02:00
Jason Ekstrand	f58e0405b6	intel/fs: Drop the gl_program from fs_visitor It's not used by anything anymore now that so much lowering has been moved into NIR. Sadly, we still need on in brw_compile_gs() for geometry shaders on Sandy Bridge. Short of a lot of pointless work, that one's probably not going away. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-25 01:02:52 -05:00
Qiang Yu	5ff41b9fc5	lima: move format handling to unified place Create a unified table to handle pipe format to texture and render target format lookup. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 11:52:29 +08:00
Alex Smith	fe0ec41c4d	radv: Change memory type order for GPUs without dedicated VRAM Put the uncached GTT type at a higher index than the visible VRAM type, rather than having GTT first. When we don't have dedicated VRAM, we don't have a non-visible VRAM type, and the property flags for GTT and visible VRAM are identical. According to the spec, for types with identical flags, we should give the one with better performance a lower index. Previously, apps which follow the spec guidance for choosing a memory type would have picked the GTT type in preference to visible VRAM (all Feral games will do this), and end up with lower performance. On a Ryzen 5 2500U laptop (Raven Ridge), this improves average FPS in the Rise of the Tomb Raider benchmark by up to ~30%. Tested a couple of other (Feral) games and saw similar improvement on those as well. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 19.2 <mesa-stable@lists.freedesktop.org> (Bas: CCing this to 19.2-rc due to high impact and limited complexity)	2019-08-24 17:37:47 +02:00
Vasily Khoruzhick	681e99d11c	lima/ppir: print register index and components number for spilled register It can be useful for debugging purposes Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:31 -07:00
Vasily Khoruzhick	28d4b456a5	lima/ppir: add control flow support This commit adds support for nir_jump_instr, if and loop nir_cf_nodes. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:31 -07:00
Vasily Khoruzhick	1cdf585613	lima/ppir: add better liveness analysis Add better liveness analysis that was modelled after one in vc4. It uses live ranges and is aware of multiple blocks which is prerequisite for adding CF support Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:31 -07:00
Vasily Khoruzhick	d30a98c896	lima/ppir: validate shader outputs Mali4x0 supports only gl_FragColor. gl_FragDepth is not supported. Check that we don't get anything but gl_FragColor in shader outputs. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:25 -07:00
Vasily Khoruzhick	8dd195e865	lima/ppir: turn store_color into ALU node We don't have a special OP to store color in PP, all we need to do is to store gl_FragColor into reg0, thus it's just a mov and therefore ALU node. Yet we still need to indicate that it's store_color op so regalloc ignores its destination. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	7f814d2b46	lima/ppir: create ppir block for each corresponding NIR block Create ppir block for each corresponding NIR block and populate its successors. It will be used later in liveness analysis and in CF support Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	4e695489df	lima/ppir: add dummy op We can get following from NIR: (1) r1 = r2 (2) r2 = ssa1 Note that r2 is read before it's assigned, so there's no node for it in comp->var_nodes. We need to create a dummy node in this case which sole purpose is to hold ppir_dest with reg in it. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	d11e1b7909	lima/ppir: add write after read deps for registers For cases like: (1) r1 = r2 (2) r2 = ssa1 We need to add (1) as dependency of (2), otherwise scheduler may reorder them. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	cd8c569ced	lima/ppir: fix ordering deps There can be several root nodes, i.e.: (1) r0 = r1 (2) r2 = r3 (3) branch if (ssa1) We need to make (3) depend on (1) and (2), old code added dependency only for (2), and (1) was kept as root node since there is no branch/discard or store color between two movs. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	bf2872eeb2	lima/ppir: set write mask for texture loads if dest is reg Destination for texture load can be a reg, so we need to set write mask in this case Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	fd129817f0	lima/ppir: add support for unconditional branches and condition negation We need 'negate' modifier for branch condition to minimize branching. Idea is to generate following: current_block: { ...; if (!statement) branch else_block; } then_block: { ...; branch after_block; } else_block: { ... } after_block: { ... } Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:46 -07:00
Vasily Khoruzhick	e15af23b73	lima/ppir: clone ld_{uni,tex,var} into each block ppir_lower_load() and ppir_lower_load_texture() assume that node is in the same block as its successors, fix it by cloning each ld_uni and ld_tex to every block. It also reduces register pressure since values never cross block boundaries and thus never appear in live_in or live_out of any block, so do it for varyings as well. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:46 -07:00
Vasily Khoruzhick	172f2ad805	lima/ppir: refactor const lowering Const nodes are now cloned for each user, i.e. const is guaranteed to have exactly one successor, so we can use ppir_do_one_node_to_instr() and drop insert_to_each_succ_instr() Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:46 -07:00

1 2 3 4 5 ...

114919 Commits All Branches Search

114919 Commits

All Branches