KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Michel Dänzer	59e7f1413c	gitlab-ci: Update the meson cross file for LLVM_VERSION as well Cross builds don't use the llvm-config path from the native file.	2019-10-22 10:26:29 +00:00
Michel Dänzer	163ec5d808	gitlab-ci: Use native aarch64 runner for ARM build jobs This allows running the regression tests. One downside is that we can't easily build the Vulkan overlay layer, because only x86 binaries of the glslang validator are available. If that's important, we could either use those binaries via qemu, or build it from source. v2: * Add :amd64 suffix to existing debian-9/10 job names (Eric Engestrom) Acked-by: Eric Engestrom <eric.engestrom@intel.com> # v1	2019-10-22 10:26:29 +00:00
Michel Dänzer	c5aa2711a4	gitlab-ci: Explicitly list debian-10 in needs: for .deqp-test template Apparently needs: in a definition overwrites inherited ones. So .deqp-test effectively didn't declare needs: for debian-10, which means any jobs based on .deqp-test could spuriously run after the debian-10 job failed or was cancelled.	2019-10-22 10:26:29 +00:00
Michel Dänzer	38d42cf1d5	gitlab-ci: Bring ARM docker image install script in line with x86_64 Use https:// URLs in the APT configuration. Drop --no-install-recommends, the image generation template disables installation of recommended packages in /etc/apt/apt.conf. Run apt-get autoremove at the end, cleaning up packages which were installed to satisfy dependencies but are no longer needed. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Michel Dänzer	e3c7e04dfa	gitlab-ci: Sort ARM docker image packages in alphabetical order No functional change. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Samuel Pitoiset	a13320370e	radv: fix updating bound fast ds clear values with different aspects On GFX9, the driver is able to do an optimized fast depth/stencil clear with only one aspect (ie. clear the stencil part of a depth/stencil image). When this happens, the driver should only update the clear values of the given aspect. Note that it's currently only supported on GFX9 but I have some local patches that extend this optimized path for other gens. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967 Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-22 11:16:13 +02:00
Sagar Ghuge	97e6d34e66	intel/compiler: Refactor disassembly of sources in 3src instruction Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	18b28b5654	intel/compiler: Don't move immediate in register On Gen12, we support mixed mode HF/F operands, and also 3 source instruction supports immediate value support, so keep immediate as it is, if it fits properly in 16 bit field. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	bf943bdf24	intel/compiler: Set bits according to source file On Gen >= 12, if src0 or src2 holds immediate value, we need set src[0/2]_is_imm bits instead of register file. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	c018c5a339	intel/compiler: Add Immediate support for 3 source instruction On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but not both. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Eric Anholt	fb9362c6fb	ci: Disable lima until its farm can get fixed. It's been throwing the following error today: "<Fault -32603: 'Internal Server Error (contact server administrator for details): could not extend file "base/17952/18226": No space left on device\nHINT: Check free disk space.\n'>" Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-10-21 20:31:34 -07:00
Sagar Ghuge	7fb75ddfa7	intel: Add missing entry for brw_nir_lower_alpha_to_coverage in Makefile Fixes: `7ecfbd4f6d` ("nir: Add alpha_to_coverage lowering pass") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-21 16:19:24 -07:00
Dave Airlie	bde08ce4d7	llvmpipe: handle compute shader launch with 0 threads If you set LP_NUM_THREADS=0 compute shaders would hang, just execute the workloads in sequence if we have no threads in the pool. Fixes: `1b24e3ba75` ("llvmpipe: add compute threadpool + mutex") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-21 22:51:23 +00:00
Marijn Suijten	0141a4cdc0	freedreno/ir3: Add missing ir3_nir_lower_tex_prefetch.c to Android.mk This file is created in `2a0d45ae6c` but addition to android makefiles was omitted. It breaks the build with missing references which are defined in this file. List the file in ir3_SOURCES to make the build succeed. Signed-off-by: Marijn Suijten <marijns95@gmail.com>	2019-10-21 22:43:00 +00:00
Samuel Pitoiset	39760793b5	ac/llvm: fix ac_to_integer_type() for 32-bit const addr space pointers This fixes some crashes with dEQP-VK.descriptor_indexing.* when read_first_invocation has its source from a descriptor. Most of these tests still fail because of an LLVM bug (they work with ACO). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 22:32:01 +02:00
Rhys Perry	73184e51d1	aco: run opt_algebraic in a loop Totals from affected shaders: SGPRS: 13920 -> 13656 (-1.90 %) VGPRS: 12972 -> 12960 (-0.09 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1005680 -> 1000648 (-0.50 %) bytes LDS: 91 -> 91 (0.00 %) blocks Max Waves: 688 -> 688 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 19:18:30 +00:00
Rhys Perry	132ae89b19	aco: use nir_lower_idiv_precise v7: rename _nv50/_llvm to _fast/_precise Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Rhys Perry	8b98d0954e	nir/lower_idiv: add new llvm-based path v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Sagar Ghuge	f729ecefef	intel/compiler: Remove emit_alpha_to_coverage workaround from backend Remove emit_alpha_to_coverage workaround from backend compiler and start using ported workaround from NIR. v2: Copy comment from brw_fs_visitor (Caio Marcelo de Oliveira Filho) Fixes piglit test on HSW: - arb_sample_shading-builtin-gl-sample-mask-mrt-alpha-to-coverage-combinations Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Sagar Ghuge	7ecfbd4f6d	nir: Add alpha_to_coverage lowering pass Importing this pass from fs_visitor::emit_alpha_to_coverage_workaround() in intel/compiler. v2 (Caio Marcelo de Oliveira Filho): - Track store output and sample mask instruction - Nest math insturction for more readability - Bail out early if no gl_SampleMask v3: (Caio Marcelo de Oliveira Filho): - Do math instructions after instruction block - Restructure code - Move pass under src/intel/compiler v4: (Caio Marcelo de Oliveira Filho): - Organize dither mask calculation Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Daniel Schürmann	0e4bd261b1	aco: ensure that uniform booleans are computed in WQM if their uses happen in WQM This fixes graphical corruption in SC2. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-21 17:39:46 +00:00
Dylan Baker	a9a9249288	meson: Require meson >= 0.49.1 when using icc or icl 0.49.0 can compile most of mesa with ICC or ICL, but not SWR without additional workarounds in our meson.build files. Bumping patch version is easier and shouldn't be a big burden anyway, especially to cover a niche compiler. The check originally only covered ICC, but now covers ICL as well. Fixes: `3740ffb59c` ("meson: add switches for SWR with MSVC") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1937 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-21 17:21:57 +00:00
Juan A. Suarez Romero	d33fe2d5eb	docs: update calendar, add news item and link release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-21 19:13:55 +02:00
Juan A. Suarez Romero	62a0e8421e	docs: add release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit cc88eeb6ffc4e86d76dfdbfc601d519bc35b6c41)	2019-10-21 19:10:52 +02:00
Juan A. Suarez Romero	7aa63ffe4f	docs: add release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit 5c6d266c591208b1c27e06f61b814210fc6e095f)	2019-10-21 19:10:49 +02:00
Timur Kristóf	7e5f87b533	aco/gfx10: Update constant addresses in fix_branches_gfx10. Due to a bug in GFX10 hardware, s_nop instructions must be added if a branch is at 0x3f. We already do this, but forgot to also update the constant addresses that come after this instruction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 14:33:54 +00:00
Timur Kristóf	f380398f8f	aco/gfx10: Fix PS exports for SPI_SHADER_32_AR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 14:33:54 +00:00
Timur Kristóf	1749953ea3	aco/gfx10: Wait for pending SMEM stores before loads Currently if you have an SMEM store followed by an SMEM load that loads the same location as was written, it won't work because the store isn't finished before the load is executed. This is NOT mitigated by an s_nop instruction on GFX10. Since we currently don't have proper alias analysis, this commit adds a workaround which will insert an s_waitcnt lgkmcnt(0) before each SSBO load if they follow a store. We should further refine this in the future when we can make sure to only add the wait when we load the same thing as has been stored. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 14:33:54 +00:00
Boris Brezillon	7fa5cd3ee3	panfrost: Fix the DISCARD_WHOLE_RES case in transfer_map() The current implementation does not synchronize on BO readiness when DISCARD_WHOLE_RES flag is set, which can lead to misbehaviours when the resource being updated is being used by one of the pending or already flushed batches. Adding unconditional BO synchronization would do the trick, but we can sometimes optimize this path by re-allocating a new BO instead of waiting for the existing one to be ready. Reported-by: Daniel Stone <daniels@collabora.com> Reported-by: Heinrich Fink <heinrich.fink@daqri.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-21 14:37:02 +02:00
Iago Toral Quiroga	2d5edf2558	st/mesa: only require ESSL 3.1 for geometry shaders According to the OES_geometry_shader spec, section Dependencies: "OpenGL ES 3.1 and OpenGL ES Shading Language 3.10 are required." Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-21 09:09:15 +00:00
Lepton Wu	f4ba31ff50	egl/android: Remove our own reference to buffers. We currently doesn't maintain it correctly and the buffer gets leaked if surface is destroyed before calling swapping buffers. From Android frameworks/native/libs/nativewindow/include/system/window.h: The window holds a reference to the buffer between dequeueBuffer and either queueBuffer or cancelBuffer, so clients only need their own reference if they might use the buffer after queueing or canceling it. v2: Remove our own reference. Fixes: `0212db3504` ("egl/android: Cancel any outstanding ANativeBuffer in surface destructor") Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1) Reviewed-By: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-21 07:50:31 +00:00
Samuel Pitoiset	b72205a4c1	radv: advertise VK_KHR_spirv_1_4 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 09:21:40 +02:00
Samuel Pitoiset	b139198b06	radv: do not dump descriptors twice in hang reports If a pipeline has both graphics and compute, descriptors are same. While we are at it, use queue->device for simplicity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	cf5e55558e	radv: dump trace files earlier if a GPU hang is detected To make sure a trace file is generated in case the driver crashes during the hang report generation (which happens sometimes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	bc2319deb2	radv: print which ring is dumped in hang reports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	076f9dce7c	radv: do not print useless descriptors info in hang reports This information has never been useful. All descriptors are already dumped with colors etc, and it's more useful. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	9da94e510c	radv: enable VK_KHR_shader_float_controls on GFX6-GFX7 Disable 16-bit features because fp16 isn't exposed on these chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:47:28 +02:00
Alyssa Rosenzweig	4c9b9ed5f9	panfrost/ci: Update expectations list A bunch of blend tests fixed on T760. A single blend test regressed on both T760/T860 but I am unable to reproduce locally so am just documenting the regression and moving on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	b8c4fb235e	pan/midgard: Implement SIMD-aware dead code elimination We would like to eliminate not just entire dead instructions, but also dead components, which increases scheduler flexibility (since some vector instructions can become scalar after eliminating dead components). This also will allow better RA in the future. Results are meh. total instructions in shared programs: 3453 -> 3451 (-0.06%) instructions in affected programs: 60 -> 58 (-3.33%) helped: 2 HURT: 0 total bundles in shared programs: 1826 -> 1824 (-0.11%) bundles in affected programs: 33 -> 31 (-6.06%) helped: 2 HURT: 0 total quadwords in shared programs: 3144 -> 3144 (0.00%) quadwords in affected programs: 0 -> 0 helped: 0 HURT: 0 total registers in shared programs: 321 -> 321 (0.00%) registers in affected programs: 45 -> 45 (0.00%) helped: 11 HURT: 11 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 16.67% max: 50.00% x̄: 39.70% x̃: 50.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for registers value: -0.45 0.45 95% mean confidence interval for registers %-change: -1.87% 62.18% Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 445 -> 447 (0.45%) threads in affected programs: 2 -> 4 (100.00%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	6c4b97011b	pan/midgard: Create dependency graph bytewise This allows for vec16 dependencies in the scheduler, not that we have any yet (thankfully). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	825f11e739	pan/midgard: Handle nontrivial masks in texture RA The texture instruction has a mask we need to take into account. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	d1d3411ba5	pan/midgard: Implement per-byte liveness tracking Now that we have notion of byte masks, liveness tracking can be updated to reflect this extra granularity without loss of correctness. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	43fd730fc4	pan/midgard: Simplify mir_bytemask_of_read_components There are easy ways to iterate sources! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	e9202ff3cb	pan/midgard: Report byte masks for read components Read component masks don't have a particular type associated, since the type of the ALU operation may not match the type of the operands in question. So let's generate byte masks instead, and update the rest of the compiler to use byte masks when analyzing reads. Preparation for mixed types. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	d079631248	pan/midgard: Add helpers for manipulating byte masks There are essentially two formats of masks in play beginning with this commit: masks per-channel and masks per-byte. The former make sense within a given fixed-size instruction; the latter are typesize-independent. It turns out you need the latter to meaningfully manipulate instructions containing multiple sizes (which is quite possible with ALU operations). Similarly, we have mir_srcsize. We calculate the size of the source by analyzing the size of the instruction itself and stepping down if there is a half-modifier. Finally, we have mir_round_bytemask_down, for when we want to take a byte mask and "round it down" to a given component size, so that we can use it as a component mask. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	e981b69484	pan/midgard: Implement OP_IS_STORE with table ..rather than open-coding. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	8e31b14858	pan/midgard: Tableize load/store ops This will allow us to encode properties about the load/store ops like we do for ALU ops. We include now properties about whether we have a store, and if there are special cases on the load/store op. We also tag each instruction by its natural size... this is probably not totally right, but it's a start. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	5952add9a9	pan/midgard: Factor out mir_get_alu_src This helper is used in a bunch of places ... might as well make that common. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	f77ea9798d	pan/midgard/disasm: Fix printing 8-bit/16-bit masks The trick is realizing even with a destination override, the masks are encoded in the same mode as the instruction itself, rather than stepping down. The override means that the smaller type is used, but the mask is parsed as if it were the higher type. Overriding down is down by printed by blinding doing this. Overriding up can be thought of as printing in the upper size, but shifting the alphabet to use the upper half, i.e. shifting xyzw to become abcd. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	d49fdca229	pan/midgard: Identify 64-bit atomic opcodes They are symmetric to their 32-bit counterparts, just shifted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00

1 2 3 4 5 ...

116604 Commits All Branches Search

116604 Commits

All Branches