KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	7365626d78	radv: reorder cmd_state to remove a hole. This just removes a hole in the cmd_state and packs some bools together. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:10:53 +00:00
Dave Airlie	f0ae06a13c	radv: free attachments on end command buffer. If we allocate attachments in the begin command buffer due to the render pass continue bit, we were leaking them. Since renderpasses inside a cmd buffer malloc/free these properly, and set to NULL, we just need to call free at end. Fixes a memory leak with multithreading demo. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:03:47 +00:00
Bas Nieuwenhuizen	608af05ffb	radv: Optimize calling radv_save_descriptors. uint32_t data[MAX_SETS * 2] = {}; was getting executed before the exit and took significant amounts of time. By having the check outside the function, we skip the execution of the clear. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-04 20:18:17 +01:00
Bas Nieuwenhuizen	cecbcf4b2d	radv: Use an array to store descriptor sets. The vram_list linked list resulted in lots of pointer chasing. Replacing this with an array instead improves descriptor set allocation CPU usage by 3x at least (when also considering the free), because it had to iterate through 300-400 sets on average. Not a huge improvement as the pre-improvement CPU usage was only about 2.3% in the busiest thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-04 20:18:17 +01:00
Pierre Moreau	b041687ed1	nv50,nvc0: Display shared memory usage in pipe_debug_message Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-11-04 14:12:07 -04:00
Pierre Moreau	efe532b739	nv50,nvc0: Copy shared memory per block to the program info structure and back In OpenCL/CUDA kernels, shared memory usage can be defined within the kernel code. Those usage will only be picked up while parsing the SPIR-V, during the translation phase of the program. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-11-04 14:12:07 -04:00
Pierre Moreau	49752e99f8	nv50/ir: Store shared memory per block in nv50_ir_prog_info Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-11-04 14:12:07 -04:00
Anuj Phogat	898e5555de	i965/gen10: Implement Wa3DStateMode This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Remove the bits enabling Float blend optimization. It is enabled through CACHE_MODE_SS register. Update the comment. Move gen10 if block on top of gen9 if block. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:34 -07:00
Anuj Phogat	6c681b4cc1	i965/gen10: Enable float blend optimization This optimization is enabled for previous generations too. See Mesa commit `c17e214a6b` On CNL this bit has been moved to CACHE_MODE_SS register. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:34 -07:00
Anuj Phogat	d3d0fe4572	i965/gen10: Implement WaForceRCPFEHangWorkaround This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Add the check for Post Sync Operation. Update the workaround comment. Use braces around if-else. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:34 -07:00
Anuj Phogat	3cf4fe2219	i965/gen10: Implement WaSampleOffsetIZ workaround There are few other (duplicate) workarounds which have similar recommendations: WaFlushHangWhenNonPipelineStateAndMarkerStalled WaCSStallBefore3DSamplePattern WaPipeControlBefore3DStateSamplePattern WaPipeControlBefore3DStateSamplePattern has some extra recommendations if driver is using mid batch context restore. Ignoring it for now because We're not doing mid-batch context restore in Mesa. This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Use brw_load_register_imm32() to program CACHE_MODE_0. Get rid of brw_flush_gpu_caches(). V3: Make the workaround helper functions static. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by :Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:33 -07:00
Anuj Phogat	7a09be2dc9	i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:30:33 -07:00
Anuj Phogat	2d10eb5ed8	i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:30:33 -07:00
Andrey Grodzovsky	19fc3cdcfb	winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx. Fixes reverted patch `f03b7c9` by doing VMID reservation per process and not per context. Also updates required amdgpu libdrm version since the change involved interface updates in amdgpu libdrm. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-11-03 18:06:17 +01:00
Lionel Landwerlin	24ec29b919	i965: perf: list registers to program for queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:25:36 +00:00
Lionel Landwerlin	285a2192f9	i965: perf: factorize code for availability Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:23:39 +00:00
Lionel Landwerlin	05231a4e74	i965: perf: make revision variable available This will be used in the next commit to build up register programming. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:23:22 +00:00
Nicolai Hähnle	ca63a5ed3e	glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx The dynamic index of a vector (not array!) is lowered to a sequence of conditional assignments. However, the interpolate_at_* expressions require that the interpolant is an l-value of a shader input. So instead of doing conditional assignments of parts of the shader input and then interpolating that (which is nonsensical), we interpolate the entire shader input and then do conditional assignments of the interpolated result. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-03 14:30:08 +01:00
Nicolai Hähnle	4f42450b86	glsl: allow any l-value of an input variable as interpolant in interpolateAt* The intended rule has been clarified in GLSL 4.60, Section 8.13.2 (Interpolation Functions): "For all of the interpolation functions, interpolant must be an l-value from an in declaration; this can include a variable, a block or structure member, an array element, or some combination of these. Component selection operators (e.g., .xy) may be used when specifying interpolant." For members of interface blocks, var->data.must_be_shader_input must be determined on-the-fly after lowering interface blocks, since we don't want to disable varying packing for an entire block just because one input in it is used in interpolateAt. v2: keep setting must_be_shader_input in ast_function (Ian) v3: follow the relaxed rule of GLSL 4.60 v4: only apply the relaxed rules to desktop GL (the ES WG decided that the relaxed rules may apply in a future version but not retroactively; see also dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101378 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-03 14:30:08 +01:00
Dave Airlie	57372c5a42	nir/serialize: fix build with gcc 4.4.7 I had to build on RHEL6 today, and noticed this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 15:03:35 +10:00
Dave Airlie	0722b6d693	i915g: remove some unknown cap warnings.	2017-11-03 15:03:30 +10:00
Dave Airlie	cc69f2385e	i915g: make gears run again. We need to validate some structs exist before we dirty the states, and avoid the problem in some other places. Fixes: `e027935a7` ("st/mesa: don't update unrelated states in non-draw calls such as Clear")	2017-11-03 15:03:30 +10:00
Timothy Arceri	6e2eb96b64	ac: remove the remaining duplicate llvm types Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	e73a467005	ac: remove usused v4f32 Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	7f4966731f	ac: add v2f32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	cd6cfd1095	ac: use the ac f16 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	8f651ae062	ac: use the ac f32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	368654a299	ac: use the ac f64 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	d927db0672	ac: use the common v8i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	9db51b2393	ac: use the common v4i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	ee376ac6f4	ac: add v3i32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	309a51411d	ac: add v2i32 to the common code and use it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	c64cfa0392	ac: use the ac i64 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	3d45acf71c	ac: remove unused i16 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	4d4799643d	ac: use the ac ivoidt llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	209ad5c16f	ac: use the ac i8 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	21d71189ec	ac: use the ac i1 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	bd59a0bb8b	ac: use the ac i32 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	439a2febc4	ac/radeonsi: add support for tex instr without a derefence These are produced by nir_lower_bitmap(), adding the missing derefence would cause other issues that need to be hacked around such as skipping sampler lowering and uniform location assignment, so this change seems the correct way to go. Fixes 194 piglit crashes on radeonsi using NIR. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:51 +11:00
Timothy Arceri	440d08fe93	nir: skip lowering sampler if there is no dereference This avoids a crash on the output of nir_lower_bitmap(). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:46 +11:00
Dave Airlie	de126b0402	r600: add support for early depth/stencil. This add support for the early depth/stencil property found on image shaders. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:37 +10:00
Dave Airlie	f3c6149c26	r600: add support for emitting RAT instructions to the assembler. This adds support for emitting RAT instructions to the assembler. RAT instructions are used to implement image accessors. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:33 +10:00
Dave Airlie	159bf38c3a	r600: add support for mark bit to the assembler. This adds support to the assembler for the mark bit on the export word1. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:30 +10:00
Dave Airlie	90ca378080	r600: add support for valid pixel mode on CF clauses This just adds support to the assembler for setting the valid pixel mode on the CF clause. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:26 +10:00
Dave Airlie	d584b4671f	r600: add support for some ALU sources. These special ALU sources provide the shader engine, simd and hw wave ids. These are required for images support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:31:50 +10:00
Samuel Pitoiset	bad31f6a65	radv: use the optimal packets order for dispatch calls This should reduce the time where compute units are idle, mainly for meta operations because they use a bunch of compute shaders. This seems to have a really minor positive effect for Talos, at least. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-02 23:03:59 +01:00
Timothy Arceri	cf5f8f55c3	nir: add tess patch support to nir_remove_unused_varyings() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 08:58:39 +11:00
Dylan Baker	4ff6187b84	es2api/ABI-check: Add es3.x symbols Currently this ABI check only checks for es2 symbols, but es3.x symbols are also exposed. Exposing these symbols is recommended by Khronos, and as such the test should accept that as ABI. see: https://lists.freedesktop.org/archives/mesa-stable/2016-June/004545.html for the discussion about exposing these symbols cc: Ian Romanick <idr@freedesktop.org> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-11-02 14:50:52 -07:00
Dylan Baker	a5635d993a	meson: Set c visibility args for wayland-drm Because otherwise gbm will expose wayland symbols that it shouldn't. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-02 14:50:18 -07:00
Timothy Arceri	4837ad4832	st/glsl_to_nir: pass gl_shader_program to st_finalize_nir() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 08:32:35 +11:00

1 2 3 4 5 ...

97308 Commits All Branches Search

97308 Commits

All Branches