KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Toni Lönnberg	ecf62a967e	intel/genxml: Add engine definition to render engine instructions (gen6) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions v4: Added missing engine to MEDIA_GATEWAY_STATE Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	571d6447d8	intel/genxml: Add engine definition to render engine instructions (gen5) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	6463ceca69	intel/genxml: Add engine definition to render engine instructions (gen45) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added addition engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	a4ca710c96	intel/genxml: Add engine definition to render engine instructions (gen4) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	102dadec81	intel/decoder: tools: Use engine for decoding batch instructions The engine to which the batch was sent to is now set to the decoder context when decoding the batch. This is needed so that we can distinguish between instructions as the render and video pipe share some of the instruction opcodes. v2: The engine is now in the decoder context and the batch decoder uses a local function for finding the instruction for an engine. v3: Spec uses engine_mask now instead of engine, replaced engine class enums with the definitions from UAPI. v4: Fix up aubinator_viewer (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	a6aab7e436	intel/decoder: tools: gen_engine to drm_i915_gem_engine_class Removed the gen_engine enum and changed the involved functions to use the drm_i915_gem_engine_class enum from UAPI instead. v3: Wrong engine was being used for blocks in video ring v4: Fixed aubinator_viewer.cpp Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	b00bccd012	intel/decoder: Engine parameter for instructions Preliminary work for adding handling of different pipes to gen_decoder. Each instruction needs to have a definition describing which engine it is meant for. If left undefined, by default, the instruction is defined for all engines. v2: Changed to use the engine class definitions from UAPI v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to engine_mask, added check for incorrect engine and added the possibility to define an instruction to multiple engines using the "\|" as a delimiter in the engine attribute. v4: Fixed the memory leak. v5: Removed an unnecessary ralloc_free(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Gert Wollny	8d4bb6e5cd	virgl: Add command and flags to initiate debugging on the host (v2) On the host VREND_DEBUG=guestallow must be set to let the guest override the debug flags. v2: Send flag string instead of flags, this avoids the need to keep the flags in sync. v3: Only request host logging if the host actually understands the command Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-11-13 14:42:22 +01:00
Gert Wollny	caa964b422	mesa: Reference count shaders that are used by transform feedback objects Transform feedback objects may hold a pointer to a shader program, and at least in Gallium, this must be a valid pointer until ctx->Driver.EndTransformFeedback in glEndTransformFeedback has been called - which is conform with the spec that any program that is part of a current rendering state should only be flagged for deletion by glDeleteProgram. This was not handled properly for the transform feedback objects so that a call sequence glUseProgram(x) glBeginTransformFreedback(...) glPauseTransformFeedback(...) glDeleteProgram(x) glEndTransformFeedback(...) would result in a use after free bug. With this patch the transform feedback object also updates the reference count to the used program thereby keeping the program valid as long as the transform feedback objects links to it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108713 Fixes: `654587696b` mesa: add end_transform_feedback() helper Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-13 10:57:25 +01:00
Samuel Pitoiset	90d68858ed	radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9 Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-13 10:24:36 +01:00
Samuel Pitoiset	f70c5d31cd	radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-13 10:24:33 +01:00
Samuel Pitoiset	b5f213bb1d	radv: binding streamout buffers doesn't change context regs Cc: 18.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-13 10:24:31 +01:00
Plamena Manolova	c5f3013cba	nir: Don't lower the local work group size if it's variable. If the local work group size is variable it won't be available at compile time so we can't lower it in nir_lower_system_values(). Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-11-13 10:57:04 +02:00
Matt Turner	efb1ccadca	util/ralloc: Make sizeof(linear_header) a multiple of 8 Prior to this patch sizeof(linear_header) was 20 bytes in a non-debug build on 32-bit platforms. We do some pointer arithmetic to calculate the next available location with ptr = (linear_size_chunk )((char )&latest[1] + latest->offset); in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation would only be 4-byte aligned. On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of 4-byte registers to memory) requires an 8-byte aligned address. Such an instruction is used to store to an 8-byte integer type, like intmax_t which is used in glcpp's expression_value_t struct. As a result of the 4-byte alignment returned by linear_alloc_child() we would generate a SIGBUS (unaligned exception) on SPARC. According to the GNU libc manual malloc() always returns memory that has at least an alignment of 8-bytes [1]. I think our allocator should do the same. So, simple fix with two parts: (1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally. (2) Mark linear_header with an aligned attribute, which will cause its sizeof to be rounded up to that alignment. (We already do this for ralloc_header) With this done, all Mesa's unit tests now pass on SPARC. [1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html Fixes: `47e1758692` ("glcpp: use the linear allocator for most objects") Bug: https://bugs.gentoo.org/636326 Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-12 20:54:49 -08:00
Matt Turner	7e3748c268	util/ralloc: Switch from DEBUG to NDEBUG The debug code is all asserts, so protect it with the same thing that controls assert. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-12 20:54:49 -08:00
Timothy Arceri	34dffcf913	nir: add support for removing redundant stores to copy prop var For example the following type of thing is seen in TCS from a number of Vulkan and DXVK games: vec1 32 ssa_557 = deref_var &oPatch (shader_out float) vec1 32 ssa_558 = intrinsic load_deref (ssa_557) () vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float) vec1 32 ssa_560 = intrinsic load_deref (ssa_559) () vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float) vec1 32 ssa_562 = intrinsic load_deref (ssa_561) () intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x / intrinsic store_deref (ssa_559, ssa_560) (1) / wrmask=x / intrinsic store_deref (ssa_561, ssa_562) (1) / wrmask=x */ No shader-db changes on i965 (SKL). vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 7832 -> 7728 (-1.33 %) VGPRS: 6476 -> 6740 (4.08 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 469572 -> 456596 (-2.76 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 989 -> 960 (-2.93 %) Wait states: 0 -> 0 (0.00 %) The Max Waves and VGPRS changes here are misleading. What is happening is a bunch of TCS outputs are being optimised away as they are now recognised as unused. This results in more varyings being compacted via nir_compact_varyings() which can result in more register pressure when they are not packed in an optimal way. This is an existing problem independent of this patch. I've run some benchmarks and haven't noticed any performance regressions in affected games. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 15:19:36 +11:00
Timothy Arceri	3561108de0	anv/i965: make use of nir_link_constant_varyings() shader-db results for SLK: total instructions in shared programs: 13106498 -> 13091573 (-0.11%) instructions in affected programs: 1186244 -> 1171319 (-1.26%) helped: 6186 HURT: 0 total cycles in shared programs: 332062633 -> 331961653 (-0.03%) cycles in affected programs: 8537165 -> 8436185 (-1.18%) helped: 5371 HURT: 862 LOST: 6 GAINED: 14 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 14:06:32 +11:00
Eric Anholt	621b0fa892	egl: Improve the debugging of gbm format matching in DRI configs. Previously the debug would be: libEGL debug: No DRI config supports native format 0x20203852 libEGL debug: No DRI config supports native format 0x38385247 but libEGL debug: No DRI config supports native format R8 libEGL debug: No DRI config supports native format GR88 is a lot easier to understand. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-11-12 15:20:23 -08:00
Eric Anholt	6328536ff2	gbm: Introduce a helper function for printing GBM format names. This requires that the caller make a little (stack) allocation to store the string. v2: Use gbm_format_canonicalize (suggested by Daniel) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-11-12 15:20:23 -08:00
Eric Anholt	ee7f848c00	gbm: Move gbm_format_canonicalize() to the core. I want it for the format name debugging code. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-11-12 15:20:23 -08:00
Dylan Baker	4eab98b66e	meson: fix libatomic tests There are two problems: 1) the extra underscore in MISSING_64BIT_ATOMICS 2) we should link with libatomic if the previous test decided we needed it Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>	2018-11-12 13:29:00 -08:00
Marek Olšák	32a334777c	mesa: mark GL_SR8_EXT non-renderable on GLES Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-12 16:19:43 -05:00
Marek Olšák	e0c7114eb3	st/mesa: disable L3 thread pinning This implementation can have massive drawbacks. Cc: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2018-11-12 16:18:15 -05:00
Christian Gmeiner	c6aaafa3a1	nir: add lowering for ffloor Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-12 21:57:25 +01:00
Alyssa Rosenzweig	41c8f99137	util: Fix warning in u_cpu_detect on non-x86 regs is only set and used on x86; on other platforms (like ARM), this code causes a trivial warning, solved by moving the regs declaration to the architecture-dependent usage. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2018-11-12 10:28:04 -08:00
Dylan Baker	9c2a95b298	meson: Don't set -Wall meson does this for you with its warn levels, so we don't need to set it ourselves. Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 08:55:55 -08:00
Rob Clark	4a0c2cfdd6	freedreno/drm: fix unused 'entry' warnings Looks like importing libdrm_freedreno into mesa crossed paths with `e27902a261`. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-12 10:45:48 -05:00
Lionel Landwerlin	89785e2d56	i965: add support for sampling from AYUV Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 13:22:54 +00:00
Lionel Landwerlin	252ca7b43f	dri: add AYUV format v2: Add a AYUV entry android in the android backend (Tapani) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 13:22:54 +00:00
Lionel Landwerlin	8a15f06d19	nir/lower_tex: Add AYUV lowering support Byte ordering is : 0: V 1: U 2: Y 3: A v2: Split refactoring of alpha channel (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)	2018-11-12 13:22:54 +00:00
Lionel Landwerlin	0a30c33e83	nir/lower_tex: add alpha channel parameter for yuv lowering We're about to introduce AYUV support which provides its own alpha channel. So give alpha as a parameter and set it to 1 on exising formats. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 13:22:54 +00:00
Samuel Pitoiset	97fb1a02fd	radv: make use of num_good_cu_per_sh in si_emit_graphics() too Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-12 09:35:46 +01:00
Samuel Pitoiset	d9d14346c2	radv: clean up setting partial_es_wave for distributed tess on VI Only needed when the pipeline actually uses tessellation. I don't think that changes anything, except improving readability. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-12 09:35:44 +01:00
Samuel Pitoiset	cc4569b733	radv: cleanup and document a Hawaii bug with offchip buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-12 09:35:42 +01:00
Hanno Böck	8dc2085baf	glsl/test: Fix use after free in test_optpass. The variable state is free'd and afterwards state->error is used as the return value, resulting in a use after free bug detected by memory safety tools like address sanitizer. Signed-off-by: Hanno Böck <hanno@hboeck.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-12 07:42:58 +02:00
Timothy Arceri	a068958692	nir: don't pack varyings ints with floats unless flat Fixes: `1c9c42d16b` ("nir: add varying component packing helpers") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-12 15:38:56 +11:00
Timothy Arceri	9dd737bb02	nir: add glsl_type_is_integer() helper Fixes: `1c9c42d16b` ("nir: add varying component packing helpers") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-12 15:38:56 +11:00
Francisco Jerez	552642066f	intel/fs: Prevent emission of IR instructions not aligned to their own execution size. This can occur during payload setup of SIMD-split send message instructions, which can lead to the emission of header setup instructions with a non-zero channel group and fixed SIMD width. Such instructions could end up using undefined channel enable signals except they don't care since they're always marked force_writemask_all. Not known to affect correctness of any workload at this point, but it would be trivial to back-port to stable if something comes up. Reported-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>	2018-11-09 19:39:22 -08:00
Timothy Arceri	590fcb50e7	st/mesa: make use of nir_link_constant_varyings() Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 161464 -> 161368 (-0.06 %) VGPRS: 86904 -> 86292 (-0.70 %) Spilled SGPRs: 296 -> 314 (6.08 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3618596 -> 3573852 (-1.24 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 26189 -> 26276 (0.33 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-10 11:41:00 +11:00
Timothy Arceri	d40dd05553	nir: add new linking opt nir_link_constant_varyings() This pass moves constant outputs to the consuming shader stage where possible. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-10 11:41:00 +11:00
Andre Heider	414470854d	st/nine: clean up thead shutdown sequence a bit Just break out of the loop instead, it does the same thing. Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-11-09 22:37:27 +01:00
Andre Heider	123bf9cbe7	st/nine: plug thread related leaks Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-11-09 22:37:27 +01:00
Andre Heider	10598c9667	st/nine: fix stack corruption due to ABI mismatch This fixes various crashes and hangs when using nine's 'thread_submit' feature. On 64bit, the thread function's data argument would just be NULL. On 32bit, the data argument would be garbage depending on the compiler flags (in my case -march>=core2). Fixes: `f3fa7e3068` ("st/nine: Use WINE thread for threadpool") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-11-09 22:37:26 +01:00
Marek Olšák	d2b2364313	radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET only Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-09 14:55:04 -05:00
Marek Olšák	4bec5025ac	gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-09 14:55:04 -05:00
Marek Olšák	9dc776f3f2	radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2 and add has_dcc_constant_encode.	2018-11-09 14:55:04 -05:00
Marek Olšák	832ab883e2	radeonsi: use better DCC clear codes Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-09 14:55:04 -05:00
Marek Olšák	d059eae269	ac/surface: remove the overallocation workaround for Vega12 not needed anymore (probably since the tile_swizzle fix) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-09 14:55:04 -05:00
Lionel Landwerlin	959e2a5aeb	intel/aub_read: remove useless breaks Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-09 18:17:30 +00:00
Erik Faye-Lund	b55af392d9	Revert "mesa: expose NV_conditional_render on GLES" This reverts commit `5213be9fab`.	2018-11-09 17:39:25 +01:00

1 2 3 4 5 ...

105732 Commits All Branches Search

105732 Commits

All Branches