KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	ce454d02cc	radv: simplify a condition in radv_src_access_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:17 +02:00
Samuel Pitoiset	1ff25c4e6b	radv: save current state just before resolving with FS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:15 +02:00
Samuel Pitoiset	c3d5f124c6	radv: don't check if a subpass has resolve attachments twice We already check that in radv_cmd_buffer_resolve_subpass(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:13 +02:00
Samuel Pitoiset	0a8127bbfb	radv: make use of radv_subpass_barrier() when resolving subpasses The goal is to use radv_barrier()/radv_subpass_barrier() as much as possible for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:11 +02:00
Rhys Perry	409a60df3b	nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding total instructions in shared programs : 5480808 -> 5472107 (-0.16%) total gprs used in shared programs : 647530 -> 647532 (0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58551648 -> 58459352 (-0.16%) local shared gpr inst bytes helped 0 0 73 2609 2609 hurt 0 0 71 34 34	2018-07-19 23:34:58 +02:00
Rhys Perry	2afef231db	nv50/ir: handle SHLADD in IndirectPropagation An alternative solution to the problem fixed in `0bd83d0` ("nv50/ir: move LateAlgebraicOpt to the very end"). total instructions in shared programs : 5481195 -> 5480808 (-0.01%) total gprs used in shared programs : 647535 -> 647530 (-0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58555784 -> 58551648 (-0.01%) local shared gpr inst bytes helped 0 0 2 34 34 hurt 0 0 0 0 0	2018-07-19 23:34:58 +02:00
Rhys Perry	3b6edd0b59	gm107/ir: use CS2R for SV_CLOCK This instruction seems to be faster than S2R and requires no barrier, though the range of special registers it can read from is limited. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-19 23:34:58 +02:00
Lionel Landwerlin	94cf964586	intel: tools: dump: remove mentions of intel_aubdump Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 20:12:53 +01:00
Lionel Landwerlin	0f9d8b754f	intel: tools: aubwrite: fix invalid frees on finish Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-07-19 20:11:56 +01:00
Samuel Pitoiset	3d41757788	ac/nir: add a workaround for bitfield_extract when count is 0 LLVM 7 returns incorrect results when count is 0, something has been broken since LLVM 6. Of course, the best solution is to fix LLVM but this workaround works as expected for now. Original workaround by Philippe Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-19 20:41:10 +02:00
Nanley Chery	e2e32b6afd	intel/isl/gen4: Make depth/stencil buffers Y-Tiled Rendering to a linear depth buffer on gen4 is causing a GPU hang in the CI system. Until a better explanation is found, assume that errata is applicable to all gen4 platforms. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Nanley Chery	44ab26d0c9	i965/misc: Use depth/stencil surf's tiling on gen4-5 Make the 3D engine aware of the depth/stencil surface's tiling before doing any render operations. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Caio Marcelo de Oliveira Filho	507a8037a7	glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch When handling 'if' in copy propagation elements, if a certain variable was killed when processing the first branch of the 'if', then the second would get any propagation from previous nodes. x = y; if (...) { z = x; // This would turn into z = y. x = 22; // x gets killed. } else { w = x; // This would NOT turn into w = y. } With the change, we let copy propagation happen independently in the two branches and only then apply the killed values for the subsequent code. One example in shader-db part of shaders/unity/8.shader_test: (assign (xyz) (var_ref col_1) (var_ref tmpvar_8) ) (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) ( (assign (xyz) (var_ref col_1) (expression vec3 + (var_ref tmpvar_8) ... ) ... ) ) ( (assign (xyz) (var_ref col_1) (expression vec3 lrp (var_ref col_1) ... ) ... ) )) The variable col_1 was replaced by tmpvar_8 in the then-part but not in the else-part. NIR deals well with copy propagation, so it already covered for the missing ones that this patch fixes. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:59 -07:00
Caio Marcelo de Oliveira Filho	e4f32dec23	glsl: change opt_copy_propagation_elements data structures Instead of keeping multiple acp_entries in lists, have a single acp_entry per variable. With this, the implementation of clone is more convenient and now fully implemented. In the previous code, clone was only partial. Before this patch, each acp_entry struct represented a write to a variable including LHS, RHS and a mask of what channels were written to. There were two main hash tables, the first (lhs_ht) stored a list of acp_entries per LHS variable, with the values available to copy for that variable; the second (rhs_ht) was a "reverse index" for the first hash table, so stored acp_entries per RHS variable. After the patch, there's a single acp_entry struct per LHS variable, it contains an array with references to the RHS variables per channel. There now is a single hash table, from LHS variable to the corresponding entry. The "reverse index" is stored in the ACP entry, in the form of a set of variables that copy from the LHS. To make the clone operation cheaper, the ACP entries are created on demand. This should not change the result of copy propagation, a later patch will take advantage of the clone operation. v2: Add note clarifying how the hashtable is destroyed. v3: (all from Eric Anholt) Add remove_unused_var_from_dsts() function for reuse. Remove from dsts as we go instead of clearing at the end. Add clarifying comment to erase(). Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Caio Marcelo de Oliveira Filho	7b0d395250	glsl: separate copy propagation state Separate higher level logic of visiting instructions and chosing when to store and use new copy data from the datastructure holding the copy propagation information. This will also make easier later patches that change the structure. v2: Remove empty destructor and clarify how hash tables are destroyed. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Lionel Landwerlin	49e86f09fe	intel: tools: dump: trace memory writes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 16:48:42 +01:00
Lionel Landwerlin	5ba3e5c358	intel: tools: dump: remove command execution feature In commit `86cb05a6d3` ("intel: aubinator: remove standard input processing option") we removed the ability to process aub as an input stream because we're now rely on mmapping the aub file to back the buffers aubinator is parsing. intel_aubdump was the provider of the standard input data and since we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa, we don't need that code anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-19 10:11:54 +01:00
Danylo Piliaiev	494a206229	radv: Fix incorrect assumption about ternary operator precedence Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 10:04:27 +02:00
Marek Olšák	dcbcc83003	mesa: fix make check for AMD_performance_monitor	2018-07-19 01:17:01 -04:00
Marek Olšák	f097f0c55c	mesa: remove dead code from api_loopback This should only contain functions not set in vtxfmt.c. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 01:10:32 -04:00
Marek Olšák	987c2ece03	mesa: expose ARB_indirect_parameters in the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) v2: fix dispatch_sanity	2018-07-19 01:10:18 -04:00
Marek Olšák	d40188800e	vbo: fix ARB_multi_draw_indirect for the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	6c4652ea8a	mesa: expose ARB_shader_viewport_layer_array in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	da528898bc	mesa: expose ARB_ES3_1_compatibility in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	565dacc3d6	winsys/amdgpu: remove RADEON_SURF_FMASK leftover RADEON_SURF_FMASK is never set.	2018-07-19 00:58:51 -04:00
Marek Olšák	9b82d128c9	ac: run LLVM optimization passes only on the final function after inlining	2018-07-19 00:58:49 -04:00
Bas Nieuwenhuizen	17b5a59b4e	radv: Enable binning and dfsm by default on Raven. Seems like it increases performance by 2-3% for some demos and games. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:21 +02:00
Bas Nieuwenhuizen	978570769d	radv: Always set disable zpass increment bit when possible. When no occlusion queries are active even if out of order is enabled. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:10 +02:00
Bas Nieuwenhuizen	82664af6cf	radv: Select correct entries for binning. Overshot it by one every time. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:01 +02:00
Bas Nieuwenhuizen	760211b77c	radv: Fix number of samples used for binning. Used the wrong register ... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:54 +02:00
Bas Nieuwenhuizen	c0144e915a	radv: Disable disabled color buffers in rbplus opts. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:47 +02:00
Marek Olšák	fb049742d6	r600: silence the signed overflow warning like radeonsi r600_gpu_load.c: In function ‘r600_gpu_load_thread’: ../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow] if (start <= end)	2018-07-18 17:48:48 -04:00
Andres Rodriguez	d3d9513556	radv: fix wmaybe-uninitialized in radv_meta_fast_clear.c Assignment and usage of this variable both happen inside an if(rad_image_has_dcc()) {} blocks. It seems gcc plays it safe and assumes that both function calls could have different return values. But in this case we should be safe. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-18 15:32:51 -04:00
Sonny Jiang	4bf7234061	radeonsi: emit_guardband packets optimization Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 15:04:27 -04:00
Sonny Jiang	80ade05b8d	radeonsi: Save CLEAR_STATE initial values for optimization Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 15:04:27 -04:00
Jan Vesely	9baacf3fa7	radeonsi: Refuse to accept code with unhandled relocations They might lead to unrecoverable GPU hang. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-07-18 13:56:56 -04:00
Eric Anholt	70534dbe29	Allow AMD_perfmon on GLES contexts v2: whitespace alignment fix Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:39:21 -07:00
Eric Anholt	4ba478d7cd	egl: Use the canonical drm-uapi fourcc header to avoid local defines. We should only use a #define locally once it's been upstreamed, and at that point you should just update our drm_fourcc.h. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 10:37:54 -07:00
Eric Anholt	2c6279d58b	v3d: Fix tiling modifier support to use the new UIF define. You can't use T tiled buffers on V3D 3.x and newer, it's been replaced with a newer layout shared with other hardware blocks.	2018-07-18 10:37:49 -07:00
Eric Anholt	6c0482e176	drm-uapi: Update drm_fourcc.h for new format modifiers. This brings in the Broadcom VC4 SAND and V3D 3.x+ UIF modifiers, from drm-next commit 4da1d4c751c9b1b713c13043bad7c4d27cd1418c.	2018-07-18 10:37:49 -07:00
Marek Olšák	201ebf51d1	st/mesa: notify u_vbuf/driver that draw index bounds are unknown for indirect Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-18 13:33:30 -04:00
Timothy Pearson	e1621fda84	radeonsi: Use signed char for color_interp_vgpr_index color_interp_vgpr_index was declared as a generic char value. Because signed values are used in this variable, the result was not safe across architectures and crashed on ppc64[el] and arm. Declare color_interp_vgpr_index as a signed type. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 13:31:29 -04:00
Jason Ekstrand	aaa6fac8f6	intel/blorp: Take an explicit filter parameter in blorp_blit This lets us move the glBlitFramebuffer nonsense into the GL driver and make the usage of BLORP mutch more explicit and obvious as to what it's doing. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-18 09:47:28 -07:00
Jason Ekstrand	9fbe2a2007	intel/blorp: Add a blorp_filter enum for use in blorp_blit At the moment, this is entirely internal but we'll expose it to clients of the BLORP API in the next commit. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-18 09:47:28 -07:00
Caio Marcelo de Oliveira Filho	ea556471a1	intel/tools: add missing include for stdarg.h Fixes build in GCC 8.1.1: FAILED: src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o gcc -Isrc/intel/tools/src@intel@tools@@intel_dump_gpu@sha -Isrc/intel/tools -I../../src/intel/tools -Isrc/../include -I../../src/../include -Isrc -I../../src -Isrc/mapi -I../../src/mapi -Isrc/mesa -I../../src/mesa -I../../src/gallium/include -I../../src/gallium/auxiliary -Isrc/intel -I../../src/intel -I../../include/drm-uapi -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS '-DVERSION="18.2.0-devel"' -DPACKAGE_VERSION=VERSION '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"' -DGLX_USE_TLS -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DHAVE_SURFACELESS_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD -DHAVE_LIBDRM -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=1 -DHAVE_VALGRIND -DHAVE_LIBUNWIND -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Wall -Werror=implicit-function-declaration -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -Wno-missing-field-initializers -fPIC -fvisibility=hidden -Wno-override-init -MD -MQ 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -MF 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o.d' -o 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -c ../../src/intel/tools/aub_write.c ../../src/intel/tools/aub_write.c: In function ‘fail_if’: ../../src/intel/tools/aub_write.c:243:4: error: implicit declaration of function ‘va_start’; did you mean ‘assert’? [-Werror=implicit-function-declaration] va_start(args, format); ^~~~~~~~ assert ../../src/intel/tools/aub_write.c:245:4: error: implicit declaration of function ‘va_end’; did you mean ‘rand’? [-Werror=implicit-function-declaration] va_end(args); ^~~~~~ rand cc1: some warnings being treated as errors Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:19:22 -07:00
Jason Ekstrand	2be30a1a39	intel/tools: Rename error2aub to intel_error2aub Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 09:03:05 -07:00
Danylo Piliaiev	d219521379	i965: Sweep NIR after linking phase to free held memory After optimization passes and many trasfromations most of memory NIR holds is a garbage which was being freed only after shader deletion. Freeing it at the end of linking will save memory which would be useful in case there are a lot of complex shaders being compiled. The common case for this issue is 32bit game running under Wine. The cost of the optimization is around ~3-5% of compilation speed with complex shaders. V2: by Jason Ekstrand - Move nir_sweep up, right after the last change of NIR Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103274 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2018-07-18 09:00:18 -07:00
Marek Olšák	51d6b163da	winsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2) Dependencies between rings are inserted correctly if a buffer is represented by only one unique amdgpu_winsys_bo instance. Use a hash table keyed by amdgpu_bo_handle to have exactly one amdgpu_winsys_bo per amdgpu_bo_handle. v2: return offset and stride properly Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	e06b8ec106	winsys/amdgpu: use a better hash_pointer function Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	53684e9163	winsys/amdgpu: clean up error handling in amdgpu_bo_from_handle Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00

1 2 3 4 5 ...

103642 Commits All Branches Search

103642 Commits

All Branches