KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	874db83e24	egl/dri2: don't use the template keyword for C++ editors Reviewed-by: Brian Paul <brianp@vmware.com>	2017-09-30 19:03:07 +02:00
Benedikt Schemmer	3797a82e78	radeonsi/uvd: clean up si_video_buffer_create V2: remove code duplication and one unnessecary variable, minor whitespace fix Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-09-30 19:03:07 +02:00
Marek Olšák	e9cf64a67c	radeonsi/uvd: fix planar formats broken since `f70f6baaa3` Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-30 19:03:07 +02:00
Roland Scheidegger	740a1618c3	gallium: add new LOD opcode The operation performed is all the same as LODQ, but with the usual differences between dx10 and GL texture opcodes, that is separate resource and sampler indices (plus result swizzling, and setting z/w channels to zero). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-30 02:58:09 +02:00
Kamil Páral	d5e7ce28b5	drirc: whitelist glthread for Outlast FPS increase 10-20% in starting locations on Core i5-4570 + Radeon R9 270.	2017-09-29 20:53:32 +02:00
Jan Vesely	7148795665	travis: Add clover build using llvm-5.0 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-09-29 12:14:34 -04:00
Jan Vesely	8af90b59f9	travis: Add clover build using llvm-4.0 llvm-4 needs gcc 4.8: http://releases.llvm.org/4.0.1/docs/ReleaseNotes.html#non-comprehensive-list-of-changes-in-this-release Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-09-29 12:14:34 -04:00
Jan Vesely	b9a358a3e6	travis: Add clover build using llvm-3.9 Use r600,radeonsi instead of i915 Update binutils, new linker is required for llvm-3.9: https://www.ubuntuupdates.org/package/core/trusty/universe/updates/binutils-2.26 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-09-29 12:14:34 -04:00
Leo Liu	361d8f82c0	st/va: add dst rect to avoid scale on deint For 1080p video transcode, the height will be scaled to 1088 when deint to progressive buffer. Set dst rect to make sure no scale. Fixes: `3ad8687` "st/va: use new vl_compositor_yuv_deint_full() to deint" Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Andy Furniss <adf.lists@gmail.com>	2017-09-29 10:06:30 -04:00
Nicolai Hähnle	d190bfc1ad	radeonsi: emit DLDEXP and DFRACEXP TGSI opcodes Note: this causes spurious regressions in some current piglit tests, because the tests incorrectly assume that there is no denorm support for doubles. I'm going to send out a fix for those tests as well. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:08:07 +02:00
Nicolai Hähnle	061303e4fd	radeonsi: emit LDEXP opcode The LLVM intrinsic has existed for a long time. The current name was established in LLVM 3.9. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:08:04 +02:00
Nicolai Hähnle	6de5147d20	st/glsl_to_tgsi: use LDEXP when available Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:08:03 +02:00
Nicolai Hähnle	cad959d901	gallium: add LDEXP TGSI instruction and corresponding cap Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:08:01 +02:00
Nicolai Hähnle	2b0bfc51de	tgsi: infer that dst[1] of DFRACEXP is an integer Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:59 +02:00
Nicolai Hähnle	5cf279bf7e	gallivm: add support for TGSI instructions with two outputs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:57 +02:00
Nicolai Hähnle	7af64b4d4a	gallivm: add dst register index to lp_build_tgsi_context::emit_store Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:55 +02:00
Nicolai Hähnle	3c78215a1c	tgsi: clarify the semantics of DFRACEXP The status quo is quite the mess: 1. tgsi_exec will do a per-channel computation, and store the dst[0] result (significand) correctly for each channel. The dst[1] result (exponent) will be written to the first bit set in the writemask. So per-component calculation only works partially. 2. r600 will only do a single computation. It will replicate the exponent but not the significand. 3. The docs pretend that there's per-component calculation, but even get dst[0] and dst[1] confused. 4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions, and kind-of assumes that everything is replicated, generating this for the dvec4 case: DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw Settle on the simplest behavior, which is single-component calculation with replication, document it, and adjust tgsi_exec and r600. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:50 +02:00
Nicolai Hähnle	dbe7fc00d5	tgsi: fix the documentation of DLDEXP Sourcing the exponent for the zw destination pair from Z is consistent with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always generates per-channel instructions anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:46 +02:00
Nicolai Hähnle	d713af711d	tgsi: infer that DLDEXP's second source has an integer type Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:33 +02:00
Nicolai Hähnle	93bf9c114b	glsl/lower_instruction: handle denorms and overflow in ldexp correctly GLSL ES requires both, and while GLSL explicitly doesn't require correct overflow handling, it does appear to require handling input inf/denorms correctly. Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.* Cc: mesa-stable@lists.freedesktop.org Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:08 +02:00
Nicolai Hähnle	a208cd7ae4	util/queue: fix a race condition in the fence code A tempting alternative fix would be adding a lock/unlock pair in util_queue_fence_is_signalled. However, that wouldn't actually improve anything in the semantics of util_queue_fence_is_signalled, while making that test much more heavy-weight. So this lock/unlock pair in util_queue_fence_destroy for "flushing out" other threads that may still be in util_queue_fence_signal looks like the better fix. v2: rephrase the comment Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>	2017-09-29 11:52:41 +02:00
Nicolai Hähnle	c49400a03b	r600: cleanup set_occlusion_query_state This fixes a warning caused by the fork (note the change in the function signature): ../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’: ../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state; Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-29 11:47:37 +02:00
Nicolai Hähnle	5184a1e8ee	r300: add missing case PIPE_SHADER_CAP_INT64_ATOMICS Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-29 11:47:34 +02:00
Nicolai Hähnle	797dd12c7b	radeonsi: fix border color translation for integer textures This fixes the extremely unlikely case that an application uses 0x80000000 or 0x3f800000 as border color for an integer texture and helps in the also, but perhaps slightly less, unlikely case that 1 is used as a border color. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:45:08 +02:00
Nicolai Hähnle	6eb9483912	radeonsi: clamp border colors for upgraded depth textures The hardware does this automatically for unorm formats, but we need to do it manually for unorm depth formats that have been upgraded to Z32_FLOAT. Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth and others. Fixes: `d4d9ec55c5` ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:45:05 +02:00
Nicolai Hähnle	4c56e07029	radeonsi: clamp depth comparison value only for fixed point formats The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: `d4d9ec55c5` ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:44:50 +02:00
Nicolai Hähnle	7dfa891f32	radeonsi/gfx9: fix geometry shaders without output vertices Not that those are super common or useful, but hey! Fun corner cases of the API... Fixes dEQP-GLES31.functional.geometry_shading.emit.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:43:09 +02:00
Nicolai Hähnle	a6ea4c1b93	amd/common: save an instruction in the build_cube_select sequence Avoid a v_cndmask: the absolute value is free due to input modifiers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:43:07 +02:00
Nicolai Hähnle	5be5c1e0fa	amd/common: fix build_cube_select Fix the custom cube coord selection sequence to be identical to the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling with user-provided derivatives. Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:43:04 +02:00
Nicolai Hähnle	8ea7d3a5c8	st/glsl_to_tgsi: fix conditional assignments to packed shader outputs Overriding the default (no-op) swizzle is clearly counter-productive, since the whole point is putting the destination register as one of the source operands so that it remains unmodified when the assignment condition is false. Fragment depth and stencil outputs are a special case due to how their source swizzles are manipulated in translate_src when compiling to TGSI. Fixes dEQP-GLES2.functional.shaders.conditionals.if.*_vertex Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:42:59 +02:00
Nicolai Hähnle	2703fa613b	st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts Found by address sanitizer. The loop here tries to be safe, but in doing so, it ends up doing exactly the wrong thing: the safe foreach is for when the loop variable (inst) could be deleted and nothing else. However, this particular can delete inst's successor, but not inst itself. Fixes: `8c6a0ebaad` ("st/mesa: add st fp64 support (v7.1)") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:42:38 +02:00
Nicolai Hähnle	4ed419328d	radeonsi: move descriptor logs to after corresponding draw/compute packet It has to happen after descriptor uploads since otherwise we'll print out the wrong GPU list / incorrectly claim descriptor corruption. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-29 11:37:06 +02:00
Nicolai Hähnle	9ddc6e16a9	amd/common: remove ac_shader_abi::chip_class Redundant with the recently added ac_llvm_context::chip_class. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-29 11:37:03 +02:00
Nicolai Hähnle	5b86c53b47	gallium/radeon: fix a comment Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-29 11:36:46 +02:00
Iago Toral Quiroga	47e527bd81	i965/fs: force pull model for 64-bit GS inputs Triggering the push model when 64-bit inputs are involved is not easy due to the constrains on the maximum number of registers that we allow for this mode, however, for GS with 'points' primitive type and just a couple of double varyings we can trigger this and it just doesn't work because the implementation is not 64-bit aware at all. For now, let's make sure that we don't attempt this model whith 64-bit inputs and we always fall back to pull model for them. Also, don't enable the VUE handles in the thread payload on the fly when we find an input for which we need the pull model, this is not safe: if we need to resort to the pull model we need to account for that when we setup the thread payload so we compute the first non-payload register properly. If we didn't do that correctly and we enable it on-the-fly here then we will end up VUE handles on the first non-payload register which will probably lead to GPU hangs. Instead, always enable the VUE handles for the pull model so we can safely use them when needed. The GS is going to resort to pull model almost in every situation anyway, so this shouldn't make a significant difference and it makes things easier and safer. v2: Always enable the VUE handles for pull model, this is easier and safer and the GS is going to fallback to pull model almost always anyway (Ken) v3: Only clamp the URB read length if we are over the maximum reserved for push inputs as we were doing in the original code (Ken). v4: No need to clamp the urb read length if invocations > 1 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-09-29 08:18:25 +02:00
Jason Ekstrand	2df897cf1f	i965/link: Use prog->nir instead of creating a temporary This way, when NIR_PASS_V makes a clone of the shader (for testing nir_clone), the new and lowered version gets re-assigned to prog->nir. [jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-09-28 16:20:41 -07:00
Jason Ekstrand	006533d5ef	i965/link: Make more use of NIR_PASS [jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-09-28 16:20:35 -07:00
Jason Ekstrand	69ed3244d4	i965/link: Make better use of temporary variables The way NIR_PASS works (and, by extension, nir_optimize) is that they may clone the shader and throw the old one away. (We use this for testing nir_clone.) It's better if we just make a temporary variable, use it for everything, and re-assign to the gl_program at the end. [jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-09-28 16:19:54 -07:00
Thomas Helland	ce09364d4e	util: fix in-class initialization of static member Fix a compile error with G++ 4.4 string_buffer_test.cpp:43: error: ISO C++ forbids initialization of member ‘str1’ string_buffer_test.cpp:43: error: making ‘str1’ static string_buffer_test.cpp:43: error: invalid in-class initialization of static data member of non-integral type ‘const char*’ Tested-by: Vinson Lee <vlee at freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103002	2017-09-28 23:22:07 +02:00
Eric Engestrom	a35f25068a	REVIEWERS: add myself as a Meson reviewer Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-09-28 18:08:59 +01:00
Eric Engestrom	573a60f177	REVIEWERS: add Meson Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-09-28 18:08:01 +01:00
Dylan Baker	a118322b4e	meson: remove duplicate libisl dependency in anv Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-09-28 10:06:00 -07:00
Brian Paul	4d5497d50d	svga: add missing PIPE_SHADER_CAP_INT64_ATOMICS switch cases Silences a compiler warning. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-09-28 10:41:33 -06:00
Brian Paul	e8d09f80ea	svga: trivial whitespace clean-ups in svga_screen.c	2017-09-28 10:41:33 -06:00
Brian Paul	f33fbe2cf9	gallium/util: use new util_vasprintf() function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-28 10:41:33 -06:00
Brian Paul	864148d69e	util: add util_vasprintf() for Windows (v2) We don't have vasprintf() on Windows so we need to implement it ourselves. v2: compute actual length of output string, per Nicolai Hähnle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-28 10:41:33 -06:00
Brian Paul	76a4209dc0	st/mesa: don't call close() on Windows Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-28 10:41:33 -06:00
Neha Bhende	652bc4b537	svga: start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION Since our driver support arb_provoking_vertex, we can start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests Tested piglit, glretrace on Hw 11 and Hw 13 Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-09-28 10:41:33 -06:00
Marek Olšák	9d54025cd1	mesa: fix texture updates for ATI_fragment_shader Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-09-28 17:48:33 +02:00
Lucas Stach	15e3657e43	etnaviv: optimize RS transfers Currently we are blitting the whole resource when the RS is used to de-/tile a resource. This can be very inefficient for large resources where the transfer is only changing a small part of the resource (happens a lot with glTexSubImage2D). Optimize this by only blitting the tile aligned subregion of the resource, which the transfer is going to change. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>	2017-09-28 17:41:07 +02:00

... 3 4 5 6 7 ...

96339 Commits All Branches Search

96339 Commits

All Branches