KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Francisco Jerez	e4124f9bc1	glapi: Update XML for last revision of EXT_shader_framebuffer_fetch. Desktop GL is now supported, and there is an additional entry-point for EXT_shader_framebuffer_fetch_non_coherent. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	6a8ec78c2a	mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT. The changes I had originally planned for the MESA_shader_framebuffer_fetch extension have been merged into the EXT spec, there's no point in keeping MESA_shader_framebuffer_fetch extension enables. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	d0bef79f12	mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec. This GL entry point was renamed to glFramebufferFetchBarrier() in the EXT extension on request from Khronos members. Update the Mesa codebase to match the latest spec. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	27c829da28	i965: Fix KHR_blend_equation_advanced with some render targets. This reverts two bogus and seemingly useless changes from the commits referenced below, which broke KHR_blend_equation_advanced (and EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet) for any kind of render target surface that would cause the get_isl_surf() call in brw_emit_surface_state() to do anything useful (notice how the result of get_isl_surf() is completely ignored by the caller right now), as was the case while using those extensions with 1D array or 3D framebuffers in particular. Fixes: `f5859b45b1` "i965/miptree: Switch remaining surfaces to isl" Fixes: `bf24c3539e` "i965/miptree: Clean-up unused" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Marek Olšák	fb410ae392	radeonsi: remove si_descriptors parameter from emit_shader_pointer functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	63ea0a00a3	radeonsi: preload the tess offchip ring in TES so that it's not done multiple times in branches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	2d03c4cac8	radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	190e064e63	radeonsi: move 2nd-shader descriptor pointers into s[0:1] If 32-bit pointers are supported, both pointers can be moved into s[0:1] and then ESGS has exactly the same user data SGPR declarations as VS. If 32-bit pointers are not supported, only one pointer can be moved into s[0:1]. In that case, the 2nd pointer is moved before TCS constants, so that the location is the same in HS and GS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	1d1df76d2b	radeonsi: change si_descriptors::shader_userdata_offset type to short We will want to use SH registers outside of user data SGPRs, like the GFX9 special SGPRs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	fca7dee9c6	radeonsi: put both tessellation rings into 1 buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	d2963d8b5f	radeonsi: move tessellation ring info into si_screen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	41895c26d3	radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits For a later patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Karol Herbst	f0b39779a0	nvir: dont optimize mad with subops to shladd Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-24 18:48:13 +01:00
James Legg	afd8fd0656	radv: Really use correct HTILE expanded words. When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] `5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)` CC: Dave Airlie <airlied@redhat.com> CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Fixes: `5158603182` "radv: Use correct HTILE expanded words."	2018-02-24 02:16:22 +01:00
Mauro Rossi	8eed942136	radv/extensions: fix c_vk_version for patch == None Similar to `cb0d1ba156` ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) \| ((minor) << 12) \| (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: `e72ad05c1d` ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-24 00:31:31 +01:00
Eric Anholt	b4b4ada761	broadcom/vc5: Fix layout of 3D textures. Cube maps are entire miptrees repeated, while 3D textures have each level have all of its layers next to each other. Fixes tex3d and tex-miplevel-selection GL2:texture() 3D.	2018-02-23 15:07:26 -08:00
Eric Anholt	97dc077303	broadcom/vc5: Ignore unused usage flags in is_format_supported. Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to match. Just drop the whole retval == usage thing and return early when we hit a known unsupported case. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 15:07:18 -08:00
Eric Anholt	880573e737	gbm: Fix the alpha masks in the GBM format table. Once GBM started looking at the values of the alpha masks, ARGB/ABGR wouldn't match any more because we had both A and R in the low bits. Fixes: `2ed344645d` ("gbm/dri: Add RGBA masks to GBM format table") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-23 15:03:36 -08:00
Mathias Fröhlich	b54bf0e3e3	mesa: Update vertex processing mode on _mesa_UseProgram. The change is a bug fix for 92d76a169: mesa: Provide an alternative to get_vp_mode() that actually got exposed through 4562a7b0: vbo: Make use of _DrawVAO from the dlist code. Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229 Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 21:08:35 +01:00
Marek Olšák	d169438d8e	mesa: rename has_core_gs -> has_gs in get_programiv This is also true for GLES. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:23 +01:00
Marek Olšák	1881f41b6c	mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl This is more accurate with respect to the compatibility profile. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:22 +01:00
Marek Olšák	1defc973db	mesa: add some of missing compatibility support for ARB_bindless_texture The extension is exposed in the compatibility profile. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:20 +01:00
Marek Olšák	b8e2e9e1a1	mesa: expose ARB_enhanced_layouts in the compatibility profile GLSL 1.40 is required. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:19 +01:00
Marek Olšák	a0c8b49284	mesa: enable OpenGL 3.1 with ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:17 +01:00
Marek Olšák	605a7f6db5	mesa: implement ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:15 +01:00
Emil Velikov	14a2c87c41	swr: remove dead LLVM code paths LLVM requirement was bumped to 4.0.0 with earlier commit. Hence any code tailored for older versions is now unreachable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-02-23 19:17:31 +00:00
Eric Anholt	5980a41c0f	broadcom/vc4: Remove the retval==usage check in is_format_supported(). This got us into trouble recently, so just remove it entirely.	2018-02-23 08:42:13 -08:00
Eric Anholt	bc3d16e633	broadcom/vc4: Add support for YUV textures using unaccelerated blits. Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.	2018-02-23 08:42:13 -08:00
Eric Anholt	c824a045ea	broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows. When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.	2018-02-23 08:42:13 -08:00
Eric Anholt	6deb158ec1	broadcom/vc4: Add pipe_reference debugging for vc4_bos. Trying to track down the YUV EGLImage use-after-free, it helps to see what the mystery objects are that are being refcounted.	2018-02-23 08:42:13 -08:00
Eric Anholt	34ea1aca92	broadcom/vc4: Remove dead vc4_bo_set_reference(). It would be broken if NULL was passed to it anyway, since it wouldn't participate in screen->bo_handles management.	2018-02-23 08:42:13 -08:00
Eric Anholt	a49738290c	broadcom/vc4: Use pipe_resource_reference in sampler views. Improves u_debug_refcount output.	2018-02-23 08:42:13 -08:00
Eric Anholt	0c1dd9dee0	broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride. This is part of supporting YUV textures -- MMAL will be handing us a single GEM BO with the planes at offsets within it, and MMAL-decided stride.	2018-02-23 08:42:13 -08:00
Eric Anholt	978b884afc	broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported(). We were failing the retval == usage check at the end. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 08:42:13 -08:00
Lucas Stach	8df11f3fad	etnaviv: fix in-place resolve tile count TS tiles map to a fixed amount of bytes in the color/depth surface, so the blocksize of the format needs to be taken into account when calculating the number of tiles to fill. The simplest fix is to just use the layer stride, which is the surface size in bytes. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	add23b59c9	etnaviv: switch magic single buffer state to "3" Some of the 16bit formats misrender with missing tiles with the current "2" state. As all the previously working formats also work with the "3" state, just always use that one. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	8befc11186	etnaviv: add debug switch to disable single buffer feature This feature has caused some trouble already. Add a debug switch to allow users to quickly check if a specific issue is caused by this feature. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-02-23 15:34:31 +01:00
Dylan Baker	5c460337fd	meson: Fix GL and EGL pkg-config files with glvnd Currently meson will generate a pkg-config that links to EGL_mesa (or GLX_mesa), but this isn't correct, it should always link to EGL or GL. Probably the "right" solution is to have glvnd itself provide the pkg config files for GL and EGL, but that also means that glvnd needs to provide many of the header files, which makes it a more involved job. Fixes: `a47c525f32` ("meson: build glx") Fixes: `035ec7a2bb` ("meson: Add support for EGL glvnd") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-23 13:30:28 +00:00
Frank Binns	6160bf97db	egl/dri2: fix segfault when display initialisation fails dri2_display_destroy() is called when platform specific display initialisation fails. However, this would typically lead to a segfault due to the dri2_egl_display vbtl not having been set up. Fixes: `2db9548296` ("loader_dri3/glx/egl: Optionally use a blit context for blitting operations") Signed-off-by: Frank Binns <francisbinns@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-23 11:13:22 +00:00
Juan A. Suarez Romero	e1623b303c	mesa: add missing RGB9_E5 format in _mesa_base_fbo_format RGB9_E5 should be accepted by RenderbufferStorage if the EXT_texture_shared_exponent is exposed. It is left to the implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT when checking the framebuffer completeness if they do not support rendering in this format. Discussed in: https://github.com/KhronosGroup/OpenGL-API/issues/32 This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5 v2: Added more info to the commit message (Antia) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2018-02-23 10:12:06 +01:00
Christian Gmeiner	e72062b66d	etnaviv: npot_tex_any_wrap needs one bit only Reduces size of struct etna_specs from 100 to 94 bytes. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 09:38:16 +01:00
Mathias Fröhlich	4562a7b0e8	vbo: Make use of _DrawVAO from the dlist code. Finally use an internal VAO to execute display list draws. Avoid duplicate state validation for display list draws. Remove client arrays previously used exclusively for display lists. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:14 +01:00
Mathias Fröhlich	2f35140846	mesa: Use atomics for shared VAO reference counts. VAOs will be used in the next change as immutable object across multiple contexts. Only reference counting may write concurrently on the VAO. So, make the reference count thread safe for those and only those VAO objects. v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:11 +01:00
Mathias Fröhlich	8a3a4b6fae	vbo: Make use of _DrawVAO from immediate mode draw Finally use an internal VAO to execute immediate mode draws. Avoid duplicate state validation for immediate mode draws. Remove client arrays previously used exclusively for immediate mode draws. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:07 +01:00
Mathias Fröhlich	c757e416ce	vbo: Implement tool functions for vbo specific VAO setup. Correct VBO_MATERIAL_SHIFT value. The functions will be used next in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:04 +01:00
Mathias Fröhlich	ef8028017d	mesa: Add flush_vertices to _mesa_bind_vertex_buffer. We will need the flush_vertices argument later in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:01 +01:00
Mathias Fröhlich	354b76ad20	mesa: Make _mesa_vertex_attrib_binding public. Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a flush_vertices argument, and make it publicly available. The function will be needed later in the series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:58 +01:00
Mathias Fröhlich	4331969ac4	mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib. We will need the flush_vertices argument later in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:55 +01:00
Mathias Fröhlich	195bb990ed	vbo: Use _DrawVAO for array type draw commands. Switch over to use the _DrawVAO for all the array type draws. The _DrawVAO needs to be set before we enter _mesa_update_state, so move setting the draw method in front of the first call to _mesa_update_state which is in turn called from the validateDraw* calls. Using the gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode and gl_vertex_array_object::_AttributeMapMode we can already set varying_vp_inputs before we call _mesa_update_state the first time. Thus remove duplicate state validation. v2: Update comments. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:50 +01:00
Mathias Fröhlich	6002ab564b	vbo: Implement method to track the inputs array. Provided the _DrawVAO and the derived state that is maintained if we have the _DrawVAO set, implement a method to incrementally update the array of gl_vertex_array input pointers. v2: Add some more comments. Rename _vbo_array_init to _vbo_init_inputs. Rename vbo_context::arrays to vbo_context::draw_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:46 +01:00
Mathias Fröhlich	08c7474189	mesa: Introduce a yet unused _DrawVAO. During the patch series this VAO gets populated with either the currently bound VAO or an internal VAO that will be used for immediate mode and dlist rendering. v2: More comments about the _DrawVAO, filter and enabled mask. Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs. v3: Fix and move comment. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:43 +01:00
Mathias Fröhlich	ce3d2421a0	vbo: Remove get_vp_mode() and enum vp_mode. Is now unused. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:40 +01:00
Mathias Fröhlich	60c3ca1b23	vbo: Use _VPMode instead of get_vp_mode(). At those places where we used get_vp_mode() use gl_vertex_program_state::_VPMode instead. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:36 +01:00
Mathias Fröhlich	92d76a1691	mesa: Provide an alternative to get_vp_mode() To get equivalent information than get_vp_mode(), track the vertex processing mode in a per context variable at gl_vertex_program_state::_VPMode. This aims to replace get_vp_mode() as seen in the vbo module. But instead of the get_vp_mode() implementation which only gives correct answers past calling _mesa_update_state() this context variable is immediately tracked when the vertex processing state is modified. The correctness of this value is asserted on state validation. With this in place we should be able to untangle the dependency with varying_vp_inputs and state invalidation. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:30 +01:00
Ilia Mirkin	d73f1f2ad8	nv50,nvc0: fix integer MS resolves using 2d engine We don't want filtering for integer textures, same as depth/stencil. Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	33ce3569c5	nvc0: fix writing query results into buffer We need to mark the range as valid, and validate the resource using a helper to ensure that the buffer status is marked properly. Fixes some CTS pipeline stats query tests, and KHR-GL45.direct_state_access.queries_functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	f6e4f95668	nv50,nvc0: fix clear buffer acceleration Two things were off: - valid range was not updated, which could affect waiting for future maps - fencing was done manually instead of using the *_resource_validate helper, which resulted in a missed dirty buffer flag being set Fixes: KHR-GL45.direct_state_access.buffers_clear Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Lionel Landwerlin	bd9672695b	i965: perf: ensure reading config IDs from sysfs isn't interrupted Fixes: `458468c136` "i965: Expose OA counters via INTEL_performance_query" Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-23 01:44:07 +00:00
Bas Nieuwenhuizen	032870beda	radv: Fix autotools build. Somewhere along the way the Makefile changes got lost ... Fixes: `4db78f3a6b` "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <airlied@redhat.com>	2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen	e72ad05c1d	radv: Return NULL for entrypoints when not supported. This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	414f5e0e14	radv: Reword radv_entrypoints_gen.py With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	076f7cfc6b	radv: Track enabled extensions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	4db78f3a6b	radv: Put supported extensions in a struct. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Samuel Pitoiset	d6b7539206	ac/nir: remove emission of nir_op_fpow fpow is now lowered at NIR level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:44:46 +01:00
Samuel Pitoiset	7aa008d1d7	radv: enable lowering of fpow to fexp2 and flog2 There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:47 +01:00
Samuel Pitoiset	63fb30c674	nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a) Similar for the 4 case. Suggested by Bas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:45 +01:00
Samuel Pitoiset	b18997876f	nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b)) Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:43 +01:00
Samuel Pitoiset	a01e9996b5	ac/nir: set GLC=1 for load/store of coherent/volatile images This disables persistence accross wavefronts. F1 2017 and Wolfenstein 2 appear to use some coherent images but this patch doesn't seem to change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:55 +01:00
Samuel Pitoiset	3c40be126f	spirv: apply memory qualifiers to images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:53 +01:00
Chuck Atkins	540e49e105	glx: Properly handle cases where screen creation fails This fixes a segfault exposed by `a29d63ecf7` which occurs when swr is used on an unsupported architecture. v2: re-work to place logic in xmesa_init_display Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: mesa-stable@lists.freedesktop.org Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-22 10:20:32 -05:00
Iago Toral Quiroga	7668b594e6	anv/blorp: multisample resolve all attachment layers We were only resolving the first. v2: - Do not require that the number of layers on dst and src are an exact match, it is okay if the dst has more layers so long as it has at least the same that we are going to resolve. - Do not always resolve array_len layers, we should resolve only from base_array_layer to array_len. v3: - v2 was assuming that array_len represented the total number of layers in the image, but it represents the number of layers starting at the base array ayer. v4: - The number of layers to resolve should be taken from the framebuffer (Nanley). Fixes new CTS tests for multisampled layered rendering: dEQP-VK.renderpass.multisample_resolve.layers_* Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-22 08:23:39 +01:00
Jason Ekstrand	2dce4ac6ac	intel/isl: Improve the documentation on get_default_aux_state Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	24952160fd	i965: Use finish_external instead of make_shareable in setTexBuffer2 The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT which has tighter restrictions than just "it's shared". In particular, it says that any rendering to the image while it is bound causes the contents to become undefined. The GLX_EXT_texture_from_pixmap extension provides us with an acquire and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT. The extension spec says, "Rendering to the drawable while it is bound to a texture will leave the contents of the texture in an undefined state. However, no synchronization between rendering and texturing is done by GLX. It is the application's responsibility to implement any synchronization required." From the EGL 1.4 spec for eglBindTexImage: "After eglBindTexImage is called, the specified surface is no longer available for reading or writing. Any read operation, such as glReadPixels or eglCopyBuffers, which reads values from any of the surface’s color buffers or ancillary buffers will produce indeterminate results. In addition, draw operations that are done to the surface before its color buffer is released from the texture produce indeterminate results In other words, between the bind and release calls, we effectively own those pixels and can assume, so long as we don't crash, that no one else is reading from/writing to the surface. The GLX and EGL implementations call the setTexBuffer2 and releaseTexBuffer function pointers that the driver can hook. In theory, this means that, between BindTexImage and ReleaseTexImage, we own the pixels and it should be safe to track aux usage so we can avoid redundant resolves so long as we start off with the right assumption at the start of the bind/release pair. In practice, however, X11 has slightly different expectations. It's expected that the server may be drawing to the image at the same time as the compositor is texturing from it. In that case, the worst expected outcome should be tearing or partial rendering and not random corruption like we see when rendering races with scanout with CCS. Fortunately, the GEM rules about texture/render dependencies save us here. If X11 submits work to write to a pixmap after the compositor has submitted work to texture from it, GEM inserts a dependency between the compositor and X11. If X11 is using a high-priority context, this will cause the compositor to get a temporarily boosted priority while the batch from X11 is waiting on it. This means that we will never have an actual race between X11 and the compositor so no corruption can happen. Unfortunately, however, this means that X11 will likely be rendering to it between the compositor's BindTexImage and ReleaseTexImage calls. If we want to avoid strange issues, we need to be a bit careful about resolves because we can't really transition it away from the "default" aux usage. The only case where this would practically be a problem is with image_load_store where we have to do a full resolve in order to use the image via the data port. Even there it would only be a problem if batches were split such that X11's rendering happens between the resolve and the use of it as a storage image. However, the chances of this happening are very slim so we just emit a warning and hope for the best. This commit adds a new helper intel_miptree_finish_external which resets all aux state to whatever ISL says is the right worst-case "default" for the given modifier. It feels a little awkward to call it "finish" because it's actually an acquire from the perspective of the driver, but it matches the semantics of the other prepare/finish functions. This new helper gets called in intelSetTexBuffer2 instead of make_shareable. We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer before) and call intel_miptree_prepare_external in it. This probably does nothing most of the time but it means that the prepare/finish calls are properly matched. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	00926a2730	i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2 The old code made a new miptree that referenced the same BO as the renderbuffer and just trusted in the memory aliasing to work. There are only two ways in which the new miptree is liable to differ from the one in the renderbuffer and neither of them matter: 1) It may have a different target. The only targets that we can ever see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE and the difference between the two doesn't matter as far as the miptree is concerned; genX(update_sampler_state) only looks at the gl_texture_object and not the miptree when determining whether or not to use normalized coordinates. 2) It may have a very slightly different format. Again, this doesn't matter because we've supported texture views for quite some time so we always look at the gl_texture_object format instead of the miptree format for hardware setup anyway. On the other hand, because we were recreating the miptree, we were using intel_miptree_create_for_bo which doesn't understand modifiers. We really want this function to work without doing a resolve so long as you have modifiers so we need to fix that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	41d45eb21e	i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	344b57b10b	i965/miptree: Loosen the format check in miptree_match_image This function is used to determine when we need to re-allocate a miptree. Since we do nothing different in miptree allocation for sRGB vs. linear, loosening this should be safe and may lead to less copying and reallocating in some odd cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	5b1b710e6f	i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2 We're about to start letting the intel_obj->_Format be the "real" texture format. For depth/stencil textures, this may be a combined depth stencil format. For ETC2 on gen7 and earlier, this will be the actual ETC2 format. This makes a bit more GL sense but means we have to be careful in state upload. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Kenneth Graunke	183ce5e629	glsl: Parse 'layout' as a token with advanced blending or bindless Both KHR_blend_equation_advanced and ARB_bindless_texture provide layout qualifiers, and are exposed in compatibility contexts. We need to parse the layout qualifier as a token in order for those to work, but forgot to extend this check. ARB_shader_image_load_store would need a similar treatment, but we don't expose that in legacy OpenGL contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-21 17:50:57 -08:00
Daniel Stone	c7e22483fe	vulkan/wsi/x11: Consistently update and return swapchain status Use a helper function for updating the swapchain status. This will be used later to handle VK_SUBOPTIMAL_KHR, where we need to make a non-error status stick to the swapchain until recreation. Instead of direct comparisons to VK_SUCCESS to check for error, test for negative numbers meaning an error status, and positive numbers indicating non-error statuses. v2 (Jason Ekstrand): - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)" - Handle wsi_queue_pull returning VK_TIMEOUT - Call x11_swapchain_result in x11_present_to_x11 Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	6937c61324	vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails This most likely means we lost our connection to the X server so OUT_OF_DATE is reasonable. This was also the one case where we pushed a UINT32_MAX into the queue without setting an error condition. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Daniel Stone	bfa22266cd	vulkan/wsi/wayland: Add support for zwp_dmabuf zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer modifiers. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	c757fd2852	anv/image: Add support for modifiers for WSI This adds support for the modifiers portion of the WSI "extension". Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	adca1e4a92	anv/image: Separate modifiers from legacy scanout For a bit there, we had a bug in i965 where it ignored the tiling of the modifier and used the one from the BO instead. At one point, we though this was best fixed by setting a tiling from Vulkan. However, we've decided that i965 was just doing the wrong thing and have fixed it as of `5048572352`. The old assumptions also affected the solution we used for legacy scanout in Vulkan. Instead of treating it specially, we just treated it like a modifier like we do in GL. This commit goes back to making it it's own thing so that it's clear in the driver when we're using modifiers and when we're using legacy paths. v2 (Jason Ekstrand): - Rename legacy_scanout to needs_set_tiling Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	f5433e4d6c	vulkan/wsi: Add modifiers support to wsi_create_native_image This involves extending our fake extension a bit to allow for additional querying and passing of modifier information. The added bits are intended to look a lot like the draft of VK_EXT_image_drm_format_modifier. Once the extension gets finalized, we'll simply transition all of the structs used in wsi_common to the real extension structs. Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Daniel Stone	55b27e1e5f	vulkan/wsi: Add drm_modifier member to wsi_image Not yet used anywhere. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Daniel Stone	61c3feb38d	vulkan/wsi: Add multiple planes to wsi_image Not currently used. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Timothy Arceri	cdeac00267	nir: remove old assert This was originally intended to make sure the remap location was not -1. However the code has changed alot since then, the location is now never set to -1 and we also handle components meaning this old assert has been doing comparisions with the pointer to the array of component data. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183	2018-02-22 09:31:00 +11:00
Timothy Arceri	86098696fc	radeonsi/nir: collect more accurate output_usagemask Fixes assert in the glsl-1.50-gs-max-output-components piglit test. Note that the double handling will only work for doubles that don't take up multiple slots i.e. double and dvec2. However dual slot double handling is an existing bug which is made no worse by this patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	79dc94828a	radeonsi/nir: disable GLSL IR loop unrolling Delaying unrolling and allowing NIR to do it instead has been shown to result in better code in drivers such as i965. shader-db results appear to show the same is true for radeonsi. The other advantage is that using NIR unrolling improves compile times significantly. Totals from affected shaders: SGPRS: 9624 -> 10016 (4.07 %) VGPRS: 6800 -> 6464 (-4.94 %) Spilled SGPRs: 0 -> 2 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 359176 -> 332264 (-7.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1355 -> 1432 (5.68 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	e6269ffc2e	radeonsi/nir: fix tess varying loads for doubles Fixes the following piglit tests: tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	6d338d757f	ac/radeonsi: pass type to load_tess_varyings() We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Daniel Stone	eef890b7b1	x11/dri3: Store raw present completion mode The DRI3 drawable info struct currently stores a boolean for whether the last completed operation was a flip or not. As we need to track the full completion mode for handling suboptimal returns, change the 'flipping' field to the raw present completion mode from the server. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-21 21:57:38 +00:00
Daniel Stone	a6f1952814	x11/dri3: Don't open-code ARRAY_SIZE Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-21 21:57:38 +00:00
Jason Ekstrand	52056206e1	anv: Don't assert that stencil HiZ clears are single-slice It's true for depth HiZ clears because we only have HiZ on single-slice images right now. However, for stencil-only clears there is no such restriction. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-21 13:54:11 -08:00
Jason Ekstrand	7dd0f73fe1	anv: Only copy clear dwords if we're rendering to the first slice Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-02-21 12:47:17 -08:00
Marek Olšák	b494ed168c	radeonsi: don't flush when si_eliminate_fast_color_clear is no-op	2018-02-21 20:03:11 +01:00
Marek Olšák	5f55f4c59f	radeonsi: make texture_discard_cmask/eliminate functions non-static	2018-02-21 20:03:11 +01:00
James Zhu	81dd4a7637	radeonsi: enable uvd encode for HEVC main Enable UVD encode for HEVC main profile Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	b38b208ff8	radeonsi:create uvd hevc enc entry Add UVD hevc encode pipe video codec creation entry Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	e7d51e27ed	radeon/uvd:add uvd hevc enc functions Implement UVD hevc encode functions Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	2b86f5fa0b	radeon/uvd:add uvd hevc enc hw ib implementation Implement required IBs for UVD HEVC encode. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	461508c15c	radeon/uvd:add uvd hevc enc hw interface header Add hevc encode hardware interface for UVD Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	c6acae22c8	winsys/amdgpu:add uvd hevc enc support in amdgpu cs Support UVD HEVC encode in amdgpu cs Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	f0ad908e79	amd/common:add uvd hevc enc support check in hw query Based on amdgpu hardware query information to check if UVD hevc enc support Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-21 13:53:38 -05:00
Karol Herbst	7319311a50	nvir/nvc0: fix legalizing of ld unlock c0[0x10000] We have to increase the file index also for 0x10000 not just for values greater than 0x10000. Fixes: `37b67db6ae` Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-21 11:12:45 +01:00
Samuel Pitoiset	a6accad68f	ac/nir: add glsl_is_array_image() helper For consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:51 +01:00
Samuel Pitoiset	ff83dfb364	ac/nir: set the DA field when performing atomics on 3D images This doesn't fix anything known but it should definitely be set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:49 +01:00
Eric Anholt	afa7b2f199	i965: Fix compiler warning about write being undefined. This looks like it should be protected by the assume() about nr_color_regions, but my compiler warns anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	4636ce362d	glsl/tests: Fix a compiler warning about signed/unsigned loop comparison. Fixes: `d32956935e` ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	7075c084fc	loader: Fix compiler warnings about truncating the PCI ID path. My build was producing: ../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=] and we can avoid this careful calculation by just using asprintf (as we do elsewhere in the file). Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	1b313eedb5	glsl: Silence warnings in the uniform initializer test about 16-bit types They should probably get unit tests implemented, but this cleans up a bunch of warnings in my build for now. Fixes: `59f458cd87` ("glsl: Add 16-bit types") Cc: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Jordan Justen	96fe36f7ac	i965: Enable disk shader cache by default Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-20 18:49:43 -08:00
Dave Airlie	baa0feb73d	radv: don't send num_tcs_input_cp to sgprs. We never use it in the shaders. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:36 +00:00
Dave Airlie	952222ddd4	radv/tess: don't need to look in constant for vertices_per_patch This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:28 +00:00
Dave Airlie	77fd1b9187	ac/radv: cleanup some tcs output values access Just consolidates some code to make it easier to change. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:23 +00:00
Dave Airlie	0e6f0d400b	ac/radv: remove total_vertices variable This just removes an unneeded variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:19 +00:00
Dave Airlie	e9b9fb3616	ac/radv: don't mark tess inner as used if we don't use it. This just avoids marking it as a used output if we don't actually use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:15 +00:00
Dave Airlie	d5b2d7ed67	ac/nir: to integer the args to bcsel. dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw was hitting an llvm assert due to one value being an int and the other a float. This just casts both values to integer and fixes the test. Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-20 23:15:18 +00:00
Jason Ekstrand	c66fb12117	anv/blorp: Use layout_to_aux_usage when a layout is provided Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me something reasonable" we now use anv_layout_to_aux_usage whenever a layout is available. If a layout is available, we ignore the aux_usage parameter. For the cases where we have an explicit aux usage such as clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:17 -08:00
Jason Ekstrand	0fa040e6f5	anv/cmd_buffer: Delete some assert-only variables Checking the sample count is almost as good as aux usage in this case. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:16 -08:00
Jason Ekstrand	e10a62662b	anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:14 -08:00
Jason Ekstrand	7ea8131aa0	anv/cmd_buffer: Simplify transition_depth_buffer If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for both layouts. If the two layouts are the same, they will get the aux usage. In either case, the code below will give us ISL_AUX_OP_NONE and we'll return without doing anything. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:09 -08:00
Jason Ekstrand	87e86ee2e6	anv/cmd_buffer: Do subpass image transitions in begin/end_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	7d5f6b6088	anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	8a3f086a42	anv/cmd_buffer: Sync clear values in begin_subpass This is quite a bit cleaner because we now sync the clear values at the same time as we do the fast clear. For loading the clear values into the surface state, we now do it once when we handle the LOAD_OP_LOAD instead of every subpass. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	a4136b8c1a	anv/pass: Store usage in each subpass attachment This requires us to ditch the VkAttachmentReference struct in favor of an anv-specific struct. However, we can now easily identify from just the subpass attachment what kind of an attachment it is. This will make iteration over anv_subpass::attachments a little easier in some case. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	bd356e1bcf	anv/cmd_buffer: Add a concept of pending load aspects These are the same as pending clear aspects only for the "load" operation. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	e526d49edd	anv/cmd_buffer: Iterate all subpass attachments when clearing This unifies things a bit because we now handle depth and stencil at the same time. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	2cc3445eb2	anv/cmd_buffer: Decide whether or not to HiZ clear up-front This moves the decision out of begin_subpass and into BeginRenderPass like the decision for color clears. We use a similar name for the function for depth/stencil as for color even though no aux usage is really getting computed. v2 (Jason Ekstrand): - Don't always disable HiZ clears by accident - Use the initial layout to decide whether to do fast clears Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	6fc8555610	anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	7991838973	intel/blorp: Add a blorp_hiz_clear_depth_stencil helper This is similar to blorp_gen8_hiz_clear_attachments except that it takes actual images instead of trusting in the already set depth state. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	1900dd76d0	anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass This doesn't really change much now but it will give us more/better control over clears in the future. The one interesting functional change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and friends for each clear. However, this only happens at begin_subpass time so it shouldn't be substantially more expensive. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	6fb9d6c6f5	anv/cmd_buffer: Pass a subpass id into begin_subpass This is a bit less awkward than passing in the subpass because it means we don't have to extract the subpass id from the subpass. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	01223b8199	anv/cmd_buffer: Add begin/end_subpass helpers Having begin/end_subpass is a bit nicer than the begin/next/end hooks that Vulkan gives us. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	b5bd3fb4e4	anv/cmd_buffer: Apply subpass flushes before set_subpass This seems slightly more correct because it means that the flushes happen before any clears or resolves implied by the subpass transition. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	869448a8ab	anv: Use framebuffer layers for implicit subpass transitions Fixes: `de3be61801` "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	85d0bec961	anv: Be more careful about fast-clear colors Previously, we just used all the channels regardless of the format. This is less than ideal because some channels may have undefined values and this should be ok from the client's perspective. Even though the driver should do the correct thing regardless of what is in the undefined value, it makes things less deterministic. In particular, the driver may choose to fast-clear or not based on undefined values. This level of nondeterminism is bad. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	4796025ba5	intel/isl: Add an isl_color_value_is_zero helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	116e818ef1	anv/gpu_memcpy: CS Stall before a MI memcpy on gen7 This fixes a pile of hangs caused by the recent shuffling of resolves and transitions. The particularly problematic case is when you have at least three attachments with load ops of CLEAR, LOAD, CLEAR. In this case, we execute the first CLEAR followed by a MI memcpy to copy the clear values over for the LOAD followed by a second CLEAR. The MI commands cause the first CLEAR to hang which causes us to get stuck on the 3DSTATE_MULTISAMPLE in the second CLEAR. We also add guards for BLORP to fix the same issue. These shouldn't actually do anything right now because the only use of indirect clears in BLORP today is for resolves which are already guarded by a render cache flush and CS stall. However, this will guard us against potential issues in the future. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:19 -08:00
Guillaume Charifi	a572ec2efe	st/mesa: Factorize duplicate code for atomic buffer binding Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 20:54:49 +01:00
Guillaume Charifi	56bfcd50f7	st/mesa: Factorize duplicate code in st_update_framebuffer_state() Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 20:54:49 +01:00
Rob Clark	4c4e6232ee	freedreno/ir3: fix use_count refcnt'ing issue Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test When eliminating a copy, we were dropping the use_count of the mov that is skipped, but not increasing the use_count of it's src instruction. Fixes: `76440fcca9` freedreno/ir3: clean up dangling false-dep's Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-20 13:43:42 -05:00
Brian Paul	e7d1a93723	svga: replaced 'unsigned' with proper enum types in shader code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-20 08:11:06 -07:00
Andres Gomez	36ac485bd1	swr: bump minimum supported LLVM version to 4.0 Since radv and radeonsi removed support for LLVM 3.9 the distcheck target got broken because SWR distribution needed 3.9.x. After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0 and above, which will solve this problem. Fixes: `3bf1e036e8` ("amd: remove support for LLVM 3.9") Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2018-02-20 17:03:06 +02:00
Samuel Pitoiset	1ac741d690	ac/nir: move ac_declare_lds_as_pointer() outside of the switch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-20 10:44:59 +01:00
Samuel Pitoiset	b5d111ae76	radv: allow to force family using RADV_FORCE_FAMILY Useful for pipeline-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-20 10:44:47 +01:00
Thomas Hellstrom	f386776ea5	loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback Removing this callback caused rendering corruption in some multi-screen cases, so it is reinstated but without the drawable argument which was never used by implementations and was confusing since the drawable could have been created with another screen. Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org Fixes: `5198e48a0d` (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013 Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com> Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-20 10:36:53 +01:00
Thomas Hellstrom	80c31f7837	svga: Fix a leftover debug hack Fix what appears to be a leftover debug hack. The hack would force the driver to take a different blit path; possibly, although unverified, reverting to software blits. Tested using piglit tests/quick. No related regressions. Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org> Fixes: `9d81ab7376` (svga: Relax the format checks for copy_region_vgpu10 somewhat) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625 Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-20 10:12:19 +01:00
Iago Toral Quiroga	af5f2322d0	anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-20 08:12:32 +01:00
Ilia Mirkin	e1a70aed10	nv50,nvc0: mark ABGR format as displayable instead of ARGB format This matches the hardware's capabilities. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-19 22:33:58 -05:00
Ilia Mirkin	f7604d8af5	st/dri: only expose config formats that are display targets In the case of NVIDIA hardware, ABGR is displayable but ARGB is not. Only advertise the one set in the visuals list. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-02-19 22:33:58 -05:00
Ilia Mirkin	ebdc4c31e2	mesa: add xbgr support adjacent to xrgb Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Daniel Stone <daniels@collabora.com>	2018-02-19 22:33:58 -05:00
Timothy Arceri	d88a2906f8	st/shader_cache: copy nir pointer to gl_program after deserializing This fixes a crash when running the arb_get_program_binary-api-errors piglit test twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	691c320de0	radeonsi: add nir shader cache support In future we might want to try avoid calling nir_serialize() but this works for now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	2b431808ab	radeonsi: rename variables tgsi_binary -> ir_binary This better represents that the ir could be either tgsi or nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Marek Olšák	f78fe98fff	radeonsi: fix regression from 32-bit pointers on CI Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-02-19 17:56:23 +01:00
Samuel Pitoiset	549c7f3724	radv: compact varyings after removing unused ones It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-19 12:19:17 +01:00
Timothy Arceri	51e745cf77	radeonsi/nir: fix gl_FragCoord for pixel_center_integer Fixes piglit test glsl-arb-fragment-coord-conventions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Timothy Arceri	347038baa9	glsl/nir: add pixel_center_integer to shader info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Ilia Mirkin	fe76fc11b1	gm107/ir: avoid using kepler instruction capabilities Split up the op properties table into generation-specific bits, and only use the kepler ones on kepler. Fixes some CTS images tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-02-17 23:41:21 -05:00
Ilia Mirkin	f08fd676bf	nvc0: add support for bindless on maxwell+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 23:41:21 -05:00
Ilia Mirkin	0255550eb1	gm107/ir: change how SUQ works in preparation for bindless All this information can be retrieved from the TIC directly. Avoid having to dip into the constbuf information about the image. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 23:41:21 -05:00
Kenneth Graunke	fa8a764b62	i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+. By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic state base address. This makes it unusable for pushing UBOs. There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake) which controls whether buffer 0 is relative to dynamic state base address, or simply a normal pointer. Setting that gives us full flexibility. This lets us push up to 4 UBO ranges. We can't currently write this on Haswell and earlier, and will need to update the kernel command parser, and then do the whole version checking song and dance. We also need a brand new kernel that supports context isolation - on older kernels, newly created contexts inherit register state from whatever happened to be running. So, setting this would have catastrophic impact on other drivers such as libva, Beignet, or older Mesa. See commit `8ec5a4e4a4` where we did this once before, but had to revert it in commit `013d331220`. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-17 11:26:31 -08:00
Kenneth Graunke	a63c74be85	i965: Stop restoring the default L3 configuration on Kernel 4.16+. Kernel 4.16 has proper context isolation, which means we can change the L3 configuration without worrying about that leaking to other newly created contexts, breaking the assumptions of other userspace. So, disable our workaround to reprogram it back to the default. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-17 11:26:18 -08:00
Mikko Perttunen	5a1606c51f	nvc0: Use GP100_COMPUTE_CLASS on GP10B GP10B requires the use of GP100_COMPUTE_CLASS instead of GP104_COMPUTE_CLASS as is used for other non-GP100 chips. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 14:16:10 -05:00
Daniel Stone	9d21dbeb88	i965: Fix aux-surface size check The previous commit reworked the checks intel_from_planar() to check the right individual cases for regular/planar/aux buffers, and do size checks in all cases. Unfortunately, the aux size check was broken, and required the aux surface to be allocated with the correct aux stride, but full image height (!). As the ISL aux surface is not recorded in the DRIimage, we cannot easily access it to check. Instead, store the aux size from when we do have the ISL surface to hand, and check against that later when we go to access the aux surface. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `c2c4e5bae3` ("i965: Fix bugs in intel_from_planar") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-17 10:22:35 +00:00
Marek Olšák	931ec80eeb	radeonsi: implement 32-bit pointers in user data SGPRs (v2) User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: 1645656 -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)	2018-02-17 04:52:17 +01:00
Marek Olšák	5722cd4084	radeonsi: disallow constant buffers with a 64-bit address in slot 0 State trackers must use a user buffer or const_uploader, or set pipe_resource::flags same as const_uploader->flags. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	d790b6cece	radeonsi: move const_uploader allocations to 32-bit address space Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	50581549b7	winsys/radeon: implement and enable 32-bit VM allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	1104d1e9d3	winsys/radeon: add struct radeon_vm_heap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	48ecacfefa	winsys/amdgpu: enable 32-bit VM allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	c2da45be86	gallium/radeon: add 32-bit address space heaps Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	0977b7f7b3	ac: query high bits of 32-bit address space	2018-02-17 04:51:58 +01:00
Marek Olšák	16be55da94	gallium: use PIPE_CAP_CONSTBUF0_FLAGS	2018-02-17 04:20:55 +01:00
Marek Olšák	8e7222f4e5	gallium: allow drivers to impose BO flags restrictions on constant buffer 0 Required by radeonsi for optimal behavior.	2018-02-17 04:20:55 +01:00
Alexander von Gluck IV	834d221512	meson: Add Haiku platform support v4 Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-16 16:56:34 -06:00
Anuj Phogat	7b283544dc	anv/icl: Add render target flush after uploading binding table The PIPE_CONTROL command description says: "Whenever a Binding Table Index (BTI) used by a Render Taget Message points to a different RENDER_SURFACE_STATE, SW must issue a Render Target Cache Flush by enabling this bit. When render target flush is set due to new association of BTI, PS Scoreboard Stall bit must be set in this packet." Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	136f583a24	anv/icl: Enable float blend optimization Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	cd7102972f	anv/icl: Use gen11 functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	9673c21d4f	anv/icl: Build anv libs for gen11 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	1f108b436b	anv/icl: Generate gen11 entry point functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	a86c0a08df	anv/icl: Don't use DISPATCH_MODE_SIMD4X2 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	cd5fc634a8	anv/icl: Don't use SingleVertexDispatch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	6e3940b3cf	anv/icl: Don't set ResetGatewayTimer Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	41a4c2c8e8	anv/icl: Add #define genX Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:31 -08:00
Anuj Phogat	413d475b44	anv/icl: Add gen11 mocs defines Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:31 -08:00
Kenneth Graunke	1d6cf433d2	i965: Implement GenerateMipmap directly, rather than using Meta. Meta is awful and we'd like to stop using it. Implementing this using BLORP allows us to stop trashing a bunch of GL state every time. This follows the structure of st_generate_mipmap(). compute_num_levels is lifted directly from there. Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3) on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher resolutions). v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5 because it's not implemented yet. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-02-16 10:48:10 -08:00
Kenneth Graunke	9bcd31ea90	mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c. I want to use compute_num_levels inside i965. Rather than duplicating it, move it from mesa/st to core Mesa, and make it non-static. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-16 10:48:10 -08:00
Dylan Baker	03ab40b1f7	meson: freedreno depends on nir This fixes a race condition in building targets that link in freedreno. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120 Fixes: `0bbecc5a85` ("meson: define driver dependencies") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Mark Janes <mark.a.janes@intel.com>	2018-02-16 10:10:18 -08:00
George Kyriazis	f1fbeb1a53	swr/rast: blend_epi32() should return Integer, not Float fix gcc8 compiler error for KNL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	7dd793d10c	swr/rast: Normalize path for debug metadata in template gen_llvm.hpp Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	f979d0bc2f	swr/rast: Consolidate archrast Draw events Consolidate archrst draw events into single draw event with an attribute that represents the type of draw - Add handlers for new private proto versions of DrawInstancedEvent, DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and DrawIndexedInstancedSplitEvent - Convert the draw events to generic DrawInfoEvents - parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with 'uint32_t'. This draw type is actually an enum, but can be represented as an unsigned integer. - is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	45df1a6520	swr/rast: Add semantics for translating address Added support for another full translation path in fetch jitter. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	c09483cf0a	swr/rast: Convert C Sampler intrinsics Convert portions of the C sampler to the rasty SIMD lib. Also fix SRL call with a non-immediate. Don't count on the compiler automagically converting an srli call to srl if the shift count isn't an immediate. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	37ebf86add	swr/rast: Make SIMDLib templated types easier to use "typename SIMD_T::TypeName" --> "TypeName<SIMD_T>" Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	74e8bb4a22	swr/rast: Be more explicit when fetching next component Use a new function to denote that we want to get offset to next component and hide the fact that GEP is used underneath. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	da77eb55d5	swr/rast: Fix bug related to passing AR handle We were passing a garbage handle. Let's not do that. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	48d62409f8	swr/rast: Fix primitive replication issue in tesselation PA. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	e12db47a7d	swr/rast: Use llvm intrinsic masked gather Use llvm intrinsic masked.gather instead of manual unroll for the cases where we have vector of pointers. Improves llvm IR debug experience by reducing a ton of IR to a single intrinsic call. Also seems to reduce overall stack use considerably. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00

... 2 3 4 5 6 ...

92605 Commits