Commit Graph

76894 Commits

Author SHA1 Message Date
Nicolai Hähnle ff7e9412be radeonsi: extract the texture descriptor computation into its own function
This will allow this code to be re-used for shader images.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-09 15:02:27 +01:00
Nicolai Hähnle 1197c69bdd radeonsi: extract the buffer descriptor computation into its own function
This will allow it to be re-used for shader image descriptors.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-09 15:02:27 +01:00
Nicolai Hähnle 2bf8ee34b8 radeonsi: remove resource field from si_sampler_view
view->resource is redundant with view->base.texture, so get rid of it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák 2dec5e09e1 radeonsi: accept pipe_resource in si_sampler_view_add_buffer
and rename .._buffers -> .._buffer

Based loosely on Nicolai's patch. This will make it easier to cherry-pick
Nicolai's patches from his image support branch.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák f18fc70d6f radeonsi: disable DCC on handle export if expecting write access
This should be okay except that sampler views and images are not re-set.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Bas Nieuwenhuizen 1e48ec7571 radeonsi: add DCC decompression (v2)
This is currently not needed but will be necessary when we have
features that do not work with DCC enabled, such as image stores
and sharing non-scanout surfaces.

v2: Marek: rebase, remove decompression from si_flush_resource (not needed)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák b744ac9f44 radeonsi: allocate DCC in the same backing buffer as the texture
To allow sharing textures with DCC enabled.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák 60c08aa90b gallium/radeon: disable CMASK on handle export if sharing doesn't allow it (v2)
v2: remove the list of all contexts

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák 970b979da1 gallium/radeon: eliminate fast color clear before sharing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák abac6bf67a gallium/radeon: don't use fast color clear if sharing doesn't allow it
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák d4e847ea33 gallium/radeon: disallow handle export for MSAA & depth textures
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák d95f593758 gallium/radeon: remember that texture_from_handle was called and its flags
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák c034d3dde0 gallium/radeon: check that handle usage doesn't change for a resource
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák 6b187bbd9f gallium/radeon: disallow reallocation of shared buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák ecbd3aba17 gallium/radeon: if we can't discard a whole resource, discard the range instead
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák afdaffcbdb gallium/radeon: buffer valid range tracking only works with unshared buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák be73d35829 gallium/radeon: don't set texture metadata for buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák f914779c75 gallium/radeon: set texture metadata only once
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák 69d8b75114 gallium/radeon: clean up r600_texture_get_handle
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák e3cee38e13 gallium/radeon: move code initializing texture metadata to its own function
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák f4aa3256ef winsys/amdgpu: allow drivers to set/get opaque metadata
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák bd1feb2827 gallium/radeon: rename winsys buffer_get/set_tiling to buffer_get/set_metadata
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák 6011d7cf25 gallium/radeon: remove rcs parameter from radeon_winsys::buffer_set_tiling
This was needed for DRM < 2.12.0 where the kernel was rewriting tiling flags
in IBs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:25 +01:00
Marek Olšák 260ef9c9be gallium/radeon: use a structure for passing tiling flags from/to winsys
and call it radeon_bo_metadata

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:25 +01:00
Marek Olšák 82db518f15 gallium: add external usage flags to resource_from(get)_handle (v2)
This will allow drivers to make better decisions about texture sharing
for DRI2, DRI3, Wayland, and OpenCL.

v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-03-09 15:02:25 +01:00
Axel Davy d943ac432d dri: add backbuffer use flag
This will be used by the next commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-09 15:02:25 +01:00
Timothy Arceri 2188c77a0e glsl: dont allow undefined array sizes in ES
This applies the rule to empty declarations.

Fixes:
dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex
dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-09 20:30:42 +11:00
Tamil velan 353a4f844f radeon/uvd: increase max height to 4096 for VI and newer
With this issue 'mpv --hwdec=vdpau --vo=vdpau <stream>' fails
for vdpau decode if the stream height is 4096. Vdpau decode of
height upto 4096 is necessary usecase on amdgpu driver for VI
and newer platforms.

The fix is in driver specific implementation of "Decoder
Query Capabilities" API to return 4096 for VI and newer
platforms. With this fix vdpauinfo reports height support as
4096 and mpv for vdpau decode works fine for 4096 height streams.

Signed-off-by: Tamil velan <Tamil-Velan.Jayakumar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-08 19:01:19 -05:00
Bas Nieuwenhuizen 6373845d98 winsys/amdgpu: enlarge buffer_indices_hashlist
Enlarge the buffer hashlist to prevent large numbers of misses
due to adding more buffers than can be cached in the hashlist.

The game I tested had CS's with up to 1500 buffers and the overhead
of amdgpu_lookup_buffer for various sizes was:

4096 1.97% (new value)
2048 4.37%
1024 6.92%
512  9.47% (old value)

(percentage of CPU usage in render thread as determined by perf)

The time spent in amdgpu_add_buffer self is ~4.2% in all cases and
for 4096 the time needed to clear the hashlist is still < 0.10%,
so I am not expecting significant regressions.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 00:52:07 +01:00
Samuel Pitoiset 32e848b016 nvc0: add a new validation path for compute
This makes use of the new state validation interface to be consistent
with 3d.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-09 00:19:21 +01:00
Samuel Pitoiset db9b41d302 nvc0: rework the validation path for 3D
This exposes an interface for state validation that will be also used
to rework the compute validation path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-09 00:19:16 +01:00
Jordan Justen a100a57e30 i965/hsw: Initialize SLM index in state register
For Haswell, we need to initialize the SLM index in the state
register. This can be copied out of the CS header dword 0.

v2:
 * Use UW move to avoid changing upper 16-bits of sr0.1 (mattst88)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94081
Fixes: piglit arb_compute_shader/execution/shared-atomics.shader_test
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-08 14:27:18 -08:00
Jordan Justen d8347f12ea i965/compute: Skip SIMD8 generation if it can't be used
If the local workgroup size is sufficiently large, then the SIMD8
program can't be used. In this case we can skip generating the SIMD8
program. For complex programs this can save a significant amount of
time.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-08 14:27:18 -08:00
Jordan Justen e1d54b1ba5 i965/fs: Allow spilling for SIMD16 compute shaders
For fragment shaders, we can always use a SIMD8 program. Therefore, if
we detect spilling with a SIMD16 program, then it is better to skip
generating a SIMD16 program to only rely on a SIMD8 program.

Unfortunately, this doesn't work for compute shaders. For a compute
shader, we may be required to use SIMD16 if the local workgroup size
is bigger than a certain size. For example, on gen7, if the local
workgroup size is larger than 512, then a SIMD16 program is required.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-08 14:27:18 -08:00
Timothy Arceri 91630d7453 glsl: don't always reject shaders with mismatching ifc blocks
Since we store some member qualifiers in the interface type
we need to be more careful about rejecting shaders just because
the pointer doesn't match. Its perfectly valid for some qualifiers
such as precision to not match across shader interfaces.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-09 09:21:42 +11:00
Timothy Arceri 3026b3565a glsl: make interstage_match() static
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-09 09:21:36 +11:00
Timothy Arceri ebc419fcbd glsl: don't validate ifc blocks using validation meant for variables
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-09 09:21:31 +11:00
Kenneth Graunke 19f13b2096 mesa: Fix error code for GetFramebufferAttachmentParameter in ES 3.0+.
The ES 3.0+ specifications contain the exact same text as the OpenGL
specification, which says that we should return GL_INVALID_OPERATION.

ES 2.0 contains different text saying we should return GL_INVALID_ENUM.

Previously, Mesa chose the error code based on API (GL vs. ES).
This patch makes ES 3.0+ follow the GL behavior.  ES 2 remains as is.

Fixes dEQP-GLES3.functional.fbo.api.attachment_query_empty_fbo.
However, breaks the dEQP-GLES2 variant of the same test for drivers
which silently promote to ES 3.0.  This can be worked around by
exporting MESA_GLES_VERSION_OVERRIDE=2.0, but is a bug in dEQP-GLES2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-08 12:46:28 -08:00
Kenneth Graunke 8b3496f378 mesa: Add GL_RED and GL_RG to ES3 effective internal format mapping.
The dEQP-GLES3.functional.fbo.completeness.renderable.texture.
{color0,depth,stencil}.{red,rg}_unsigned_byte tests appear to expect
GL_RED/GL_RG and GL_UNSIGNED_BYTE to map to GL_R8/GL_RG8, rather than
returning an INVALID_OPERATION error.

This makes perfect sense.  However, RED and RG are strangely missing
from the ES 3.0/3.1/3.2 spec's "Effective internal format corresponding
to external format and type" tables.  It may be worth filing a spec bug.

Fixes the 6 dEQP tests mentioned above.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-03-08 12:46:28 -08:00
Samuel Pitoiset 752769e053 nv50,nvc0: make sure to destroy the mutex used for blits
This mutex is initialized when the blitter is created, but it is never
destroyed. This doesn't hurt anything but it makes sense to destroy it
at blitter deletion.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-08 21:24:46 +01:00
Marek Olšák 3146014d5f gallium/radeon: don't use temporary buffers for persistent mappings
Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-08 20:08:52 +01:00
Jason Ekstrand 14b18aba89 nir: Add a pass for lower indirect variable dereferences
This new pass lowers load/store_var intrinsics that act on indirect derefs
to if-ladder of direct load/store_var intrinsics.  The if-ladders perform a
simple binary search on the indirect.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-08 10:41:54 -08:00
Alejandro Piñeiro ef76ea4ba9 i965/fs/nir: "surface_access::" prefix not needed
"using namespace brw::surface_access" is already present at the
top of the source file.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-08 17:55:28 +01:00
Brian Paul 6857420e79 mesa: fix malformed assertion in _image_format_class_to_glenum()
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
2016-03-08 08:42:56 -07:00
Brian Paul 3ed8729f7b program: minor whitespace clean-ups in program_parse_extra.c 2016-03-08 08:42:56 -07:00
Christian König 37402aa4c6 st/mesa: conditionally enable GL_NV_vdpau_interop
Only enable it when we compile the state tracker as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-08 13:00:04 +01:00
Christian König e148a3b6e9 radeon/uvd: disable MPEG1
The hardware simply doesn't support that correctly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-08 12:57:08 +01:00
Alejandro Piñeiro 0548844e86 i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic
Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need
to explicitly set it.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-08 08:22:26 +01:00
Alejandro Piñeiro d3a89a7c49 i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor
surface_access emit_untyped_read and emit_untyped_atomic provides the same
functionality.

v2: surface parameter of emit_untyped_atomic is a const, no need to
    specify default predicate on emit_untyped_atomic, use retype
    (Francisco Jerez).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-08 08:22:26 +01:00
Alejandro Piñeiro 0c5c2e2c93 i965/vec4: pass the correct src_sz to emit_send at emit_untyped_atomic
If the src is invalid, so src size is zero, the src_sz passed to emit
send should be zero too, instead of a default 1 if we are in a simd4x2
case. This can happens if using emit_untyped_atomic for an atomic
dec/inc.

v2: use the proper src_sz when calling emit_send, instead of just
    avoid loading src at emit_send if BAD_FILE (Francisco Jerez)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-08 08:22:26 +01:00