Commit Graph

22859 Commits

Author SHA1 Message Date
Marek Olšák 2792ec2cdd radeonsi: rename rview -> sview
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:26:45 -05:00
Marek Olšák 96610f625d radeonsi: rename rscreen -> sscreen
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:25:57 -05:00
Marek Olšák 86e25ed5a3 radeonsi: disable render cond & pipeline stats for internal compute dispatches 2019-01-22 12:24:35 -05:00
Sonny Jiang 1b25d340b7 radeonsi: use compute for resource_copy_region when possible
v2: marek: fix snorm8 blits

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-01-22 12:24:35 -05:00
Jiang, Sonny 8daf5bb209 radeonsi: add compute_last_block to configure the partial block fields 2019-01-22 12:22:46 -05:00
Marek Olšák b443465fb9 gallium/util: add util_format_snorm8_to_sint8 (from radeonsi) 2019-01-22 12:21:43 -05:00
Marek Olšák 4d5f8f39f3 radeonsi: move PKT3_WRITE_DATA generation into a helper function
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák c252273f98 radeonsi: don't use WRITE_DATA.DST_SEL == MEM_GRBM on >= CIK
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák a545415eb9 radeonsi: fix the top-of-pipe fence on SI
SI doesn't have MEM.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák e402961e1d radeonsi: correct WRITE_DATA.DST_SEL definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Marek Olšák c605738113 radeonsi: compile clear and copy buffer compute shaders on demand
same as all other shaders
2019-01-22 11:59:27 -05:00
Marek Olšák f139589069 radeonsi: remove redundant call to emit_cache_flush in compute clear/copy
launch_grid calls it.
2019-01-22 11:59:27 -05:00
Marek Olšák e3d283eaca radeonsi: use buffer_store_format_x & xy 2019-01-22 11:59:27 -05:00
Marek Olšák 4c4c8bb1f0 radeonsi: fix rendering to tiny viewports where the viewport center is > 8K
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Marek Olšák caa2dcd730 radeonsi: fix a u_blitter crash after a shader with FBFETCH
This fixes an assertion failure with GL CTS when cts-runner is used.
(not a specific test)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
2019-01-22 11:59:27 -05:00
Jonathan Marek fc4f6b2f12 freedreno: a2xx: add partial lower_scalar pass for ir2
Some instructions can only be scalar on a2xx, lower these only

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek 9f614c74b7 freedreno: a2xx: add ir2 copy propagation
Two cases:
* replacing srcs which refer to MOV instructions
* replacing MOVs used to write to exports

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek c7dbf0b280 freedreno: a2xx: insert scalar MOV to allow 2 source scalar
If we want to use a scalar instruction with two sources, both sources have
to be in the same register. This covers a common case by inserting a scalar
MOV into a previous instruction with only a vector alu instruction.

A better method would be to have the sources end up in the same register in
the first place, but when one source is a constant this is the only way.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Jonathan Marek 67610a0323 freedreno: a2xx: NIR backend
This patch replaces the a2xx TGSI compiler with a NIR compiler.

It also adds several new features:
-gl_FrontFacing, gl_FragCoord, gl_PointCoord, gl_PointSize
-control flow (including loops)
-texture related features (LOD/bias, cubemaps)
-filling scalar ALU slot when possible

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
2019-01-22 14:45:03 +00:00
Karol Herbst 8bb46de08b mesa: add MESA_SHADER_KERNEL
used for CL kernels

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Jonathan Marek 5886c5d092 freedreno: a2xx: sysmem rendering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:34 -05:00
Jonathan Marek bec6e4b054 freedreno: a2xx: fix non-zero texture base offsets
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:27 -05:00
Jonathan Marek 02ab85afd8 freedreno: a2xx: fix VERTEX_REUSE/DEALLOC on a20x
On a20x, set VGT_VERTEX_REUSE_BLOCK_CNTL to 2 and don't change it. Small
rearrangement on a220 to reduce the size of draw commands.

Only set DEALLOC_CNTL on a20x because the correct a220 value is not known.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:22 -05:00
Jonathan Marek 0286a11b7e freedreno: a2xx: fix gmem2mem viewport
Fixes cases where previous viewport values might case gmem2mem to fail.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:16 -05:00
Jonathan Marek 64b12520a2 freedreno: a2xx: cleanup REG_A2XX_PA_CL_VTE_CNTL
Doesn't change much, but reduces the size of fd2_emit_state

gmem2mem does not need to change the value: no Z clipping on resolve
mem2gmem now needs to restore the common value after rendering

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:22:10 -05:00
Jonathan Marek 6ef7700ac6 freedreno: a2xx: cleanup init_shader_const
Only 3 vertices are used so we can drop the data for vertex 4

It doesn't make sense to have 1.1 for some coordinates, use 1.0 instead

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-21 09:21:51 -05:00
Karol Herbst 0a793c78a3 nir: add bit_size parameter to system values with multiple allowed bit sizes
v2: add assert to verify we have at least one valid bit_size
v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:17:18 +01:00
Rhys Kidd 8002eaab6c nv50,nvc0: add missing CAPs for unsupported features
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-20 13:51:01 -05:00
Karol Herbst 6fefd69724 nir: rename nir_var_ssbo to nir_var_mem_ssbo
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst 3afc1e068f nir: rename nir_var_ubo to nir_var_mem_ubo
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst 9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Kristian H. Kristensen 5486c9d526 freedreno/a6xx: Turn on texture tiling by default
The color swap isn't available for tiled formats and it's not needed
either. We pick one channel order and use for all non-linear formats.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-18 14:27:15 -08:00
Kristian H. Kristensen 60c6778dda freedreno: Synchronize batch and flush for staging resource
Staging blit downloads would wait on the src resource instead of the
staging resource and didn't make sure to submit the blit batch first.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-01-18 14:27:12 -08:00
Karol Herbst 80dae7022e gm107/ir: disable TEXS for tex with derivAll set
fixes deqp tests:
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isamplercube_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usamplercube_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2dshadow_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_fixed_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_float_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.isampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.usampler3d_vertex
dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex

Fixes: f821e80213
       "gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-18 03:27:51 +01:00
Karol Herbst 30b5c9eda2 nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions
fixes dEQP-GLES2.functional.shaders.invariance.mediump.loop_3

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-18 02:03:30 +01:00
Eric Anholt 59527a36e9 v3d: Restructure RO allocations using resource_from_handle.
I had bugs in the old path where I was laying out as tiled (so we'd render
tiled) but then only allocating space in the shared object for linear
rendering.  The resource_from_handle makes it so the same layout choices
are made in both the import and export scanout cases.  Also, fixes a leak
of the fd that was tripping up the CTS.

Now that we're checking PIPE_BIND_SHARED to choose to use RO, the
DRM_FORMAT_MOD_LINEAR check wasn't needed any more.

Fixes visual corruption and MMU faults in X in renderonly mode.

Fixes: bd09bb1629 ("v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.")
2019-01-16 16:28:41 -08:00
Eric Anholt d70eb2302b v3d: If the modifier is not known on BO import, default to linear for RO.
Part of fixing DRI3 rendering with RO on X11.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
2019-01-16 16:28:41 -08:00
Timothy Arceri e106e0f2dd radeonsi/nir: get correct type for images inside structs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Alok Hota 187a6506a3 swr/rast: Store cached files in multiple subdirs
This improves cache filesystem performance, especially during CI tests
Also updated jitcache magic number due to codegen parameter changes
Removed 2 `if constexpr` to prevent C++17 requirement
2019-01-16 13:53:30 -06:00
Alok Hota bb98be61f4 swr/rast: New execution engine per JIT
Fixes relocation errors with LLVM 7.0.0
2019-01-16 13:53:30 -06:00
Alok Hota b135db5d58 swr/rast: Scope MEM_CLIENT enum for mem usages
Avoids confusion with other defaulted integer parameters

- fixed some unspecified usages
- removed unnecessary includes
- removed unecessary protected access specifier in buckets framework
2019-01-16 13:53:30 -06:00
Alok Hota c722ad7379 swr/rast: Unaligned and translations in gathers
- added graphics address translation in odd gathers
- added support for unaligned gathers in fetch shader
- changed how 2+ GB offsets are handled to make them compatible with
unaligned offsets
2019-01-16 13:53:30 -06:00
Alok Hota 9459863dfa swr/rast: partial support for Tiled Resources
- updated sample from TRTT surfaces correctly
- implemented mapped status return for TRTT surfaces
- implemented per-sample instruction minLod clamp
- updated bilinear filter weight calculation to be closer to D3D specs
- implemented "ReducedTexcoordRange" operation from D3D specs to avoid
loss of precision on high-value normalized coordinates
2019-01-16 13:53:30 -06:00
Alok Hota 9cacf9d877 swr/rast: Add annotator to interleave isa text
To make debugging simpler
2019-01-16 13:53:30 -06:00
Alok Hota c9fa2ee343 swr/rast: Use gfxptr_t value in JitGatherVertices
Use gfxptr_t type value for stream pointer uses in gather and similar
calls
2019-01-16 13:53:30 -06:00
Bruce Cherniak ed7673afd2 gallium/swr: Fix multi-context sync fence deadlock.
Various recreation scenarios lead to API thread getting stuck in
swr_fence_finish().  This is a multi-context issue, whereby one context
overwrites the fence read-value with a previous sync's lesser value.
The fence sync value is supposed to be always increasing.

In swr_fence_cb(), only update the "read" value if the new value is
greater.

(This may seem like we're not waiting on the other context to finish, but
had we needed for it to finish there would have been a wait prior to
submitting a new sync.)

cc: mesa-stable@lists.freedesktop.org
2019-01-16 09:26:36 -06:00
Marek Olšák 5183e794af radeonsi: also apply the GS hang workaround to draws without tessellation
ported from AMDVLK.

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 18:55:58 -05:00
Eric Anholt bd09bb1629 v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.
We don't have a way to talk to RO about modifiers it can do yet, so assume
the minimum.
2019-01-14 15:40:55 -08:00
Eric Anholt 6281f26f06 v3d: Add support for shader_image_load_store.
This is only exposed on V3D 4.1+, because we didn't have the TMU write
operations for images on 3.3 (To do GLES 3.1 there, you have to lower it
to SSBO load/stores, which is a problem to solve later).
2019-01-14 15:40:55 -08:00
Eric Anholt 5932c2f0b9 v3d: Add SSBO/atomic counters support.
So far I assume that all the buffers get written.  If they weren't, you'd
probably be using UBOs instead.
2019-01-14 15:40:55 -08:00