Commit Graph

112040 Commits

Author SHA1 Message Date
Hyunjun Ko f7f8fb1b55 freedreno/ir3: fix typo
Fixes: a9b556d3a0 ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov")
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-20 08:34:09 -07:00
Alyssa Rosenzweig 546236e27f panfrost: Load from tiled images
Now that we have lima tiling code available, use it to load from a tiled
source.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 08:22:38 -07:00
Alyssa Rosenzweig 035a07c0ae panfrost: Switch to lima tiling
Lima and Panfrost both have implementations of software tiling
(the Lima one was forked off the Panfrost one which was forked off the
original Lima one...). Switch to the most recent Lima code, since it's
more complete than ours at this point.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 08:22:38 -07:00
Alyssa Rosenzweig 7b46f09f26 panfrost: Fix tiled NPOT textures with bpp<4
Panfrost's tiling routines (incorrectly) ignored the source stride,
masking this bug; lima's routines respect this stride, causing issues
when tiling NPOT textures whose stride is not a multiple of 64
(for instance, NPOT textures with bpp=1).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 08:22:38 -07:00
Alyssa Rosenzweig 413242277a lima,panfrost: Move lima_tiling.c/h to /src/panfrost
This will allow both drivers to share this code. Both drivers
build-tested with meson. Android build not tested.

v2: Change naming from tiling->shared, in case Lima and Panfrost can
share more in the future. Fix Android build system.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-and-tested-by: Qiang Yu <yuq825@gmail.com>
2019-06-20 08:06:35 -07:00
Kenneth Graunke c57b4c86c0 iris: Use render_batch/compute_batch locals in memory_barrier
We have them, may as well use them.
2019-06-20 10:04:38 -05:00
Lionel Landwerlin 4a61be24fe anv: only resort to sync fds internally with no syncobj support
We can rely on only one kind of synchronization object (drm-syncobj)
when it is available. This reduces the number of file descriptors we
use in our implementation.

This will be required later for timeline semaphores implementation, at
this point we won't ever want to use anything else but syncobjs.

v2: Only use has_syncobj for semaphores (Jason)

v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-20 14:59:51 +00:00
Alyssa Rosenzweig 1d7e53a854 panfrost: Remove other commented pointers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:05 -07:00
Alyssa Rosenzweig 2608da14b9 panfrost/decode: Elide more zero fields
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:05 -07:00
Alyssa Rosenzweig cfc2218a8c panfrost/decode: Remove memory comments
These do more harm than good at this point.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig 8643b89c48 panfrost: Add missing 0x in invocation_count
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig b6d46d09c2 panfrost/decode: Skip decode of fragment backend in non-fragment
This is all zero for anything but fragment shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig ae2bfab7b7 panfrost/decode: Clip mali_compute_fbd at 64-bytes
Looking at internal evidence (later fields including a literal other
compute job inception-style, seeming memory corruption, no clear
function, and the field after this being a pointer to *itself*), it
looks like this is really a much smaller descriptor.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig 3faf33488a panfrost/decode: Print COMPUTE uniforms as pointers
In OpenGL, uniforms generally represent fp32 vec4s (at least in highp
mode). In OpenCL, they represent vec2s of 64-bit pointers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig 0021fae7f8 panfrost/decode: Show int uniforms
Float is ambiguous.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig 1f7dfee1b4 panfrost/decode: Expand pointers in compute descriptor
Just as an aid.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig 0aa5d89acb panfrost/decode: Identify "compute FBD"
There is fundamentally not a framebuffer associated with a compute job.
Allocate a new structure for it so we don't mess up graphics when
decoding.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 07:48:04 -07:00
Tomeu Vizoso 4f881237c3 panfrost: Allocate panfrost_job in panfrost_context
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 15:48:35 +02:00
Tomeu Vizoso b5db7cce60 panfrost: Release transient pools
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 15:48:35 +02:00
Tomeu Vizoso 6cec937e22 panfrost: ci: Exclude flip-flops from results
These tests are failing at times, blacklist for now:

dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgba
dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgb
dEQP-GLES2.functional.shaders.matrix.mul.dynamic_highp_mat4_vec4_vertex

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-20 15:48:15 +02:00
Alejandro Piñeiro 6a159bca9d util: add empty line before virgl options
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-06-20 15:21:39 +02:00
Alejandro Piñeiro 790c3dbac8 util: add missing DRI_CONF_OPT_END
When DRI_CONF_GLES_EMULATE_BGRA was added for the virgl driver, it
missed a DRI_CONF_OPT_END.

This make some drivers, like v4c/v3d to crash with the following
error:
Fatal error in __driConfigOptions line 99, column 2: mismatched tag.

Not sure why it doesn't fail with virgl.

Fixes: b793663449
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-20 14:11:30 +02:00
Eric Engestrom a9e09d56a9 isl: tag unreachable path as such
GCC should be able to figure out that all the possible enum values are
exhausted in the switch() and all the branches return from the function,
but apparently it doesn't, so let's tell the compiler explicitly.

This gets rid of the following warnings in GCC 9:

    [1/24] Compiling C object 'src/intel/isl/60d23f8@@isl@sta/isl.c.o'.
    ../src/intel/isl/isl.c: In function ‘isl_surf_init_s’:
    ../src/intel/isl/isl.c:1569:10: warning: ‘array_pitch_el_rows’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     1569 |    *surf = (struct isl_surf) {
          |    ~~~~~~^~~~~~~~~~~~~~~~~~~~~
     1570 |       .dim = info->dim,
          |       ~~~~~~~~~~~~~~~~~
     1571 |       .dim_layout = dim_layout,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~
     1572 |       .msaa_layout = msaa_layout,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1573 |       .tiling = tiling,
          |       ~~~~~~~~~~~~~~~~~
     1574 |       .format = info->format,
          |       ~~~~~~~~~~~~~~~~~~~~~~~
     1575 |
          |
     1576 |       .levels = info->levels,
          |       ~~~~~~~~~~~~~~~~~~~~~~~
     1577 |       .samples = info->samples,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~
     1578 |
          |
     1579 |       .image_alignment_el = image_align_el,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1580 |       .logical_level0_px = logical_level0_px,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1581 |       .phys_level0_sa = phys_level0_sa,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1582 |
          |
     1583 |       .size_B = size_B,
          |       ~~~~~~~~~~~~~~~~~
     1584 |       .alignment_B = base_alignment_B,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1585 |       .row_pitch_B = row_pitch_B,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1586 |       .array_pitch_el_rows = array_pitch_el_rows,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1587 |       .array_pitch_span = array_pitch_span,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1588 |
          |
     1589 |       .usage = info->usage,
          |       ~~~~~~~~~~~~~~~~~~~~~
     1590 |    };
          |    ~
    ../src/intel/isl/isl.c:1488:24: warning: ‘*((void *)&phys_total_el+4)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     1488 |    struct isl_extent2d phys_total_el;
          |                        ^~~~~~~~~~~~~
    ../src/intel/isl/isl.c:1335:38: warning: ‘phys_total_el’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     1335 |       isl_align_div(phys_total_el->w * tile_el_scale,
          |                     ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
    ../src/intel/isl/isl.c:1488:24: note: ‘phys_total_el’ was declared here
     1488 |    struct isl_extent2d phys_total_el;
          |                        ^~~~~~~~~~~~~

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-20 12:05:14 +00:00
Samuel Pitoiset f179febde0 radv: enable DCC for mipmapped color textures on GFX8
It's tricky on GFX9, so only GFX8 for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:04:02 +02:00
Samuel Pitoiset 17f94e1984 radv: do not fast clears if one level can't be fast cleared
And fallback to slow color clears.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:03:58 +02:00
Samuel Pitoiset 450bce522a radv: add fast clears support for mipmapped color images with DCC
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:03:57 +02:00
Samuel Pitoiset fa903ba799 radv: add radv_dcc_clear_level() helper
For clearing only one level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:03:53 +02:00
Samuel Pitoiset b92d87f7f0 radv: re-initialize DCC metadata after decompressing using compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:03:52 +02:00
Samuel Pitoiset dc6e3053a7 radv: initialize levels without DCC during layout transitions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:03:49 +02:00
Thomas Hellstrom 71b43490dd svga: Support ARB_buffer_storage
This basically boils down to supporting persistent and coherent buffer
storage.
We chose to use coherent buffer storage for all persistent buffers
even if it's not explicitly specified, since using glMemoryBarrier to
obtain coherency would be particularly expensive in our driver stack,
and require a lot of additional bookkeeping.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-06-20 09:30:22 +02:00
Thomas Hellstrom 8c01e5ed5f gallium/util: Make it possible to disable persistent maps in the upload manager
For svga, the use of persistent / coherent maps is typically slightly
slower than without them. It's probably a bit case-dependent and
possible to tune, but for now, make sure we can disable those.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-06-20 09:30:22 +02:00
Thomas Hellstrom 3b828c4e68 svga: Map vertex- index- and constant buffers ansynchronously when reading
With SWTNL and index translation we're mapping buffers for reading. These
buffers are commonly upload_mgr buffers that might already be referenced
by another submitted or unsubmitted GPU command. A synchronous map will
then trigger a flush and sync, at least on Linux that doesn't distinguish
between read- and write referencing. So map these buffers async. If they
for some obscure reason happen to be dirty (stream-output, buffer-copy),
the resource_buffer code will read-back and sync anyway. For persistent /
coherent buffers a corresponding read-back and sync will happen in the
kernel fault handler.

Testing: Piglit quick. No regressions.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-06-20 09:30:22 +02:00
Thomas Hellstrom f51915ba62 svga: Fix index buffer uploads
In the case of SWTNL and index translation we were uploading index buffers
and then reading out from them using the CPU. Furthermore, when translating
indices we often cached the results with an upload_mgr buffer, causing the
cached indexes to be immediately discarded on the next write to that
upload_mgr buffer.

Fix this by only uploading when we know the index buffer is going to be
used by hardware. If translating, only cache translated indices if the
original buffer was not a user buffer. In the latter case when we're not
caching, use an upload_mgr buffer for the hardware indices.

This means we can also remove the SWTNL hand-crafted index buffer upload
mechanism in favour of the upload_mgr.

Finally avoid using util_upload_index_buffer(). It wastes index buffer
space by trying to make sure that the offset of the indices in the
upload_mgr buffer is larger or equal to the position of the indices in
the source buffer. From what I can tell, the SVGA device does not
require that.

Testing done: Piglit quick. No regressions.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-06-20 09:30:22 +02:00
Thomas Hellstrom 4f59d51d82 winsys/svga: Make it possible to specify coherent resources
Add a flag in the surface cache key and a winsys usage flag to
specify coherent memory.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-06-20 09:30:22 +02:00
Thomas Hellstrom 4412be40dd gallium/util: Make u_debug_flush support persistent maps
Previously unsynchronized maps have been assumed to also be persistent,
Now destinguish between persistent and unsynchronized map and also support
PIPE_TRANSFER_PERSISTENT from ARB_buffer_storage.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-06-20 09:30:22 +02:00
Gert Wollny a478e56fbd virgl: Add debug flag to bypass driconf to enable the BGRA tweaks
This useful for testing, also because with vtest the dri configuration
is not read.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny 5dbecf7863 virgl: Add a tweak to set the value for emulated queries of GL_SAMPLES_PASSED
On GLES hosts GL_SAMPLES_PASSED is emulated by GL_ANY_SAMPLES_PASSED which returns a boolen.
With this tweak the value that is returned if any sample passed can be set. This
may be of iterest when an application decides whether some geometry is rendered based
on an amount of visibility and not just a binary desicion. virgelrenderer sets a default
of 1024 on th host.

v2: Remove reference from virgl and correct description (Emil)
v3: Send the tweak binary encoded instead of using strings (Gurchetan)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny 59757dbad6 virgl: Add tweak to apply a swizzle when drawing/blitting to a emulated BGRA texture
With Qemu this final swizzle is not needed, but with vtest it is, i.e. it depends on
how a program using virglrenderer uses the surface that is rendered to, hence
a tweak is added.

v2: Update description and fix spelling (Emil)
v3: Send tweak as binary value instead of using strings (Gurchetan)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny b793663449 virgl: Add driconf tweak for emulating BGRA surfaces on GLES
These tweaks are used to fix rendering issues with Valve games and
at least also "The Raven Remastered" when run on a GLES host.

v2: Fix type in define and remove virgl from driconf option (Emil)
v3: Encode tweak binary instead of using strings (Gurchetan)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny 13d4a34c44 virgl: Add override for BGRA format to use swizzled SRGB format
Tie in the check whether the host supports tweaks and whether this tweak
is enabled.

v2: Add comment about the emulated formats not being used directly in the
    guest (Gurchetan)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny 22edafb239 virgl: Add code to accept BGRx_SRGB as RGBx_SRGB
This will be enabled in later patches by the emulation tweak.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny d8967b7951 virgl: Add skeleton to evaluate cap and send tweaks
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny 28dc096e15 virgl: factor out format host bits check
This will make it a single location when we want to replace a format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny 30eb1fdc51 gallium/virgl: Add code path for virgl to read driconf
This works only for the drm variant of virgl and not for the vtest
variant.

v2: Rebase, replace the configuration query function by a pointer to
    the configuration data.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:38 +02:00
Gert Wollny cf800998af virgl: Add driinfo file and tie it into the build
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-06-20 08:50:37 +02:00
Caio Marcelo de Oliveira Filho 9b0720c436 glspirv: Call pass to lower frexp instructions
These were previously handled by the spirv_to_nir, but that changed to
be an explict pass in 23d30f4099 "spirv,nir: lower
frexp_exp/frexp_sig inside a new NIR pass"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-19 22:07:57 -07:00
Caio Marcelo de Oliveira Filho 12131096fa spirv: Restrict use of descriptor intrinsics to Vulkan
In ARB_gl_spirv we'll be able to use variables for uniform buffers, so
don't use the descriptor intrinsics to lower the block access.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-19 22:07:51 -07:00
Nicolai Hähnle 21dd881416 ac/rtld: report better error messages for LDS overallocation
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00
Marek Olšák b64bd5887e ac/rtld: check correct LDS max size
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00
Nicolai Hähnle 1ee0f0d315 radeonsi: add s_sethalt to shaders for debugging
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-06-19 20:30:32 -04:00