Commit Graph

117507 Commits

Author SHA1 Message Date
Marek Olšák e00791c552 st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them
They use the "sample" keyword as a variable name.

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:23:37 -05:00
Lionel Landwerlin 34f32a6d66 anv: implement VK_KHR_timeline_semaphore
v2: Fix inverted condition in vkGetPhysicalDeviceExternalSemaphoreProperties()

v3: Add anv_timeline_* helpers (Jason)

v4: Avoid variable shadowing (Jason)
    Split timeline wait/signal device operations (Jason/Lionel)

v5: s/point/signal_value/ (Jason)
    Drop piece of drm-syncobj timeline code (Jason)

v6: Add missing sync_fd semaphore signaling (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Jason Ekstrand 5a4f15ef2c anv: Plumb timeline semaphore signal/wait values through from the API
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin edc6606d4e anv/wsi: signal the semaphore in the acquireNextImage
We seem to have forgotten about the semaphore in the
acquireNextImageInfo.

v2: Signal semaphore/fence regardless of presentation status (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Jason Ekstrand b10b455c1d anv: Lock around fetching sync file FDs from semaphores
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 246261f0ad anv: prepare the driver for delayed submissions
Timeline semaphore introduce support for wait before signal behavior,
which means that it is now allowed to call vkQueueSubmit() with wait
semaphores not yet submitted for execution. Our kernel driver requires
all of the wait primitives to be created before calling the execbuf
ioctl. As a result, we must delay submissions in the userspace driver.
This change store the necessary information to be able to delay a
VkSubmitInfo submission to the kernel driver.

v2: Fold count++ into array access (Jason)
    Move queue list to another patch (Jason)

v3: Document cleanup of temporary semaphores (Jason)

v4: Track semaphores of SYNC_FD type that needs updating after delayed
    submission

v5: Don't forget to update sync_fd in signaled semaphores after
    submission (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 3e22363537 anv: refcount semaphores
Delayed submissions required by timeline semaphores mean we need to be
able to update the sync fd backed semaphores in a delayed fashion.
This could mean a race between the application destroying the
semaphore and the submission code trying to update it with the new
sync fd.

This change prepares semaphores to be refcounted, we'll most likely
only take a reference for cases where we signal a sync fd semaphore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 3da798c9f1 anv: prepare driver to report submission error through queues
When we will submit to i915 from a submission thread, we won't be able
to directly report the error to the user (in particular through the
debug report callbacks). So prepare 2 paths to report errors device ->
notifying the user immediately, queue -> notifying the user the next
time an entry point is called.

In this change we still report directly for both paths, this will
change in the next commit.

v2: Split NULL batch parameter handling in
    anv_queue_submit_simple_batch() in a different commit

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 89de271bc2 anv: allow NULL batch parameter to anv_queue_submit_simple_batch
We can reuse device->trivial_batch_bo

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin f606c12731 anv: move queue init/finish to anv_queue.c
Prepare the queue initialization to take on more responsabilities and
possibly fail.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 206ab49ba1 anv: expose timeout helpers outside of anv_queue.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 2f4dcc8a1c anv: detach batch emission allocation from device
In the future we'll have 2 different allocations depending on whether
we're using threaded submission or not.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 935f8f0e56 anv: remove list items on batch fini
This doesn't seem to fix anything because those destroy() calls happen
right before the command buffer object & its list of batch_bo is also
destroyed. Still looks a bit cleaner.

v2: Found a second occurence

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Fixes: 26ba0ad54d ("vk: Re-name command buffer implementation files")
Cc: <mesa-stable@lists.freedesktop.org>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin 048f0690ee anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit
We always close the in_fence at the end the anv_cmd_buffer_execbuf()
so when we take it from the semaphore, let's not forget to invalidate
it.

Note that the code leaks the fence_in if we get any error before
reaching the close(). Let's fix that in another patch or better,
rewrite the whole thing!

v2: drop redundant fd = -1 (Jason)

v3: Update commit message (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Rhys Perry de998d3eb5 radv: fix radv_nir_get_max_workgroup_size when nir=NULL
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 84a1a2578 ('compiler: pack shader_info from 160 bytes to 96 bytes')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-11 20:44:12 +00:00
Lionel Landwerlin f93bb90302 mesa: check framebuffer completeness only after state update
The change made in 88d665830f ("mesa: check draw buffer completeness
on glClearBufferfi/glClearBufferiv") correctly updated the state prior
to checking the framebuffer completeness on glClearBufferiv but not in
glClearBufferfi.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 88d665830f ("mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/2072
2019-11-11 22:04:55 +02:00
Caio Marcelo de Oliveira Filho d4a3b09c4b glsl: Check earlier for MaxTextureImageUnits and MaxImageUniforms
Currently the linker do all the work then check for the limits, which
means num_textures and num_images in shader_info may have to store more
than the limit.  This breaks down now since shader_info was packed and
doesn't expect to store larger invalid values.

To fix this, pull the check before we set the counts in shader_info.
Add necessary plumbing to make sure we bail once those errors are
found.

Fixes: 84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 10:58:40 -08:00
Caio Marcelo de Oliveira Filho fce76ae769 glsl: Check earlier for MaxShaderStorageBlocks and MaxUniformBlocks
Currently the linker do all the work then check for the limits, which
means num_ssbos and num_ubos in shader_info may have to store more
than the limit.  This breaks down now since shader_info was packed and
doesn't expect to store larger invalid values.

To fix this, pull the check before we set the counts in shader_info.
One drawback of this approach is that for some cases we might not see
the collected errors from various stages, but bail as soon as a stage
breaks the limits.

Fixes: 84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 10:58:40 -08:00
Dylan Baker a8d941091f util: Use ZSTD for shader cache if possible
This allows ZSTD instead of ZLIB to be used for compressing the shader
cache.

On a 72 core system emulating skl with a full shader-db (with i965):
ZSTD:
    1915.10s user 229.27s system 5150% cpu 41.632 total (cold cache)
    225.40s user 10.87s system 3810% cpu 6.201 total (warm cache)
    154M (235M on disk)
ZLIB:
    2231.33s user 194.24s system 1899% cpu 2:07.72 total (cold cache)
    229.15s user 10.63s system 3906% cpu 6.139 total (warm cache)
    163M (244M on disk)

Tim Arceri sees (8 core ryzen and a full shader-db):
ZSTD:
    2505.22 user 40.50 system 3:18.73 elapsed 1280% CPU (cold cache)
    418.71 user 14.93 system 0:46.53 elapsed 931% CPU (warm cache)
    454.3 MB (681.7 MB on disk)
ZLIB:
    3069.83 user 40.02 system 4:20.13 elapsed 1195% CPU (cold cache)
    425.50 user 15.17 system 0:46.80 elapsed 941% CPU (warm cache)
    470.3 MB (701.4 MB on disk)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-11 18:53:45 +00:00
Laurent Carlier 57acf921e2 egl: avoid local modifications for eglext.h Khronos standard header file
Move differences in eglextchromium.h header file, then provide the same header than libglvnd-1.2
So program that omit to include eglextchromium.h will fail to build with both mesa and libglvnd headers.

Fixes: a0a8109f "include: add the definition of EGL_EXT_image_flush_external"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-11 17:20:16 +00:00
Eric Engestrom eaf4396602 egl: move #include of local headers out of Khronos headers
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-11 17:20:16 +00:00
Jason Ekstrand 69244fc72a intel/fs: Lower large local arrays to scratch
Shader-db results on Kaby Lake:

    total instructions in shared programs: 14929212 -> 14880028 (-0.33%)
    instructions in affected programs: 72428 -> 23244 (-67.91%)
    helped: 6
    HURT: 2
    helped stats (abs) min: 2165 max: 15981 x̄: 8590.00 x̃: 7624
    helped stats (rel) min: 56.06% max: 74.52% x̄: 67.55% x̃: 72.08%
    HURT stats (abs)   min: 1178 max: 1178 x̄: 1178.00 x̃: 1178
    HURT stats (rel)   min: 350.60% max: 361.35% x̄: 355.97% x̃: 355.97%
    95% mean confidence interval for instructions value: -11947.03 -348.97
    95% mean confidence interval for instructions %-change: -125.72% 202.37%
    Inconclusive result (%-change mean confidence interval includes 0).

    total cycles in shared programs: 368585300 -> 342557344 (-7.06%)
    cycles in affected programs: 28144921 -> 2116965 (-92.48%)
    helped: 6
    HURT: 2
    helped stats (abs) min: 1404978 max: 7766106 x̄: 4353922.00 x̃: 3890682
    helped stats (rel) min: 82.01% max: 95.57% x̄: 89.95% x̃: 92.28%
    HURT stats (abs)   min: 47778 max: 47798 x̄: 47788.00 x̃: 47788
    HURT stats (rel)   min: 278.20% max: 282.98% x̄: 280.59% x̃: 280.59%
    95% mean confidence interval for cycles value: -5900438.73 -606550.27
    95% mean confidence interval for cycles %-change: -140.79% 146.16%
    Inconclusive result (%-change mean confidence interval includes 0).

    total spills in shared programs: 9243 -> 8901 (-3.70%)
    spills in affected programs: 2718 -> 2376 (-12.58%)
    helped: 4
    HURT: 4

    total fills in shared programs: 21831 -> 10141 (-53.55%)
    fills in affected programs: 11804 -> 114 (-99.03%)
    helped: 6
    HURT: 2

    total sends in shared programs: 815912 -> 815912 (0.00%)
    sends in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    LOST:   1
    GAINED: 3

The helped shaders are all compute shaders in Aztec Ruins.  There is
also a compute shader in synmark2 OglCSDof that's helped but it doesn't
show up in above shader-db results because it went from SIMD8 to SIMD16.
That shader improves enough to yield an 15-20% performance boost to the
benchmark as a whole on my KBL laptop.  The hurt shaders are a couple
shaders in Kerbal Space Program and a couple in Aztec Ruins.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand 53bfcdeecf intel/fs: Implement the new load/store_scratch intrinsics
This commit fills in a number of different pieces:

 1. We add support to brw_nir_lower_mem_access_bit_sizes to handle the
    new intrinsics.  This involves simple plumbing work as well as a
    tiny bit of extra logic to always scalarize scratch intrinsics

 2. Add code to brw_fs_nir.cpp to turn nir_load/store_scratch intrinsics
    into byte/dword scattered read/write messages which use the A32
    stateless model.

 3. Add code to lower_surface_logical_send to handle dword scattered
    messages and the A32 stateless model.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand e2297699de intel/nir: Plumb devinfo through lower_mem_access_bit_sizes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand 1dff48af05 intel/fs: refactor surface header setup
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand a0999bc049 intel/fs: Add DWord scattered read/write opcodes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand 83f04d80b0 intel/nir: Use nir_extract_bits in lower_mem_access_bit_sizes
The new helper solves most of the annoying problems with data wrangling
in brw_nir_lower_mem_access_bit_sizes.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand b8d45d9307 nir: Add tests for nir_extract_bits 2019-11-11 17:17:02 +00:00
Jason Ekstrand d0bbf98c96 nir/builder: Add a nir_extract_bits helper
This new helper is better than nir_bitcast_vector because it's able to
take a (mostly) arbitrary range from the source vector.  The only
requirement is that first_bit has to be aligned to the smaller of the
two bit sizes.  It wouldn't be hard to lift that requirement but it's
reasonable for now.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Eric Engestrom 86d3a346f1 egl: fix _EGL_NATIVE_PLATFORM fallback
When the X11 or Haiku platforms were compiled in, they would bypass the
`_EGL_NATIVE_PLATFORM` fallback by always returning themselves instead.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-11 17:14:07 +00:00
Ricardo Garcia 20b403aad0 anv: Unify GetDeviceQueue and GetDeviceQueue2
Avoid duplicating some checks and code by making anv_GetDeviceQueue a
subcase of anv_GetDeviceQueue2, like radv does.

Signed-off-by: Ricardo Garcia <rgarcia@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 16:14:56 +00:00
Alyssa Rosenzweig 5b31182665 panfrost: Select format-specific blending intrinsics
If we have an accelerated path for a particular framebuffer format,
let's use it to save a bunch of instructions in a blend shader.

[Tomeu: Only use the faster intrinsic on >T760]

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig 3295edaadf pan/midgard: Pack load/store masks
While most load/store operations on 32-bit/vec4 intriniscally, some are
not and have special type-size-dependent semantics for the mask. We need
to convert into this native format.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig 843874c7c3 pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_pan
We can use the native Midgard ops for this, depending what chip we're
on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig 5885b64e42 pan/midgard: Identify ld_color_buffer_u8_as_fp16*
There are two versions of this opcode, depending what version of the ISA
you're using. I'm not sure if there's a semantic difference; I think
there might be some slight subtleties but it's too early to know at this
stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig 03f73c7fc6 nir: Add load_output_u8_as_fp16_pan intrinsic
This is a single opcode, at least on newer Midgard chips. It's easier to
have this represented in NIR rather than trying to optimize out the
conversions, so let's add the intrinsic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Tomeu Vizoso ee5321f239 panfrost: Set depth and stencil for SFBD based on the format
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-11 15:23:44 +00:00
Erik Faye-Lund b4d47e21d7 zink: correct depth-stencil format
When using packed vulkan-formats on little-endian systems, we need to
swap the components for the gallium formats. And since Zink isn't
big-endian safe yet, little-endian is the only endianess we care about
right now.

This fixes a bunch of piglit tests, amongs others:
- spec@arb_depth_texture@depth-level-clamp
- spec@arb_depth_texture@depthstencil-render-miplevels * d=z24
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
2019-11-11 14:35:53 +00:00
Erik Faye-Lund d7a6cc8f4a zink/spirv: add support for nir_op_flrp
This fixes the following piglit:

spec@ati_fragment_shader@ati_fragment_shader-render-fog

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-11 14:25:30 +01:00
Chris Wilson 863872e141 egl: Mention if swrast is being forced
The system can be disabling HW acceleration unbeknown to the user,
leading to a long debug session trying to work out which component is
failing. A quick mention that it is the environment override would be
very useful.

v2: Use more generic "CPU renderer" and so try to avoid jargon.

Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
2019-11-11 11:52:02 +00:00
Jason Ekstrand 9e440b8d0b spirv: Sort out the mess that is sampled image
This commit makes two major changes.  First, we add a second case to
OpLoad for sampled images which constructs a vtn_sampled_image and
stashes that rather than stashing a pointer to the combined image
sampler like we do for bare samplers and images.  This should be more in
line with how SPIR-V is intended to work and hopefully doesn't cause any
weird problems.  The second is a rework of vtn_handle_texture to assume
that everything has an image but not everything has a sampler.  We also
add a vtn_fail_if for the case where a texture instructions require a
sampler but none is provided.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-09 15:29:01 +00:00
Jason Ekstrand 9cc4c2c916 spirv: Add a vtn_decorate_pointer helper
This helper makes a duplicate copy of the pointer if any new access
flags are set at this stage.  This way we don't end up propagating
access flags further than they actual SPIR-V decorations.  In several
instances where we create new pointers, we still call the decoration
helper directly because no copy is needed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-09 15:29:01 +00:00
Jason Ekstrand 4f9688e571 spirv: Remove the type from sampled_image
We have types on all vtn_values at this point so there's no reason to
carry the redundant type information.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-09 15:29:01 +00:00
Rob Clark a3dc975ee7 freedreno/ir3: also track # of nops for shader-db
The instruction count is (mostly) a measure of what optimization passes
can do, while # of nops is more an indication of how effectively the
scheduler is balancing register pressure vs instruction count.  So track
these independently.

(There could be opportunities to rematerialize values to reduce register
pressure, swapping some nop's with other alu instructions, so nothing is
truely independent.. but it is still useful to break these stats out.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark 5f45818673 freedreno/ir3: sync disasm changes from envytools
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark f3980a8ef7 freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISION
Set flag based on actual output reg type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark f0f9ec6882 freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISION
We should really be setting this based on the actual output register
type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark df229977c3 freedreno/ir3: remove obsolete comment
The meta PHI instruction was removed long ago.  And fanin/fanout
themselves to not contribute actual instructions (at least not by the
time you get to sched, they may prevent copy-propagating away a mov)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark e804b42fd7 freedreno/ir3/ra: remove ir print after livein/out
The IR hasn't changed at this point, so it isn't really adding any
value.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark 8b92052f10 freedreno/ir3/ra: move regs_count==0 check
Fold it in to writes_gpr() (since a register that does not reference any
registers by definition does not write a register).  This lets us avoid
having to handle this case in a few other places.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00