Commit Graph

116006 Commits

Author SHA1 Message Date
Rhys Perry 77ebb030ed aco: fix load_constant with multiple arrays
I thought I fixed this, but I guess I must have broken it again.

Fixes various dEQP-VK.draw.* tests

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-04 22:43:11 +01:00
Eric Anholt ce76be9933 nir: Fix some wonky whitespace in nir_search.h.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt 3cc914921e nir: Factor out most of the algebraic passes C code to .c/.h.
Working on the algebraic implementation, I was being driven nuts by my
editor not highlighting and handling indentation for the C code.  It turns
out that it's basically not pass-specific code, and we can move it over to
the relevant .c file.  Replaces 30KB of code with 34KB of data on my i965
build.  No perf diff on shader-db (n=3)

Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt c23db0df18 nir: Keep the range analysis HT around intra-pass until we make a change.
This lets us memoize range analysis work across instructions.  Reduces
runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3).

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt 7025dbe794 nir: Skip emitting no-op movs from the builder.
Having passes generate these is just making more work for copy
propagation (and thus probably calling more optimization passes)
later.  Noticed while trying to debug nir_opt_algebraic()
top-to-bottom having O(n^2) behavior due to not finding new matches in
replacement code.

Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt e7b754a05c nir: Make nir_search's dumping go to stderr.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Adam Jackson 3746ee912f surfaceless: Support EGL_WL_bind_wayland_display
Feature parity with the drm, x11, and wayland platforms.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1870
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2019-10-04 15:49:10 +00:00
Rhys Perry 1264acdf4b nir/print: always use the right FILE *
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-04 15:24:10 +00:00
Erik Faye-Lund 49b32233a0 nir: initialize needs_helper_invocations as well
Similar to the previous commit, we should also initialize
needs_helper_invocations here.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 14:55:40 +00:00
Erik Faye-Lund 1d6d2ca9f1 nir: initialize uses_discard to false
This matches what we do for uses_sample_qualifier, and what we
do in ir_set_program_inouts.cpp as well.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 14:55:40 +00:00
Rhys Perry a87b0f5141 radv/aco,aco: set lower_fmod
This simplifies ACO and allows the lowered code to be optimized (in
particular, constant folded).

Totals from affected shaders:
SGPRS: 1776 -> 1776 (0.00 %)
VGPRS: 1436 -> 1436 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 203452 -> 203564 (0.06 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 103 -> 103 (0.00 %)

At least some of the code size increase seems to be from literals being
applied to instructions as a result of constant folding.

v2: remove fmod/frem handling in init_context()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-04 14:00:46 +00:00
Prodea Alexandru-Liviu 0fe2e04f2d scons/windows: Fix build with LLVM>=8
Fixes eebe091d29
("scons/windows: Enable compute shaders when possible.")
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-10-04 13:48:08 +00:00
Michel Dänzer b012f06d66 dri3: Pass __DRI2_THROTTLE_COPYSUBBUFFER from loader_dri3_copy_drawable
0 is __DRI2_THROTTLE_SWAPBUFFER, which doesn't really make sense here.

Avoids dri_flush() throttling twice for the same glFlush call with front
buffer rendering, as described in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2057 .

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 10:55:43 +02:00
Gert Wollny 7cbb44aa6a r600: Fix interpolateAtCentroid
If the instruction interpolateAtCentroid is used the extra interpolator
must also be enabled in the state.

Fixes: fs-interpolateatcentroid-block

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-04 10:09:01 +02:00
Dylan Baker 1481d05409 meson: Only error building gallium video without libdrm when the platform is drm
Fixes: 3b265f61f5
       ("meson: gallium media state trackers require libdrm with x11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1878
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-10-03 22:14:20 -07:00
Alyssa Rosenzweig dcd2f26b98 pan/midgard: Replace mir_is_live_after with new pass
Now that we have live_out calculated per block as metadata, calculating
liveness of an instruction at a given point in the program becomes O(n)
to the size of the block worst-case, rather than O(n) the program.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig 39a4b3ebe9 pan/midgard: Calculate temp_count for liveness
This needs to be correct or the analysis fails.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig ad5fcac005 pan/midgard: Invalidate liveness for mir_is_live_after
Callers should have liveness info ready. Ideally we'd have a nice
metadata tracking framework like NIR to handle this automatically, but
for now this will allow us to make forward progress... when we're about
to do something with liveness, invalidate everything ahead to force a
clean calculation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig 3450c013c5 pan/midgard: Begin tracking liveness metadata
This will allow us to explicitly invalidate liveness analysis results so
we can cache liveness results.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig 846e5d5ba8 pan/midgard: Don't try to OR live_in of successors
By definition, once liveness analysis has occurred:

   live_out = OR {succ} succ->live_in

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig 013cd6bed2 pan/midgard: Move RA's liveness analysis into midgard_liveness.c
There are unfortunately two distinct liveness analysis passes in the
compiler right now -- one good (but complex) pass used by RA based on
solving data flow equations, and one awful (but simple) pass used for
dead code elimination and bundling based on an abstract walk of the AST.

Let's move RA's pass into shared code so we can work on unifying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig 76a76de7af pan/midgard: Add mir_calculate_temp_count helper
This allows us to fill in ctx->temp_count explicitly, even if we haven't
squished down the MIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig c59fae0fef pan/midgard: Remove mir_has_multiple_writes
We already enforce this with the SSA/register distinction in the
backend. There is no need to duplicate this logic merely for an assert.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Erik Faye-Lund 3f4be0d199 .mailmap: add a couple of aliases for Jakob Bornecrantz
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
2019-10-03 17:11:20 -04:00
Erik Faye-Lund 2eb916a58d .mailmap: add an alias for Tomeu Vizoso
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-10-03 17:11:10 -04:00
Erik Faye-Lund 27ae5c81f7 .mailmap: add an alias for Gert Wollny
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-10-03 17:10:59 -04:00
Erik Faye-Lund 28b64049d0 .mailmap: add an alias for Alexandros Frantzis
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-10-03 17:10:28 -04:00
Erik Faye-Lund b7baf70778 .mailmap: specify spelling for Elie Tournier
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-10-03 17:09:42 -04:00
Boris Brezillon 1ac33aae49 panfrost: Get rid of the flush in panfrost_set_framebuffer_state()
Now that we have track inter-batch dependencies, the flush done in
panfrost_set_framebuffer_state() is no longer needed. Let's get rid of
it.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 70cf93c4d7 panfrost: Kill the explicit serialization in panfrost_batch_submit()
Now that we have all the pieces in place to support pipelining batches
we can get rid of the drmSyncobjWait() at the end of
panfrost_batch_submit().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 0a12a16bae panfrost: Do fine-grained flushing when preparing BO for CPU accesses
We don't have to flush all batches when we're only interested in
reading/writing a specific BO. Thanks to the
panfrost_flush_batches_accessing_bo() and panfrost_bo_wait() helpers
we can now flush only the batches touching the BO we want to access
from the CPU.

This fixes the dEQP-GLES2.functional.fbo.render.texsubimage.* tests.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 2225383af8 panfrost: Make sure the BO is 'ready' when picked from the cache
This is needed if we want to free the panfrost_batch object at submit
time in order to not have to GC the batch on the next job submission.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 22190bc27b panfrost: Add flags to reflect the BO imported/exported state
Will be useful to make the ioctl(WAIT_BO) call conditional on BOs that
are not exported/imported (meaning that all GPU accesses are known
by the context).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 82399b58d3 panfrost: Add a panfrost_flush_batches_accessing_bo() helper
This will allow us to only flush batches touching a specific resource,
which is particularly useful when the CPU needs to access a BO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon a45984b244 panfrost: Add a panfrost_flush_all_batches() helper
And use it in panfrost_flush() to flush all batches, and not only the
one currently bound to the context.

We also replace all internal calls to panfrost_flush() by
panfrost_flush_all_batches() ones.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon b5d8f9bbbf panfrost: Prepare panfrost_fence for batch pipelining
The panfrost_fence logic currently waits on the last submitted batch,
but the batch serialization that was enforced in
panfrost_batch_submit() is about to go away, allowing for several
batches to be pipelined, and the last submitted one is not necessarily
the one that will finish last.

We need to make sure the fence logic waits on all flushed batches, not
only the last one.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 2dad9fde50 panfrost: Start tracking inter-batch dependencies
The idea is to track which BO are being accessed and the type of access
to determine when a dependency exists. Thanks to that we can build a
dependency graph that will allow us to flush batches in the correct
order.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 40a07bfbd7 panfrost: Add a panfrost_freeze_batch() helper
We'll soon need to freeze a batch not only when it's flushed, but also
when another batch depends on us, so let's add a helper to avoid
duplicating the logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 819738e4af panfrost: Use the per-batch fences to wait on the last submitted batch
We just replace the per-context out_sync object by a pointer to the
the fence of the last last submitted batch. Pipelining of batches will
come later.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 6936b7f319 panfrost: Add a batch fence
So we can implement fine-grained dependency tracking between batches.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon a8bd265cef panfrost: Make panfrost_batch->bos a hash table
So we can store the flags as data and keep the BO as a key. This way
we keep track of the type of access done on BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon ada752afe4 panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
The type of access being done on a BO has impacts on job scheduling
(shared resources being written enforce serialization while those
being read only allow for job parallelization) and BO lifetime (the
fragment job might last longer than the vertex/tiler ones, if we can,
it's good to release BOs earlier so that others can re-use them
through the BO re-use cache).

Let's pass extra access flags to panfrost_batch_add_bo() and
panfrost_batch_create_bo() so the batch submission logic can take the
appropriate when submitting batches. Note that this information is not
used yet, we're just patching callers to pass the correct flags here.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon 12f790f7da panfrost: Add the shader BO to the batch in patch_shader_state()
We know a shader will be used by a batch when
panfrost_patch_shader_state() is called, so let's add the shader BO at
that time.

Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Andres Gomez 02c265be9d egl: Remove the 565 pbuffer-only EGL config under X11.
The CTS finally has agreed to drop the requirement for a
565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the
code to satisfy this requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

This reverts commit 82607f8a90,
commit 6ad31c4ff3 and
commit dacb11a585.

v2:
  - Reference the VK-GL-CTS issue (Eric E.).

v3:
  - Don't revert
    fc21394bc4 ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers")
    (Kenneth).

References: VK-GL-CTS issue 1601.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-03 23:51:46 +03:00
Dylan Baker 974e3ad004 bin: delete unused releasing scripts
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Dylan Baker 3226b12a09 release: Add an update_release_calendar.py script
This script is for updating post version bump.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Dylan Baker 86079447da scripts: Add a gen_release_notes.py script
This script is responsible for generating an entire page in the
docs/relnotes/ directory. It includes a template for the page, and uses
mako to fill in the necessary bits. It is designed to be purely fire and
forget, calculating previous versions, shortlogs, bug fixes, and dates.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Dylan Baker 7ff49c25ed docs: add a new_features.text file and remove 19.3.0 release notes
The next patch is going to introduce a tool that creates the entire
release html page for us, without any user intervention. As such we
can't be editing it. To that end the script will read the
new_features.txt file to get a list of new features.

This is a flat text file, one entry per line.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Rafael Antognolli cdc331c6f9 anv/block_pool: Align anv_block_pool state to 64 bits.
On 64 bits platforms, some atomic operations like __sync_fetch_and_add()
have constant time, but on 32 bits platforms they are implemented with a
loop and might take much longer.

Additionally, it seems like if their operands are not aligned to 64
bits, they also require extra memory accesses. From the Intel
Architecture's Developer Manual Vol. 1, 4.1.1:

 "A word or doubleword operand that crosses a 4-byte boundary or a
 quadword operand that crosses an 8-byte boundary is considered
 unaligned and requires two separate memory bus cycles for access."

Forcing the u64 field to be aligned to 64 bits seems to make the unit
tests that are stressing this finish much faster.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-03 12:40:33 -07:00
Erik Faye-Lund 0103d4747a loader/dri3: do not blit outside old/new buffers
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-03 18:58:34 +00:00