Commit Graph

99287 Commits

Author SHA1 Message Date
Bas Nieuwenhuizen dbf1e918cd radv: emit pa_sc_mode_cntl_0 with multisample state.
We don't have the meta kludge with 0 viewports anymore,
so we can always enable them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 23:35:12 +01:00
Kenneth Graunke c7dcee58b5 i965: Avoid problems from referencing orphaned BOs after growing.
Growing the batch/state buffer is a lot more dangerous than I thought.

A number of places emit multiple state buffer sections, and then write
data to the returned pointer, or save a pointer to brw->batch.state.bo
and then use it in relocations.  If each call can grow, this can result
in stale map references or stale BO pointers.  Furthermore, fences refer
to the old batch BO, and that reference needs to continue working.

To avoid these woes, we avoid ever swapping the brw->batch.*.bo pointer,
instead exchanging the brw_bo structures in place.  That way, stale BO
references are fine - the GEM handle changes, but the brw_bo pointer
doesn't.  We also defer the memcpy until a quiescent point, so callers
can write to the returned pointer - which may be in either BO - and
we'll sort it out and combine the two properly in the end.

v2/v3:
- Handle stale pointers in the shadow copy case, where realloc may or
  may not move our shadow copy to a new address.
- Track the partial map explicitly, to avoid problems with buffer reuse
  where multiple map modes exist (caught by Chris Wilson).

v4:
- Don't use realloc in the CPU shadow case, it isn't safe.

Fixes: 2dfc119f22 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v3]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-19 11:30:10 -08:00
Kenneth Graunke 8a5bc304ff i965: Rename 'aux' to 'prog_data' in program cache.
'aux' is a very generic name, suggesting it can be a bunch of things.
However, it's always the brw_*_prog_data structure.  So, call it that.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-19 11:29:47 -08:00
Chuck Atkins a4be2bcee2 swr: allow a single swr architecture to be builtin
Part 2 of 2 (part 1 is autoconf changes, part 2 is C++ changes)

When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen.  Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.

This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.

Based on an initial implementation by Tim Rowley.

v2: Refactor repetitive preprocessor checks to reduce code duplication
v3: Formatting changes per Bruce C. Also delay screen creation until end
    to avoid leaks when failure conditions are hit.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 13:16:00 -06:00
Chuck Atkins 2ed8b6f827 swr: (autoconf) allow a single swr architecture to be builtin
Part 1 of 2 (part 1 is autoconf changes, part 2 is C++ changes)

When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen.  Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.

This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.

Based on an initial implementation by Tim Rowley.

v2: Fix comment placement pointed out by Bruce C.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 13:15:54 -06:00
Greg V 8ff8c82630 swr: fix clang 5 null cast warning
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-19 16:15:56 +00:00
Gert Wollny ea89843b3d mesa/program: Fix -Wunused-param warning
v2: Don't annotate, but remove the unused ctx parameter

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-19 15:45:57 +00:00
Gert Wollny 81d8a0f4a4 mesa/program/prog_execute.c: Silence -Wunused-param
v2: Don't annotate, but remove the unused ctx parameter

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:45:57 +00:00
Gert Wollny fef4b16523 mesa: Make numSamples an unsigned int
As a followup to the previous patch propagate the change of numSamples
from int to unsigned to gl_config::samples and consequently fix some
-Wsign-compare warnings.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-19 15:45:57 +00:00
Gert Wollny d0e37599ab gallium: Make (num_)samples an unsigned int
According to the ARB_multisample num_samples is a non-negative integer.
Consequently define it as such, fail in glx/choose_visual if a negative
number is given.

v2: split patch into gallium and mesa part

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:45:57 +00:00
Andres Gomez 7a2c87177a docs: correct a typo in releasing instructions
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:53 +02:00
Andres Gomez 7760566ab7 docs: move untar line in basic testing instructions for coherence
For scons, windows/mingw dealing with LLVM_CONFIG is done before
untarring. This is also more convenient for copy and paste.

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:39 +02:00
Andres Gomez bd8537fa71 docs: add a notice whenever a release is the final in a series
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:27 +02:00
Andres Gomez b910eec489 docs: add final release note for 17.2.8
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:25:20 +02:00
Andres Gomez 4b50cfef44 docs: add final release note for 17.1.10
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 15:20:57 +02:00
Grazvydas Ignotas e6abc613e2 st/vdpau: release held lock in error path
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-01-19 13:30:22 +02:00
Juan A. Suarez Romero 302ff82434 docs: update calendar, add news and link release notes to 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-01-19 10:46:18 +01:00
Juan A. Suarez Romero 059db12097 docs: add sha256 checksums for 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit bc1503b13fcf8190262757ea7f86613e14e25981)
2018-01-19 10:46:18 +01:00
Juan A. Suarez Romero 3205a45fc3 docs: add release notes for 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 80f5f279b3f9fc752ba35b1cb2878a936f8ace90)
2018-01-19 10:46:18 +01:00
Samuel Iglesias Gonsálvez 7109a1fe13 anv: avoid segmentation fault due to vk_error()
vk_error() is a macro that calls __vk_errorf() with instance == NULL.

Then, __vk_errorf() passes a pointer to instance->debug_report_callbacks
to vk_debug_error(), which segfaults as this pointer is invalid but not
NULL.

Fixes: e5b1bd6ab8 "vulkan: move anv VK_EXT_debug_report implementation to common code."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-19 09:39:05 +01:00
Bas Nieuwenhuizen 32170d87e3 ac/nir: Fix vector extraction if source vector has >4 elements.
v2: Add forgotten argument and start offset.

Fixes: 91074bb11b "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-19 02:00:28 +01:00
Bas Nieuwenhuizen f4211e6f93 ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.
Fixes: 91074bb11b "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-19 02:00:14 +01:00
Bas Nieuwenhuizen 4a9fd90e1e ac/nir: Fix TCS output LDS offsets.
When a channel was not set we also did not increase the LDS address,
while that obviously should happen.

The output loading code was inadvertently fixed which resulted in a
mismatch causing the SaschaWillems tessellation demo to result
in corrupt rendering.

Fixes: 7898eb9a60 "ac: rework load_tcs_{inputs,outputs}"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:54:59 +01:00
Bas Nieuwenhuizen bd5c942cef radv: Use correct bindings for inputRate in key generation.
The bindings also have an index field.

Fixes: 49d035122e "radv: Add single pipeline cache key."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104677
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 01:54:59 +01:00
Bas Nieuwenhuizen b1444c9ccb radv: Implement VK_ANDROID_native_buffer.
Passes
  dEQP-VK.api.smoke.*
  dEQP-VK.wsi.android.*

with android-cts-7.1_r12 .

Unlike the initial anv implementation this does
use syncobjs instead of waiting on the CPU.

This is missing meson build coverage for now.

One possible todo is that linux 4.15 now has a
sycall that allows us to export amdgpu fence to
a sync_file, which allows us not to force all
fences and semaphores to use syncobjs. However,
I had trouble with my kernel crashing regularly
with NULL pointers, and I'm not sure how beneficial
it is in the first place given that intel uses
syncobjs for all fences if available.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen a3e241ed07 radv: Add create image flag to not use DCC/CMASK.
If we import an image, we might not have space in the
buffer for CMASK, even though it is compatible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen e344cd8178 radv: Generate VK_ANDROID_native_buffer.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen 0f89f9b8eb radv: Replace an assert with unreachable.
Otherwise we get uninitialized variable warnings for es_vgpr_comp_cnt.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 00:38:45 +01:00
Bas Nieuwenhuizen e417ab212b radv: Remove DCC check on CS resolve dst image.
Gives a warning when the assert is disabled, and not even
necessarily true.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-19 00:38:45 +01:00
George Kyriazis f76ca91ae0 gallivm: support avx512 (16x32) in interleave2_half
lp_build_interleave2_half was not doing the right thing for avx512-style
16-wide loads.

This path is hit in the swr driver with a 16-wide vertex shader. It is
called from lp_build_transpose_aos, when doing texel fetches and the
fetched data needs to be transposed to one component per output register.

Special-case the post-load swizzle operations for avx512 16x32 (16-wide
32-bit values) so that we move the xyzw components correctly to the outputs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-18 17:07:06 -06:00
Brian Paul 9e6efdd177 vbo: fix VBO optimization regression
The optimization in change 8e4efdc895 ("vbo: optimize some display
list drawing") missed the loopback case.  This is used when the
glBegin/End primitive doesn't have a uniform set of vertex attributes.
The new Piglit gl-1.0-dlist-materials test hits this.

So check the aligned_vertex_buffer_offset(list) value and adjust the
buffer offset accordingly.

We also need to remove the 'start == 0' assertion in the loopback
code since it no longer applies.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-18 15:07:17 -07:00
Dylan Baker 26bde1e354 meson: ensure that xmlpool_options.h is generated for targets that need it
Currently a couple of gallium targets race with xmlpool_options.h being
generated, don't do that.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-18 13:31:47 -08:00
Timothy Arceri 3bccb5dba9 ac: fix visit_ssa_undef() for doubles
V2: use LLVMIntTypeInContext()

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-19 08:09:04 +11:00
Dave Airlie 3153d74207 ac/nir: account for view index in the user sgpr allocation.
The view index user sgpr wasn't being accounted for properly,
this refactors out the code to decide if it's required and then
uses that info to account for it.

Fixes: 180c1b924e (ac/nir: Add shader support for multiviews.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 19:47:40 +00:00
Dave Airlie 5758a8c402 r600: enable ARB_enhanced_layouts
Only one piglit test fails,
sso-vs-gs-fs-array-interleave

There are 3 tests using ssbo without checking sizes failing also
but those are test bugs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-19 05:33:44 +10:00
Chris Wilson 34499e8ddc intel: Future-proof ring names for aubinator_error_decode
The kernel is moving to a $class$instance naming scheme in preparation
for accommodating more rings in the future in a consistent manner. It is
already using the naming scheme internally, and now we are looking at
updating some soft-ABI such as the error state to use the new naming
scheme. This of course means we need to teach aubinator_error_decode how
to map both sets of ring names onto its register maps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2018-01-18 17:35:21 +00:00
Kenneth Graunke 3e18c53e59 i965: Bind null render targets for shadow sampling + color.
Portal 2 appears to bind RGBA8888_UNORM textures to a sampler2DShadow,
and calls shadow2D() on it.  This causes undefined behavior in OpenGL.

Unfortunately, our sampler appears to hang in this scenario, which is
not acceptable.  Just give them a null surface instead, which returns
all zeroes.

Fixes GPU hangs in Portal 2 on Kabylake.

Huge thanks to Jason Ekstrand for noticing this crazy behavior while
sifting through crash dumps.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104487
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 09:32:28 -08:00
Iago Toral Quiroga 7ec6e4e689 anv/query: implement multiview interactions
From the Vulkan spec with KHX extensions:

  "If queries are used while executing a render pass instance that has
   multiview enabled, the query uses N consecutive query indices
   in the query pool (starting at query) where N is the number of bits
   set in the view mask in the subpass the query is used in.

   How the numerical results of the query are distributed among the
   queries is implementation-dependent. For example, some implementations
   may write each view's results to a distinct query, while other
   implementations may write the total result to the first query and write
   zero to the other queries. However, the sum of the results in all the
   queries must accurately reflect the total result of the query summed
   over all views. Applications can sum the results from all the queries to
   compute the total result."

In our case we only really emit a single query (in the first query index)
that stores the aggregated result for all views, but we still need to manage
availability for all the other query indices involved, even if we don't
actually use them.

This is relevant when clients call vkGetQueryPoolResults and pass all N
queries to retrieve the results. In that scenario, without this patch,
we will never see queries other than the first being available since we
never emit them.

v2: we need the same treatment for timestamp queries.

v3 (Jason):
 - Better an if instead of an early return.
 - We can't write to this memory in the CPU, we should use
   MI_STORE_DATA_IMM and emit_query_availability (Jason).

v4 (Jason):
 - No need to take the value to write as parameter, just hard code it to 0.

Fixes test failures in some work-in-progress CTS multiview+query tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 16:37:06 +01:00
Emil Velikov c9b2cb7897 vc5: add missing files to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-18 11:36:36 +00:00
Emil Velikov 393cf04fa4 broadcom: add missing headers to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-18 11:21:35 +00:00
Mario Kleiner d67ef48580 i965/screen: Allow drirc to set 'allow_rgb10_configs' again.
Since setup of ALLOW_RGB10_CONFIGS was moved to i965's own
brw_config_options.xml, this was hard-coded to false and
could not be overriden by drirc. Add some parsing into
i965's private screen->optionCache to enable drirc again.

Fixes: b391fb26df ("dri_util: remove ALLOW_RGB10_CONFIGS option (v2)")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-18 08:18:47 +02:00
Samuel Iglesias Gonsálvez eac629deb6 anv: return VK_ERROR_OUT_OF_DEVICE_MEMORY when surface size is out of HW limits
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 06:48:47 +01:00
Timothy Arceri 9248f72c4e ac: tidy up array indexing logic
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-18 15:59:27 +11:00
Rob Clark 4c69961daf mesa/st: translate SO info in glsl_to_nir() case
This was handled for VS, but not for GS.

Fixes for gallium drivers using nir:
spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams-without-invocations
spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams*
spec@arb_transform_feedback3@arb_transform_feedback3-ext_interleaved_two_bufs_gs*
spec@ext_transform_feedback@geometry-shaders-basic
spec@ext_transform_feedback@* use_gs
spec@glsl-1.50@execution@geometry@primitive-id*
spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip *
spec@glsl-1.50@transform-feedback-builtins
spec@glsl-1.50@transform-feedback-type-and-size

v2: don't call st_translate_program_stream_output) for TCS

v3: drop scanning patch outputs as TCS can't output xfb

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-01-18 15:35:58 +11:00
Dave Airlie 44a27cdcec r600/sb: add lds related peepholes.
if no destination:
a) convert _RET instructions to non _RET variants if no dst
b) set src0 to undefined if it's a READ, this should get DCE then.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:38:17 +00:00
Dave Airlie 3bb2b2cc45 r600/sb: use different stacks for tracking lds and queue usage.
The normal ssa renumbering isn't sufficient for LDS queue access,
this uses two stacks, one for the lds queue, and one for the
lds r/w ordering.

The LDS oq values are incremented in their use in a linear
fashion.
The LDS rw values are incremented in their definitions and used
in the next lds operation to ensure reordering doesn't occur.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:38:09 +00:00
Dave Airlie 8cfec333c0 r600/sb: schedule LDS ops in appropriate places.
So LDS ops have to be SLOT_X,
and LDS OQ reads have read port restrictions so we try
and force those into only having one per slot and avoiding
bank swizzles.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:38:05 +00:00
Dave Airlie 71a50de4fc r600/sb: hit the scheduler with a big hammer to avoid lds splits.
This tries to avoid an lds queue read getting scheduled separately
from an lds ret read, the non-sb code uses the same style of hammer,
this isn't foolproof.

We can do better, but it's a bit tricky, as you have to scan ahead
and either schedule more lds oq moves and more lds reads and that
could lead to you running out of space anyways.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:56 +00:00
Dave Airlie 46549bd6b6 r600/sb: adding lds oq tracking to the scheduler
This adds support for tracking the lds oq read/writes
so can avoid scheduling other things in between.

This patch just adds the tracking and assert to show
problems.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:52 +00:00
Dave Airlie 5002dd4052 r600/sb: add gcm support to avoid clause between lds read/queue read
You have to schedule LDS_READ_RET _, x and MOV reg, LDS_OQ_A_POP
in the same basic block/clause. This makes sure once we've issues
and MOV we don't add another block until we balance it with an
LDS read.

Acked-By: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:37:42 +00:00