Commit Graph

26205 Commits

Author SHA1 Message Date
Erico Nunes 7a51abab42 lima: actually wait for bo in lima_bo_wait
PIPE_TIMEOUT_INFINITE is unsigned and gets assigned to signed fields
where it ends up as -1. When this reaches the kernel as a timeout it
gets translated as no timeout, which cause the waiting functions to
return immediately and not actually wait for a completion.

This seems to cause unstable results with lima where even piglit tests
randomly fail.

Handle this by setting the signed max value in case of infinite timeout.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-16 16:31:29 +02:00
Vasily Khoruzhick 861c2b8d31 lima: fix compilation of standalone compiler
Fixes: e0aeee946004("lima: add summary report for shader-db")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-15 16:59:51 -07:00
Alyssa Rosenzweig 6fe4822cca panfrost: Add R10G10B10A2_SSCALED vertex format
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig 375d4c2c74 panfrost: Extend blending to MRT
Our hardware supports independent (per-RT) blending, but we need to
route those settings through from Gallium.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig 6ed3843224 pan/mfbd: Stuff in RT count
Fixes DATA_INVALID_FAULTs with multiple render targets.

We do always allocate space for 4 cbufs just to keep things sane. This
may not be strictly necessary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig b78e04c17b panfrost: Note "MFBD preload disable" bit
It's a chicken bit, as far as I can tell. Buck buck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:39:57 -07:00
Alyssa Rosenzweig de2efd5ea7 panfrost: Ensure we upload at least 1 blend RT
Otherwise we'll get memory junk.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig 54438267c3 panfrost: Zero tripipe on initialize
I don't think the hardware cares, but this adds a lot of noise to traces
that we would rather not need to look at.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig 3e6f2e7aba panfrost: Remove panfrost_add_dependency asserts
It doesn't... make a ton of sense to need to assert and this routine is
hotter than you might expect. Doesn't matter for release builds, of
course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Marek Olšák aafc95ceb6 radeonsi: add support for Renoir
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-14 17:31:04 -04:00
Christian Gmeiner 17200bb67a etnaviv: fix weird indentation
Fixes: 797a2e4fd0 ("etnaviv: update logic to determine uniform limits")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 21:29:48 +02:00
Christian Gmeiner 1290cc3e27 etnaviv: split destroy_shader
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner f90b23b8c4 etnaviv: split link_shader
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner 0765a1dd0e etnaviv: split dump_shader
Also this adds the missing impl for etna_dump_shader_nir(..).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner a36d04daa1 etnaviv: mv etnaviv_compiler.c etnaviv_compiler_tgsi.c
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner b2da8a8357 etnaviv: correct PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE handling
Have a correct answer to GL_MAX_FRAGMENT_UNIFORM_VECTORS and
GL_MAX_VERTEX_UNIFORM_VECTORS.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach l.stach@pengutronix.de
2019-08-14 12:29:56 +02:00
Christian Gmeiner 797a2e4fd0 etnaviv: update logic to determine uniform limits
Taken 1:1 from the header file.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach l.stach@pengutronix.de
2019-08-14 12:29:56 +02:00
Christian Gmeiner 45cb5eee5d etnaviv: put uniform limit determination into own function
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach l.stach@pengutronix.de
2019-08-14 12:29:56 +02:00
Marek Vasut 8f97262cdd etnaviv: Use reentrant screen lock around flush
The flush callback may be called on the same pipe context, and thus
the same stream, from two different threads of execution. However,
etna_cmd_stream_flush{,2}() must not be called on the same stream
from two different threads of execution as that would mess up the
etna_bo refcounting and likely have other ugly side effects.

Fix this by using a reentrant screen lock around the flush callback.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-14 10:36:36 +02:00
Gert Wollny 742d3c918f softpipe: Add support for ARB_derivative_control
Enables and passes piglits:

spec/ARB_drivative_control/
        dfdx-coarse
        dfdx-dfdy
        dfdx-fine
        dfdy-coarse
        dfdy-fine

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-14 07:03:15 +00:00
Vasily Khoruzhick b579af77f3 lima/ppir: print srcs and dests in ppir_node_print_prog()
Now we have an accessors for ppir src, so it's possible to easily
print all srcs and dests while dumping ppir representation.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:44:07 -07:00
Vasily Khoruzhick 6920710af5 lima/ppir: use src accessors in ppir regalloc
Get rid of most switch/case by using src accessors

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:44:07 -07:00
Vasily Khoruzhick a5e7c12ced lima/ppir: add ppir_node to ppir_src
We'll need it if we want to walk through node sources

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:43:58 -07:00
Vasily Khoruzhick afa64a2105 lima/ppir: introduce accessors for ppir_node sources
Sometimes we need to walk through ppir_node sources, common
accessor for all node types will simplify code a lot.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:38:07 -07:00
Jordan Justen 0f5be81edd
iris: Expose aux buffer as 2nd plane w/modifiers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Jordan Justen 246eebba4a
iris: Export and import surfaces with modifiers that have aux data
The DRI interface for modifiers with aux data treats the aux data as a
separate plane of the main surface.

When the dri layer requests the plane associated with the aux data, we
save the required information into the dri aux plane image.

Later when the image is used, the dri plane image will be available in
the pipe_resource structure's `next` field. Therefore in iris, we
reconstruct the aux setup from this separate dri plane image when the
image is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Kenneth Graunke 99c8eb997d
iris: Do proper format checks for Y+CCS modifier support
We need to ensure that the DRI image format supports CCS.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-08-13 15:20:47 -07:00
Jordan Justen 51f941c20c
iris: Create single bo for surfaces with modifiers and aux data
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Jordan Justen 2c7b577e13
iris: Split iris_resource_alloc_aux to enable aux modifiers
Reworks:

 * If the aux-state is not ISL_AUX_STATE_AUX_INVALID, then use memset
   even when memset_value is zero. The hiz buffer initial aux-state
   will be set to invalid, and therefore we can skip the memset. But,
   for CCS it will be set to ISL_AUX_STATE_PASS_THROUGH, and therefore
   the aux data must be cleared to 0 with the memset. Previously we
   would use BO_ALLOC_ZEROED with the CCS aux data, so this memset
   wasn't required. Now, the CCS aux data may be part of the main
   surface. We prefer to not use BO_ALLOC_ZEROED excessively, so the
   memset is needed for the CCS case. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:46 -07:00
Jordan Justen aad36dfd16
iris: Add aux offset into hiz_address
This is not currently required because the hiz buffer is in a separate
buffer, and therefore the offset is 0. If we combine the aux buffer
with the main surface buffer, then the hiz offset may become non-zero.

Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:39 -07:00
Alyssa Rosenzweig 0c56330361 panfrost: Workaround bug in partial update implementation
We can't intersect with empty regions.

Fixes: 65ae86b854 ("panfrost: Add support for KHR_partial_update()")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-13 11:13:48 -07:00
Alyssa Rosenzweig 29cfd154e3 panfrost: Implement transform feedback
Midgard has no hardware support for transform feedback, so we simulate
it in software. Lucky us.

What Midgard does do is write out vertex shader outputs to main memory
unconditonally. Fragment shaders read varyings back from main memory;
there's no on-chip storage for varyings. Whether this was a reasonable
design is a question I will not be engaging in this commit message.

What that does mean is that, in some sense, Midgard *always* does
transform feedback uncondtionally, and there's no way to turn off
transform feedback. Normally, we would allocate some scratch memory
every frame to store the varyings in an arbitrary format (interleaved
for simplicity), and then feed that scratch to the fragment shader and
discard when the rendering completes.

The only difference now is that sometimes, for some buffers, we use a BO
provided to us by Gallium and a format provided by Gallium, instead of
allocating the memory and choosing the format ourselves. This has some
limitations -- in particular, it only works at vec4 granularity, so a
corresponding GLSL linkage patch is needed to correctly implement
transform feedback for non-vec4 types. Nevertheless, given the hardware
already works in this admittedly-bizarre fashion, transform feedback is
"free". Or, at least, it's no more expensive than any other rendering.

Specifically not implemented is dynamically-sized transform feedback
(i.e. with geometry/tesselation shaders).

Spoiler alert: Midgard has no support for geometry *or* tessellation
shaders, despite advertising support. They get compiled to *massive*
compute shaders. How's that for checkbox compliance?

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:41 -07:00
Alyssa Rosenzweig 7c29588c07 panfrost: Increment offsets[] per draw
We have to maintain the internal offset ourselves. Per v3d.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:39 -07:00
Alyssa Rosenzweig e7a05a601e panfrost: Fixup stream out information per variant
We could probably get away with doing this once per pipe_shader_state
but let's not jump down that rabbit hole quite yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:32 -07:00
Alyssa Rosenzweig 5b0a1a4e49 panfrost: Route outputs_written through the compiler
It's there in shader_info, but we need to access it from pan_context.c

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:17 -07:00
Alyssa Rosenzweig f714eab882 panfrost: Import stream out utility from iris
We'll need this in a moment. Ken's implementation, lightly edited for
Panfrost.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:14 -07:00
Alyssa Rosenzweig 9b2514d6c6 panfrost: Flush when using transform feedback
This is a huge hack to workaround incomplete BO flushing logic, but it's
enough for the dEQP transform feedback tests, and doing the resource
management to get this right is out-of-scope for this patch series.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:11 -07:00
Alyssa Rosenzweig 4b0001c42d panfrost: Set PIPE_CAP_TGSI_TEXCOORD
It doesn't really make sense, since we don't have special texture
coordinate varyings, but it'll make some code simpler for XFB and it
doesn't hurt us, even if I lose a bit of my soul setting it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:09 -07:00
Alyssa Rosenzweig 72fc06df9c panfrost: Wire up statistics for primitives
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN should now be handled.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:04 -07:00
Alyssa Rosenzweig 7c224c1008 panfrost: Implement callbacks for PRIMITIVES queries
We're just going to compute them in the driver but let's get the
structures setup to handle them. Implementation from v3d.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:42:48 -07:00
Rob Clark 72d086fc36 freedreno/a6xx: move SSBO/image consts to IBO stateobj
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark ab01ab4d4f freedreno/a6xx: move VS driverparams to it's own stateobj
If driver-params are required, we really should emit it on every draw
for correctness.  And if not required, we should emit a DISABLE so that
un-applied state updates from previous draws don't corrupt the const
state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 882d53d8e3 freedreno/ir3+a6xx: same VBO state for draw/binning
Worth ~+20% on gl_driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 4b82d1bbb7 freedreno/a6xx: add fd_emit_take_group()
Which takes ownership of the stateobj.  Useful for streaming state-
objs, to avoid an extra ref/unref

Worth ~5% at gl_driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 4a188e4215 freedreno/ir3: track # of driver params
To avoid emitting unneeded const state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 7f1e3391c6 freedreno/a6xx: move immediates to program stateobj
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark f0b91730a1 freedreno/a6xx: stop using ir3_emit_{vs,fs}_consts()
Should be no functional change.  Next step is to re-arrange various
const state into different stateobjs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 53667a43c4 freedreno/ir3: push ctx further up call chain
Move more of the code to deal just w/ screen, without requiring ctx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 4080dfb8af freedreno/ir3: move ring_wfi() further up call chain
Hoist them out of code-paths that will eventually be called directly for
various a6xx+ const related stateobjs.

This ends up duplicating one constlen check in ir3_emit_vs_consts(), to
avoid what could otherwise be an unnecessary WFI on older gens.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark c6fab232c8 freedreno/all: move more emit helpers to screen
framebuffer_barrier() still depends on the ctx, but the rest can move to
screen.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 684f4b5843 freedreno/a3xx-a6xx+ir3: move emit_const* to screen
These don't need to be in context, and we'll need them in screen in a
later patch.  Plus it's a good cleanup.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark 566f2281c5 freedreno/a6xx: add fd6_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark e89255b0a5 freedreno/a5xx: add fd5_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:25 -07:00
Rob Clark d256e3f34a freedreno/a3xx: add fd3_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:25 -07:00
Rob Clark b9d3f39728 freedreno/a2xx: add fd2_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark ec0ec641d8 freedreno/a4xx: add fd4_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark 2f94de2372 freedreno/a2xx: call fd2_emit_ib() directly from fd2
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark eb45422c5f freedreno/a5xx: call fd5_emit_ib() directly from fd5
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark 50e15e1c6f freedreno/a4xx: call fd4_emit_ib() directly from fd4
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark 4326eeac97 freedreno/a3xx: call fd3_emit_ib() directly from fd3
No reason for the indirection when called from a3xx specific code.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark 32014afa44 freedreno/ir3: move VS driver-param emit
Move DP emit to it's own function.  No functional change, just code
motion to prepare for splitting up const state into multiple state-
objs on a6xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Boris Brezillon 65ae86b854 panfrost: Add support for KHR_partial_update()
Implement ->set_damage_region() region to support partial updates.

This is a dummy implementation in that it does not try to merge
damage rects. It also does not deal with distinct regions and instead
pick the largest quad as the only damage rect and generate up to 4
reload rects out of it (the left/right/top/bottom regions surrounding
the biggest damage rect).

We also do not try to reduce the number of draws by passing all quad
vertices to the blit request (would require extending u_blitter)

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-13 14:41:10 +02:00
Jordan Justen fc12fd05f5
iris: Implement pipe_screen::resource_get_param
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 01:12:30 -07:00
Iago Toral Quiroga 2353f7f7ef vc4: clamp gl_PointSize to a minimum of 1.0
The OpenGL ES spec requires that the value of gl_PointSize is clamped
to an implementation-dependent range matching what is advertised by
GL_ALIASED_POINT_SIZE_RANGE. For VC4 this is [1.0, 512.0], but the
hardware won't clamp to the minimum side of the range and won't render
points with a size strictly smaller than 1.0 either, so we need to
clamp manually. For points larger than the maximum size of the range
the hardware clamps automatically.

Fixes piglit test:
spec/!opengl 2.0/vs-point_size-zero

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 09:44:54 +02:00
Iago Toral Quiroga b594796f1b v3d: do not automatically flush current job for SSBOs and shader images
If the current job has a sequence of draw calls involving SSBOs and/or
shader images, we would flush the job in between each draw call.
With this change, we won't flush the current job and we rely on the
application inserting correct barriers by issuing glMemoryBarrier()
when needed.

v2 (Eric):
 - When mapping a buffer for writing, we always need to flush.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:25:15 +02:00
Iago Toral Quiroga f1cf1153e8 v3d: only process glMemoryBarrier() for SSBOs and images
PIPE_BARRIER_UPDATE is defined as:
PIPE_BARRIER_UPDATE_BUFFER | PIPE_BARRIER_UPDATE_TEXTURE

Which means we were flushing for any flags other than these two, but
this was intended to only flush for ssbos and images.

Actually, the driver automatically flushes jobs as we need, including
writes/reads involving SSBOs and images, so we don't really need to
flush anything when the program emits a barrier. However, this may
lead to excessive flushing in some cases, so we will soon change this
to avoid atutomatic flushing of the current job for SSBOs and images,
meaning that we will rely on the application to emit correct memory
barriers for these that we should make sure to process here.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:25:15 +02:00
Iago Toral Quiroga f1559ca922 v3d: fix flushing of SSBOs and shader images
If the current draw call includes SSBOs, then we must flush any jobs
that are writing to the same SSBOs (so that our SSBOs reads are correct),
as well as jobs reading from the same SSBO (so that our SSBO writes don't
stomp previous SSBO reads).

The exact same logic applies to shader images. In this case we were already
flushing previous writes, but we should also flush previous reads.

Note that We don't need to call v3d_flush_jobs_reading_resource() and
v3d_flush_jobs_writing_resource() separately though, since flushing
jobs that read a resource also flushes those writing to it.

Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:25:15 +02:00
Rafael Antognolli a1a499e7fe iris/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.
If the pixel pipes have a different number of subslices, emit a slice
hashing table that will ensure proper workload distribution.

v2: Don't need to set the mask - it's mbo (Ken).
v3: Don't keep a reference to the resource used for emitting the table
(Ken).
2019-08-12 16:19:08 -07:00
Jason Ekstrand 134607760a intel/compiler: Fill a compiler statistics struct
This commit is all annoying plumbing work which just adds support for a
new brw_compile_stats struct.  This struct provides a binary driver
readable form of the same statistics we dump out to stderr when we
INTEL_DEBUG is set with a shader stage.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Khaled Emara 2720ad5fd9 freedreno: disable tiling for cubemaps
Tiling doesn't work quite well with cubemaps.
Revert to linear textures, until it's fixed.
2019-08-12 22:30:54 +00:00
Khaled Emara 0ae16fb565 freedreno: add tiling parameters for 2D/2DArray/3D 2019-08-12 22:30:54 +00:00
Khaled Emara aeaba3e4a6 freedreno: simplified slices setup for a3xx
a3xx doesn't support ASTC and layout_first always returns false
2019-08-12 22:30:54 +00:00
Khaled Emara e11a239e8c freedreno: enable tiled textures for debug builds 2019-08-12 22:30:54 +00:00
Rhys Perry 7740149852 nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo
v2: add to series
v3: update Makefile.sources
v4: don't remove a comment and break statement
v4: use nir_can_move_instr

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00
Rhys Perry da8ed68aca nir: replace nir_move_load_const() with nir_opt_sink()
This is mostly the same as nir_move_load_const() but can also move
undef instructions, comparisons and some intrinsics (being careful with
loops).

v2: actually delete nir_move_load_const.c
v3: fix nir_opt_sink() usage in freedreno
v3: update Makefile.sources
v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def
v4: handle if uses
v4: fix handling of nested loops
v5: re-write adjust_block_for_loops
v5: re-write setting of use_block for if uses

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00
Andreas Baierl 1c45541c7f lima/ppir: Add fddx and fddy
Lower fddx and fddy and set the right bits in codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-08-12 23:20:04 +02:00
Francisco Jerez 026773397b iris/gen9: Optimize slice and subslice load balancing behavior.
See "i965/gen9: Optimize slice and subslice load balancing behavior."
for the rationale.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-12 13:17:58 -07:00
Alyssa Rosenzweig 15954ab6ca pan/midgard: Implement nir_intrinsic_load_num_work_groups
Just a sysval to route through.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig 60d80157d1 panfrost: Force flush every compute job
This is of course suboptimal for performance, forcing each
glDispatchCompute call to be submitted separately to the kernel and
finish to completion. However, for the initial bring-up of compute jobs,
this simplifies quite a bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig 2efa025b05 panfrost: Add SSBO system value
For each SSBO index we get from Gallium/NIR, we need two pieces of
information in the shader:

1. The address of the SSBO in GPU memory. Within the shader, we'll be
accessing it with raw memory load/store, so we need the actual address,
not just an index.

2. The size of the SSBO. This is not strictly necessary, but at some
point, we may like to do bounds checking on SSBO accesses.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Marek Olšák 8ce4f9bbc3 radeonsi: remove the always_nir option
tgsi_to_nir is no longer optional if NIR is enabled.
2019-08-12 14:52:17 -04:00
Marek Olšák 4e545f934f radeonsi/nir: implement default tess level system values
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák 5167ca27fa gallium: add TGSI_SEMANTIC_DEFAULT_OUTER/INNER_LEVEL
for radeonsi NIR support.
2019-08-12 14:52:17 -04:00
Marek Olšák 1b881852bc compiler: add SYSTEM_VALUE_USER_DATA_AMD
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák f0ccc5457a compiler: add shader_info.cs.user_data_components_amd 2019-08-12 14:52:17 -04:00
Marek Olšák 028dbd35ba compiler: add shader_info.vs.blit_sgprs_amd
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák 902dd50cf0 gallium: add AMD-specific compute TGSI enums
for tgsi_to_nir
2019-08-12 14:52:17 -04:00
Marek Olšák 6a2bdb8d01 gallium: add TGSI_PROPERTY_VS_BLIT_SGPRS_AMD for tgsi_to_nir
needed by radeonsi NIR support
2019-08-12 14:52:17 -04:00
Christian Gmeiner 914ecc9384 etnaviv: fix compile warnings in release build
[27/31] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_compiler_nir.c.o'.
In file included from ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c:552:
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir_emit.h: In function 'ra_assign':
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir_emit.h:903:9: warning: unused variable 'ok' [-Wunused-variable]
    bool ok = ra_allocate(g);
         ^~
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c: In function 'etna_compile_shader_nir':
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c:663:9: warning: unused variable 'ok' [-Wunused-variable]
    bool ok = emit_shader(c->nir, &options, &v->num_temps, &num_consts);
         ^~

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-12 16:58:13 +00:00
Tapani Pälli d4b574f26a iris: reorder arguments as expected by the function
CID: 1452262
Fixes: b4c54894bb "iris: Handle vertex shader with window space position"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
2019-08-12 13:08:26 +03:00
Tapani Pälli 590ba15d6e iris/android: move iris_query.c to 'per gen' LIBIRIS_SRC_FILES
Fixes Iris build on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-08-12 10:06:36 +03:00
Kenneth Graunke 0f3768bc5d iris: Free query on error path
CID: 1452276
2019-08-11 14:04:31 -07:00
Kenneth Graunke 661be3fef9 iris: Add missing 'break'
We don't want to fall through to unreachable().

CID: 1452277
2019-08-11 14:04:31 -07:00
Caio Marcelo de Oliveira Filho 5ed4e31c08 spirv: Drop lower_workgroup_access_to_offsets
Intel drivers are not using this anymore, and turnip still don't have
Compute Shaders, so won't make a difference to stop using this option.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-08-10 22:15:35 -07:00
Kenneth Graunke f1dba99639 iris: minor restyling 2019-08-10 00:16:45 -07:00
Mark Janes 9c597514d4 iris/query: enable amd performance monitors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:34 -07:00
Mark Janes 469af7fdc9 iris/perf: get monitor results
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:32 -07:00
Mark Janes 1cb4fc184f iris/perf: add begin/end hooks
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:24 -07:00
Mark Janes 8c4c346665 iris/perf: add delete query
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:17 -07:00
Mark Janes aca42759ff iris/perf: implement iris_create_monitor_object
This is the first call that provides the iris context to the monitor
implementation.  On the first call, use the iris context to initialize
the monitor context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:14 -07:00
Mark Janes 0fd4359733 iris/perf: implement routines to return counter info
With this commit, Iris will report that AMD_performance_monitor is
supported, and will allow the caller to query the available metrics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:03 -07:00
Lionel Landwerlin 8818db8f2c vc4: prepare for p_compiler.h dependency removal
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Alyssa Rosenzweig 9bc99e60a8 panfrost: Assign varying buffers dynamically
Rather than hardcoding certain varying buffer indices "by convention",
work it out at draw time. This added flexibility is needed for
futureproofing and will be enable streamout.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig 46dae9ef58 panfrost: Assign indices at draw-time
This will allow us to shuffle buffers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig af6d3f7cb5 panfrost: Break out pan_varyings.c
This code is fairly self-contained, so let's factor it out of the giant
pan_context.c monster.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig 4dba493fd7 panfrost: Enable PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS
Just as easy/hard as the rest of XFB.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig 5ff7973560 panfrost: Import streamout data structures
Pretty much copypasted from v3d to jumpstart us.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Krzysztof Raszkowski c0ab268f9c gallium/swr: Fix glClear when it's used with glEnable/glDisable GL_SCISSOR_TEST
When GL_SCISSOR_TEST is enabled glClear is handled by state tracker
and there is no need to do this in gallium driver.

Reviewed-by: Alok Hota alok.hota@intel.com
2019-08-09 18:56:13 +02:00
Christian Gmeiner 889e752965 etnaviv: fix typo
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-09 13:08:20 +00:00
Christian Gmeiner de5070ea8d etnaviv: add gpu_supports_texture_target(..)
Currently I am seeing a handful of the following debug message:
translate_texture_target:495: Unhandled texture target: 0

PIPE_BUFFER is not handled in translate_texture_target(..) which makes
sense as it is used to translate from PIPE_XXX to GPU specific value
during etna_create_sampler_view_state(..).

To fix this problem introduce gpu_supports_texture_target(..) which just
checks if the texture target is supported.

Fixes: dfe048058f ("etnaviv: support 3D and 2D array textures")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-09 13:08:20 +00:00
Vasily Khoruzhick 39a90749af lima: introduce a struct describing texture descriptor
Use a struct with bitfields to construct texture descriptor
instead of poking bits in array of uint32_t. It improves code
readability and makes it easier to experiment with unknown fields.

Also fix mipmapping while we're at it - Utgard can have up to 13
levels, but 64 bytes is enough only for 10. Calculate descriptor
size dynamically to account extra levels if we need them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-08 19:17:20 -07:00
Vasily Khoruzhick edf008c04e lima: add texel format table
Introduce a table for supported texel formats and use it to check
whether format is supported and for converting pipe format to lima
texel format.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-08 19:17:20 -07:00
Gurchetan Singh 42759dc986 virgl: check scanout mask
Otherwise, virgl will report renderable or texturable formats as
also scan-out formats.

v2: drop host feature check (@kusma)

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-08-08 16:21:57 -07:00
Gurchetan Singh 3da029ac1a virgl: fixup_readback_format --> fixup_formats
This function is generalizable.

Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-08-08 16:21:57 -07:00
Gurchetan Singh bf0ca99ec7 virgl: access caps in a less verbose way in virgl_is_format_supported
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-08-08 16:21:57 -07:00
Greg V c0dc5c1859 meson: define ETIME to ETIMEDOUT if not present
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Roman Stratiienko 28061e0ab0 lima: Fix Android.mk
1. Update LOCAL_SRC_FILES according to commit
54434fe670 ("lima/gpir: Rework the scheduler").

2. Add libpanfrost_shared.a dependency.

3. Generate lima_nir_algebraic.c with Android.mk
Fixes Android build error introduced by commit 5adfc8602c
("lima/ppir: move sin/cos input scaling into NIR")

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
2019-08-08 17:47:22 +00:00
Rhys Perry c52c54a746 anv,i965,iris: deduplicate setting of total_shared
v5: add patch

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Lucas Stach 68c24b09c2 etnaviv: remember data offset into BO
Imported resources might not start at offset 0 into the buffer object.
Make sure to remember the offset that is provided with the handle on
import.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-08 16:11:34 +02:00
Jan Zielinski 207026d29e swr/rasterizer: modernize thread TLB
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 12:33:21 +02:00
Jan Zielinski 387599a661 swr/rasterizer: Refactor events collection mechanism
Several improvements and cleanups in events and statstics mechanisms

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 11:15:07 +02:00
Jan Zielinski ff75c35846 swr/rasterizer: improvements in simdlib
1. fix build issues with MSVC 2019 compiler

The MSVC 2019 compiler seems to have an issue with optimized code-gen
when using the _mm256_and_si256() intrinsic.
Only disable use of integer vpand on buggy versions MSVC 2019.
Otherwise allow use of integer vpand intrinsic.

2. Remove unused vec/matrix functionality

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:53:47 +02:00
Jan Zielinski b55a93fdd4 swr/rasterizer: Events are now grouped and enabled by knobs
All events are now grouped as follows:

-Framework (i.e. ThreadStart) [always ON]
-Api (i.e. SwrSync) [always ON]
-Pipeline [default ON]
-Shader [default ON]
-SWTag [default OFF]
-Memory [default OFF]

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:33:25 +02:00
Jan Zielinski 982d99490f swr/rasterizer: do not mark tiles dirty until actually rendered
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:16:20 +02:00
Jan Zielinski 4f04f260d9 swr/rasterizer: enable size accumulation in mem stats
Small refactoring is also performed

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:16:20 +02:00
Jan Zielinski 365ad367f1 swr/rasterizer: enable using AOS vertex data format
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:16:20 +02:00
Iago Toral Quiroga fb9f7872e7 v3d: handle wait requirement when retrieving query results correctly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 0f2d1dfe65 v3d: use the GPU to record primitives written to transform feedback
We can use the PRIMITIVE_COUNTS_FEEDBACK packet to write various primitive
counts to a buffer, including the number of primives written to transform
feedback buffers, which will handle buffer overflow correctly.

There are a couple of caveats with this:

Primitive counters are reset when we emit a 'Tile Binning Mode Configuration'
packet, which can happen in the middle of a primitives query, so we need to
read the buffer when we submit a job and accumulate the counts in the context
so we don't lose them.

We also need to do the same when we switch primitive type during transform
feedback so we can compute the correct number of recorded vertices from
the number of primitives. This is necessary so we can provide an accurate
vertex count for draw from transform feedback.

v2:
 - When computing the number of vertices for a primitive, pass in the base
   primitive, since that is what the hardware will count.
 - No need to update primitive counts when switching primitive types if
   the base primitives are the same.
 - Log perf warning when mapping the primitive counts BO for readback (Eric).
 - Only emit the primitive counts packet once at job end (Eric).
 - Use u_upload mechanism for the primitive counts buffer (Eric).
 - Use the XML to generate indices into the primitive counters buffer (Eric).

Fixes piglit tests:
spec/ext_transform_feedback/overflow-edge-cases
spec/ext_transform_feedback/query-primitives_written-bufferrange
spec/ext_transform_feedback/query-primitives_written-bufferrange-discard
spec/ext_transform_feedback/change-size base-shrink
spec/ext_transform_feedback/change-size base-grow
spec/ext_transform_feedback/change-size offset-shrink
spec/ext_transform_feedback/change-size offset-grow
spec/ext_transform_feedback/change-size range-shrink
spec/ext_transform_feedback/change-size range-grow
spec/ext_transform_feedback/intervening-read prims-written

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 9eb8699e0f v3d: be more explicit about the query types supported
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 9b316ab57a v3d: generate packet unpack functions
These were not being compiled because of the lack of __gen_unpack_address.

v2:
 - Shift raw address correctly (Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Tomeu Vizoso e7eac8a1e8 panfrost: Print errors from kernel
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 7c8434889d panfrost: Mark buffers as PANFROST_BO_HEAP
What we call GROWABLE in Mesa corresponds to the HEAP BO flag in the
kernel. These buffers cannot be memory mapped in the CPU side at the
moment, so make sure they are also marked INVISIBLE.

This allows us to allocate a big heap upfront (16MB) without actually
reserving space unless it's needed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 19afd41e65 panfrost: Mark BOs as NOEXEC
Unless a BO has the EXECUTABLE flag, mark it as NOEXEC.

v2: - Rework version detection (Alyssa).

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 9398932c2d panfrost: Take into account flags when looking up in the BO cache
This will be useful right now so we avoid retrieving a non-executable
buffer when a executable one is needed.

As we support more flags, this logic will need to be extended to
consider the different trade-offs to be made when matching BO
specifications to BOs in the cache.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 950b5fc596 panfrost: Allocate shaders in their own BOs
Instead of all shaders being stored in a single BO, have each shader in
its own.

This removes the need for a 16MB allocation per context, and allows us
to place transient blend shaders in BOs marked as executable (before
they were allocated in the transient pool, which shouldn't be
executable).

v2: - Store compiled blend shaders in a malloc'ed buffer, to avoid
      reading from GPU-accessible memory when patching (Alyssa).
    - Free struct panfrost_blend_shader (Alyssa).
    - Give the job a reference to regular shaders when emitting
      (Alyssa).

v3: - Split out the allocation flags change (Rob).

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Mark Janes 2446f5cfd8 intel/perf: move perf-related constants to common location
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers.  Place these macros within perf,
if the perf implementation contains the only references to the values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Ilia Mirkin 9ff8da0e50 nvc0: fix program dumping, use _debug_printf
This debug situation is unforunate. debug_printf only does something
with DEBUG set, but in practice all that needs to be moved to !NDEBUG.
For now, use _debug_printf which always prints. However the whole
function is guarded by !NDEBUG.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-08-07 22:32:02 -04:00
Ilia Mirkin f6af104340 nvc0: add support for ATOMC_WRAP TGSI operations
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-08-07 22:32:02 -04:00
Ilia Mirkin a2bb7b26a1 gallium: redefine ATOMINC_WRAP to be more hardware-friendly
Both AMD and NVIDIA hardware define it this way. Instead of replicating
the logic everywhere, just fix it up in one place.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 22:31:56 -04:00
Pierre-Eric Pelloux-Prayer 519bebdb40 radeonsi: limit DPBB context_states_per_bin batches when using gfx9 workaround
It seems that using 'context_states_per_bin = 1' for DPBB fixes the reported issue.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110214

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:24 -04:00
Pierre-Eric Pelloux-Prayer 120d0ef937 radeonsi: reduce DPBB persistent_states_per_bin value for APUs
Fixes some reported GPU hangs on RAVEN.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111231

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:22 -04:00
Pierre-Eric Pelloux-Prayer 6bda9ca062 radeonsi: fix typo in DPBB register field
Also only set FLUSH_ON_BINNING_TRANSITION for GPU families that needs it (matches
what si_emit_dpbb_disable is doing).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:20 -04:00
Pierre-Eric Pelloux-Prayer 90bded140e radeonsi: fix S_028C48_MAX_ALLOC_COUNT value
This field uses "value minus 1" encoding.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:09 -04:00
Christian Gmeiner 323cda475b etnaviv: drop struct etna_3d_state
Also drop #if 0 code block.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
2019-08-07 22:12:00 +02:00
Bas Nieuwenhuizen 5a26f528cb meson,i965: Link with android deps when building for android.
The DBG marco in brw_blorp.c ends up calling an android log function:

error: undefined reference to '__android_log_print'

v2: On suggestion from Lionel, hang the Android dependency onto a new
    libintel_common dependency.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 15:34:46 +02:00
John Stultz fcfa2d1447 mesa: freedreno: Android.registers.mk: Fix up register xml.h file generation
The current Androdi.registers.mk file causes build failures that
look like:
 FAILED:
 external/mesa3d/src/freedreno/Android.registers.mk:49: error: implicit rules are obsolete: out/target/product/linaro_db845c/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/%.xml.h

Caused by the following Android build rule change:
https://android.googlesource.com/platform/build/+/HEAD/Changes.md#implicit_rules

I tried to replace this with something similar to the static
pattern suggested in the URL above, but ended up getting all the
xml.h files generated using only the first a2xx.xml source file.

So I've fallen back to explicitly defining the make rules for
each.

Additionally, we needed to provide the proper
LOCAL_EXPORT_C_INCLUDE_DIRS and add the defined static library
to the components that depend on the register headers.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2019-08-07 02:18:38 +00:00
John Stultz 96baf052b2 mesa: Add ir3/ir3_nir_imul.c generation to Android.mk
With current master we're seeing build failures with AOSP:
  error: undefined symbol: ir3_nir_lower_imul

This is due to the ir3_nir_imul.c file not being generated
in the Android.mk files.

This patch simply adds it to the Android build, after which
thigns build and book ok on db410c.

Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2019-08-07 02:18:19 +00:00
Rohan Garg 16edd56fcc panfrost: Take into account a index_bias for glDrawElementsBaseVertex calls
Midgard does not accept a index_bias directly and relies instead on a
bias correction offset (offset_bias_correction) in order to calculate
the unbiased vertex index.

We need to make sure we adjust offset_start and vertex_count in order to
take into account the index_bias as required by a
glDrawElementsBaseVertex call and then supply a additional
offset_bias_correction to the hardware.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-06 17:18:19 -07:00
Pierre-Eric Pelloux-Prayer f84c9ad17a radeonsi: enable EXT_shader_image_load_store
This depends on LLVM 10 because this needs https://reviews.llvm.org/D65283

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:07 -04:00
Pierre-Eric Pelloux-Prayer 25fff591c1 radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:06 -04:00