Commit Graph

38134 Commits

Author SHA1 Message Date
Krzysztof Raszkowski 4ff02b3edd swr: fix support for GL_ARB_copy_image extension
This commit fix support and adjusts the capabilities
returned by the SWR driver and the documentation
to correctly report the GL_ARB_copy_image extension.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-06-05 15:26:47 +00:00
Guido Günther b921df352d build: Build etnaviv drm
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-05 08:58:05 +00:00
Guido Günther 3696235f82 etnaviv: gallium: Use internal etnaviv_drmif.h
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-05 08:58:05 +00:00
Guido Günther 3835e21369 etnaviv: untabify
Two driver files had tabs mixed with spaces. Remove the tabs.

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-05 08:58:05 +00:00
Tomeu Vizoso c7a6e07454 panfrost: bifrost: Fix format string in disassembler
The compiler configuration was hardened to fail on format warnings and
things stopped building.

Fixes: c9c1e26106 ("mesa: prevent common string formatting security issues")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-05 10:40:19 +02:00
Kenneth Graunke 8d4f68ee20 iris: Free the buffer when reading from the disk cache. 2019-06-04 23:53:57 -07:00
Alyssa Rosenzweig bfa9f56a2a panfrost/midgard: Don't promote non-SSA to pipeline registers
Fixes: 33800f4612 ("panfrost/midgard: Implement "pipeline register"
prepass")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-05 00:12:36 +00:00
Eric Anholt 36cb209787 freedreno: Drop invalid scissor optimization.
We do support TF now, so it's no longer valid.  Besides, if we want this
optimization, we should probably have mesa/st doing it right for everyone.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-04 16:44:37 -07:00
Chia-I Wu 65439291a0 virgl: resolve to correct level during texture read
When PIPE_TRANSFER_READ requires a resolve, we blit from the host
storage to a temporary storage, and do a format conversion from the
temporary storage to the guest storage.  This change makes sure we
convert to the correct level of the guest storage.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-06-04 21:37:03 +00:00
Chia-I Wu 067018d4e7 virgl: fix texture resolving with compressed formats
util_format_translate_3d expects the source box to be aligned to the
block size.  When resolving, make sure the size of the staging
buffer is aligned to the block size.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-06-04 21:37:03 +00:00
Bas Nieuwenhuizen a6a5a6f67f freedreno: Add printf pattern string.
Some new flag setting disallows it due to being a security risk.

Fixes: c9c1e26106 "mesa: prevent common string formatting security issues"
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-04 23:20:50 +02:00
Bas Nieuwenhuizen 6256925b11 Revert "vl: Enable DRM by default."
Reason:

meson.build:586:7: ERROR: Unknown variable "dep_libdrm".

if building without x11 platform.

This reverts commit 392c60928a.
2019-06-04 23:14:56 +02:00
Alyssa Rosenzweig 4a03d37827 panfrost/midgard: .pos propagation
A previous optimization converts fmax(x, 0.0) instructions to fmov.pos.
This pass then propagates the .pos from the move up to the source
instruction (when possible). From there, copy propagation will eliminate
the move.

In the future, we might prefer to do this in common NIR code like we do
for saturate, as Bifrost can also benefit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 5da0a33fab panfrost/midgard: Cleanup copy propagation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 33800f4612 panfrost/midgard: Implement "pipeline register" prepass
This prepass, run after scheduling but before RA, specializes to
pipeline registers where possible. It walks the IR, checking whether
sources are ever used outside of the immediate bundle in which they are
written. If they are not, they are rewritten to a pipeline register (r24
or r25), valid only within the bundle itself. This has theoretical
benefits for power consumption and register pressure (and performance by
extension). While this is tested to work, it's not clear how much of a
win it really is, especially without an out-of-order scheduler (yet!).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 2a79afc5f0 panfrost/midgard: Helpers for pipeline
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 3c7abbfbe8 panfrost/midgard: Refactor schedule/emit pipeline
First, this moves the scheduler and emitter out of midgard_compile.c
into their own dedicated files.

More interestingly, this slims down midgard_bundle to be essentially an
array of _pointers_ to midgard_instructions (plus some bundling
metadata), rather than the instructions and packing themselves. The
difference is critical, as it means that (within reason, i.e. as long as
it doesn't affect the schedule) midgard_instrucitons can now be modified
_after_ scheduling while having changes updated in the final binary.

On a more philosophical level, this removes an IR. Previously, the IR
before scheduling (MIR) was separate from the IR after scheduling
(post-schedule MIR), requiring a separate set of utilities to traverse,
using different idioms. There was no good reason for this, and it
restricts our flexibility with the RA. So unify all the things!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 0524ab9c37 panfrost/midgard: Cleanup RA (stylistic changes)
Trivial.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig debc29b9ad panfrost/midgard: Share MIR utilities
These are more generally useful than the files they were constrained to.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 1bfa0d6ccc panfrost/midgard: Misc. cleanup for readibility
Mostly, this fixes a number of instances of lines >> 80 chars,
refactoring them into something legible.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 2d98022330 panfrost/midgard: Extend RA to non-vec4 sources
This represents a major break with the former RA design. We now use
conflicting register classes to represent the subdivision of Midgard's
128-bit registers into varying sizes and arrangement. We determine class
based on the number of components in the instructions' masks. To support
this, we include a number of helpers in the RA to allow composing
swizzles and masks, such that MIR written implicitly assuming .xyzw
sources can be transformed to use actual (non-aligned) sources.

The net result is a marked decrease in register pressure on
non-vec4-exclusive shaders. We could still be doing much better. Not
implemented yet are:

   - Register spilling
   - Per-component liveness

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig c1715b558a panfrost/midgard: Set masks on ld_vary
These masks distinguish scalar/vec2/vec3 loads from the default vec4,
which helps with assembly readability (since it's immediately obvious
how many components are _actually_ affected, rather than doing
mysterious things to an unknown number of unused components). Later in
the series, this will enable smarter register allocation, as the unused
components will not be interpreted abnormally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 550be763fa panfrost/midgard: Fix liveness analysis bugs
This fixes liveness analysis with respect to inline constants and
branching. in practice, the symptom is abnormally high register
pressure.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig c54f3f42eb panfrost/midgard: Set int outmod for "pasted" code
These snippets of integer assembly are injected for various purposes.
Eventually, we'll want to implement these in NIR directly. Regardless,
the "default" output modifier is different between floats and ints, so
let's set the right one.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 51196c3591 panfrost/midgard: Hoist some utility functions
These were static to midgard_compile.c but are more generally useful
across the compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig 005d9b1ada panfrost/midgard: Remove pinning
This mechanism is only used by blend shaders, so just use a move here.
Ideally, it'll be copy-propped and DCE'd away; this removes a source of
considerable indirection and will simplify RA logic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-06-04 20:14:50 +00:00
Bas Nieuwenhuizen 392c60928a vl: Enable DRM by default.
If libdrm is found the pipe loader enables drm anyway, and that is
pretty much the only extra dependency this code has.

This enables creating libva display using a drm fd without having
to enable the DRM (GBM really) backend of EGL, which is completely
unrelated.

Leaving the X11 platforms alone as they would still result in the
additional inclusion of extra deps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-04 20:01:34 +02:00
Connor Abbott d68218dbca radeonsi/nir: Fix type in bindless address computation
Bindless handles in GL are 64-bit. This fixes an assert failure in LLVM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-04 15:15:46 +02:00
Christian Gmeiner a6e879984c etnaviv: implement set_active_query_state(..) for hw queries
Clear w/ quad uses a normal draw which adds up to OQ. st/meta
uses set_active_query_state(..) to tell the driver to pause
queries in such cases.

Fixes spec@arb_occlusion_query@occlusion_query_meta_save piglit.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-04 14:58:02 +02:00
Kenneth Graunke 34d3103dee iris: Fix SO stride units for DrawTransformFeedback
Mesa measures in DWords.  The hardware also claims to measure in DWords.
Except the SO_WRITE_OFFSET field is actually bits 31:2, with 1:0 MBZ.
Which means that it really measures in bytes.  So, convert to bytes.

Without this, our offset / stride denominator was 1/4th the size it
should be, leading to 4x the vertex count that we should have had.

Fixes GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_two_buffers
2019-06-03 22:51:18 -07:00
Nicolai Hähnle f480b8aaa4 amd/common: use generated register header 2019-06-03 20:05:20 -04:00
Nicolai Hähnle 853ef5ccba amd/common: use SH{0,1}_CU_EN definitions only of COMPUTE_STATIC_THREAD_MGMT_SE0
The automatic header generation unifies identical registers in a series
and only emits definitions for the first one. This is mostly to avoid
emitting excessive definitions for CB registers, but special-casing
an exception for this family of registers doesn't seem worth it.
2019-06-03 20:05:20 -04:00
Nicolai Hähnle cf51009ad2 amd/common: unify PITCH_GFX6 and PITCH_GFX9
The definition of the fields differs, but PITCH_GFX9 is a mere extension
of PITCH_GFX6 that does not conflict with any other fields.

This aligns the definitions with what will be generated from the
register JSON.

The information about how large the fields really are is preserved in
the register database.
2019-06-03 20:05:20 -04:00
Nicolai Hähnle cd247cf456 amd/common: cleanup DATA_FORMAT/NUM_FORMAT field names
The field layout wasn't actually changed in gfx9, so having the suffix
isn't very useful. The field *contents* were changed, but this is
reflected in the V_xxx_xxx definitions and is taken into account by
the ac_debug logic based on the register JSON.

This aligns the definitions with what will be generated from the
register JSON.
2019-06-03 20:05:20 -04:00
Caio Marcelo de Oliveira Filho 045aeccf0e iris: Always reserve binding table space for NIR constants
Don't have a separate mechanism for NIR constants to be removed from
the table.  If unused, we will compact it away.  The use_null_surface
is needed when INTEL_DISABLE_COMPACT_BINDING_TABLE is set.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho 5611444809 iris: Print binding tables when INTEL_DEBUG=bt
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho 97cd865be2 iris: Compact binding tables
Change the iris_binding_table to keep track of what surfaces are
actually going to be used, then assign binding table indices just for
those.  Reducing unused bytes on those are valuable because we use a
reduced space for those tables in Iris.

The rest of the driver can go from "group indices" (i.e. UBO #2) to
BTI and vice-versa using helper functions.  The value
IRIS_SURFACE_NOT_USED is returned to indicate a certain group index is
not used or a certain BTI is not valid.

The environment variable INTEL_DISABLE_COMPACT_BINDING_TABLE can be
set to skip compacting binding table.

v2: (all from Ken)
    Use BITFIELD64_MASK helper. Improve comments.
    Assert all group is marked as used when we have indirects.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho 79f1529ae0 iris: Create an enum for the surface groups
This will make convenient to handle compacting and printing the
binding table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho 1c8ea8b300 iris: Handle binding table in the driver
Stop using brw_compiler to lower the final binding table indices for
surface access.  This is done by simply not setting the
'prog_data->binding_table.*_start' fields.  Then make the driver
perform this lowering.

This is a better place to perfom the binding table assignments, since
the driver has more information and will also later consume those
assignments to upload resources.

This also prepares us for two changes: use ibc without having to
implement binding table logic there; and remove unused entries from
the binding table.

Since the `block` field in brw_ubo_range now refers to the final
binding table index, we need to adjust it before using to index
shs->constbuf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho 518f83236b iris: Pull brw_nir_analyze_ubo_ranges() call out setup_uniforms
We'll change iris to perform lowering of the binding table indices
earlier (before the backend kick in), but the backend compiler uses
the result of the analysis to identify load_ubo intrinsics, so we do
the analysis after the lowering to have the right indices.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Hyunjun Ko 382e3553af freedreno/ir3: fix counting and printing for half registers.
v2: defining 0x100 and use this for setting the FS_OUTPUT_REG.HALF_PRECISION
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-06-03 13:31:51 -07:00
Neil Roberts 689c3c7d40 freedreno/ir3: Use output type size to set OUTPUT_REG_HALF_PRECISION
Previously the A5XX_SP_FS_OUTPUT_REG_HALF_PRECISION was set depending
on whether half_precision was set in the shader key. With support for
mediump precision, it is possible to have different outputs use
different precisions. That means we can’t have a global shader state
to specify it. Instead it now tries to copy the half-float-ness
from the nir_variable for the output into the ir3_shader_variant. This
is then used to decide whether to set half-precision for each output.

The a6xx version is copied from the a5xx code but it has not been
tested.

v2. [Hyunjun Ko (zzoon@igalia.com)] There's the half flag recently
added, which represents precision based on IR3_REG_HALF. Now use this
flag to avoid duplication.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-06-03 12:44:03 -07:00
Pierre-Eric Pelloux-Prayer 4583f09caa radeonsi: init sctx->dma_copy before using it
Commit a1378639ab reordered context functions initializations but broke
sctx->b.resource_copy_region init when using AMD_DEBUG=forcedma.

In this case sctx->dma_copy was assigned a value after being used in:
   sctx->b.resource_copy_region = sctx->dma_copy;

This commit moves the FORCE_DMA special case after sctx->dma_copy initialization.

See https://bugs.freedesktop.org/show_bug.cgi?id=110422

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-06-03 15:05:30 -04:00
Axel Davy 5820ac6756 d3dadapter9: Revert to old throttling limit value
Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2
to 1:
20909284f2

No driver seems to overwrite the default value.

One user reports severe regressions for some games.
For now, revert to the value 2 for nine.

Cc: "19.1" mesa-stable@lists.freedesktop.org

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2019-06-03 20:37:13 +02:00
Marek Olšák 486bc1e17e ac: use amdgpu-flat-work-group-size
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-03 14:32:47 -04:00
Marek Olšák 4b11ed443b u_blitter: don't fail mipmap generation for depth formats containing stencil
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109754

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-06-03 14:32:47 -04:00
Christian Gmeiner 3135ca4172 etnaviv: drop a bunch of duplicated gallium PIPE_CAP default code
Now that we have the util function for the default values, we can get
rid of the boilerplate.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-03 16:29:59 +02:00
Jonathan Marek 91672becc3 nir: copy intrinsic type when lowering load input/uniform and store output
Fixes: c1275052 "nir: add type information to load uniform/input and store output intrinsics"

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Tested-by: Erico Nunes <nunes.erico@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-06-03 12:46:14 +00:00
Caio Marcelo de Oliveira Filho 27497c5c02 iris: Drop unused locals from iris_clear.c to avoid warning
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-05-31 15:55:05 -07:00
Jonathan Marek f387c2b238 nir: remove bool lowering from lower_int_to_float
Removes the bool_to_float logic from the int_to_float pass, so that both
can be used separately. By having separate passes we have better validation
and it makes it possible to use with the lower_ftrunc option (int lowering
generates ftrunc, but lower_ftrunc generates bools, ftrunc lowering should
probably be reworked). For now we always expect lower_bool to come after
lower_int.

Also fixes f2i32 to become ftrunc and adds u2f/f2u cases.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-31 21:35:26 +00:00