Commit Graph

116432 Commits

Author SHA1 Message Date
Marek Olšák 64dfc82340 st/mesa: rename basic -> common for st_common_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák 33d53f0614 st/mesa: rename st_xxx_program::tgsi to state
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák dd4d791821 st/mesa: lower doubles for NIR after linking
This allows dropping 1 call to st_nir_opts, because shaders are always
optimized after linking.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:37 -04:00
Marek Olšák 7908e82f60 st/mesa: call st_nir_opts for linked shaders only once
The removed st_nir_opts calls are mostly redundant.

There is an improvement with shader-db on radeonsi:

Before:
    real	1m54.047s
    user	28m37.857s
    sys 	0m7.573s

After:
    real	1m52.012s
    user	28m3.412s
    sys 	0m7.808s

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Ian Romanick 92252219d3 intel/vec4: Don't try both sources as immediates for DPH
DPH isn't actually commutative, so this doesn't work.  If the immediate
in src0 would be a VF candidate, we could do better. *shrug*

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b04beaf41d ("intel/vec4: Try both sources as candidates for being immediates")
2019-10-17 15:07:01 -07:00
Ian Romanick 050e4e28bf nir/search: Fix possible NULL dereference in is_fsign
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
2019-10-17 15:07:01 -07:00
Jordan Justen da10fa9d63
iris: Let isl decide the supported tiling in more situations
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:23 -07:00
Jordan Justen be89fbd51e
intel/isl: Add gen12 depth/stencil surface alignments
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:23 -07:00
Jason Ekstrand d9565160b2
intel/isl: Select Y-tiling for stencil on gen12
Rework:
 * Disallow linear 1D stencil buffers (Nanley)
 * Force Y for gen12 stencil rather than ~W (Nanley)

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jason Ekstrand 9dd9c3363b
intel/genxml: Remove W-tiling on gen12
It's no longer supported by the hardware

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jordan Justen 523ba0a3e7
intel/genxml,isl: Add gen12 stencil buffer changes
Rework:
 * NULL stencil buffer path (Jason)
 * genxml fixes (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jordan Justen d2a490d1d9
intel/genxml,isl: Add gen12 depth buffer changes
Reworks:
 * Fix 3DSTATE_DEPTH_BUFFER "Surface Format" end in xml (Jason)
 * Remove WM_HZ_OP changes (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jordan Justen 6c9f9a82d7
intel/genxml,isl: Add gen12 render surface state changes
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:17 -07:00
Eric Anholt 75c601b6cf mesa: Refactor the entirety of _mesa_format_matches_format_and_type().
This function was difficult to implement for new formats due to the
combination of endianness and swapbytes support.  Since it's mostly
used for fast paths, bugs in it were often missed during testing.

Just reimplement it on top of the recent
_mesa_format_from_format_and_type() which can give us a canonical
MESA_FORMAT for a format and type enum (while respecting endianness).

Fixes:
- R4G4B4A4_UNORM, B4G4R4_UINT, R4G4B4A4_UINT incorrectly matched with
  swapBytes (you can't just reverse the channels if the channels
  aren't bytes)
- A4R4G4B4_UNORM and A4R4G4B4_UINT missing BGRA/4444_REV matches
- failing to match RGB/BGR unorm8 array formats on BE
- 2101010 formats incorrectly matching with swapBytes set.
- UINT/SINT byte formats failed to match with swapBytes set.

This deletes the part of tests/mesa_formats.cpp that called
_mesa_format_matches_format_and_type() to make sure it didn't
assertion fail, as it now would assertion fail due to the fact that we
were passing an invalid format (GL_RG) for most types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt d77c77936b mesa: Add support for array formats of depth and stencil.
In desktop GL, you can specify things like GL_DEPTH_COMPONENT/GL_BYTE as a
ReadPixels format, and we need to be able to represent that to see if we
have proper MESA_FORMATs for them.  That's exactly what the
mesa_array_format enum is for.

v2: Drop _mesa from static fn.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt 4f4fc75357 mesa: Add format/type matching for DEPTH/UINT_24_8.
We had missed this case where GLES3 allows glReadPixels(DEPTH, UINT_24_8),
and just got lucky by the readpixels path never asking for the matching
format from this function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt 7be72b24f5 mesa: Fix depth/stencil ordering in _mesa_format_from_format_and_type().
The GL spec says the 24-bit component is in the high bits, and
format_unpack.c looks at the high 24 bits in the S8Z24 case, not
Z24SS8.

Avoids a regression in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt df5fe86232 mesa: Add debug info to _mesa_format_from_format_and_type() error path.
The unreachable() that follows isn't very useful for debug, and by adding
this here we get a nice description of the failure in debug builds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Kristian H. Kristensen 0a4e6726ba freedreno/a6xx: Turn on geometry shaders
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:45:03 -07:00
Kristian H. Kristensen d3945e3b9b freedreno/ci: Add failing tests to skip list
Some queries are still failing and layered rending needs more work.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:45:03 -07:00
Kristian H. Kristensen 622afc8dbd freedreno/a6xx: Implement PIPE_QUERY_PRIMITIVES_GENERATED for GS
When we don't have streamout enabled, we have to read this register to
get the number of primitives emitted.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen c8e1522a50 freedreno/blitter: Save GS state
We have GS state now.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 946a1e206f st/mesa: Also enable GS when ESSLVersion > 320
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 7cb672227b freedreno/a6xx: Support layered render targets
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 0eebedb619 freedreno/a6xx: Emit program state for GS
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen d6ed39e20e freedreno/ir3: End VS with CHMASK and CHSH in GS pipelines
When used in a GS pipeline, the VS doesn't end with the END
instruction. Instead it chains to the GS, which continues running with
the same register allocation.  The intended use cases seems to be that
you can compile a regular VS (ie outputs in registers and ending with
END) but then tack on link-time generated code past the END to write
the outputs using STLW, in case the VS is used with GS.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 4b7312b763 freedreno/ir3: Start GS with (ss) and (sy)
We don't know what kind of loads we might have to wait on when coming
in from chsh in the VS so set both sync flags.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen c347708bea freedreno/ir3: Pre-color GS header and primitive ID
These sysvals have to be unclobbered by VS and in the same registers
in both VS and GS, since the chsh from VS to GS doesn't reload the
values. We use the pre-color argument to ir3_ra() to always place
these values in r0.x and r0.y.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen ce08fddbbe freedreno/ir3: Setup ir3 inputs and outputs for GS
Inputs are the GS header, which contains vertex ID, local primitive ID
and thread ID as well as primitive ID. The setup is a little different
from other sysvals, since we always have to receive them in the VS so
that it can pass them on into the GS.

The vertex flag outputs from GS is set up as a proper nir output in
the lowering pass and doesn't need special handling here.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 0293d14719 freedreno/ir3: Implement primitive layout intrinsics
This implements the load_vs_primitive_stride_ir3,
load_vs_vertex_stride_ir3 and load_primitive_location_ir3 intrinsics,
used for getting the primitive layout strides and locations.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 8e16fb1528 freedreno/ir3: Implement lowering passes for VS and GS
This introduces two new lowering passes. One to lower VS to explicit
outputs using STLW and one to lower GS to load input using LDLW and
implement the GS specific functionality.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 8f39985b01 freedreno/ir3: Add has_gs flag to shader key
Since the presence of GS changes how the VS operates we need to track
that in the shader key.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 2703844cb3 freedreno/a6xx: Add missing adjacency primitives to table
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 0324706764 freedreno/ir3: Add intrinsics that map to LDLW/STLW
These intrinsics will let us do all the offset calculations in nir,
which is nicer to work with and lets nir_opt_algebraic eat it all up.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 436d125adf freedreno/ir3: Add new LDLW/STLW instructions
These access memory used for passing data between geometry stages.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 956d319446 freedreno/ir3: Extend RA with mechanism for pre-coloring registers
We'll need to pre-color certain input registers betwee VS and GS
shaders.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 0b6625d825 freedreno/ir3: Use third register for offset for LDL and LDLV
Before, offset held the offset, which can be either immediate or a
register.  Use a third register to hold the offset so that we can use
a register.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 3a93e60e7b freedreno/ir3: Add support for CHSH and CHMASK instructions
Just add the constructors for now and special case similar to END so
we don't remove them.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen f335a6663d freedreno/a6xx: Trim a few regs from fd6_emit_restore()
We know what these do an either write them in the program stateobj or
don't need to write them.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen 610c8c938e freedreno/registers: Update with GS, HS and DS registers
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Eric Anholt 628ed1bbd5 freedreno/ci: Ban texsubimage2d_pbo.r16ui_2d, due to two flakes reported.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-10-17 20:32:46 +00:00
Marek Olšák 12d92714e9 st/mesa: silence a warning in st_nir_lower_tex_src_plane
trivial
2019-10-17 16:07:26 -04:00
Marek Olšák 3ed1dd3d42 gallium/u_blitter: remove an unused variable
trivial
2019-10-17 16:07:02 -04:00
Marek Olšák 9aa5b348de radeonsi: recreate aux_context after a GPU reset
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-17 14:56:26 -04:00
Marek Olšák 438ede3ca3 radeonsi: call the reset callback if get_device_reset_status returns a failure
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-17 14:56:24 -04:00
Marek Olšák 93707457b6 st/mesa: call the reset callback if glGetGraphicsResetStatus returns a failure
so that we immediately set the no-op dispatch

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-17 14:56:23 -04:00
Caio Marcelo de Oliveira Filho c847bfaaf5 intel/fs/gen12: Add tests for scoreboard pass
Tests the combinations of cases of RAW, WAW and WAR hazards involving
both inorder and outoforder instructions.  Also tests that
dependencies combine and propagate correctly through control
flow (loops and conditionals).

v2: Add an extra test illustrating that the non-logical CFG edge
    between then-block and else-block is being taking into
    account.  (Curro)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-10-17 10:02:35 -07:00
Daniel Schürmann 4b458b3e8f aco: don't combine minmax3 if there is a neg or abs modifier in between
This fixes a graphical corruption in HotS.
No pipelinedb changes other than that.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-17 16:21:19 +00:00
Roland Scheidegger 045f05a2f6 gallivm: Fix saturated signed psub/padd intrinsics on llvm 8
LLVM 8 did remove both the signed and unsigned sse2/avx intrinsics in
the end, and provide arch-independent llvm intrinsics instead.
Fixes a crash when using snorm framebuffers (tested with piglit
arb_color_buffer_float-render GL_RGBA8_SNORM -auto).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
2019-10-17 17:42:16 +02:00
Samuel Pitoiset c644644c65 radv: fix DCC fast clear code for intensity formats (correctly)
Previous fix was pretty bogus.

This fixes a rendering regression with Nier (minimap too large).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1943
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1952
Fixes: ea92273cea ("radv: fix DCC fast clear code for intensity formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-17 15:29:43 +02:00