Commit Graph

109426 Commits

Author SHA1 Message Date
Rob Clark 8eb16ae8bf freedreno/ir3: fix regmask for merged regs
On a6xx+ with half-regs conflicting with full-regs, the legalize pass
needs to set appropriate sync bits, such as (sy), on writes to full regs
that conflict with half regs, and visa-versa.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 1dffb089f9 freedreno/ir3: fix sam.s2en encoding
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 45b7a581b4 freedreno/ir3: fix sam.s2en decoding
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark 2d31cf9d3b freedreno/ir3/ra: fix half-class conflicts
On a6xx, half-regs conflict with full-regs.  But we were only setting up
conflicts for the first class (ie. scalar, but not hvec2/hvec3/hvec4),
resulting in higher half-reg classes getting assigned to regs that
overwrite full-regs.

Noticed while trying to enable indirect-sampler (sam.s2en) which uses an
hvec2 argument to pass the sampler/tex index.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark cc5ca9391c freedreno/ir3 better cat6 encoding detection
These two bits seem to be a better way to detect which encoding we are
looking at.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Samuel Pitoiset 00327f827f ac: fix incorrect argument type for tbuffer.{load,store} with LLVM 7
GLC/SLC are boolean.

This fixes the following LLVM error when checkir is set:
Intrinsic has incorrect argument type!
void (i32, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32)* @llvm.amdgcn.tbuffer.store.i32

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:02:00 +01:00
Samuel Pitoiset 20cac1f498 ac: fix 16-bit shifts
This fixes the following LLVM error when ckeckir is set:
Type too small for ZExt

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:01:58 +01:00
Samuel Pitoiset 2ac5c5c1b5 ac: add 16-bit support to fract
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:09 +01:00
Samuel Pitoiset 0eb1478ac2 ac: add 16-bit support fo fsign
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:07 +01:00
Samuel Pitoiset ff11c9dcc7 ac: add f16_0 and f16_1 constants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:05 +01:00
Timothy Arceri 427a6fee43 nir: only override previous alu during loop analysis if supported
Users of this function expect alu to be a supported comparision
if the induction variable is not NULL. Since we attempt to
override the return values if the first limit is not a const, we
must make sure we are dealing with a valid comparision before
overriding the alu instruction.

Fixes an unreachable in inverse_comparison() with the game
Assasins Creed Odyssey.

Fixes: 3235a942c1 ("nir: find induction/limit vars in iand instructions")

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110216
2019-03-21 21:51:21 +11:00
Michel Dänzer 6d0a7f798c gitlab-ci: Use 8 CPU cores in autotools job
This cuts down the job runtime from ~9.5 to ~7 minutes with my personal
runner on an 8-core Ryzen 7 1700.

While this might result in slightly higher load on shared runners, it
should be OK, since libtool doesn't use the CPU cores as effectively as
e.g. ninja does; a significant part of the CPU load tends to be in bash
processes at any time, which should be relatively light on memory.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-03-21 09:58:31 +01:00
Michel Dänzer a2cce701e6 gitlab-ci: List some longer-running jobs before others of the same stage
This increases the chance of them running earlier, which can have an
impact on the total duration of the pipeline.

v2:
* Minor style fix-up to moved comment (Eric Anholt)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-03-21 09:55:08 +01:00
Samuel Pitoiset db07f0554a radv: add missing initializations since VK_EXT_pipeline_creation_feedback
This fixes the world.

Fixes: 5f5ac19f13 ("radv: Implement VK_EXT_pipeline_creation_feedback.")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:42:31 +01:00
Rhys Perry 037f11d42e radv: enable VK_KHR_8bit_storage
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:27 +01:00
Rhys Perry 3cc72a88d8 ac/nir: implement 8-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:25 +01:00
Rhys Perry c73f8b6576 ac/nir: add 8-bit types to glsl_base_to_llvm_type
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:22 +01:00
Rhys Perry 9c5067acf1 ac/nir: implement 8-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:20 +01:00
Samuel Pitoiset b235d77e18 ac: add ac_build_tbuffer_store_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:18 +01:00
Rhys Perry b12e074b89 ac/nir: implement 8-bit push constant, ssbo and ubo loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:16 +01:00
Samuel Pitoiset 104dbc64a5 ac: add ac_build_tbuffer_load_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:14 +01:00
Samuel Pitoiset 6e632eb24b ac: add various int8 definitions
Original patch by Rhys Perry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:10 +01:00
Tapani Pälli 4e1bbb000c anv/radv: release memory allocated by glsl types during spirv_to_nir
Fixes leaks for each glsl_type generated:

   ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
   ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
   ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
   ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
   ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
   ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
   ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
   ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
   ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
   ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
   ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
   ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
   ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)

v2: move release call to vkDestroyInstance
v3: apply fix also to radv driver

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 08:30:22 +02:00
Jason Ekstrand 6e19348ad1 spirv: Drop inline tg4 lowering
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Jason Ekstrand 08f804ec0c anv,radv,turnip: Lower TG4 offsets with nir_lower_tex
v2: turn on for turnip as well (Karol Herbst)

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst d8a0658d8b nir/lower_tex: Add support for tg4 offsets lowering
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst 99f202432b nv50/ir/nir: support gather offsets
v2: only emit offsets if those are !0

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst 71c66c254b nir: add support for gather offsets
Values inside the offsets parameter of textureGatherOffsets are required to be
constants in the range of [GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET,
GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET].

As this range is never outside [-32, 31] for all existing drivers inside mesa,
we can simply store the offsets as a int8_t[4][2] array inside nir_tex_instr.

Right now only Nvidia hardware supports this in hardware, so we can turn this
on inside Nouveau for the NIR path as it is already enabled with the TGSI one.

v2: use memcpy instead of for loops
    add missing bits to nir_instr_set
    don't show offsets if they are all 0
v3: default offsets aren't all 0
v4: rename offsets -> tg4_offsets
    rename nir_tex_instr_has_explicit_offsets -> nir_tex_instr_has_explicit_tg4_offsets

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Dave Airlie b95b33a5c7 nir/deref: remove casts of casts which are likely redundant (v3)
Not sure how ptr_stride should be taken into account if at all here

v2: reorder check to avoid src walking (Jason)
v3: remove is_cast_cast checks, keep going afterwards (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 10:58:06 +10:00
Dave Airlie 3b3653c4cf nir/spirv: don't use bare types, remove assert in split vars for testing
For OpenCL we never want to strip the info from the types, and it makes
type comparisons easier in later stages. We might later need a nir pass to
strip this for GLSL, but so far the only regression is the assert and Jason
said removing that is fine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-03-21 10:25:40 +10:00
Rafael Antognolli e7c8402163 iris: Let blorp update the clear color for us.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli 93123417dd iris: Track fast clear color.
v2: Update tracked clear color when we update the surface state.
v3: Update all aux surface states when updating the clear color.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli 5658c661de iris: Stall on the CPU and resolve predication during fast clears.
Only if the clear color/depth is changing. In those cases, it's hard to
keep track of the current clear color, and aux state of some layers,
when predication is enabled. So simplify everything by stalling on the
few cases where we would have a fast clear color change with
predication.

v2:
 - fix comment (Ken)
 - explicitly check for predicate state after resolving it (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli ce830a364e iris: Add iris_resolve_conditional_render().
This function can be used to stall on the CPU and resolve the predicate
for the conditional render. It will convert ice->state.predicate from
IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or
IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query.

v2:
 - return void (Ken)
 - update the stored condition (Ken)
 - simplify the code leading to resolve the predicate (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 131b42f0aa iris: Implement fast clear color.
If all the restrictions are satisfied, do a fast clear instead of
regular clear.

v2:
 - add perf_debug() when we can't fast clear (Ken)
 - improve comment: s/miptree/resource/ (Ken)
 - use swizzle_color_value from blorp (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli bd6f51ec21 intel/blorp: Make swizzle_color_value public.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli d97eddff25 intel/isl: Add isl_format_has_color_component() function.
v2: Get luminance bits from luminance component (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 7f6344a726 iris: Bring back check for srgb and fast clear color.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli a8b5ea8ef0 iris: Add function to update clear color in surface state.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 32c8fa6411 iris: Add helper to convert fast clear color.
It needs to be converted to a value that can be used by ISL (and our
hardware SURFACE_STATE structure).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 51638cf18a iris: Fast clear depth buffers.
Check and do a fast clear instead of a regular clear on depth buffers.

v3:
 - remove swith with some cases that we shouldn't wory about (Ken)
 - more parens into the has_hiz check (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 34d00b4410 iris: Use the clear depth when emitting 3DSTATE_CLEAR_PARAMS.
Take the clear depth into account when IRIS_DIRTY_DEPTH_BUFFER is marked
as dirty.

Also update the blorp surface clear color.

v2: Use a single if (zres && zres->aux.bo) (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli 37f2692591 iris: Allocate buffer space for the fast clear color.
Also store clear color in the iris_resource.

Always allocate clear color state buffer.

v2:
 - Make clear_color_offset be 64 bits (Ken).
 - Simplify the logic to decide when to memset the aux buffer (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Bas Nieuwenhuizen 5f5ac19f13 radv: Implement VK_EXT_pipeline_creation_feedback.
Does what it says on the tin.

The per stage time is only an approximation due to linking and
the Vega merged stages.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-20 21:19:46 +00:00
Samuel Pitoiset 72e366b4c2 ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()
New buffer intrinsics have a separate soffset parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:19 +01:00
Samuel Pitoiset 9d960c17a8 ac: use new LLVM 8 intrinsic when storing 16-bit values
vindex is always 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:14 +01:00
Samuel Pitoiset 2a9d331898 ac: add ac_build_{struct,raw}_tbuffer_store() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:12 +01:00
Samuel Pitoiset 30c2aca67f ac: use new LLVM 8 intrinsics in ac_build_buffer_load()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:08 +01:00
Samuel Pitoiset da46dbb1be ac/nir: use ac_build_buffer_store_dword() for SSBO store operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:06 +01:00
Samuel Pitoiset 6b573c00c9 ac/nir: use ac_build_buffer_load() for SSBO load operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:02 +01:00