Commit Graph

113548 Commits

Author SHA1 Message Date
Jason Ekstrand 799f0f7b28 nir/lower_subgroups: Properly lower masks when subgroup_size == 0
Instead of building a constant mask (which depends on knowing the
subgroup size), we build an expression.  Because the pass uses the
nir_shader_lower_instructions helper, subgroup lowering will be run on
any newly emitted instructions as well as the previously existing
instructions.  In particular, if the subgroup size is known, the newly
emitted subgroup_size intrinsic will get turned into a constant and a
later constant folding pass will clean it up.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand 256e6c2d94 vulkan: Update the XML and headers to 1.1.116
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand c84b8eeeac intel/compiler: Be more conservative about subgroup sizes in GL
The rules for gl_SubgroupSize in Vulkan require that it be a constant
that can be queried through the API.  However, all GL requires is that
it's a uniform.  Instead of always claiming that the subgroup size in
the shader is 32 in GL like we have to do for Vulkan, claim 8 for
geometry stages, the maximum for fragment shaders, and the actual size
for compute.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand 1981460af2 intel/compiler: Lower gl_SubgroupSize in postprocess_nir
Instead of lowering the subgroup size so early, wait until we have more
information.  In particular, we're going to want different subgroup
sizes from different stages depending on the API.  We also defer
lowering of subgroup masks because the ge/gt masks require the subgroup
size to generate a subgroup mask.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Jason Ekstrand f62227f2b7 intel/nir: Make brw_nir_apply_sampler_key more generic
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Sagar Ghuge 87cef718e1 nir: Add lowering for nir_op_irem and nir_op_imod
Tested on Gen > 9.

v2: 1) Fix lowering
    2) Keep a consistent i/u order (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 10:33:09 -07:00
Yevhenii Kolesnikov 882fe09a74 main: Fix memleaks in mesa_use_program
Add freeing of SubroutineIndexes to the _mesa_free_shader_state.

Fixes: 4566aaaa5b ("mesa/subroutines: start adding per-context
subroutine index support (v1.1)")
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-24 16:54:21 +00:00
Andrii Simiklit fa2fc68de1 intel/compiler: don't use a keyword struct for a class fs_reg
warning: struct 'fs_reg' was previously declared as a class
Fixes: e64be391 ("intel/compiler: generalize the combine constants pass")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-07-24 13:26:42 +00:00
Qiang Yu 280dfa02fa lima/ppir: fix disassembler temp read/write print
temp read/write use negtive offset, and handle
alignment==1 case.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
2019-07-24 20:39:39 +08:00
Eric Engestrom e7e31b18d6 gallium+mesa: fix tgsi_semantic array type
Fixes: ed23335a31 ("gallium: use enums in p_shader_tokens.h (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-24 09:33:29 +01:00
Eric Engestrom f986741a91 util: fix no-op macro (bad number of arguments)
Fixes: b8e077daee ("util: no-op __builtin_types_compatible_p() for non-GCC compilers")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 09:13:58 +01:00
Samuel Pitoiset 4389e85dc9 radv/gfx10: enable VK_EXT_transform_feedback
When a pipeline uses transform feedback, the driver fallbacks to
the legacy path because NGG support for streamout is a non-trivial
amount of work.

AMDVLK also uses the legacy path for streamout, while RadeonSI
uses the new NGG path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:37 +02:00
Samuel Pitoiset a3a4fa1860 radv/gfx10: do not enable NGG if a pipeline uses XFB
NGG GS for streamout requires a bunch of work, so enable it with
the legacy path only for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:34 +02:00
Samuel Pitoiset 09abe571a2 radv/gfx10: emit streamout shader config
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:32 +02:00
Samuel Pitoiset 383c2e625a radv/gfx10: declare streamout user SGPRs
Required for legacy streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:30 +02:00
Samuel Pitoiset fd195d8085 radv/gfx10: update streamout descriptors
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:27 +02:00
Samuel Pitoiset ea337c8b7e radv/gfx10: fix VS input VGPRs with the legacy path
For some reasons, InstanceID is VGPR3 although StepRate0 is set to 1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:21 +02:00
Dave Airlie 2631fd3b0b gallivm: rework lp_build_tgsi_soa to take a struct
The parameters were getting messy and I have to add a few more
for compute shaders, so clean it up before proceeding.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-07-24 09:20:09 +10:00
Jason Ekstrand 9700e45463 nir/lower_io: Return SSA defs from helpers
I can't find a single place where nir_lower_io is called after going out
of SSA which is the only real reason why you wouldn't do this. Returning
SSA defs is more idiomatic and is required for the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-23 17:48:49 -05:00
Dylan Baker 7cf50af6f5 meson: allow building all glx without any drivers
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111016
Fixes: a47c525f32
       ("meson: build glx")
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-23 15:34:23 -07:00
Jan Zielinski 3d6cffffcf swr/rasterizer: Fix 3D resource copies.
Ensure constant attributes stay constant with barycentric interpolation.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-07-23 21:55:09 +02:00
Jan Zielinski ec4a5f5e13 swr/rasterizer: Fix return type on SIMD8 version of Clamp and Normalize utility functions
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-07-23 21:55:09 +02:00
Jan Zielinski 47cdb0ac27 swr/rasterizer: small formatting changes
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-07-23 21:55:09 +02:00
Jan Zielinski ccc6b4f96b swr/rasterizer: Adding support for unhandled clipEnable state
Clipping is not correctly handled by the rasterizer - fixing this.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-07-23 21:55:09 +02:00
Bas Nieuwenhuizen e5b3f0a867 radv/gfx10: Enable binning.
Numbers for Talos:

gfx10 without binning: 77.0 77.7 77.2 77.6
gfx10 with binning: 82.3 82.0 82.7 82.4

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 3268c806fb radv/gfx10: Implement bin size calculation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 4b757697e9 radv/gfx9: Select between depth/color bins based on area.
Mirrors radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 22f2f76789 radv: Generalize binning settings.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 793cbf6161 radv/gfx10: Use new scan converter.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 4058b354c5 radv: Set FLUSH_ON_BINNING_TRANSITION.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 906fcfccfd radv: Use pbb_allow for framebuffer BREAK_BATCH.
Ported from radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-23 21:26:59 +02:00
Marek Olšák 264ab6ffcd radeonsi/nir: set tgsi_shader_info::uses_fbfetch for KHR_blend_equation_adv.
This doesn't implement the color buffer load.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:08:37 -04:00
Marek Olšák 45556731b6 tgsi/scan: add uses_fbfetch
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:08:37 -04:00
Marek Olšák ee858871bd radeonsi: fail if importing a texture with incorrect last_level or samples
v2: don't fail if the texture comes from an incompatible driver.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (v1)
2019-07-23 15:08:27 -04:00
Marek Olšák f8b6c5a1a6 radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:51 -04:00
Marek Olšák e718f8e713 radeonsi: simplify si_get_input_prim and remove incorrect TODO comment
u_vertices_per_prim(QUADS) is the same as TRIANGLES.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:49 -04:00
Marek Olšák 16392cc3f3 radeonsi/gfx10: fix and enable CLEAR_STATE
it was a driver bug.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:47 -04:00
Marek Olšák ad642d5b3a radeonsi: stop using info.opcode_count[TGSI_OPCODE_INTERP_SAMPLE]
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:46 -04:00
Marek Olšák 6ac2146a98 ac/nir: implement nir_op_pack_{us}norm_2x16
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:44 -04:00
Pierre-Eric Pelloux-Prayer 079e5f73d7 mesa/st: rewrite src var when lowering tex_src_plane
The assign_extra_samplers() adds the needed extra samplers but they need
to be used in the nir_tex_instr.
Otherwise the plane information is simply lost and all nir_tex_instr use the
same sampler.
Here's an example of the bug:

NIR before st_nir_lower_tex_src_plane:
	vec1 32 ssa_8 = load_const (0x00000000 /* 0.000000 */)
	vec4 32 ssa_9 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord), ssa_8 (plane)
	vec1 32 ssa_10 = load_const (0x00000001 /* 0.000000 */)
	vec4 32 ssa_11 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord), ssa_10 (plane)

After:

	vec4 32 ssa_9 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord)
	vec4 32 ssa_11 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord)

This fixes the following piglit test for radeonsi + NIR:
  - ext_image_dma_buf_import-sample_nv12
  - ext_image_dma_buf_import-sample_yuv420
  - ext_image_dma_buf_import-sample_yvu420

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-07-23 15:00:43 -04:00
Pierre-Eric Pelloux-Prayer e9cf8c1d30 u_blitter: add a msaa parameter to util_blitter_clear
Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-23 14:42:20 -04:00
Pierre-Eric Pelloux-Prayer d811446e6c u_blitter: enable msaa when dst num samples is > 1
Commit ea5b7de138 broke some piglit tests on radeonsi (Bonaire hardware).
This commit fixes half of the regression by enabling msaa if the dest surface has
more than 1 sample (instead of hardcoding it to false).

Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-23 14:42:20 -04:00
Jason Ekstrand ae392d73c9 nir/gather_info: Look for uses of helper invocations
The one obvious omission here is gl_HelperInvocation itself.  However,
the spec doesn't require that we generate then when gl_HelperInvocation
is used, it merely mandates that we report them if they are there.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Jason Ekstrand 41ab92a327 nir/gather_info: Move setting uses_64bit out of the switch
Otherwise, as we add things to the switch, we're going to forget and add
some 64-bit op at some point in the future and it'll stop getting
flagged.  There's no reason why we can't do the check for derivatives.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Jason Ekstrand 0e6cb481fa nir: Add a nir_tex_instr_has_implicit_derivatives helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Jason Ekstrand 7a98c7804c nir: Move nir_alu_instr_is_comparison to the ALU section
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Rafael Antognolli 1f4cbc9a06 intel/genxml: Add new test for subgroups.
Make sure that a <group> tag within another <group> tag work just fine.

v2: rename 'halfbyte' to 'byte' to match the size (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-23 17:45:19 +00:00
Rafael Antognolli fe5ae96d66 intel/genxml: Add basic infra for encoding/decoding unit tests.
Adding option to print quiet.

v2: Add license header.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-23 17:45:19 +00:00
Rafael Antognolli e25ebe2ec9 intel/gen_decoder: Decode <group> inside <group>.
Now we can decode a <group> tag inside another <group> tag, and properly
print its indices and content.

v2: Use push/pop stack to fields, groups and iters (Lionel).
v3: Add assert(iter->level < DECODE_MAX_ARRAY_DEPTH) (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-23 17:45:19 +00:00
Rafael Antognolli f670c2e1ff intel/gen_decoder: Add the concept of array "levels".
We currently only support one level, which is the basic level of a
<group> tag.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-23 17:45:19 +00:00