Commit Graph

85931 Commits

Author SHA1 Message Date
Marek Olšák a2ea653a49 radeonsi: remove cb0_is_integer handling
st/mesa does this for us.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 54f8efeb02 st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs
v2: rebased

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-10-19 19:26:30 +02:00
Marek Olšák c64da9d499 mesa: remove gl_shader_compiler_options::EmitNoNoise
it's always true

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 2897cb3dba glsl_to_tgsi: remove code for fixing up TGSI labels
I don't know what this was supposed to do, but all TGSI labels were
always 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák ec35ff4e2b glsl_to_tgsi: remove subroutine support
Never used. The GLSL compiler doesn't even look at EmitNoFunctions.

v2: add back "return" support in "main"

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák eacda2c080 mesa_to_tgsi: remove remnants of flow control and subroutine support
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 82f4c0126d mesa_to_tgsi: drop support for instructions that can't occur here
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 4e42898d9d glsl_to_tgsi: allocate glsl_to_tgsi_instruction::tex_offsets on demand
sizeof(glsl_to_tgsi_instruction): 384 -> 264

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 4d3d620f26 glsl_to_tgsi: merge buffer and sampler fields in glsl_to_tgsi_instruction
sizeof(glsl_to_tgsi_instruction): 416 -> 384

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák dbf64ea28b glsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields
sizeof(glsl_to_tgsi_instruction): 464 -> 416

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 9015cbb3a3 glsl_to_tgsi: reduce the size of st_dst_reg and st_src_reg
I noticed that glsl_to_tgsi_instruction is too huge.

sizeof(glsl_to_tgsi_instruction): 752 -> 464 (-38%)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 222c599b61 glsl_to_tgsi: remove unused st_translate::tex_offsets
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 0d95eeb79c glsl_to_tgsi: remove unused parameters from calc_deref_offsets
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Marek Olšák 6980480052 glsl_to_tgsi: use array_id for temp arrays instead of hacking high bits
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-19 19:26:30 +02:00
Adam Jackson 4276b5c16a reviewers: Throw myself on the GLX grenade
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-10-19 12:37:22 -04:00
Eric Engestrom 8acb79dfac egl: bring back the default glapi.so name
Earlier commit replaced the default platform specific libglapi.so name
with an #error.

This may have been overzealous since the name is the correct for the BSD
platforms, at least. Reinstate the hunk - bringing back OpenBSD, et al.
to a successful build state.

Fixes: 7a9c92d071 ("egl/dri2: non-shared glapi cleanups")
[Emil Velikov: format the patch from Eric, add commit message and tag.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-10-19 15:09:26 +01:00
Iago Toral Quiroga 66d8bd3b7e i965: fix subnr overflow in suboffset()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-10-19 11:48:21 +02:00
Dave Airlie 86c4575a81 radv: decompress fmask before reading using texture unit
Before we can read the fmask using the compute shader, we need
to decompress the fmask in place.

This fixes a bunch of remaining failure and hopefully multisampling
in Talos.
2016-10-19 17:39:47 +10:00
Dave Airlie 67c91ef2a2 radv: fix samples_identical return value.
This was returning an inversion, so not doing as it should have.

We need to compare the fmask value with 0, and return the result
from that.
2016-10-19 17:39:01 +10:00
Dave Airlie 93ba86c307 radv: fix wsi porting regression in swapchain destroy.
The code in anv is right, there's a pending patch to fix this up
different, but I'll sync the code for now.
2016-10-19 13:54:49 +10:00
Dave Airlie 63406b669e radv: fix fmask ptr issue
We were using the wrong descriptor in the fmask picking code.
2016-10-19 13:16:25 +10:00
Dave Airlie db7ae14b60 radv: simplify fast clear shaders
There is no need for anything but a noop shader here.
2016-10-19 13:16:14 +10:00
Dave Airlie 1ec5e6e702 vulkan/wsi: fix out of tree build. 2016-10-19 10:54:42 +10:00
Dave Airlie b0e11a153c radv: start using defines for the user sgpr offsets
This adds some comments and adds defines for the user sgprs,
so that we can move them around easier later and not have
to change/revalidate every one of these.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-19 10:17:48 +10:00
Dave Airlie 6c3bd1cdb3 radv: port to common wsi codebase
This drops all the radv WSI code in favour of using
the new shared code that was ported from anv

This regresses Talos for now, Jason has pointed out
the bug is in Talos and we should wait for them to fix it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 3f7ef24889 anv: move to using shared wsi code
This moves the shared code to a common subdirectory
and makes anv linked to that code instead of the copy
it was using.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie ec0bc14a70 anv/wsi: remove all anv references from WSI common code
the WSI code should be now be clean for sharing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 971523410f anv: move common wsi code to x11/wayland common files.
Next task is to rename all the anv_ out of this,
and move to a common location

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie e0d15fbe1d anv/wsi/wayland: add callback to get device format properties.
This avoids having to know the toplevel API name.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 4392de6771 anv/wsi/wl: stop using device in more places
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 507722b882 anv/wsi: split out surface creation to avoid instance API
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 954cd09e66 anv/wsi: move further away from passing anv displays around
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 1720bbd353 anv/wsi: split image alloc/free out to separate fns.
This moves these outside the wsi platform code, so we can reuse
that code

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:43 +10:00
Dave Airlie 828b8dbce4 anv/wsi: switch to using VkDevice in swapchain
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 6542001345 anv/wsi/x11: more refactoring to use generic handles
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 340e72f056 anv/wsi/x11: start refactoring out the image allocation/free functionality
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie c264c272a5 anv/wsi: drop device from get format
Just use the wsi_device instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 467d161e6a anv/wsi: remove device from get_support interface
replace with wsi_device and allocator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie b8e7460563 anv/wsi/x11: abstract WSI interface from internals.
This allows the API and the internals to be split, and the
internals shared.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 36e6be2e0d anv/wsi/x11: push anv_device out of the init/finish routines
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 7c10258567 anv/wsi: abstract wsi interfaces away from device a bit more.
This is a step towards separating out the wsi code for sharing

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie be61fff6da anv/wsi/x11: push device out of x11 connection fns.
just pass the allocator/wsi_interface instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie e9cf7c4460 anv/wsi: drop device from get caps
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 0e4abc3e10 anv/wsi: drop get present modes device arg
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Dave Airlie 32d70c0d66 radv/anv/wsi: drop unneeded parameter
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-19 10:15:42 +10:00
Roland Scheidegger aeceec54a8 draw: improve vertex fetch (v2)
The per-element fetch has quite some calculations which are constant,
these can be moved outside both the per-element as well as the main
shader loop (llvm can figure out it's constant mostly on its own, however
this can have a significant compile time cost).
Similarly, it looks easier swapping the fetch loops (outer loop per attrib,
inner loop filling up the per vertex elements - this way the aos->soa
conversion also can be done per attrib and not just at the end though again
this doesn't really make much of a difference in the generated code). (This
would also make it possible to vectorize the calculations leading to the
fetches.)
There's also some minimal change simplifying the overflow math slightly.
All in all, the generated code seems to look slightly simpler (depending
on the actual vs), but more importantly I've seen a significant reduction
in compile times for some vs (albeit with old (3.3) llvm version, and the
time reduction is only really for the optimizations run on the IR).
v2: adapt to other draw change.

No changes with piglit.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-10-19 01:44:59 +02:00
Roland Scheidegger 0942fe548e draw: improved handling of undefined inputs
Previous attempts to zero initialize all inputs were not really optimal
(though no performance impact was measurable). In fact this is not really
necessary, since we know the max number of inputs used.
Instead, just generate fetch for up to max inputs used by the shader,
directly replacing inputs for which there was no vertex element by zero.
This also cleans up key generation, which previously would have stored
some garbage for these elements.
And also drop the assertion which indicates such bogus usage by a
debug_printf (the whole point of initializing the undefined inputs was to
make this case safe to handle).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-10-19 01:44:59 +02:00
Roland Scheidegger d1b4a3451e gallivm: print out time for jitting functions with GALLIVM_DEBUG=perf
Compilation to actual machine code can easily take as much time as the
optimization passes on the IR if not more, so print this out too.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-10-19 01:44:59 +02:00
Roland Scheidegger 6f2f0daeb4 gallivm: Use native packs and unpacks for the lerps
For the texturing packs, things looked pretty terrible. For every
lerp, we were repacking the values, and while those look sort of cheap
with 128bit, with 256bit we end up with 2 of them instead of just 1 but
worse, plus 2 extracts too (the unpack, however, works fine with a
single instruction, albeit only with llvm 3.8 - the vpmovzxbw).

Ideally we'd use more clever pack for llvmpipe backend conversion too
since we actually use the "wrong" shuffle (which is more work) when doing
the fs twiddle just so we end up with the wrong order for being able to
do native pack when converting from 2x8f -> 1x16b. But this requires some
refactoring, since the untwiddle is separate from conversion.

This is only used for avx2 256bit pack/unpack for now.

Improves openarena scores by 8% or so, though overall it's still pretty
disappointing how much faster 256bit vectors are even with avx2 (or
rather, aren't...). And, of course, eliminating the needless
packs/unpacks in the first place would eliminate most of that advantage
(not quite all) from this patch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-10-19 01:44:59 +02:00
Dave Airlie 7e1e06bc75 anv: drop pointless struct decl.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-19 09:05:26 +10:00