Commit Graph

81635 Commits

Author SHA1 Message Date
Kenneth Graunke dac10e8a13 i965, anv: Use NIR FragCoord re-center and y-transform passes.
This handles gl_FragCoord transformations and other window system vs.
user FBO coordinate system flipping by multiplying/adding uniform
values, rather than recompiles.

This is much better because we have no decent way to guess whether
the application is going to use a shader with the window system FBO
or a user FBO, much less the drawable height.  This led to a lot of
recompiles in many applications.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:30:08 -07:00
Kenneth Graunke 6e5d86c07a nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers.
nir_lower_wpos_ytransform() is great for OpenGL, which allows
applications to choose whether their coordinate system's origin is
upper left/lower left, and whether the pixel center should be on
integer/half-integer boundaries.

Vulkan, however, has much simpler requirements: the pixel center
is always half-integer, and the origin is always upper left.  No
coordinate transform is needed - we just need to add <0.5, 0.5>.
This means that we can avoid using (and setting up) a uniform.

I thought about adding more options to nir_lower_wpos_ytransform(),
but making a new pass that never even touched uniforms seemed simpler.

v2: Use normal iterator rather than _safe variant (noticed by Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:30:00 -07:00
Kenneth Graunke 12ab7fc6ac nir: Don't use ffma in nir_lower_wpos_ytransform().
ffma is an explicitly fused multiply add with higher precision.
The optimizer will take care of promoting mul/add to fma when
it's beneficial to do so.

This fixes failures on Gen4-5 when using this pass, as those platforms
don't actually implement fma().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-20 14:29:04 -07:00
Kenneth Graunke b8b1b1c34c nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform.
These also need flipping!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke 4b7577fad8 nir: Make lower_wpos_ytransform_block a void function.
The return value was used for the old nir_foreach_block callback system,
but at this point it no longer means anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke 88ea960aa7 nir: Make nir_lower_wpos_ytransform() match FragCoord by location.
gl_FragCoord is a shader input with location == VARYING_SLOT_POS.
ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS,
but it isn't called gl_FragCoord.  We do want to transform it.

Matching by location guarantees we catch both.

Fixes several fp tests on a branch which uses this pass on i965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke c9192fcbd2 nir: Add interp_var_at_offset flipping.
The Y-offset needs flipping as well, similar to ddy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke 287f099db1 nir: Fix fddy swizzles in nir_lower_wpos_ytransform().
The original value might have been swizzled.  That's taken care of in
the fmul source - we don't want to reswizzle it again.

Fixes validation failures in glsl-derivs-varyings on a branch of mine
which uses this pass in i965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke 7fe9a19302 nir: Fix wpos_ytransform lowering state_slot swizzle.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:28:30 -07:00
Kenneth Graunke 1539009bf0 i965: Fix brw_regs_equal() for NaN and positive/negative zero.
We'd like the comparisons to mean "the exact same bits".  Comparing
doubles won't do that for NaN values or positive vs. negative zero.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:28:06 -07:00
Dave Airlie b19a0d506d virgl: handle cull distance cap.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-21 06:19:54 +10:00
Rob Herring 2235b80f2a virgl: Add missing texture transfer_inline_write
transfer_inline_write cannot be NULL and the virgl renderer doesn't support
inline writes for textures, so add the default version.

This fixes a crash in st_TexSubImage since commit fb9fe352ea ("st/mesa:
use transfer_inline_write for memcpy TexSubImage path").

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-21 06:07:18 +10:00
Kristian Høgsberg Kristensen 12dc89d844 anv: Merge in my TODO list items
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2016-05-20 10:35:57 -07:00
Matt Turner 015f2207cf mesa: Replace uses of Shared->Mutex with hash-table mutexes
We were locking the Shared->Mutex and then using calling functions like
_mesa_HashInsert that do additional per-hash-table locking internally.

Instead just lock each hash-table's mutex and use functions like
_mesa_HashInsertLocked and the new _mesa_HashRemoveLocked.

In order to do this, we need to remove the locking from
_mesa_HashFindFreeKeyBlock since it will always be called with the
per-hash-table lock taken.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-20 10:05:09 -07:00
Matt Turner aded1160e5 hash: Add _mesa_HashRemoveLocked() function.
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-20 10:05:09 -07:00
Matt Turner fb5dcb81cc i965: Pass nir_src/nir_dest by reference.
Cuts 6K of .text.

   text    data     bss     dec     hex filename
5772372  264648   29320 6066340  5c90a4 lib/i965_dri.so before
5766074  264648   29320 6060042  5c780a lib/i965_dri.so after

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 10:04:06 -07:00
Mark Janes 9ca5ec2a31 glsl: Guard against NULL dereference
This trivially corrects mesa 3ca1c221, which introduced a check that
crashes when a match is not found.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-20 09:52:49 -07:00
Nanley Chery 9b8c4000d0 anv: Enable textureCompressionASTC_LDR on Gen9+
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery 0d2847e177 anv/format: Reorder ASTC mappings to match ISL enum ordering
Keep the lists consistent for ease of use.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery f3ed3a0a15 genxml: Expand SKL's SurfaceFormat field width for ASTC
In the expanded field, only ASTC format enums have the MSB set to 1.
Expanding the field width makes the process of handling these formats
identical to the way other formats are handled.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery a141576887 isl: Handle npot ASTC block dimensions on Gen9+
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery de86fb875d isl: Add 2D ASTC format layouts and enums
Also, make changes needed for successful compilation and registration
as a texture compression mode.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Youry Metlitsky 4e2c9a0435 mesa: Build EGL without X11 headers after interop patchset
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-20 08:44:18 -07:00
Rob Clark df361fc58c nir/validate: assume() that hashtable entry exists
At this point, it would require a logic error in nir_validate to not
have already populated this hashtable entry, but coverity doesn't
realize that:

CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3.
dereference: Dereferencing a null pointer entry.

CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3.
dereference: Dereferencing a null pointer entry.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark fcd6b3f42b nir: coverity unitialized pointer read
Not sure how coverity arrives at the conclusion that we can read comp[j]
unitialized (around line 204), other than not being aware that ncomp is
greater than 1 so it won't underflow in the 'if (tex->is_array)' case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark 53c48feae0 nir: coverity sign-extension fix
Not 100% sure, but I think being an unsigned literal will help:

CID 1358505 (#1 of 1): Unintended sign extension
(SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension:
load1->def.num_components with type unsigned char (8 bits, unsigned) is
promoted in load1->def.num_components * (load1->def.bit_size / 8) to
type int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If load1->def.num_components * (load1->def.bit_size /
8) is greater than 0x7FFFFFFF, the upper bits of the result will all be
1.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark bb993da795 nir/glsl_to_nir: quell some uninit_member coverity errors
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark 3a1bbd6a0a freedreno/ir3: need to lower fmod too
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-20 11:13:50 -04:00
Mark Janes a2d28ddc01 i965: Fix strerror error code sign
This trivial fix to error-handling corrects the sign of drm error
codes before passing them to strerror.

Identified by Coverity: CID1358581
2016-05-20 05:58:18 -07:00
Jason Ekstrand eb384daae8 nir/spirv: Handle the NonReadable decoration on struct members 2016-05-19 21:18:59 -07:00
Jason Ekstrand ea8c11fdc2 anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled 2016-05-19 21:18:59 -07:00
Jason Ekstrand 902628bce6 anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled 2016-05-19 21:18:59 -07:00
Jason Ekstrand 23090b51e0 anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment
Originally we removed the instruction, changed the source, and then
re-inserted it.  This works, but nir_instr_rewrite_src is a bit more
obviously correct.
2016-05-19 21:18:59 -07:00
Jason Ekstrand c29ffea6d1 anv/device: Add a boolean for robust buffer access 2016-05-19 21:18:59 -07:00
Jason Ekstrand d5b4638d6a anv: Add a TODO file 2016-05-19 20:09:31 -07:00
Dave Airlie 3ca1c2216d glsl: handle same struct redeclaration (v2)
This works around a bug in older version of UE4, where a shader
defines the same structure twice. Although we aren't sure this is correct
GLSL (it most likely isn't) there are enough UE4 based things out there
we should deal with this.

This drops the error to a warning if the struct names and contents match.

v1.1: do better C++ on record_compare declaration (Rob)
v2: restrict this to desktop GL only (Ian)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-20 11:22:52 +10:00
Matt Turner 8a65b5135a i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.
Ken suggested instead of a big and complicated optimization pass, to
just recognize the operations here. It's certainly less code and a lot
prettier, but it seems to actually perform worse for currently unknown
reasons.

total instructions in shared programs: 8923452 -> 8904108 (-0.22%)
instructions in affected programs: 814563 -> 795219 (-2.37%)
helped: 3336
HURT: 10

total cycles in shared programs: 66970734 -> 66651476 (-0.48%)
cycles in affected programs: 10582686 -> 10263428 (-3.02%)
helped: 2438
HURT: 691

total spills in shared programs: 1811 -> 1789 (-1.21%)
spills in affected programs: 85 -> 63 (-25.88%)
helped: 4

total fills in shared programs: 3143 -> 3109 (-1.08%)
fills in affected programs: 167 -> 133 (-20.36%)
helped: 4

LOST:   2
GAINED: 36

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-19 17:27:49 -07:00
Matt Turner 75dccf5ac2 i965: Add infrastucture for sample lod-zero operations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-19 17:27:49 -07:00
Matt Turner 07353599e0 i965/fs: Add and use get_nir_src_imm().
The next patch wants to inspect the LOD argument and do something
different if it's 0.0f. But at that point we've emitted a MOV for it and
we just have a register to look at.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-19 17:27:49 -07:00
Ilia Mirkin 8bf5493899 nvc0: account for shader-allocated local memory needs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-19 20:20:23 -04:00
Ilia Mirkin 5c6b8cc7d0 nv50/ir: treat addresses as local
Address registers are always loaded right before use. Don't treat them
as "global", which will cause them to be put into the function's
linkage, and will make the register allocator hold onto that
register until the end of the function.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-19 20:20:23 -04:00
Tim Rowley 65c2abf6fd swr: [rasterizer] utility functions for shared libs
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:18 -05:00
Tim Rowley 6deb9f7f2c swr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADD
llvm changed the mask type to vector of ints with 3.8.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:12 -05:00
Tim Rowley 600528168b swr: [rasterizer core] apply KNOB_TOSS_DRAW to more functions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:06 -05:00
Tim Rowley 6d212cccf0 swr: [rasterizer jitter] add instancing to non-gather fetch path
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:01 -05:00
Tim Rowley 63d7ed835a swr: [rasterizer core] move MultisampleTrait static from header to cpp
Move a MultisampleTrait static from header to cpp as clang seemed to get
confused with some specializations in the header vs some in cpp.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:54 -05:00
Tim Rowley c969ef2d42 swr: [rasterizer core] clang override for _mm_undefined*
Not supported in older xcode versions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:49 -05:00
Tim Rowley da75160039 swr: [rasterizer common] add OSX to unix portability sections
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:44 -05:00
Tim Rowley 4997169779 swr: [rasterizer] rename _aligned_malloc to AlignedMalloc
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:38 -05:00
Tim Rowley 2e4ef23523 swr: [rasterizer jitter] rename MEMCPY function to MEMCOPY
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:30 -05:00