Commit Graph

66212 Commits

Author SHA1 Message Date
Eric Anholt 8c7ac377b7 vc4: Reuse uniform_data/contents indices when making uniforms.
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.

total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs:     10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs:     2467 -> 2264 (-8.23%)
2014-10-24 18:04:26 +01:00
Eric Anholt 18ccda7b86 vc4: When asked to discard-map a whole resource, discard it.
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO.  In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
2014-10-24 18:04:26 +01:00
Eric Anholt a71c3b885a vc4: Refactor flushing before mapping a BO.
I'm going to want to make some other decisions here before flushing.
2014-10-24 18:04:26 +01:00
Eric Anholt 52824811b9 vc4: Allow dead code elimination of unused varyings.
total instructions in shared programs: 39022 -> 37341 (-4.31%)
instructions in affected programs:     26979 -> 25298 (-6.23%)
total uniforms in shared programs: 11242 -> 10523 (-6.40%)
uniforms in affected programs:     5836 -> 5117 (-12.32%)
2014-10-24 18:04:26 +01:00
Eric Anholt 5d32e26335 vc4: Add debug output to match shaderdb info to program dumps.
I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but
when debugging regressions, I want to match shaderdb output to shader
disassembly.
2014-10-24 18:04:26 +01:00
Andreas Boll 14bdcc6ff9 radeon: enable Hyper-Z on r600g and radeonsi by default
This reverts commit 01e6371149.
Since then many Hyper-Z issues have been fixed or worked around.

Enable Hyper-Z by default so that we get enough feedback for the upcoming
mesa 10.4 release.

If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment
variable R600_DEBUG=nohyperz and please report the issue on the bugtracker.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011
See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-24 09:11:51 +02:00
Matt Turner 76f27a6b03 i965: Silence unused variable warning. 2014-10-23 16:20:07 -07:00
Matt Turner 40492be2a4 i965/fs: Silence uninitialized variable warning.
The compiler isn't privy to the knowledge that we're doing at least one
framebuffer write.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-10-23 16:20:07 -07:00
Matt Turner 2695891088 util: Add assume() macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-23 16:20:07 -07:00
Jan Vesely bbe93161e7 glapi: Fix compiler warning and script name
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-23 16:03:16 +01:00
Rob Clark 4f1fec6060 Revert "freedreno/a3xx: only emit dirty consts"
This reverts commit 94bb33617d.

Which somehow broke gnome-shell.. and needs more investigation.  For
now, revert..
2014-10-23 10:46:51 -04:00
Rob Clark 6eabc11936 freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
fd_bo_cpu_prep() doesn't realize the bo is already referenced in
unflushed cmdstream.  It could be made to do so (but would have to be
implemented twice, ie. both for msm and kgsl).  But we still can't do
the expected thing if the caller isn't using _NOSYNC.  Because of the
way the tiling works, we need to build quite a bit of cmdstream at flush
time, which is not possible to do at the libdrm level.

So rather than trying to make fd_bo_cpu_prep() smarter than it can
possibly be, just *always* discard and reallocate if the
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-23 10:46:51 -04:00
Jan Vesely ab53830b95 clover: Require libelf
v2: test for libelf once, check in both radeon and clover

CC: Tom Stellard <tom@stellard.net>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-23 15:19:00 +01:00
Emil Velikov b4039cf15a clover: use correct typenames for compat::pair's first/second
Seems to be a typo judging from the overall declaration of the
template.

Cc: EdB <edb+mesa@sigluy.net>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-23 15:18:12 +01:00
Emil Velikov c63eb5dd5e auxiliary/os: get the mmap/munmap wrappers working with android
- Use macro for munmap under Android - the STATIC_ASSERT uses
a off_t which is not used under Android for mmap. As loff_t size
does not vary as does off_t just ignore the assert.

 - Wrap the long lines to improve readability.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-23 15:18:11 +01:00
Mauro Rossi 417b17378a gallium/nouveau: fully build the driver under android
Fix the trivial typo in the variable name.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-10-23 15:18:11 +01:00
Alon Levy d897e7c34a mesa/shaderimage.c: fix inconsistent sign warning
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-23 14:45:41 +01:00
Alon Levy 501baa6bbb wgl: stw_pixelformat_get_info: correct type for index variable
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-23 14:45:40 +01:00
Alon Levy 23080e49c4 u_math.h: fix 64 to 32 bit truncation warning
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-23 14:45:40 +01:00
José Fonseca 75ad4fe78e gallivm: Fix build with LLVM 3.3.
The setMCJITMemoryManager method doesn't exist in LLVM 3.3.

I thought I had tested the latest version of my earlier change with LLVM
3.3, but it looks I missed it.

Trivial.
2014-10-23 10:42:12 +01:00
José Fonseca 065256dfc4 gallivm: Properly update for removal of JITMemoryManager in LLVM 3.6.
JITMemoryManager was removed in LLVM 3.6, and replaced by its base class
RTDyldMemoryManager.

This change fixes our JIT memory managers specializations to derive from
RTDyldMemoryManager in LLVM 3.6 instead of JITMemoryManager.

This enables llvmpipe to run with LLVM 3.6.

However, lp_free_generated_code is basically a no-op because there are
not enough hook points in RTDyldMemoryManager to track and free the code
of a module.  In other words, with MCJIT, code once created, stays
forever allocated until process destruction.  This is not speicfic to
LLVM 3.6 -- it will happen whenever MCJIT is used regardless of version.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:19:33 +01:00
José Fonseca 3fd220e2eb gallivm: Fix white-space.
Replace tabs with spaces.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:19:33 +01:00
José Fonseca 013ff2fae1 gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.

As consequence HAVE_AVX define can disappear.  (Note HAVE_AVX meant
whether LLVM version supports AVX or not.  Runtime support for AVX is
always checked and enforced independently.)

Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:18:56 +01:00
Ilia Mirkin 9ad80d1d18 mesa: remove conditional render and rgtc from ES3 requirements
The functionality exposed by those extensions does not appear in ES3

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-23 00:45:08 -04:00
Brian Paul c9a6ec1978 u_blitter: put a comment on util_blitter_cache_all_shaders()
Trivial.
2014-10-22 17:33:40 -06:00
Brian Paul f82a84c097 u_blitter: use ctx->bind_fs_state(), not pipe->bind_fs_state()
Consistently use the function pointer we saved earlier.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Brian Paul 0bcd9f5469 u_blitter: create basic fs shaders in util_blitter_cache_all_shaders()
We need to create all fs shaders in this function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Brian Paul 27de89d266 u_blitter: do error checking assertions for shader caching
If the user calls util_blitter_cache_all_shaders() set a flag and assert
that we never try to create any new fragment shaders after that point.
If the assertions fails, it means we missed generating some shader in
util_blitter_cache_all_shaders().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Anuj Phogat 7a652c41b4 glsl: Use signed array index in update_max_array_access()
Avoids a crash in case of negative array index is used in a
shader program.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-22 16:13:37 -07:00
Anuj Phogat 6f0089e92e glsl: Fix crash due to negative array index
Currently Mesa crashes with a shader like this:

[fragmnet shader]
float[5] array;
int idx = -2;
void main()
{
   gl_FragColor = vec4(0.0, 1.0, 0.0, array[idx]);
}

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-22 16:13:37 -07:00
Marek Olšák 8ec40adf7e radeonsi: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:05:00 +02:00
Marek Olšák a3591da1a0 r600g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:58 +02:00
Marek Olšák 8ddd2f7aee r300g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:56 +02:00
Michel Dänzer ae879718c4 r600g: Drop references to destroyed blend state
Fixes use-after-free when the currently bound blend state is destroyed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

Cc: mesa-stable@lists.freedesktop.org
2014-10-22 17:09:43 +09:00
Kenneth Graunke 6dc6e6e0d9 i965/vec4: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:

   cmp.ge.f0(8)    g6<1>D          g1<0,4,1>F      0F
   cmp.nz.f0(8)    null            g6<4,4,1>D      0D
   (+f0) sel(8)    g5<1>F          g1.4<0,4,1>F    g2<0,4,1>F

The first operand is always a boolean, and we want to predicate the SEL
on that.  Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.

Now we generate:

   cmp.ge.f0(8)    null            g1<0,4,1>F      0F
   (+f0) sel(8)    g5<1>F          g1.4<0,4,1>F    g2<0,4,1>F

No difference in shader-db.

v2: Remember to delete the old code (thanks Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:14:03 -07:00
Kenneth Graunke f5c3f095b9 i965/vec4: Simplify visit(ir_expression *)'s result_src/dst setup.
Using dst_reg(this, ir->type) automatically sets the writemask to the
proper size for the type; src_reg(dst_reg) preserves that.  This should
be equivalent, but less code.

Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so
the old code did need the manual writemask adjustment, since it
constructed the registers the other way around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:14:00 -07:00
Kenneth Graunke cb36e79f96 i965/vec4: Delete some dead code in visit(ir_expression *).
Nothing uses the vector_elements temporary variable.

Setting this->result.file is dead because we overwrite this->result a
few lines later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:13:37 -07:00
Kenneth Graunke 4d34c4b582 i965/fs: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:

   cmp.ge.f0(8)   g4<1>D          g2<0,1,0>F      0F
   cmp.nz.f0(8)   null            g4<8,8,1>D      0D
   (+f0) sel(8)   g120<1>F        g2.4<0,1,0>F    g3<0,1,0>F

The first operand is always a boolean, and we want to predicate the SEL
on that.  Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.

Now we generate:

   cmp.ge.f0(8)    null            g2<0,1,0>F      0F
   (+f0) sel(8)    g124<1>F        g2.4<0,1,0>F    g3<0,1,0>F

total instructions in shared programs: 5473459 -> 5473253 (-0.00%)
instructions in affected programs:     6219 -> 6013 (-3.31%)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:13:37 -07:00
Kenneth Graunke 32364a1fe5 glsl: Delete unused gl_uniform_driver_format enum values.
A while back, Matt made the uniform upload functions simply upload
ctx->Const.UniformBooleanTrue for boolean values instead of 0/1, which
removed the need to convert it later.  We also set UniformBooleanTrue to
1.0f for drivers which want to treat booleans as 0.0/1.0f.

Nothing ever sets these, so they are dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-21 18:53:13 -07:00
Rob Clark 36310d9d56 freedreno/a3xx: fix depth/stencil restore format
Also fix z16 restore format which was completely wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark 2bc2ab66d9 freedreno/a3xx: fix viewport state during clear
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark 3eb8289aa4 freedreno: mark scissor state dirty when enable bit changes
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark 01b757e2b0 freedreno: clear vs scissor
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor.  We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Vinson Lee 1ab6543431 clover: Fix build error with LLVM 3.4.
DataLayoutPass was added in LLVM 3.5 r202168, commit
57edc9d4ff1648568a5dd7e9958649065b260dca "Make DataLayout a plain
object, not a pass.".

This patch fixes this build error with LLVM 3.4.

  CXX      llvm/libclllvm_la-invocation.lo
llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector<llvm::Function*>&)':
llvm/invocation.cpp:324:18: error: expected type-specifier
       PM.add(new llvm::DataLayoutPass(mod));
                  ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85189
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-10-21 15:40:47 -07:00
Marek Olšák 43b2432368 r600g,radeonsi: convert TGSI shader type to LLVM shader type
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.

We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák c5a44cf3f8 radeonsi: add some missing register definitions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák fc3b3354d7 radeonsi: load ring resource descriptors only once
v2: document the new functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:35 +02:00
Marek Olšák d787608957 radeonsi: clarify shader constant load functions
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:35:44 +02:00
Marek Olšák 55a9b778c8 radeonsi: statically declare resource and sampler arrays
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:48 +02:00
Marek Olšák e827bb6fe7 radeonsi: remove conversion of DX9 FACE input to GL
st/mesa and gallium expect the DX9 format, so this is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:41 +02:00