Commit Graph

12818 Commits

Author SHA1 Message Date
Glenn Kennard 444c8c2f28 r600g: Implement sm5 interpolation functions
Requires evergreen/cayman

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2014-10-28 23:20:44 +01:00
Tobias Klausmann 1a170980a0 nv50: handle inverted render conditions
This enables ARB_conditional_render_inverted.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-26 07:33:16 -04:00
Rob Clark 13862812dc freedreno/ir3: consider instruction neighbors in cp
Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive
scalar registers.  Keep track of instruction neighbors in copy-
propagation step and avoid eliminating mov's which would cause an
instruction to need multiple distinct left and/or right neighbors.

This lets us not fall on our face when we encounter things like:

  1: MOV TEMP[2], IN[0].xyzw
  2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D
  3: MOV TEMP[2].xy, IN[0].yxzz
  4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D
  5: END

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:43 -04:00
Rob Clark 4dff2a6429 freedreno/ir3: always mov tex coords
Always insert extra mov's for the tex coord into the fanin.  This
simplifies things a bit, and avoids a scenario where multiple sam
instructions can have mutually exclusive input's to it's fanin, for
example:

  1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D
  2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D

The CP pass can always remove the mov's that are not actually needed,
so better to start out with too many mov's in the front end, than not
enough.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:34 -04:00
Rob Clark 33193540fc freedreno: rename a couple debug flags
dscis -> noscis
dbypass -> nobypass

a bit more consistant w/ nobin, etc.  And IMO a bit more sensible names.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:21 -04:00
Rob Clark ded5013c4c freedreno/ir3: skip virtual outputs in standalone compiler
Kills get added to the outputs list, to ensure they get scheduled.  But
they aren't *really* outputs so skip them in the header comment block.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 10:25:15 -04:00
Rob Clark d6252d0f63 freedreno/ir3: standalone compiler updates for ir3test
In order to test compiler changes more easily, spit out the assembled
shader with some header information so that we can know about
inputs/outputs more easily.

See: git://people.freedesktop.org/~robclark/ir3test

In ir3test we have a big collection of tgsi shaders and reference
ir3_compiler outputs.  When making compiler changes, regenerate the
compiler outputs and feed to ir3test to compare the new vs reference
shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 09:08:15 -04:00
Chia-I Wu 762c68b879 ilo: improve blob decoding
The last few dwords were skipped if the total number of dwords was not a
multiple of 4.  Change the formatting for better readability.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-25 14:28:08 +08:00
José Fonseca 701f739d7f llvmpipe: Ensure the packed input of the lp_test_format is aligned.
Fixes:
- https://bugs.freedesktop.org/show_bug.cgi?id=85377
- http://llvm.org/bugs/show_bug.cgi?id=21365

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-24 21:35:23 +01:00
José Fonseca 1ef6d439ba llvmpipe: Flush stdout on lp_test_* unit tests.
So that the order of test messages and gallivm/llvmpipe debug output is
preserved.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-24 21:35:09 +01:00
Mathias Fröhlich 56088131d0 gallium: introduce PIPE_CAP_CLIP_HALFZ.
In preparation of ARB_clip_control. Let the driver decide if
it supports pipe_rasterizer_state::clip_halfz being set to true.

v3:
Initially enable on ilo.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de
2014-10-24 19:21:21 +02:00
Eric Anholt 8c7ac377b7 vc4: Reuse uniform_data/contents indices when making uniforms.
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.

total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs:     10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs:     2467 -> 2264 (-8.23%)
2014-10-24 18:04:26 +01:00
Eric Anholt 18ccda7b86 vc4: When asked to discard-map a whole resource, discard it.
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO.  In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
2014-10-24 18:04:26 +01:00
Eric Anholt a71c3b885a vc4: Refactor flushing before mapping a BO.
I'm going to want to make some other decisions here before flushing.
2014-10-24 18:04:26 +01:00
Eric Anholt 52824811b9 vc4: Allow dead code elimination of unused varyings.
total instructions in shared programs: 39022 -> 37341 (-4.31%)
instructions in affected programs:     26979 -> 25298 (-6.23%)
total uniforms in shared programs: 11242 -> 10523 (-6.40%)
uniforms in affected programs:     5836 -> 5117 (-12.32%)
2014-10-24 18:04:26 +01:00
Eric Anholt 5d32e26335 vc4: Add debug output to match shaderdb info to program dumps.
I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but
when debugging regressions, I want to match shaderdb output to shader
disassembly.
2014-10-24 18:04:26 +01:00
Andreas Boll 14bdcc6ff9 radeon: enable Hyper-Z on r600g and radeonsi by default
This reverts commit 01e6371149.
Since then many Hyper-Z issues have been fixed or worked around.

Enable Hyper-Z by default so that we get enough feedback for the upcoming
mesa 10.4 release.

If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment
variable R600_DEBUG=nohyperz and please report the issue on the bugtracker.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011
See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-24 09:11:51 +02:00
Rob Clark 4f1fec6060 Revert "freedreno/a3xx: only emit dirty consts"
This reverts commit 94bb33617d.

Which somehow broke gnome-shell.. and needs more investigation.  For
now, revert..
2014-10-23 10:46:51 -04:00
Rob Clark 6eabc11936 freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
fd_bo_cpu_prep() doesn't realize the bo is already referenced in
unflushed cmdstream.  It could be made to do so (but would have to be
implemented twice, ie. both for msm and kgsl).  But we still can't do
the expected thing if the caller isn't using _NOSYNC.  Because of the
way the tiling works, we need to build quite a bit of cmdstream at flush
time, which is not possible to do at the libdrm level.

So rather than trying to make fd_bo_cpu_prep() smarter than it can
possibly be, just *always* discard and reallocate if the
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-23 10:46:51 -04:00
Mauro Rossi 417b17378a gallium/nouveau: fully build the driver under android
Fix the trivial typo in the variable name.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-10-23 15:18:11 +01:00
José Fonseca 013ff2fae1 gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.

As consequence HAVE_AVX define can disappear.  (Note HAVE_AVX meant
whether LLVM version supports AVX or not.  Runtime support for AVX is
always checked and enforced independently.)

Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:18:56 +01:00
Marek Olšák 8ec40adf7e radeonsi: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:05:00 +02:00
Marek Olšák a3591da1a0 r600g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:58 +02:00
Marek Olšák 8ddd2f7aee r300g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:56 +02:00
Michel Dänzer ae879718c4 r600g: Drop references to destroyed blend state
Fixes use-after-free when the currently bound blend state is destroyed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

Cc: mesa-stable@lists.freedesktop.org
2014-10-22 17:09:43 +09:00
Rob Clark 36310d9d56 freedreno/a3xx: fix depth/stencil restore format
Also fix z16 restore format which was completely wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark 2bc2ab66d9 freedreno/a3xx: fix viewport state during clear
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark 3eb8289aa4 freedreno: mark scissor state dirty when enable bit changes
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark 01b757e2b0 freedreno: clear vs scissor
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor.  We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Marek Olšák 43b2432368 r600g,radeonsi: convert TGSI shader type to LLVM shader type
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.

We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák c5a44cf3f8 radeonsi: add some missing register definitions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák fc3b3354d7 radeonsi: load ring resource descriptors only once
v2: document the new functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:35 +02:00
Marek Olšák d787608957 radeonsi: clarify shader constant load functions
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:35:44 +02:00
Marek Olšák 55a9b778c8 radeonsi: statically declare resource and sampler arrays
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:48 +02:00
Marek Olšák e827bb6fe7 radeonsi: remove conversion of DX9 FACE input to GL
st/mesa and gallium expect the DX9 format, so this is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:41 +02:00
Marek Olšák a18f803a86 radeonsi: revert hack for random failures in glsl-max-varyings
This reverts commit 032e5548b3.

I've run glsl-max-varyings 30 times and it always passed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:29 +02:00
Marek Olšák b9b0973db2 radeonsi: generate shader pm4 states right after shader compilation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:26 +02:00
Marek Olšák c94af8f0d7 radeonsi: make pm4 state generation for shaders independent of the context
The si_pm4_delete_state calls became useless, because the pm4 state is
always generated only once.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:22 +02:00
Marek Olšák 139bde061a radeonsi: inline si_pm4_alloc_state
It seemed like the function needed a context pointer. Let's remove it
to make it less confusing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:15 +02:00
Marek Olšák 22c5886f3f r300g: replace r300_get_num_samples with a util variant 2014-10-21 22:03:55 +02:00
Marek Olšák 5f5b83cbba gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa
With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.

v2: return 32 for softpipe and llvmpipe
2014-10-21 21:59:02 +02:00
Eric Anholt ef280c95f2 vc4: Fix SRC_ALPHA_SATURATE blending.
Fixes glean blendFunc.
2014-10-21 15:46:48 +01:00
Eric Anholt cc298023c9 vc4: Fix stencil writemask handling.
If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).

Fixes glean's stencil2 and fbo-clear-formats on stencil.
2014-10-21 15:16:41 +01:00
Eric Anholt 48f6351940 vc4: Don't look at back stencil state unless two-sided stencil is enabled.
Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.
2014-10-21 15:16:41 +01:00
Rob Clark 4f17e026bb freedreno/ir3: add debug flag to disable cp
FD_MESA_DEBUG=nocp will disable copy propagation pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Ilia Mirkin f0ca26725e freedreno: positions come out as integers, not half-integers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark 3fcb021201 freedreno/a3xx: disable early-z when we have kill's
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark 8a0ffedd8d freedreno/ir3: fix potential gpu lockup with kill
It seems like the hardware is unhappy if we execute a kill instruction
prior to last input (ei).  Probably the shader thread stops executing
and the end-input flag is never set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark ab33a24089 freedreno/ir3: comment + better fxn name
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark 94bb33617d freedreno/a3xx: only emit dirty consts
If app only updates (for example) vertex uniforms, it would be nice to
only re-emit those and not also frag uniforms.  Means we need to mark
the first frag shader const buffer dirty after a clear.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00