Commit Graph

56834 Commits

Author SHA1 Message Date
Eric Anholt da2880bea0 intel: Extend the force_y_tiling flag to allow forcing no tiling.
For a blit-uploaded temporary, it's faster on current hardware to memcpy
the data into a linear CPU mapping than to go through the GTT.

v2: Turn the not-fully-supported mask into 3 supported enum values.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v2)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v2)
2013-05-28 13:06:43 -07:00
Eric Anholt 045612c90e intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers.
This is just in case someone else trips over this due to our weird reuse
of this code in glBlitFramebuffer().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:44 -07:00
Eric Anholt 7638f5578e i965: Allow glCopyTexSubImage() on depth textures.
If the hw is pre-gen5 and can't blit depth, it'll cleanly error out.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:39 -07:00
Eric Anholt 48a22340cf i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.
I think we've measured no performance difference from this in the past,
except that the blorp code can do things like multisample resolves.
Prevents piglit regression in the next commit when a testcase started
trying to do a multisampled resolve through the old glCopyTexSubImage()
path.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:35 -07:00
Eric Anholt 9720d436d1 i965: Consistently do depth resolves before blitting.
We were protected for a long time by the fact that depth was Y tiled and
you couldn't blit Y.  Now that we can blit Y, we were failing to resolve
depth in glCopyPixels().

Note in the comment about swrast, that the swrast map path does resolves
appropriately already.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:30 -07:00
Eric Anholt 6a7c27786c intel: Make a wrapper for intelEmitCopyBlit using miptrees.
I had previously asserted that it was hard to write a useful, simpler
blit function, but I think this might be it.

This has the side effect of extending the 32k pitch check to a few more
places that were missing it.

v2: Update comment for being moved inside intel_miptree_blit().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:25 -07:00
Eric Anholt 0ae294bf7c intel: Rename intel_renderbuffer_tile_offsets.
This makes it more consistent with intel_miptree_get_tile_offsets().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:21 -07:00
Eric Anholt 4e8eafd8f4 intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:15 -07:00
Eric Anholt 5c85e1cf55 intel: Make intel_miptree_get_tile_offsets return a page offset.
Right now, the callers in i965 don't expect a nonzero page offset to
actually occur (since that's being handled elsewhere), but it seems
like a trap to leave it this way.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:00 -07:00
José Fonseca 4eaa0999b5 glsl: Fix MSVC build.
It appears that `sizeof(Class::member)` is either non-standard or
merely unsupported in MSVC.

So use `sizeof(instance->member)` instead, which is guaranteed to work
everywhere.

Also promote the assert to a static assert.

Trivial.
2013-05-28 13:56:18 +01:00
Marek Olšák d4a06d77f5 mesa: fix GLSL program objects with more than 16 samplers combined
The problem is the sampler units are allocated from the same pool for all
shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment
shader samplers start at index 12, leaving only 4 sampler units
for the fragment shader. The main cause is probably the fact that samplers
(texture unit -> sampler unit mapping, etc.) are tracked globally
for an entire program object.

This commit adapts the GLSL linker and core Mesa such that the sampler units
are assigned to sampler uniforms for each shader stage separately
(if a sampler uniform is used in all shader stages, it may occupy a different
sampler unit in each, and vice versa, an i-th sampler unit may refer to
a different sampler uniform in each shader stage), and the sampler-specific
variables are moved from gl_shader_program to gl_shader.

This doesn't require any driver changes, and it fixes piglit/max-samplers
for gallium and classic swrast. It also works with any number of shader
stages.

v2: - converted tabs to spaces
    - added an assertion to _mesa_get_sampler_uniform_value

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-28 13:05:30 +02:00
Marek Olšák b4cb857dbf swrast: increase array size of TextureSample
to match the size of ctx->Texture.Unit, and it will also fix
piglit/max-samplers with the following commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-28 13:05:30 +02:00
Marek Olšák 15a4b6db21 mesa: declare UniformBufferBindings as an array with a static size
Some Gallium drivers were crashing, because the array was not large enough.

v2: clamp the per-shader maximum in st/mesa, then sum them all up

NOTE: This is a candidate for the stable branches.
2013-05-28 13:05:30 +02:00
Michel Dänzer cdad129f9c radeonsi: Enable GLSL 1.30 2013-05-28 11:20:53 +02:00
Michel Dänzer 0495adbac5 radeonsi: Handle TGSI TXQ opcode 2013-05-28 11:20:53 +02:00
Michel Dänzer 3623111960 radeonsi: Add support for TGSI TXF opcode 2013-05-28 11:20:53 +02:00
Michel Dänzer beaa5eb03a radeonsi: Use tgsi_util_get_texture_coord_dim() 2013-05-28 11:20:53 +02:00
Michel Dänzer 0afeea5ad2 radeonsi: Handle TGSI_SEMANTIC_CLIPDIST 2013-05-28 11:20:16 +02:00
Michel Dänzer 784df2e115 radeonsi: Make border colour state handling safe for integer textures 2013-05-28 09:55:46 +02:00
Michel Dänzer e369f40a9b radeonsi: Fix hardware state for dual source blending
Set up CB_SHADER_MASK register according to pixel shader exports, and enable
some minimal state for colour buffer 1 in case dual source blending is used.
2013-05-28 09:55:46 +02:00
Vadim Girlin 08810ca9ef r600g/sb: handle more cases for folding in gvn pass
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-28 05:24:53 +04:00
Christian König 5328c8001b st/vdpau: destroy handle table only when it's empty
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Christian König f796b67431 st/vdpau: remove vlCreateHTAB from surface functions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Christian König 8ea34fa0e8 st/vdpau: invalidate the handles on destruction
Fixes a problem with xbmc when switching channels.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Vadim Girlin 5de41575a1 r600g/sb: improve folding for SETcc
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:30:01 +04:00
Vadim Girlin 88e700329b r600g/sb: optimize CNDcc instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:29:56 +04:00
Vadim Girlin 725671a83a r600g/sb: improve optimization of conditional instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:19:20 +04:00
Chia-I Wu 5285c4c88e ilo: enable multiple constant buffers
This effectively enables uniform buffer object support.
2013-05-27 12:31:42 +08:00
Chia-I Wu 3a5dd39b1d ilo: add support for indirect access of CONST in FS
Unlike other register files, CONST is read with a message and indirect access
is easier to implement.
2013-05-27 12:30:51 +08:00
Chia-I Wu 8e7987cc49 ilo: add support for TBOs on GEN6
This hunk was missing in the last commit.
2013-05-27 12:30:42 +08:00
Chia-I Wu 11c9aaf30a ilo: advertise supports for pure integer formats
For pure integer formats, no filtering nor blending is needed.
2013-05-27 11:02:57 +08:00
Chia-I Wu fb40aca879 ilo: add support for texture buffer objects
Take care of sampler views that have buffers as the underlying resources.
Update caps related to TBOs.
2013-05-27 11:02:57 +08:00
Chia-I Wu 441aa9326a tgsi: add buffer texture to tgsi_util_get_texture_coord_dim()
TGSI_TEXTURE_BUFFER is one-dimensional.  Assert that exec_tex() is never
called with TGSI_TEXTURE_BUFFER.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-27 11:02:06 +08:00
Vadim Girlin 63d09a0cb7 r600g/sb: improve handling of KILL instructions
This patch improves handling of unconditional KILL instructions inside
the conditional blocks, uncovering more opportunities for if-conversion.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin 880f435a7e r600g/sb: fix peephole optimization for PRED_SETE
Fixes incorrect condition that prevented optimization for
PRED_SETE/PRED_SETE_INT.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin ff2a611699 r600g/sb: fix scheduling of PRED_SET instructions
PRED_SET instructions that update exec mask should be scheduled immediately
prior to the "if-then-else" block, because any instruction that is
inserted after alu clause with PRED_SET and before conditional block is
also conditionally executed by hw (exec mask is already updated at that
moment).

Propbably it's better to make PRED_SET a part of conditional
"if-then-else" block in the IR to handle this more cleanly,
but for now this temporary solution should prevent the problem.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin 44a117ab9a r600g/sb: fix handling of preloaded inputs for compute shaders
For compute shaders we need to let the backend know that
GPRs 0 and 1 are preloaded with some compute-specific input
values, otherwise any use of these regs without previous
definition is considered as undefined value and usually
is simply replaced with 0.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-25 22:56:53 +04:00
Brian Paul fd9fe4470b xlib: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-24 16:35:25 -06:00
Brian Paul fd29e4acda st/glx: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-24 16:35:25 -06:00
Brian Paul db4580cbdf st/mesa: add switch cases for new IR enums to silence warnings 2013-05-24 16:35:25 -06:00
Brian Paul 820de34ceb st/glx/xlib: assorted whitespace, comment fixes 2013-05-24 16:35:24 -06:00
Vadim Girlin 8e41ced4b3 r600g/sb: fix incorrect assert
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin e9aa46e665 r600g/sb: relax some restrictions for FETCH instructions
This allows GVN rewrite pass to propagate non-const (register)
values to FETCH source operands, helping to eliminate unnecessary
copies in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin 5a68a29706 r600g/sb: relax register allocation for compute shaders
We have to assume that all GPRs in compute shader can be indirectly
addressed because LLVM backend doesn't provide any indirect array info.
That's why for compute shaders GPR array is created that covers all used
GPRs (0..r600_bytecode::ngpr-1), but this seriously restricts register
allocation in sb.

This patch checks for actual use of indirect access in the shader and
if it's not used then GPR array is not created, so that regalloc is not
unnecessarily restricted.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin 0b5b3f8816 r600g/sb: fix gpr array handling for compute shaders
Fixes segfault with bfgminer and R600_DEBUG=sbcl.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 16:45:58 +04:00
Vadim Girlin d1e0dc6275 r600g/sb: fix buffer overflow in sb_ostream
Fixes segfault during bytecode dump with bfgminer kernel

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 16:40:58 +04:00
Tom Stellard b1797c3a38 r600g/compute: Use common transfer_{map,unmap} functions for global resources
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-23 14:52:34 -07:00
Tom Stellard 65d67bcc4b r600g/compute: Use common transfer_{map,unmap} functions for kernel inputs
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-23 14:52:34 -07:00
Kenneth Graunke 062317d667 i965: Go back to using the kernel SOL reset feature.
It turns out the MI_LOAD_REGISTER_IMM approach doesn't work on Haswell,
and regressed essentially all the transform feedback Piglit tests.

This morally reverts eaa6fbe6d5.  However,
the code is still simpler than it was.  On BeginTransformFeedback, we
simply flush the batch and set the SOL reset flag so that the next batch
will start with zeroed offsets.  There's still no software counting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64887
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-23 13:32:02 -07:00
Rob Clark 95670bdee2 freedreno: scissor fix
Don't assume the state-tracker will set the scissor after the
framebuffer state is changed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-05-23 14:35:21 -04:00