KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Roland Scheidegger	1d28650b55	llvmpipe: kill off llvmpipe_variant_count Unused except it was increased for both fs and setup shader variants created. Probably some leftover from ages ago. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-15 02:35:26 +02:00
José Fonseca	0b239d9ed9	llvmpipe: Delete unneeded LLVM stuff earlier. Same as Frank's change to draw module but for llvmpipe module. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
Roland Scheidegger	9477d8c862	llvmpipe: add support for b5g6r5_srgb The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA (and vice versa), fix this to handle also 16bit 565-style srgb formats. Still not really all that generic, things like r10g10b10a2_srgb or r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not require more work to not crash but near certainly need some higher precision calculation) but not needed right now. The code is not fully optimized for this (could use more direct calculation instead of expanding to 8-bit range first) but should be good enough. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-21 17:23:38 +01:00
Roland Scheidegger	1d53603f1f	llvmpipe: fix denorm handling for r11g11b10_float format when blending The code re-enabling denorms for small float formats did not recognize this format due to format handling hacks (mainly, the lp_type doesn't have the floating bit set). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-31 19:51:06 +01:00
José Fonseca	8771285054	s/Tungsten Graphics/VMware/ Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 \| xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra$ph\\|hp$ics,\? [iI]nc\.\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics$,\? [iI]nc\.$\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-17 20:00:32 +00:00
Brian Paul	d6fa71fbb0	llvmpipe: handle NULL color buffer pointers Fixes regression from `9baa45f78b` v2: incorporate a few small changes suggested by Roland. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-17 08:52:11 -08:00
Zack Rusin	93b953d139	llvmpipe: do constant buffer bounds checking in shaders It's possible to bind a smaller buffer as a constant buffer, than what the shader actually uses/requires. This could cause nasty crashes. This patch adds the architecture to pass the maximum allowable constant buffer index to the jit to let it make sure that the constant buffer indices are always within bounds. The behavior follows the d3d10 spec, which says the overflow should always return all zeros, and overflow is only defined as access beyond the size of the currently bound buffer. Accesses beyond the declared shader constant register size are not considered an overflow and expected to return garbage but consistent garbage (we follow the behavior which some wlk tests expect which is to return the actual values from the bound buffer). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-16 16:33:57 -05:00
Si Chen	72c6d0e506	llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers. Implement Alpha to Coverage by discarding a fragment alpha component is less than 0.5. This is a joint work of Jose and Si. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-07 16:04:42 +00:00
Matthew McClure	e84a1ab3c4	llvmpipe: add plumbing for ARB_depth_clamp With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when clamping the fragment's zw value. To support this, the variant key now includes the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's (depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-11 18:24:21 +00:00
Zack Rusin	155139059b	llvmpipe: fix blending with half-float formats The fact that we flush denorms to zero breaks our half-float conversion and blending. This patches enables denorms for blending. It's a little tricky due to the llvm bug that makes it incorrectly reorder the mxcsr intrinsics: http://llvm.org/bugs/show_bug.cgi?id=6393 Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-12-10 16:39:48 -05:00
Matthew McClure	0319ea9ff6	llvmpipe: clamp fragment shader depth write to the current viewport depth range. With this patch, generate_fs_loop will clamp any fragment shader depth writes to the viewport's min and max depth values. Viewport selection is determined by the geometry shader output for the viewport array index. If no index is specified, then the default viewport index is zero. Semantics for this path can be found in draw_clamp_viewport_idx and lp_clamp_viewport_idx. lp_jit_viewport was created to store viewport information visible to JIT code, and is validated when the LP_NEW_VIEWPORT dirty flag is set. lp_rast_shader_inputs is responsible for passing the viewport_index through the rasterizer stage to fragment stage (via lp_jit_thread_data). Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-09 12:57:02 +00:00
Roland Scheidegger	754319490f	gallivm,llvmpipe: fix float->srgb conversion to handle NaNs d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-14 12:24:55 +00:00
Vinson Lee	76df7edacf	llvmpipe: Remove unnecessary null check of shader. shader has already been dereferenced earlier so cannot be null here. Fixes "Dereference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-30 22:00:54 -07:00
José Fonseca	1569b3e536	llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM. We must take rounding in consideration when re-scaling to narrow normalized channels, such as 2-bit normalized alpha. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-20 17:34:57 +01:00
Zack Rusin	27cedd8aec	llvmpipe: fix pipeline statistics with a null ps If the fragment shader is null then pixel shader invocations have to be equal to zero. And if we're running a null ps then clipper invocations and primitives should be equal to zero but only if both stancil and depth testing are disabled. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-14 18:23:36 -04:00
Roland Scheidegger	4ef19f7fec	llvmpipe: clamp inputs for srgb render buffers Usually with fixed point renderbuffers clamping is done as part of conversion. However, since we blend in float format, we essentially skip all conversion steps pre-blend but since this is still a fixed point renderbuffer we must still clamp the inputs in this case. Makes no difference for piglit though. Obviously we could skip this if fragment color clamping is enabled, but a) this is deprecated in OpenGL (d3d never had it) and b) we don't support it natively so it gets baked into the shader. Also add some comment about logic ops being broken for srgb, luckily no test tries to do that as there's no easy fix... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-18 19:04:20 +02:00
Roland Scheidegger	e57b98bad3	llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha We were fixing up the blend factor to ZERO, however this only works correctly with fixed point render buffers where the input values are clamped to 0/1 (because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped inputs). Haven't seen any failure anywhere due to that with fixed point SNORM buffers (which clamp inputs to -1/1) but it should apply there as well (snorm blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all, d3d10 requires them but they are not blendable). Doesn't look like piglit hits this though (some internal testing hits the float case at least). (With legacy OpenGL we could theoretically still use the fixup to zero if the fragment color clamp is enabled, but we can't detect that easily since we don't support native clamping hence it gets baked into the shader.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-18 19:03:35 +02:00
Roland Scheidegger	dc1cc928ed	llvmpipe: support sRGB framebuffers Just use the new conversion functions to do the work. The way it's plugged in into the blend code is quite hacktastic but follows all the same hacks as used by packed float format already. Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never worked anyway in the blend code and are thus disabled, and I don't think anyone is interested in L8/L8A8. Would need even more hacks otherwise. Unless I'm missing something, this is the last feature except MSAA needed for OpenGL 3.0, and for OpenGL 3.1 as well I believe. v2: prettify a bit, use separate function for packing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-16 01:54:51 +02:00
Roland Scheidegger	2e4da1f594	llvmpipe: add support for nested / overlapping queries OpenGL doesn't support this but d3d10 does. It is a bit of a pain as it is necessary to keep track of queries still active at the end of a scene, which is also why I cheat a bit and limit the amount of simultaneously active queries to (arbitrary) 16 (simplifies things because don't have to deal with a real list that way). I can't think of a reason why you'd really want large numbers of overlapping/nested queries so it is hopefully fine. (This only affects queries which need to be binned.) v2: don't copy remainder of array when deleting an entry simply replace the deleted entry with the last one (order doesn't matter). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-26 23:17:53 +02:00
Adam Jackson	2151d893fb	gallium: Fix llvmpipe on big-endian machines Squashed commit of the following: commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:07 2013 -0400 gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 0d65131649a8aa140e2db228ba779d685c4333e3 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:07 2013 -0400 gallivm: Fix big-endian machines This adds a bit-shift count to the format table, and adds the concept of vector or bitwise alignment on gathers. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 9740bda9b7dc894b629ed38be9b51059ce90818f Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:07 2013 -0400 llvmpipe: Fix convert_to_blend_type on big-endian Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit ae037c2de0f029e4e99371c0de25560484f0d8df Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 util: Convert color pack to packed formats This fixes them on big-endian. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 graw-xlib: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 format: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 417b60bc66eb450e68a92ab0e47f76e292b385e6 Author: Adam Jackson <ajax@redhat.com> Date: Tue Jun 18 12:25:06 2013 -0400 st/dri: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 0934b2e022a5e0847d312c40734e2b44cac52fd8 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 st/xlib: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit a307ea3c3716a706963acce7966b5e405ba11db9 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 gbm: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 tests: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 2f77fe3ee524945eacd546efcac34f7799fb3124 Author: Adam Jackson <ajax@redhat.com> Date: Tue Jun 18 13:07:37 2013 -0400 gallium: Document packed formats Signed-off-by: Adam Jackson <ajax@redhat.com> commit 1f1017159ce951f922210a430de9229f91f62714 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 gallium: Introduce 32-bit packed format names These are for interacting with buffers natively described in terms of bit shifts, like X11 visuals: uint32_t xyzw8888 = (x << 0) \| (y << 8) \| (z << 16) \| (w << 24); Define these in terms of (endian-dependent) aliases to the array-style format names. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a Author: Adam Jackson <ajax@redhat.com> Date: Mon Jun 3 12:10:32 2013 -0400 gallium: Document format name conventions v2: - Fix a channel name thinko (Michel Dänzer) - Elaborate on SCALED versus INT - Add links to DirectX and FOURCC docs Signed-off-by: Adam Jackson <ajax@redhat.com> commit df4d269e7fb62051a3c029b84147465001e5776e Author: Adam Jackson <ajax@redhat.com> Date: Tue Jun 18 12:25:06 2013 -0400 gallivm: Remove all notion of byte-swapping Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-06-24 09:48:56 -04:00
Roland Scheidegger	008fd03600	llvmpipe: improve alignment calculation for fetching/storing pixels This was always doing per-pixel alignment which isn't necessary, except for the buffer case (due to the per-element offset). The disabled code for calculating it was incorrect because it assumed that always the full block would be fetched, which may not be the case, so fix this up. The original code failed for instance for r10g10b10a2 the alignment would have been calculated as 4 (block_width) * 4 (bytes) so 16, but the actual fetch may have only fetched 2 values at a time, hence only alignment 8 - it is unclear what exactly would happen in this case (alignment larger than size to fetch). So just use the (already calculated) fetch size instead and get alignment from that which should always work, no matter if fetching 1,2 or 4 pixels. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:47 +02:00
Roland Scheidegger	ffe2a1ca3c	llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1 For rendering to buffers, we cannot have any y alignment. So make sure that tile clear commands only clear up to the fb width/height, not more (do this for all resources actually as clearing more seems pointless for other resources too). For the jit fs function, skip execution of the lower half of the fragment shader for the 4x4 stamp completely, for depth/stencil only load/store the values from the first row (replace other row with undef). For the blend function, also only load half the values from fs output, replace the rest with undefs so that everything still operates on the full 4x4 block to keep code the same between 4x1 and 4x4 (except for load/store of course which also needs to skip (store) or replace these values with undefs (load))., at the cost of slightly less optimal code being produced in some cases. Also reduce 1d and 1d array alignment too, because they can be handled the same as buffers so don't need to waste memory. v2: don't try to run special blend code for 4x1, (very) slightly less complexity if we just use the same code as for 4x4 which may or may not make it easier to optimize in the future (as we care a lot more about 4x4 performance than 1d). v2: don't use undef values for unused fs src outputs with llvm 3.1 as it apparently can trigger a bug in llvm. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:47 +02:00
Roland Scheidegger	ef3e887084	llvmpipe: cleanup of generate_unswizzled_blend Some parameters were used inconsistently, for instance not using block_width/block_height/block_size for deferring number of pixels but rather relying on guesses from the number of fragment shaders etc, so fix this up (no actual change in behavior since the block size stays fixed). (Though most of the code would work with different block_height, with three exceptions, one being the hacked r11g11b10 conversions and twiddle code which only work with block_height 2 not 1, and the last one being blend vector type not being 128bit wide.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:47 +02:00
Roland Scheidegger	80e2cc0f97	llvmpipe: disable simple_shader optimization This optimization disabled mask checks if the shader is simple enough. While this should work correctly, the problem is that it can hide real issues because shaders in practice are usually complex enough (8 instructions or 1 texture is already enough) so this doesn't get used, whereas dumbed-down tests which should hit all the same code paths suddenly do something quite different. This was the reason that bug 41787 could not be easily tracked as stencil test not working correctly (piglit would in fact have failed some tests without that optimization). So disable it for now, it's unclear if it's much of a win in any case. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	e108716429	llvmpipe: fix early depth test / late depth write stencil issues We actually did early depth/stencil test and late depth/stencil write even when the shader could kill the fragment (alpha test or discard). Since it matters for the new stencil value if the fragment is killed by depth/stencil test or by the shader (in which case it will not reach the depth/stencil test) this simply cannot work (we also would possibly skip writing the new stencil value due to mask checks but this is a secondary issue). So use late depth test / late depth write instead in this case. (No piglit changes as it doesn't seem to hit such bogus early depth test / late depth write path.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	82d7733b52	llvmpipe: fix issue with not writing new stencil values We did mask checks between depth/stencil testing and depth/stencil write. This meant that if the depth/stencil test killed off all fragments we never actually wrote the new stencil value. This issue affected all early/late test/write combinations. So move the mask check after depth/stencil write (for early depth test, could do the same for late depth test but might not be worth it at that point so just skip it there). This addresses https://bugs.freedesktop.org/show_bug.cgi?id=41787. Piglit does not hit this issue because of the simple_shader optimization in generate_fs_loop() which means we're skipping the mask checks. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	070a9afb54	llvmpipe: handle z32s8x24 depth/stencil format We need to split up the depth and stencil values in this case, and there's some new logic required to handle float depth and stencil simultaneously. Also make sure we get the 64bit zs clear values and masks propagated correctly.	2013-05-18 00:32:33 +02:00
Roland Scheidegger	ae507b6260	llvmpipe: get rid of depth swizzling. Eliminating this we no longer need to copy between linear and swizzled layout. This is probably not quite ideal since it's a bit more work for now, could do some optimizations by moving depth testing outside the fragment shader loop (but tricky for early depth test as we don't have neither the mask nor the interpolated z in the right order handy). The large amount of tile/untile code is no longer needed will be deleted in next commit. No piglit regressions. v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR. v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-03 21:36:20 +02:00
José Fonseca	c08b04992a	llvmpipe: Ignore depth-stencil state if format has no depth/stencil. Prevents assertion failures inside the driver for such state combinations. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	a930136977	llvmpipe: Support half integer pixel center fs coord. Tested with graw/fs-fragcoord 2/3, and piglit glsl-arb-fragment-coord-conventions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:25 +01:00
José Fonseca	b191be52f2	llvmpipe: Remove the static interpolation. No longer used. If we ever want the old behavior we can run a loop unroller pass. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:22 +01:00
José Fonseca	6e833d4d09	gallivm: Drop pos arg from lp_build_tgsi_soa. Never used. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:13 +01:00
Zack Rusin	e96f4e3b85	gallium/llvm: implement geometry shaders in the llvm paths This commits implements code generation of the geometry shaders in the SOA paths. All the code is there but bugs are likely present. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Brian Paul	c0f16df938	gallivm: init vars to silence warnings Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-25 12:24:11 -06:00
Vinson Lee	7d0c1f2437	llvmpipe: Fix assertions with assignment instead of comparison. Fixes assign instead of compare defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-03-24 14:49:22 -07:00
Roland Scheidegger	b101a094b5	llvmpipe: add EXT_packed_float render target format support New conversion code to handle conversion from/to r11g11b10 AoS to/from SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA (which works pretty much the same as r11g11b10 except for the packing). (This code should also be used for texture sampling instead of relying on u_format conversion but it's not yet, so rgb9e5 is unused.) Unfortunately a crazy amount of hacks is necessary to get the conversion code running in llvmpipe's generate_unswizzled_blend, which isn't well suited for formats where the storage representation has nothing to do with what's needed for blending (moreover, the conversion will convert from packed AoS values, which is the storage format, to float SoA values, because this is much more natural for the conversion, and likewise from SoA values to packed AoS values - but the "blend" (which includes trivial things like partial mask) works on AoS values, so incoming fs values will go SoA->AoS, values from destination will go packed AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably isn't the most efficient way though the shuffles are probably bearable). Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter), still need to verify Inf/NaNs (where most of the complexity in the conversion comes from actually). v2: drop the (very bogus) rgb9e5 part, and do component extraction in the helper code for r11g11b10 to float conversion, making the code slightly more compact (suggested by Jose), now that there are no other callers left this works quite well. (Could do the same for the opposite way but it's less than ideal there, final part of packing needs to be done in caller anyway and there'd be another conditional.) v3: minor style and comment fixes. Also fix a potential issue with negative zero being potentially returned by max(src, zero) as we don't have well-defined min/max behavior (fortunately no additonal cost). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-22 20:10:53 +01:00
Roland Scheidegger	b6f15954b4	llvmpipe: Fix rendering into PIPE_FORMAT_X8*_UNORM. Mesa state tracker recently started using PIPE_FORMAT_X8B8G8R8_UNORM, causing segfaults in texture-packed-formats, because swizze[chan] was 0xff for padding channel (X). Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-22 09:00:45 +00:00
Roland Scheidegger	8b8bca06df	llvmpipe: implement dual source blending link up the fs outputs and blend inputs, and make sure the second blend source is correctly loaded and converted (which is quite complex). There's a slight refactoring of the monster generate_unswizzled_blend() function where it makes sense to factor out alpha conversion (which needs to run twice for dual source blend). This passes piglit arb_blend_func_extended tests. v2: remove new but ultimately not used function... Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-12 03:41:48 +01:00
Roland Scheidegger	67906f91c9	llvmpipe: first steps of adding dual source blend support This adds support of the additional blending factors to the blend function itself, and also enables testing of it in lp_test_blend (which passes). Still need to add the glue code of linking fs shader outputs to blend inputs in llvmpipe, and probably need to add special handling if destination doesn't include alpha (which lp_test_blend doesn't test). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-08 16:32:30 -08:00
Roland Scheidegger	8e44f4117a	llvmpipe: refactoring of visibility counter handling There can be other per-thread data than just vis_counter, so pass a struct around instead (some of our non-public code uses this already and this difference is a major cause of merge pain). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 16:32:30 -08:00
José Fonseca	0ca384fb39	llvmpipe: Support Z16_UNORM as depth-stencil format. Simply by adjusting the vector element width after/before reading/writing the depth-stencil values. Ran several GL_DEPTH_COMPONENT16 piglit tests without regressions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-01-29 07:06:36 +00:00
Roland Scheidegger	c789b981b2	gallivm: split sampler and texture state Split the sampler interface to use separate sampler and texture (sampler_view) state. This is needed to support dx10-style sampling instructions. This is not quite complete since both draw/llvmpipe don't really track textures/samplers independently yet, as well as the gallivm code not quite using the right sampler or texture index respectively (but it should work for the sampling codes used by opengl). We are however losing some optimizations in the process, apply_max_lod will no longer work, and we potentially could end up with more (unnecessary) recompiles (if switching textures with/without mipmaps only so it shouldn't be too bad). v2: don't use different callback structs for sampler/sampler view functions (which just complicates things), fix up sampling code to actually use the right texture or sampler index, and similar for llvmpipe/draw actually distinguish between samplers and sampler views. v3: fix more of PIPE_MAX_SAMPLER / PIPE_MAX_SHADER_SAMPLER_VIEWS mismatches (both in draw and llvmpipe), based on feedback from José get rid of unneeded static sampler derived state.(which also fixes the only 2 piglit regressions due to a forgotten assignment), fix comments based on Brian's feedback. v4: remove some accidental unrelated whitespace changes Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-01-28 06:50:36 -08:00
Roland Scheidegger	f2a87a1f5b	llvmpipe: more fixes for integer color buffers Cast back the fake floats to ints, and make sure we don't try to do scaling in format conversion (which only makes sense with normalized values). Also need to disable blending and alpha test (as per spec) for such buffers. This makes fbo-blending from the piglit ext_texture_integer tests work for most formats (some crash, and the luminance and intensity variants have the GB or GBA channels respectively wrong). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-01-18 09:14:52 -08:00
Roland Scheidegger	dc6bc3b642	llvmpipe: trivial code and comment cleanup. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-01-18 09:14:52 -08:00
Roland Scheidegger	8c84a82383	llvmpipe: fix using wrong format with MRT in blend code We were passing in the rt index however this was always 0 for non-independent blend case. (The format was only actually used to decide if the color mask covered all channels so this went unnoticed and was discovered by accident.) Additionally, there was a second problem because we do fixups in the key based on color buffer format we cannot use non-independent blend anyway as the fixed up values would never get used. So always turn non-independent blending into independent. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-01-18 09:14:52 -08:00
Brian Paul	8ef27e8fa9	llvmpipe: remove unneeded draw_flush() call This is redundant since we're calling draw_bind_fragment_shader() which already does a flush. v2: the redundant flush in llvmpipe_set_constant_buffer() has already been removed by commit `3427466e6d` Reviewed-by: José Fonseca <jfonseca@vmware.com>	2012-12-12 08:45:45 -07:00
Brian Paul	3427466e6d	llvmpipe: support pipe_resource-based constant buffers Before this we only supported user-based constant buffers. First, we basically plumb pipe_constant_buffer objects through llvmpipe rather than pipe_resource objects. Second, update llvmpipe_set_constant_buffer() and try_update_scene_state() so they understand both resource- and user-based constant buffers. The problem with user constant buffers is the potential for use-after-free, as seen in some WebGL tests. The next patch will flip the switch for resource-based const buffers. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2012-12-11 12:48:06 -07:00
José Fonseca	1d35f77228	gallivm,llvmpipe,draw: Support multiple constant buffers. Support 16 (defined in LP_MAX_TGSI_CONST_BUFFERS) as opposed to 32 (as defined by PIPE_MAX_CONSTANT_BUFFERS) because that would make the jit context become unnecessarily large. v2: Bump limit from 4 to 16 to cover ARB_uniform_buffer_object needs, per Dave Airlie. Reviewed-by: Brian Paul <brianp@vmware.com>	2012-12-07 15:03:07 +00:00
José Fonseca	294d8a71ef	llvmpipe: Fix alignment. My understanding and actual implementation of how the pixels are being fetch differed. This fixes bug 57863. Trivial.	2012-12-04 19:33:04 +00:00
James Benton	16f0d70ffe	llvmpipe: Implement PIPE_QUERY_TIMESTAMP and PIPE_QUERY_TIME_ELAPSED. This required an update for the query storage in llvmpipe, there can now be an active query per query type, so an occlusion query can run at the same time as a time elapsed query. Based on PIPE_QUERY_TIME_ELAPSED patch from Dave Airlie. v2: fix up piglits for timers (also from Dave Airlie) a) if we don't render anything the result is 0, so just return the current time b) add missing screen get_timestamp callback. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2012-12-03 17:21:57 +00:00

1 2 3 4 5 ...

279 Commits