KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
José Fonseca	542c5b3703	gallivm: Fix trivial out-of-bounds indirection in lp_build_cube_lookup(). Courtesy of clang: src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds] tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02); ^ ~ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here LLVMValueRef ddx_ddy[2], tmp[2], rho_vec; ^ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds] rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]); ^ ~ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here LLVMValueRef ddx_ddy[2], tmp[2], rho_vec; ^ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds] rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]); ^ ~ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here LLVMValueRef ddx_ddy[2], tmp[2], rho_vec; ^	2013-04-26 08:44:37 +01:00
Jerome Glisse	abb96fdea7	winsys/radeon: consolidate tracing into winsys v2 This move the tracing timeout and printing into winsys and add an debug environement variable for it (R600_DEBUG=trace_cs). Lot of file touched because of winsys API changes. v2: Do not write lockup file if ib uniq id does not match last one Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-25 18:36:31 -04:00
Tom Stellard	53fbae7eac	r600g/compute: Removed unused and untested code There was a lot of code in evergreen_compute_internal.c that was not being used at all and most of it was duplicating code from other parts of the driver. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 13:32:22 -07:00
Tom Stellard	f986087d5c	r600g/compute: Use a constant buffer to store kernel parameters v2 v2: - Fix usage of set_constant_buffer() - Fix typo in comment Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-25 13:32:17 -07:00
Tom Stellard	ffadc71afb	r600g: Add evergreen_emit_cs_constant_buffers() v2 v2: - Bump R600_NUM_ATOMS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-25 13:25:00 -07:00
Tom Stellard	83a00a1de8	r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernel The state tracker should be responsible for waiting for the kernel to finish. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 13:24:51 -07:00
Tom Stellard	09e47f7a25	r600g/compute: Fix input buffer size calculation Buffer size should be in bytes not dwords. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 13:24:24 -07:00
Rob Clark	73de07cbbc	freedreno: use writecombine buffers Better than uncached for writes, which are common for vertex buffer upload, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-25 15:10:56 -04:00
Rob Clark	f706d4d340	freedreno: don't patch and re-emit same shader as much New textures or vertex buffers don't always require patching and re-emitting the shaders. So do a better job of figuring out when we actually have to patch the shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-25 15:10:56 -04:00
José Fonseca	12096f334b	draw: Yield zeros for LLVM fetches of non-existing vertex elements. If a bug in an app/stater-tacker causes vertex buffer to fetch vertex elements that are not bound, simply return zeros instead of crashing. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-25 16:16:21 +01:00
José Fonseca	28e6a272fc	trace: Only close trace files on exit. Many applications don't exit cleanly, others may create and destroy a screen multiple times, so we only write </trace> tag and close at exit time.	2013-04-25 14:18:33 +01:00
José Fonseca	74d1153c9c	graw: Set the vertex shader constant buffer. We were setting the fragment shader, which wasn't needed.	2013-04-25 14:06:50 +01:00
José Fonseca	e88a1dba09	graw: Simple utilities to dump and disassemble TGSI tokens. Useful for core dumps, where calling tgsi_dump() from gdb is not an alternative.	2013-04-25 13:03:06 +01:00
José Fonseca	1687932d2b	scons: Support clang. clang is supports most gcc options / extensions, with a some exceptions. The biggest advantage of using clang is that compilation times are much short. One can tell scons to use clang when building by invoking it as CC=clang CXX=clang++ scons libgl-xlib	2013-04-25 11:59:01 +01:00
José Fonseca	f0c296773d	util/u_sse: Fix _mm_shuffle_epi8 prototype for clang. Clang does not support __artificial__. Instead match precisely what's in the clang headers.	2013-04-25 11:59:01 +01:00
José Fonseca	45a60e2e7a	scons: Remove redundant code. -fvisibility=hidden is already elsewhere for the whole tree.	2013-04-25 11:59:01 +01:00
Rob Clark	49a7624973	freedreno: fix bogus IMM const reg index We were assigning incorrect const register for immediates, and potentially writing immediate const to the wrong location. This fixes an incorrect-rendering bug with xonotic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	9495ee12c6	freedreno: clear fixes and debugging Set a few extra registers to make sure we are in proper state for clearing. And also add some debug options to mark all state dirty in clear and gmem operations to aid in debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	d5d6ec8843	freedreno: fix texture fetch type There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether there are two or there valid input components. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	d086bb22bc	freedreno: fix temp register usage The previous approach of using the dst register as an intermediate temporary doesn't work in a lot of cases. For example, if the dst register is the same as one of the src registers. For now, just simplify it and always allocate a new register to use as an intermediate. In some cases this will result in more registers used than required. I think the best solution would be to implement an optimization pass to reduce the number of registers used, which would also solve the problem we have now of not being able to use GPRs that are assigned for TGSI_FILE_INPUT. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	7a837da556	freedreno: add noop driver It is useful for debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	eec37f1cdc	freedreno: use u_math macros/helpers more Get rid of a few self-defined macros: ALIGN() -> align() min() -> MIN2() max() -> MAX2() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	38d8b02eba	freedreno: implement fd_screen_destroy() Opps, didn't notice that I had left it stubbed out. Also, make things fail a bit more gracefully when things go wrong. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	a64e2d9d9f	freedreno: set SWAP bit based on format Really this should be set based on buffer format, not on color vs depth/stencil. Probably there should be more formats that set the bit as we add support for more render target formats. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Tom Stellard	d9a32b84e3	radeon/llvm: Fix segfault with a specifc libelf implementation The libelf implementation that is distributed here: http://www.mr511.de/software/english.html requires calling elf_version() prior to calling elf_memory() Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-24 16:51:25 -07:00
Alex Deucher	5bbeae7a3d	r600g: use CP DMA for buffer clears on evergreen+ Lighter weight then using streamout. Only evergreen and newer asics support embedded data as src with CP DMA. Reviewed-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-24 18:54:31 -04:00
Tom Stellard	f64058803a	r600g/llvm: Pass struct r600_bytecode to r600_llvm_compile This way we don't need to update the function signature everytime we emit a new config value. This also fixes the build with --enable-opencl.	2013-04-24 12:42:41 -04:00
José Fonseca	e29525f79f	winsys/sw/xlib: Prevent shared memory segment leakage. Running piglit with this was causing all sort of weird stuff happening to my desktop (Chromium webpages become blank, Qt Creator flickered, etc). I tracked this down to shared memory segment leakage when GL is not shutdown properly. The segments can be seen running `ipcs` and looking for nattch==0. This changes fixes this by calling shmctl(IPC_RMID) soon after creation (which does not remove the segment immediately, but simply marks it for removal when no more processes are attached). This matches src/mesa/drivers/x11/xm_buffer.c behaviour. v2: - move shmctl(IPC_RMID) after XShmAttach() for *BSD, per Chris Wilson - remove stray debug printfs, spotted by Ian Romanick NOTE: This is a candidate for stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-24 16:54:58 +01:00
Zack Rusin	1a87473998	draw/gs: preserve leading vertex info for gs We need to handle the leading vertex information when assembling primitives for the geometry shader otherwise the resulting triangles will have vertices at incorrect input locations. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-23 06:17:59 -04:00
Christian König	c5c754d184	radeonsi: cleanup disabling tiling for UVD v3 Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=63702 v2: add a comment that this is just a workaround v3: fix typo in comment Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-24 11:07:26 +02:00
Kenneth Graunke	f0cb66b699	mesa: Restore 78-column wrapping of license text in C++-style comments. The previous commit introduced extra words, breaking the formatting. This text transformation was done automatically via the following shell command: $ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' \| sed 's/:.$//' \| xargs -I {} sh -c 'vim -e -s {} < vimscript2 where 'vimscript2' is a file containing: /THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/^ $/ !fmt -w 78 -p '// ' :wq Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:12 -07:00
Kenneth Graunke	3d8d5b298a	mesa: Restore 78-column wrapping of license text in C-style comments. The previous commit introduced extra words, breaking the formatting. This text transformation was done automatically via the following shell command: $ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' \| sed 's/:.$//' \| xargs -I {} sh -c 'vim -e -s {} < vimscript where 'vimscript' is a file containing: /THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/\\// !fmt -w 78 -p ' * ' :wq Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:09 -07:00
Kenneth Graunke	96ff2edc73	mesa: Add "OR COPYRIGHT HOLDERS" to license text disclaiming liability. This brings the license text in line with the MIT License as published on the Open Source Initiative website: http://opensource.org/licenses/mit-license.php Generated automatically be the following shell command: $ git grep 'THE AUTHORS BE LIABLE' \| sed 's/:.*$//g' \| xargs -I '{}' \ sed -i 's/THE AUTHORS/THE AUTHORS OR COPYRIGHT HOLDERS/' {} This introduces some wrapping issues, to be fixed in the next commit. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:06 -07:00
Kenneth Graunke	dd404bc94f	mesa: Change "BRIAN PAUL" to "THE AUTHORS" in license text. Generated automatically be the following shell command: $ git grep 'BRIAN PAUL BE LIABLE' \| sed 's/:.*$//g' \| xargs -I '{}' \ sed -i 's/BRIAN PAUL/THE AUTHORS/' {} The intention here is to protect all authors, not just Brian Paul. I believe that was already the sensible interpretation, but spelling it out is probably better. More practically, it also prevents people from accidentally copy & pasting the license into a new file which says Brian is not liable when he isn't even one of the authors. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:06:38 -07:00
José Fonseca	2737abb44e	gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center. Squashed commit of the following: commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852 Author: José Fonseca <jfonseca@vmware.com> Date: Tue Apr 23 17:37:18 2013 +0100 gallium: s/lower_left_origin/bottom_edge_rule/ commit 4dff4f64fa83b9737def136fffd161d55e4f1722 Author: José Fonseca <jfonseca@vmware.com> Date: Tue Apr 23 17:35:04 2013 +0100 gallium: Move diagram to docs. commit 442a63012c8c3c3797f45e03f2ca20ad5f399832 Author: James Benton <jbenton@vmware.com> Date: Fri May 11 17:50:55 2012 +0100 gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center. This change is necessary to achieve correct results when using OpenGL FBOs. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-23 19:42:47 +01:00
Marek Olšák	b692076420	r600g: initialize CMASK and HTILE with the GPU using streamout This fixes a crash when a resource cannot be mapped to the CPU's address space because it's too big. This puts a global pipe_context in r600_screen, which is guarded by a mutex, so that we can use pipe_context when there isn't one around. Hopefully our multi-context support is solid. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> NOTE: This is a candidate for the 9.1 branch.	2013-04-23 20:26:20 +02:00
Marek Olšák	1ba46bbb4c	gallium/u_blitter: implement buffer clearing Although this might be useful for ARB_clear_buffer_object, I need it for initializating resources in r600g. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> v2: comment cleanups NOTE: This is a candidate for the 9.1 branch.	2013-04-23 20:26:20 +02:00
Vincent Lejeune	edd90a19ca	r600/llvm: Read stacksize from config header	2013-04-23 19:52:29 +02:00
Vincent Lejeune	a7f73f5155	/bin/bash: q : commande introuvable	2013-04-23 19:52:02 +02:00
Tom Stellard	a0c8942bb4	radeon/llvm: Fix build with LLVM >= r180063	2013-04-23 11:53:05 -04:00
Tom Stellard	ead4db420e	gallivm: Fix build with LLVM >= r180063	2013-04-23 11:53:05 -04:00
Zack Rusin	1fb8c3ce55	draw: use the prim count for ia primitives Number of vertices to fetch doesn't always equal the number of input vertices. To correctly compute the number if IA primitives we need to use the total number of input vertices, not only those that need to be fetched. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Zack Rusin	76587d2e5e	tgsi/scan: set correct input limits for geometry shader TGSI geometry shader input declerations are of the IN[][2] format and the dimensions of the array have to be deduced from the input primitive property. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Zack Rusin	913ed25f18	draw: add code to reset instance dependent data We want to be able to reset certain parts of the pipeline, in particular the input primitive index, but only either with seperate invocations of the draw_vbo or new instances. In all other cases (e.g. new invocations due to primitive restart) that data needs to be preserved. Add a function through which we can reset instance dependent data. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Zack Rusin	2aad06844f	softpipe: fix streamout with an emptry geometry shader Same approach as in the llvmpipe, if the geometry shader is null and we have stream output then attach it to the vertex shader right before executing the draw pipeline. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
José Fonseca	7c1bf8e381	gallium: Add a new clip_halfz rasterizer state. gl_rasterization_rules lumps too many different flags. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-22 18:39:06 +01:00
José Fonseca	c0538860bf	gallivm: Fix assignment of unsigned values to OUT register. TEMP is not the only register file that accept unsigned. OUT too. Actually, what determines the appropriate type of the destination value is not the opcode, but rather the register. Also cleanup/simplify code. Add a few more asserts, but also make code more robust by handling graceful if assert fails. This fixes segfault / assertion in the included vert-uadd.sh graw shader. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-22 18:23:42 +01:00
José Fonseca	9fb5b2f45c	Revert "gallivm: Emit vector selects." It caused inumerous regressions (LLVM 3.1) in blending. In particular: - lp_test_blend type=u8nx16 rgb_func=sub rgb_src_factor=zero rgb_dst_factor=inv_src_color alpha_func=rev_sub alpha_src_factor=one alpha_dst_factor=const_color ... MISMATCH Src: 0 0 0 b5 49 29 0 a2 0 21 de 0 c3 1b ec 0 Src1: 2d 85 14 0 f8 0 79 a1 99 0 d8 0 59 16 0 0 Dst: 0 a9 97 0 c0 0 78 0 0 8b aa f0 bd 0 78 f6 Con: 7d 0 c0 0 0 bb 77 0 0 0 50 0 40 51 0 0 Res: 0 0 0 0 0 29 0 0 0 0 c8 0 97 1b e3 0 Ref: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 type=u8nx16 rgb_func=max rgb_src_factor=one rgb_dst_factor=inv_const_color alpha_func=min alpha_src_factor=zero alpha_dst_factor=inv_src1_alpha ... MISMATCH Src: d 0 0 e9 0 37 35 f0 62 0 0 b2 e9 f7 0 5c Src1: 8f 0 bf 0 a8 5 0 0 c4 0 d7 7 92 a 0 17 Dst: cb 0 1e 0 0 0 19 8e 0 4d 0 0 0 0 3 46 Con: aa 5a 5f 8f 0 0 bc 92 0 88 0 0 b7 8a c0 88 Res: 44 0 13 0 0 0 7 8e 0 24 0 0 0 0 1 40 Ref: 44 0 13 0 0 37 35 0 62 24 0 0 e9 f7 1 0 This reverts commit `1e266c7ef0`.	2013-04-21 09:07:19 +01:00
José Fonseca	d8a4c4c524	llvmpipe: verify function on blend test.	2013-04-21 08:53:31 +01:00
José Fonseca	a79990bec0	llvmpipe: Don't support Z32_FLOAT_S8X24_UINT texture sampling support either. Because we don't support, and the u_format fallback doesn't work for zs formats. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	c08b04992a	llvmpipe: Ignore depth-stencil state if format has no depth/stencil. Prevents assertion failures inside the driver for such state combinations. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	f701a5a0fe	gallivm: Disable LLVM 2.7 workaround on other versions. 2.7 was a particularly trouble ridden release. Furthermore, the bug no longer can be reproduced ever since the first_level state was taken in account. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	1e266c7ef0	gallivm: Emit vector selects. They are supported on LLVM 3.1, at least on x86. (I haven't tested on PPC though.) Actually lp_build_linear_mip_levels() already has been emitting them for some time. This avoids intrinsics, which tend to be an obstacle for certain optimization passes. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
Rob Clark	26b39df08f	freedreno: move ir -> ir2 There will be a new IR for a3xx, which has a very different shader ISA (more scalar oriented). So rename to avoid conflicts later when I start adding a3xx support to the gallium driver. Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>	2013-04-20 17:59:41 -04:00
Rob Clark	d8134792ae	freedreno: cleanup some cruft left over from fdre The standalone shader assembler needed some meta-data to know about attributes/varyings/etc, to do the shader linkage. We don't need these parts with gallium/tgsi, so just get rid of it. Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>	2013-04-20 17:31:47 -04:00
Roland Scheidegger	85974e5fee	gallivm: implement switch opcode Should be able to handle all things which make this tricky to implement. Fallthroughs, including most notably into/out of default, should be handled correctly but are quite a mess. If we see largely unoptimized switches in the wild should probably think about some "real" switch optimization pass, e.g. things like this: switch case1 someinst brk case2 default case3 someinst brk case4 someinst endswitch are legal, but the pointless case2/case3 statements not only cause condition evaluation but will turn this into a "fake" fallthrough case (because mask and defaultmask are already updated for case2 when default is encountered) requiring executing code twice. If default is at the end though, there's never any code re-execution, and if that's not the case if there's no fallthrough in (not even a fake one) and out of default there's no code re-execution neither. v2: add comments, and use enum for break type instead of magic boolean. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	8f5d4283c0	gallivm: use uint build context for mask instead of float Unsurprisingly noone was using it except for grabbing builder. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	107550e71a	gallivm/tgsi: fix up breakc It seems there was a typo in gallivm breakc handling (I am actually still not sure it is really needed but otherwise that statement really should go away). Also fix the wrong src argument type, even though they weren't really used. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	e8d1b26a82	svga: remove TGSI_OPCODE_BREAKC instruction translation While initially that opcode probably was meant for something along the lines of sm3 break_comp it has never worked that way (not even the argument count was right) and now the opcode has quite different semantics so just remove it. (Discovered by Jose Fonseca)	2013-04-20 02:27:53 +02:00
Roland Scheidegger	794579105a	gallium: document breakc and switch/case/default/endswitch docs were missing, especially the opcode-from-hell switch however is anything but obvious. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	443950c6aa	gallivm: increase nesting limit to 66 This is still not really correct, since at least for sm 4.0 the nesting limit is 64 per subroutine, and subroutine nesting itself has a limit of 32, so since we have a flat stack we'd need 32*64. But this should probably be better fixed with per-subroutine stacks, since otherwise these structures get really big (like 100kB for the lp_exec_mask). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Zack Rusin	12eab7cc56	draw: implement primitive assembler Input assembler needs to be able to decompose adjacency primitives into something that can be understood by the rest of the pipeline. The specs say that the adjacency primitives are only visible in the geometry shader, for everything else they need to be decomposed. Which in most of the cases is not an issue, because the geometry shader always decomposes them for us, but without geometry shader we were passing unchanged adjacency primitives to the rest of the pipeline and causing crashes everywhere. This commit introduces a primitive assembler which, if geometry shader is missing and the input primitive is one of the adjacency primitives, decomposes them into something that the rest of the pipeline can understand. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-18 11:51:22 -07:00
Zack Rusin	e4752d0f56	util/prim: fix decomposed counts for adjacency primitives Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-18 11:37:37 -07:00
Zack Rusin	c1299204ad	draw/so: uses the correct index with the pre clipped coordinates pre_clip_pos is a float[4] we just used (*float)[4] to be able to jump within the array of vertex_headers with it. So if the idx happened to be anything but 0, we'd actually read from some garbage in memory. Change it to just be a simple pointer instead of casting it to something that it's not. As suggested by Jose. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-18 11:36:38 -07:00
Eric Anholt	ea6cf2b686	mesa: Use quotes on bool driconf options to prevent stdbool.h breakage. Since stdbool.h's "true" and "false" are #defines, they got expanded when used as macro arguments, and that expanded value was stored in the XML string, producing XML that driconf would then fail to parse. Currently no drivers included stdbool along with driconf, but I keep accidentally doing so on intel as we move towards using normal C. v2: rebase on master. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-04-19 10:10:22 -07:00
Brian Paul	cecbfce5eb	svga: whitespace, comment fixes in svga_pipe_query.c	2013-04-19 10:04:11 -06:00
Brian Paul	ef1b2b8da7	svga: whitespace, comment fixes in svga_pipe_fs/vs.c	2013-04-19 10:03:56 -06:00
José Fonseca	dbb690872e	gallivm: Fix half floats with MCJIT. Prevents: LLVM ERROR: Cannot select: intrinsic %llvm.x86.vcvtph2ps.128	2013-04-19 10:13:19 +01:00
Jerome Glisse	d0e9aaa31c	radeonsi: add support for compressed texture v2 Most test pass, issue are with border color and swizzle. Based on ircnick<maelcum> patch. v2: Restaged commit hunk Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-18 17:25:38 -04:00
Jerome Glisse	dc21e30a62	radeonsi: add 2d tiling support for texture v3 v2: Remove left over code v3: Restage properly the commit so hunk of first one are not in second one. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-18 17:25:38 -04:00
Vadim Girlin	f732036f12	gallium: handle drirc disable_glsl_line_continuations option NOTE: This is a candidate for the 9.1 branch Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-19 01:05:03 +04:00
José Fonseca	b72ff373fb	llvmpipe: Take in consideration all current constant buffers when mapping. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-04-18 20:48:12 +01:00
Christoph Bumiller	78eaaff696	nv50: add remaining RGBX formats Not all are supported as render targets. The state tracker fallback of using RGBA instead of RGBX currently fails for blending, we could work around this by clearing their alpha to 1 and modifying the color mask to disable writing alpha.	2013-04-18 21:04:22 +02:00
Christoph Bumiller	729abfd0f5	st/mesa: optionally apply texture swizzle to border color v2 This is the only sane solution for nv50 and nvc0 (really, trust me), but since on other hardware the border colour is tightly coupled with texture state they'd have to undo the swizzle, so I've added a cap. The dependency of update_sampler on the texture updates was introduced to avoid doing the apply_depthmode to the swizzle twice. v2: Moved swizzling helper to u_format.c, extended the CAP to provide more accurate information.	2013-04-18 20:35:40 +02:00
Christoph Bumiller	246ff8f887	nv50: set BORDER_COLOR_SRGB in sampler objects	2013-04-18 20:35:40 +02:00
Christoph Bumiller	2d5d054752	nv50: fix 4th component of Lx_SINT/UINT formats	2013-04-18 20:35:40 +02:00
Tom Stellard	3b20170b2f	r600g: Fix build with --enable-opencl	2013-04-18 11:24:48 -07:00
Roland Scheidegger	50cbcf0c46	gallivm: change cubemaps / derivatives handling, take 55 Turns out the previous "fix" for handling per-pixel face selection and derivatives didn't work out that well - the derivatives were wrong by quite a bit, in theory transformation of the derivatives into cube space should work, but would be _a lot_ more work than the "simplified" transform used. So, for explicit derivatives, I'm just giving up and go back to not honoring them. For implicit derivatives (and the fake explicit ones) however we try something a little different, we just calculate rho as we would for a 3d texture, that is after scaling the coords by the inverse major axis. This gives the same results as calculating the derivs after projection of the coords to the same face as long as all pixels hit the same face (and only without rho_no_opt, otherwise it should be a bit worse). And when not all pixels are hitting the same face, the results aren't so hot but not catastrophically bad (I believe not off by more than a factor of 2 without no_rho_approx and not more than sqrt(2) with no_rho_approx). I think this is better than just picking the wrong face but who knows... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-18 17:06:43 +02:00
Roland Scheidegger	0d07f05ee8	gallivm: Add no_rho_approx debug option This will calculate rho correctly as sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2)) instead of max(\|ds/dx\|,\|dt/dx\|,\|dr/dx\|,\|ds/dy\|,\|dt/dy,\|dr/dy\|) (for 3 coords - 2 coords work analogous, for 1 coord there's no point doing the exact version), for both implicit and explicit derivatives. While such approximation seems to be allowed in OpenGL some APIs may be less forgiving, and the error can be quite large (sqrt(2) for 2 coords, sqrt(3) for 3 coords so wrong by nearly one mip level in the latter case). This also helps to single out "real" bugs from "expected" ones, so it is debug only (though at least combined with no_brilinear I didn't really see much of a performance difference but only tested with a debug build - at least with implicit mipmaps the instruction count is almost exactly the same though the instructions are more complex (1 sqrt and mul/adds instead of and/max mostly). The code when the option isn't set stays exactly the same. v2: rename no_rho_opt to no_rho_approx. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-18 17:04:01 +02:00
José Fonseca	a930136977	llvmpipe: Support half integer pixel center fs coord. Tested with graw/fs-fragcoord 2/3, and piglit glsl-arb-fragment-coord-conventions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:25 +01:00
José Fonseca	b191be52f2	llvmpipe: Remove the static interpolation. No longer used. If we ever want the old behavior we can run a loop unroller pass. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:22 +01:00
José Fonseca	6e833d4d09	gallivm: Drop pos arg from lp_build_tgsi_soa. Never used. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:13 +01:00
Stuart Abercrombie	1a59cc777f	i915g: Release old fragment shader sampler views with current pipe We were trying to use a destroy method from a deleted context. This fix is based on what's in the svga driver. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>	2013-04-17 18:15:12 -07:00
Zack Rusin	8e7f7e9693	draw/so: respect leading/provoking vertex info we were ignoring leading/provoking vertex settings which was breaking decomposition of some strips. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-17 15:43:50 -07:00
Zack Rusin	6bb217a489	softpipe/so: use the correct variable for reporting stream out we were using the wrong vars, reporting incorrect stream output statistics. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-17 15:28:54 -07:00
Zack Rusin	cb58c79efb	gallivm/gs: fix indirect addressing in geometry shaders We were always treating the vertex index as a scalar but when the shader is using indirect addressing it will be a vector of indices for each channel. This was causing some nasty crashes insides LLVM. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-17 15:28:54 -07:00
Brian Paul	02039066a8	st/wgl: fix issue with SwapBuffers of minimized windows If a window's minimized we get a zero-size window. Skip the SwapBuffers in that case to avoid some warning messages with the VMware svga driver. Internal bug #996695 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-17 16:23:19 -06:00
Zack Rusin	f01f754ca1	draw/gs: make sure geometry shaders don't overflow The specification says that the geometry shader should exit if the number of emitted vertices is bigger or equal to max_output_vertices and we can't do that because we're running in the SoA mode, which means that our storing routines will keep getting called on channels that have overflown (even though they will be masked out, but we just can't skip them). So we need some scratch area where we can keep writing the overflown vertices without overwriting anything important or crashing. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:47 -07:00
Zack Rusin	be497ac9d3	draw/gs: Return early if the passed geometry shader is null Can happen if we were using stream output without geometry shader, by returning early we avoid a crash. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:47 -07:00
Zack Rusin	80ee4a407a	draw: implement pipeline statistics in the draw module This is a basic implementation of the pipeline statistics in the draw module. The interface is similar to the stream output statistics and also requires that the callers explicitly enable it. Included is the implementation of the interface in llvmpipe and softpipe. Only softpipe enables the pipeline statistics capability though because llvmpipe is lacking gathering of the fragment shading and rasterization statistics. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:47 -07:00
Zack Rusin	b739376cff	gallivm/gs: fix the end primitive calls The issue with SOA execution and end_primitive opcode is that it can be executed both when we haven't emitted any vertices, in which case we don't want to emit an empty primitive, and when the execution mask is zero and the execution should be skipped. We handled only the latter of those conditions. Now we're combining the execution mask with a mask created from emitted vertices to handle both cases. As a result we don't need the pending_end_primitive flag which was broken because it was static and could be affected by both above mentioned conditions at run-time. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:46 -07:00
Zack Rusin	93627e33cc	tgsi/exec: geometry shaders are executed on a single primitive which means that our execution mask in GS is equal to 1 not 0xf. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 23:38:46 -07:00
Zack Rusin	88db6f0a73	tgsi/exec: fix the udiv and umod instructions Same as with llvmpipe: we can't be divind/moding by zero and we need to make sure that dividing/moding by zero produces 0xffffffff. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:46 -07:00
José Fonseca	b8f6858fcb	gallivm: JIT symbol resolution with linux perf. Details on docs/llvmpipe.html Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-17 16:50:52 +01:00
José Fonseca	35ef27d485	draw: Silence uninitialized var warnings. Trivial.	2013-04-17 16:50:52 +01:00
Vincent Lejeune	2b9ed257c0	r600g/llvm: Use gprcount from llvm	2013-04-17 17:24:29 +02:00
José Fonseca	50b3fc6204	gallium: Disambiguate TGSI_OPCODE_IF. TGSI_OPCODE_IF condition had two possible interpretations: - src.x != 0.0f - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for vertex and fragment shaders - gallivm/llvmpipe - postprocess - vl state tracker - vega state tracker - most old drivers - old internal state trackers - many graw examples - src.x != 0U - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both vertex and fragment shaders - tgsi_exec/softpipe - r600 - radeonsi - nv50 And drivers that use draw module also were a mess (because Mesa would emit float IFs, but draw module supports native integers so it would interpret IF arg as integers...) This sort of works if the source argument is limited to float +0.0f or +1.0f, integer 0, but would fail if source is float -0.0f, or integer in the float NaN range. It could also fail if source is integer 1, and hardware flushes denormalized numbers to zero. But with this change there are now two opcodes, IF and UIF, with clear meaning. Drivers that do not support native integers do not need to worry about UIF. However, for backwards compatibility with old state trackers and examples, it is advisable that native integer capable drivers also support the float IF opcode. I tried to implement this for r600 and radeonsi based on the surrounding code. I couldn't do this for nouveau, so I just shunted IF/UIF together, which matches the current behavior. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> v2: - Incorporate Roland's feedback. - Fix r600_shader.c merge conflict. - Fix typo in radeon, spotted by Michel Dänzer. - Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float) properly in nv50/ir.	2013-04-17 10:54:08 +01:00
José Fonseca	f61b7da80e	gallium: Eliminate TGSI_OPCODE_IFC. Never used or implemented. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-17 10:54:08 +01:00
Christian König	13ddf9baf2	r600/uvd: cleanup disabling tiling on pre EG asics Set transfer flag instead of fiddling with the tilling params directly. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-04-16 22:36:51 +02:00
Martin Andersson	4c3ed79566	r600g: Workaround for a harware bug with nested loops on Cayman There is a hardware bug on Cayman where a BREAK/CONTINUE followed by LOOP_STARTxxx for nested loops may put the branch stack into a state such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this by replacing the ALU_PUSH_BEFORE with a PUSH + ALU Fixes piglit tests EXT_transform_feedback/order* v2: Use existing loop count and improve comment v3: [Vadim Girlin] Set jump address for PUSH instructions NOTE: This is a candidate for the 9.1 branch Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-16 18:02:11 +04:00
Marek Olšák	8616b224bf	gallium/hud: fix FPS computation for framerate > 4.2k	2013-04-16 13:56:47 +02:00
Marek Olšák	332af88c39	gallium/hud: increase vertex buffer size for background black rectangles Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 13:56:47 +02:00
Marek Olšák	0108114619	gallium/hud: update the contents of GALLIUM_HUD=help Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 13:56:47 +02:00
Marek Olšák	30284f8892	gallium/hud: remove pipeline-statistics- prefix in query names for the env var string not to be awfully long v2: fix bug in indexing of "name" Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 13:56:47 +02:00
Marek Olšák	dfe5367f0f	r600g: implement pipeline statistics query	2013-04-16 13:56:47 +02:00
Marek Olšák	817723baf8	winsys/radeon: use query_value for timestamp, remove query_timestamp	2013-04-16 13:56:47 +02:00
Marek Olšák	413ca78af3	r600g: add a debug flag for printing virtual addresses of resources	2013-04-16 13:56:47 +02:00
Marek Olšák	05fa3595e0	r600g: add a query returning the amount of time spent during bo_map sync.	2013-04-16 13:56:47 +02:00
Matt Turner	b3f1f665b0	build: Get rid of GALLIUM_WINSYS_DIRS configure still uses it to print the enabled winsys. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	3a6e548a85	build: Get rid of GALLIUM_TARGET_DIRS configure still uses it to print the enabled targets. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	2f7a37d858	build: Build pipe-loader before gallium tests And don't build it from other Makefiles. That's awful, and breaks distclean. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	0d3b1b0e2e	build: Get rid of GALLIUM_MAKE_DIRS Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	69b69b1a0b	build: Stop using GALLIUM_STATE_TRACKERS_DIRS for SUBDIRS configure still uses it to print the enabled state trackers. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	70531b4a25	build: Remove GALLIUM_DIRS It's always constant anyway. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	d5e9426b96	build: Move src/mapi/mapi/* to src/mapi/ Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:25 -07:00
Tom Stellard	d50343dff1	radeonsi: Read config values from the .AMDGPU.config ELF section Instead of emitting configuration values (e.g. number of gprs used) in a predefined order, the LLVM backend now emits these values in register/value pairs. The first dword contains the register address and the second dword contians the value to write. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-15 10:54:30 -07:00
Tom Stellard	9277b04c02	radeon/llvm: Handle ELF formatted binary output from the LLVM backend	2013-04-15 10:54:29 -07:00
Tom Stellard	7782d19cdc	radeon/llvm: Use a struct for storing compiled code	2013-04-15 10:13:10 -07:00
Roland Scheidegger	1d6eb23f2d	gallivm: fix small but severe bug in handling multiple lod level strides Inserting the value for the second quad in the wrong place for the following shuffle. This meant the row or image stride was undefined which is quite catastrophic, can lead to bogus texels fetched or just segfault. This code is only hit for SoA path currently, still surprising it didn't crash more or caused more visible issues (I think llvm used a broadcast shuffle for the undefined parts of the vector, hence the undefined value for the second quad was just the same as that from the first quad, so as long as both quads hit the same mip level everything was fine, and since lower mips always have the same large stride it made it less likely to hit out-of-bound memory in case of differing lods). Note: this is a candidate for stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-15 15:23:40 +02:00
Francisco Jerez	02b808b08a	clover: Fix usage of incorrect object as destination in clEnqueueCopyBufferToImage. Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-04-13 14:24:10 +02:00
Francisco Jerez	1a8ad6c2e3	clover: Define platform class and merge with device_registry. Null platform IDs are OK according to the spec, but some applications have been reported to get paranoid and assume that our NULL platform is unusable. As it doesn't hurt to have device enumeration separate from the rest of the device code (quite the opposite, it makes the code cleaner), make the API use an actual platform object that keeps track of the available devices instead of the former NULL pointer. Reported-and-reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-04-13 14:20:16 +02:00
Francisco Jerez	6ace452055	clover: Add missing fields to the module serializer. Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-04-13 14:12:49 +02:00
Tom Stellard	c6a86fb563	r300g: Fix bug in OMOD optimization https://bugs.freedesktop.org/show_bug.cgi?id=60503 NOTE: This is a candidate for the stable branches.	2013-04-12 08:33:31 -07:00
Emil Velikov	ac1118d53c	nvc0: set ret variable if launch desc allocation failed Pointed out by gcc nve4_compute.c: In function 'nve4_launch_grid': nve4_compute.c:511:7: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized] if (ret) ^ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Edit by Christoph Bumiller: Set it to -1 to indicate failure and only when it's actually required.	2013-04-12 17:15:14 +02:00
Emil Velikov	48bcb94dc3	nvc0: bail out early during nve4_compute_setup() Exit gracefully rather than trying to create a random object, whenever the chipset is unknown Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-12 17:10:11 +02:00
Emil Velikov	e28c266682	nvc0: compile nve4_cache_split_name() only in debug build As otherwise it is unused - pointed out by gcc nve4_compute.c:586:20: warning: 'nve4_cache_split_name' defined but not used [-Wunused-function] static const char *nve4_cache_split_name(unsigned value) ^ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-12 17:09:03 +02:00
Emil Velikov	249f3d73cf	nv50/codegen: do not emitATOM() if the subOp is unknown For debug build we'll hit the assert, for release we are going to emit random data as subOp is used uninitilised. Spotted by gcc codegen/nv50_ir_emit_nv50.cpp: In member function 'void nv50_ir::CodeEmitterNV50::emitATOM(const nv50_ir::Instruction*)': codegen/nv50_ir_emit_nv50.cpp:1554:12: warning: 'subOp' may be used uninitialized in this function [-Wmaybe-uninitialized] uint8_t subOp; ^ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-12 17:08:26 +02:00
Christoph Bumiller	4da54c91d2	nvc0: implement multisample textures	2013-04-12 13:02:18 +02:00
Christoph Bumiller	71c1c8a9b8	nvc0: patch up TEX cases with 5 or 6 sources on nve4 Hackishly fixes alignment requirement of 2nd tuple for now.	2013-04-12 11:41:35 +02:00
Christoph Bumiller	2b62ba7cb0	nvc0: fix 2D engine MS2 resolve	2013-04-12 11:41:35 +02:00
Christoph Bumiller	69804c2ab8	nv50,nvc0: add RGBX16/32_FLOAT formats	2013-04-12 11:41:35 +02:00
Dave Airlie	f024c72476	r600g: add get_sample_position support (v3) v2: I rewrote this to use the sample positions properly. v3: rewrite properly to use bitfield to cast back to signed ints Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:29 +01:00
Dave Airlie	cc906396c7	gallium: add get_sample_position interface This is to be used to implement glGet GL_SAMPLE_POSITION. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:28 +01:00
Dave Airlie	184278a804	r600g: fix two issues in compressed msaa reading code I've no idea when sample_chan would ever be 4 here, but 4 is most definitely wrong, array textures have it as 3 as well. Also the cayman code though unused is obviously wrong. Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:27 +01:00
Christian König	5b2855bfe7	radeon/uvd: add UVD implementation v5 Just everything you need for UVD with r600g and radeonsi. v2: move UVD code to radeon subdir, clean up build system additions, remove an unused SI function, disable tiling on SI for now. v3: some minor indentation fix and rebased v4: dpb size calculation fixed v5: implement proper fall-back in case the kernel doesn't support UVD, based on patches from Andreas Boll but cleaned up a bit more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-11 17:10:28 +02:00
Christian König	f91e4d2c9d	radeon/winsys: add uvd ring support to winsys v3 Separated from UVD patch for clarity. v2: sync with next tree for 3.10 v3: as pointed out by Andreas Bool check for drm minor >= 32 http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10-wip Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-04-11 17:10:01 +02:00
Fredrik Höglund	fb69dbb0d1	r600g: Add support for GL_ARB_texture_buffer_range Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-11 00:10:45 +02:00
Marek Olšák	34c3f98641	r600g: fix valgrind warning on Cayman Warning: "Conditional jump or move depends on uninitialised value(s)".	2013-04-10 21:56:51 +02:00
Zack Rusin	fe29f99293	gallivm/tgsi: handle untyped moves both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:37:17 -07:00
Zack Rusin	d56f2d5267	gallivm: fix loops and conditionals within GS We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries, which were holding the number of vertices and primitives which were emitted, were being reset to zero. Now we're using alloca to allocate those variables to preserve them across conditionals. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:33:59 -07:00
Zack Rusin	c1cd19c3b8	llvmpipe: implement PIPE_QUERY_SO_STATISTICS We were missing the implementation of PIPE_QUERY_SO_STATISTICS query, this change implements it on top of the existing facilities. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:32:56 -07:00
Zack Rusin	7466e0b6c8	gallivm: fix unsigned divide and remainder opcodes We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0xffffffff. Based on José idea. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:31:22 -07:00
Zack Rusin	1ad4a4eeb3	gallivm: fix breakc we break when the mask values are 0 not, 1, plus it's bit comparison not a floating point comparison. This fixes both. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:25:34 -07:00
Christian König	ccf3e8fc9b	radeonsi: remove sampler writemask v3 v2: fix instrinsic name as well v3: LLVM revision incremented as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-10 10:41:29 +02:00
Niels Ole Salscheider	31f14f3def	pipe-loader: Fix out of source build Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2013-04-10 09:45:04 +02:00
Brian Paul	acd4fb8b5a	st/osmesa: re-use buffers in OSMesaMakeCurrent() Rather than creating a new buffer each time. Fixes problems found with vtk. Tested-by: Kevin H. Hobbs <hobbsk@ohio.edu>	2013-04-09 18:30:23 -06:00
Christian König	462647453c	st/vdpau: fix subtitle related bug v2 Drawing subtitles didn't increased the dirty area of the surface. Reported and tested by freeedrich on irc. v2: don't clear the surface Signed-off-by: Christian König <christian.koenig@amd.com>	2013-04-09 21:11:32 +02:00
Brian Paul	4ad360133c	softpipe: misc updates to image dumping in softpipe_flush()	2013-04-09 08:27:53 -06:00
Vinson Lee	04ffce3004	tgsi: Ensure struct tgsi_ind_register field Index is initialized. Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-08 18:59:34 -07:00
Martin Andersson	a8246927e3	r600g: Fix UMAD on Cayman The multiplication part of tgsi_umad did not work on Cayman, because it did not populate the correct vector slots. This fixed hardlocks in the EXT_transform_feedback/order tests. NOTE: This is a candidate for the stable branches. (might not be easy to cherry-pick though) Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-04-09 03:09:37 +02:00
Vincent Lejeune	5019af2145	r600g/llvm: Add support for native isa for pre EG This fixes bug 62756 : https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12	2013-04-08 15:11:59 +02:00
Marek Olšák	eff66bc9f8	gallium/util: add const to a parameter of util_max_layer	2013-04-06 23:57:15 +02:00
Tom Stellard	302f53dc20	radeonsi: Add compute support v3 v2: - Only dump shaders when env variable is set. v3: - Don't emit VGT registers Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com	2013-04-05 18:43:34 -04:00
Tom Stellard	4f7fe2cf2c	radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cache Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com	2013-04-05 18:43:34 -04:00
Tom Stellard	0ccf82c557	radeonsi: Remove si_pm4_inval_vertex_cache() This function is a holdover from r600g and is identical to si_pm4_inval_texture_cache(), so it is not needed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com	2013-04-05 18:43:34 -04:00
Tom Stellard	c5e5b3401c	gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2 This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-04-05 18:43:34 -04:00
Wladimir	1a868acbec	util: add ETC as compressed format Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-05 16:14:51 -06:00
Brian Paul	de99b6d117	gallium/u_blitter: fix is_blit_generic_supported() stencil checking Don't check if there's sampler support for stencil if we're not going to actually blit/copy stencil values. Fixes the case where we mistakenly said we can't support a blit of depth values from S8Z24 to X8Z24. Also, rename the is_stencil variable to dst_has_stencil to improve readability. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-05 16:14:51 -06:00
Rob Clark	aac7f06ad8	freedreno: use autogenerated register defs Switch to use the envytools generated headers for register/bitfield definitions. This is the first step in preparing to add a3xx support, since it avoids having conflicting names for a3xx and a2xx registers. And since I'm using envytools for a3xx it is simpler to just use it for everything. This shouldn't cause any functional change, it is really just a lot of renaming. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-04-05 14:33:16 -04:00
José Fonseca	1fefc65d20	st/wgl: Install our windows message hook to threads created before the ICD is loaded. Otherwise we will not receive destroy windows events, causing framebuffers to leak. This happens particularly with java and jogl. Tested with java + jogl, MATLAB. VMware Internal Bug Number: 1013086. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-05 18:27:54 +01:00
Adam Jackson	ca70de9bd2	llvmpipe: Work without sse2 if llvm is new enough At least on llvm 3.2 this appears to work fine. Tested on an Athlon XP 2600+, which has sse and 3dnow but not sse2. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-04-05 11:32:53 -04:00
Jerome Glisse	b8998f976e	winsys/radeon: add command stream replay dump for faulty lockup v3 Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to enable it. When enabled after each cs submission the code will try to detect lockup by waiting on one of the buffer of the cs to become idle, after a timeout it will consider that the cs triggered a lockup and will write a radeon_lockup.c file in current directory that have all information for replaying the cs. To build this file : gcc -O0 -g radeon_lockup.c -ldrm -o radeon_lockup -I/usr/include/libdrm v2: Add radeon_ctx.h file to mesa git tree v3: Slightly improve dumped file for easier editing, only dump first faulty cs Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-05 10:22:05 -04:00
Brian Paul	5192262833	st/xlib: add HUD support for xlib/GLX For the softpipe and llvmpipe drivers. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 17:00:42 -06:00
Brian Paul	f5071783c1	gallium/hud: add GALLIUM_HUD_PERIOD env var To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-04 17:00:42 -06:00
Brian Paul	6211c45186	gallium/hud: initialize sampler state The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-04 17:00:42 -06:00
Roland Scheidegger	9eef86bb55	gallivm: some minor cube map cleanup The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. And also change the chosen major axis in case the coord values are the same. OpenGL doesn't care one bit which one is chosen in this case but it looks like dx10 would require z chosen over y, and y chosen over x (previously did x chosen over y, y chosen over z). Since it's all the same effort just honor dx10's wishes. (Though actually, for some prefered orderings, we could save one (or two with derivatives) selects since the tnewx and tnewz (and the corresponding dmax values) are the same.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 23:22:10 +02:00
Zack Rusin	be9a42e980	llvmpipe: implement ucmp and add a test for it Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 12:09:55 -07:00
Paul Berry	5db2249493	Avoid spurious GCC warnings in STATIC_ASSERT() macro. GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produced spurious warnings with the STATIC_ASSERT() macro (which used a typedef to provoke a compile error in the event of an assertion failure). This patch switches to a simpler technique that avoids the warning. v2: Avoid GCC-specific syntax. Also update p_compiler.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-04 09:52:18 -07:00
Erik Faye-Lund	456f40e18d	freedreno: document debug flag Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-04-04 10:41:50 -06:00
Brian Paul	e95514c0ea	st/wgl: add HUD support v2: fix a few minor issues spotted by Jose. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 10:41:35 -06:00
Brian Paul	0c1dcf906d	st/wgl: make stw_current_context() non-static Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 08:50:16 -06:00
Brian Paul	92e5e45ff1	util: add debug_memory_check_block(), debug_memory_tag() The former just checks that the given block is valid by checking the header and footer. The later sets the memory block's tag. With extra debug code, we can use that for monitoring/checking particular allocations. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 08:50:15 -06:00
Brian Paul	a408ea9692	gallium/hud: replace malloc w/ MALLOC To match the FREE() called used later. Fixes things on Windows. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-04 08:50:15 -06:00
Vincent Lejeune	9276961223	r600g/llvm: Workaround for wrong tex.offset_*	2013-04-04 16:03:04 +02:00
Roland Scheidegger	ce5096a0a9	gallivm: honor explicit derivatives values for cube maps. This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Passes piglit arb_shader_texture_lod-texgradcube test. v2: add the forgotten abs() for all incoming derivatives (discovered by new piglit arb_shader_texture_lod-texgradcube test, though more by luck as it was failing only for exactly one pixel...). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 01:03:42 +02:00
Roland Scheidegger	f621015cb5	gallivm: do per-pixel cube face selection (finally!!!) This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get "randomly" exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in "nearly" the same values as before (the "nearly" is only because before the code would choose the face based on an "average" vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the "derivative major axis" into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. No piglit regressions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 01:03:42 +02:00
Roland Scheidegger	bdfbeb9633	gallivm: minor rho calculation optimization for 1 or 3 coords Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 01:03:42 +02:00
Roland Scheidegger	067a0ae420	gallivm: use f16c hw support for float->half and half->float conversion Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-04 01:03:42 +02:00
Zack Rusin	302df7cc85	draw/llvmpipe: allow independent so attachments to the vs When geometry shaders are present, one needs to be able to create an empty geometry shader with stream output that needs to be resolved later and attached to the currently bound vertex shader. Lets add support for it to llvmpipe and draw. draw allows attaching independent stream output info to any vertex shader and llvmpipe resolves at draw time which vertex shader the given empty geometry shader should be linked to. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	246e68735f	llvmpipe: reset so buffers when not appending We need to reset the internal state of the so buffers or we'll keep appending even though we're not supposed to. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	7ca65a68e1	draw: remove unused function we use draw_set_mapped_so_targets nowadays Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	b16ae0f792	draw/llvm: use an enum instead of magic numbers I think this was there before and got accidently removed during a merge. Same code as for the GS context, which is also using an enum instead of hardcoded numbers. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	49b7d933f8	draw/gs: cleanup some debugging code Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	822c21c776	draw/so: maintain an exact number of written vertices It's quite helpful during the rendering when we know exactly the count of the vertices available in the buffer. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	d8543bd752	draw: Implement support for primitive id We were largely ignoring primitive id. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	f6bfb62c50	draw/so: Fix bogus assert We do support so with multiple primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	e6fc635351	draw/gs: Fix memory corruption with multiple primitives We were flushing with incorrect number of primitives. TGSI exec can only work with a single primitive at a time. Plus the fetching with multiple primitives on llvm paths wasn't copying the last element. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	f313b0c850	gallivm: cleanup the gs interface Instead of void pointers use a base interface. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Brian Paul	ac114c6824	svga: add new memory-used HUD query To track the amount of memory used by all pipe_resources (textures and buffers). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 11:02:47 -06:00
Brian Paul	a69efa9482	util: add new util_resource_size() function in u_resource.[ch] Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 11:02:47 -06:00
Brian Paul	a3cccdec90	util: move functions from u_resource.c to u_transfer.c The functions are prototyped in u_transfer.h and are related to the other functions in u_transfer.c. The next patch will re-use the u_resource.c file for new code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 11:02:47 -06:00
Vincent Lejeune	159d934066	r600g/llvm: Do not override llvm provided stack_size	2013-04-03 18:39:49 +02:00
Vincent Lejeune	097a6ecdfe	r600g/llvm: Do not change cf_alu inst when adding alus	2013-04-03 18:22:40 +02:00
Marek Olšák	ff01e0db0e	radeonsi: add more cases for copying unsupported formats to resource_copy_region Ported from r600g commit: `8891b2f9c9` Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 9.1 branch.	2013-04-03 10:58:33 -04:00
Brian Paul	3838edaf5d	svga: add HUD queries for number of draw calls, number of fallbacks The fallbacks count is the number of drawing calls that use a "draw" module fallback, such as polygon stipple. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 09:56:08 -06:00
Brian Paul	49ed1f3cb3	svga: refactor occlusion query code This is in preparation for adding new query types for the HUD. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 09:56:07 -06:00
Brian Paul	a9ae7e9c28	gallium/hud: try L8 texture for font if I8 format isn't supported	2013-04-03 09:44:57 -06:00
Brian Paul	0289ebaa0f	svga: add case for PIPE_CAP_QUERY_PIPELINE_STATISTICS	2013-04-03 08:19:44 -06:00
Christoph Bumiller	80eef069f0	nv50,nvc0: remove MS resolve formats hack Mesa now allows BlitFramebuffer resolve between RGBA and BGRA.	2013-04-03 13:19:15 +02:00
Christoph Bumiller	4de70bf43c	nvc0: fix 128 bit compressed storage type selection	2013-04-03 12:54:44 +02:00
Christoph Bumiller	8e1dd58a7e	nvc0: place staging textures in GART and map them directly	2013-04-03 12:54:44 +02:00
Christoph Bumiller	ba9b0b682f	nv50: account for pesky prefetch in size calculation of linear textures	2013-04-03 12:54:44 +02:00
Christoph Bumiller	f0a0d59f0f	nvc0: honour scaled coordiantes setting for linear textures	2013-04-03 12:54:44 +02:00
Christoph Bumiller	d801545964	nvc0: fix for 2d engine R source formats writing RRR1 and not R001	2013-04-03 12:54:43 +02:00
Christoph Bumiller	6417d56c19	nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit We send position.z == 0, DEPTH_RANGE may be some arbitrary range not including 0 (for exmaple in piglit's hiz tests).	2013-04-03 12:54:43 +02:00
Christoph Bumiller	2a8145d36b	nouveau: accelerate buffer copies in resource_copy_region	2013-04-03 12:54:43 +02:00
Christoph Bumiller	3ed4bbd769	nvc0: demagic some of the NVE4_COMPUTE_UPLOAD methods It's actually the same as P2MF.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	fb0334adb3	nvc0: read PM counters for each warp scheduler separately	2013-04-03 12:54:43 +02:00
Christoph Bumiller	7bac075f25	nvc0: add some metrics to driver specific queries	2013-04-03 12:54:43 +02:00
Christoph Bumiller	198f514aa6	nvc0: add some driver statistics queries	2013-04-03 12:54:43 +02:00
Christoph Bumiller	7628cc247f	nvc0: disable compressed storage type 0xdb for now Single-sample color compression doesn't seem that useful anyway.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	ea12fc3f6c	nvc0: use correct hw query for PRIMITIVES_GENERATED It was the same as SO_STATISTICS[1] before.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	6bca4e7085	nvc0: use fence to check state of queries that don't write sequence This still isn't optimal, since the fence will signal a bit late, but better than checking on the bo, which may never be ready if it is shared (which is likely).	2013-04-03 12:54:43 +02:00
Christoph Bumiller	3d2790cead	gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICS Also, renamed "pixels-rendered" to "samples-passed" because the occlusion counter increments even if colour and depth writes are disabled, or (on some implementations) for killed fragments that passed the depth test when PS early_fragment_tests is set.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	c620aad71c	gallium/docs: fix definition of PIPE_QUERY_SO_STATISTICS Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-03 12:54:43 +02:00
Christoph Bumiller	f35e96d973	gallium: add PIPE_CAP_QUERY_PIPELINE_STATISTICS Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-03 12:54:43 +02:00
Roland Scheidegger	450950c57a	gallivm: bring back optimized but incorrect float to smallfloat optimizations Conceptually the same as previously done in float_to_half. Should cut down number of instructions from 14 to 10 or so, but will promote some NaNs to Infs, so it's disabled. It gets a bit tricky though handling all the cases correctly... Passes basic tests either way (though there are no tests testing special cases, but some manual tests injecting them seemed promising). v2: style and comment fixes suggested by Jose Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-02 18:24:31 +02:00
Roland Scheidegger	3febc4a1cd	gallivm: consolidate code for float-to-half and float-to-packed conversion. This replaces the existing float-to-half implementation. There are definitely a couple of differences - the old implementation had unspecified(?) rounding behavior, and could at least in theory construct Inf values out of NaNs. NaNs and Infs should now always be properly propagated, and rounding behavior is now towards zero (note this means too large but non-Infinity values get propagated to max representable value, not Infinity). The implementation will definitely not match util code, however (which does nearest rounding, which also means too large values will get propagated to Infinity). Also fix a bogus round mask probably leading to rounding bugs... v2: fix a logic bug in handling infs/nans. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-02 18:24:31 +02:00
Vadim Girlin	9be624b3ef	r600g: don't reserve more stack space than required v5 Reduced stack size allows to run more threads in some cases, improving performance for the shaders that use stack (that is, for the shaders with control flow instructions). E.g. with unigine-based apps. v4: implement exact computation taking into account wavefront size v5: add cases for RV620, RS880 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-02 19:34:14 +04:00
Vadim Girlin	7e04227f39	r600g: fix range handling for tgsi input declarations v2 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-02 19:34:14 +04:00
Marek Olšák	f8502b7e71	gallium/hud: do .xxxx swizzling for the font texture in the fragment shader This allows using L8 and R8 for the font if I8 isn't supported. Tested-by: Brian Paul <brianp@vmware.com>	2013-04-02 16:57:57 +02:00
Brian Paul	98b64cc20f	hud: flush/unmap the vertex buffer before drawing The VMware svga driver is picky about making sure the VBO is unmapped before drawing. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-02 08:17:28 -06:00
Brian Paul	bdd3770b78	draw: use pipe_transfer_unmap() to match pipe_transfer_map()	2013-04-02 08:17:28 -06:00
Roland Scheidegger	9b329f4c09	gallivm: fix signed small float to float conversion Introduced by `5f41e08cf3`, just a silly typo. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62921.	2013-04-02 13:21:07 +02:00
Christian König	a0dca4409a	radeonsi: add instance divisor support v3 v2: reduce key size, don't copy key around to much. v3: remove key size reduction Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:43 +02:00
Christian König	cf9b31f78a	radeonsi: add start instance support This works different than on R600, we need to add the start instance manually. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:43 +02:00
Christian König	e4ed58763a	radeonsi: add instanceid support Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:43 +02:00
Christian König	83df955ca9	radeon/llvm: move system value fetching to common code This should be used by both SI and R600. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:42 +02:00
Michel Dänzer	c6efb4870b	radeonsi: Handle arbitrary 2-byte formats in resource_copy_region Fixes mplayer -vo vdpau OSD. NOTE: This is a candidate for the 9.1 branch. Reported-by: Igor Vagulin <igor.vagulin@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Christian König <christian.koenig@amd.com>	2013-04-02 11:42:35 +02:00
Maarten Lankhorst	6d20c646d6	nvc0: Fix fd leak in nvc0_create_decoder NOTE: This is a candidate for the 9.0 and 9.1 branches. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-04-02 10:25:26 +02:00
Vincent Lejeune	50fd9c4544	r600g/llvm: Update LLVM_REVISION.txt	2013-04-01 23:50:20 +02:00
Vincent Lejeune	8c8c4e3977	r600g/llvm: Use stack_size provided from llvm.	2013-04-01 23:43:57 +02:00
Vincent Lejeune	4ac0d85ca6	r600g/llvm: uses function attribute to pass shader type	2013-04-01 23:43:42 +02:00
Vincent Lejeune	af38695f51	r600g/llvm: Add support for cf_alu native encode	2013-04-01 23:43:27 +02:00
Mike Lothian	777a7f2003	clover: Fix build with LLVM 3.3	2013-04-01 10:50:23 -07:00
Brian Paul	1165ff1af1	llvmpipe: use triangle subdivision to avoid fixed-point overflow issues If we're drawing to a surface that's 2048 x 2048 pixels or larger there's danger of fixed-point overflow in the triangle rasterization code. That leads to various rendering glitches. Rather than implement some intricate changes to the rasterization code, simply subdivide triangles into smaller subtriangles to avoid the issue. Only do this when the drawing surface is larger than 2048 by 2048. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-01 08:40:35 -06:00
Adam Jackson	e26d5940ff	gallivm: Minor comment cleanup Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-04-01 09:45:38 -04:00
Vincent Lejeune	c3fb34ee8d	r600g/llvm: Update LLVM_REVISION	2013-03-31 21:37:20 +02:00
Vincent Lejeune	67a8ee7aaa	r600g/llvm: use native encode for tex	2013-03-31 21:35:47 +02:00
Roland Scheidegger	5f41e08cf3	gallivm: consolidate some half-to-float and r11g11b10-to-float code Similar enough that we can try to use shared code. v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com	2013-03-29 16:39:40 +01:00
Christoph Bumiller	ee624ced36	nvc0: implement MP performance counters There's more, but this only adds (most) of the counters that are handled directly by the shader processors. The other counter domains are not handled on the multiprocessor and there are no FIFO object methods for configuring them. Instead, they have to be programmed by the kernel via PCOUNTER, and the interface for this isn't in place yet.	2013-03-29 00:33:01 +01:00
Christoph Bumiller	480359bcf6	nvc0: enable compression when supported	2013-03-29 00:33:01 +01:00
Christoph Bumiller	25722e3454	nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count	2013-03-29 00:33:00 +01:00
Christoph Bumiller	443b247878	nv50,nvc0: fix 3d blits, restore viewport after blit	2013-03-29 00:33:00 +01:00
Christoph Bumiller	090e73fc46	nv50: fix 3D render target setup	2013-03-29 00:33:00 +01:00
Brian Paul	b54ce3738a	llvmpipe: put .bmp extension on dumped image files	2013-03-28 17:17:26 -06:00
Brian Paul	e90c56bc4e	llvmpipe: add 'f' suffix to 1.0 in fixed_to_float()	2013-03-28 17:17:26 -06:00
Brian Paul	499aa3ddb4	draw: fix some build breakage when LLVM is not used Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-03-28 17:15:58 -06:00
Marek Olšák	a19f6e880a	st/dri: fix crash with HUD and single buffering	2013-03-28 18:17:21 +01:00
Zack Rusin	d066133a76	llvmpipe/draw: Fix texture sampling in geometry shaders We weren't correctly propagating the samplers and sampler views when they were related to geometry shaders. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00

... 3 4 5 6 7 ...

18425 Commits