KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Francisco Jerez	95ad9bca2f	nv50/ir/ra: Fix maxGPR calculation for programs with multiple functions.	2013-03-12 12:55:33 +01:00
Francisco Jerez	ca04e71024	nv50/ir/ra: Fix traversal before the beginning of the active list in buildRIG.	2013-03-12 12:55:33 +01:00
Francisco Jerez	fe17d8a7c0	nv50/ir/ra: Fix RegisterSet::occupy(const Value *v).	2013-03-12 12:55:33 +01:00
Francisco Jerez	49ded0e132	nv50/ir/ra: Fix argument const-ness in RegisterSet::idToUnits and idToBytes	2013-03-12 12:55:33 +01:00
Francisco Jerez	5959d4247a	nv50/ir/opt: Fix tryPropagateBranch for BBs with several exit branches. Comments and "if (bf->cfg.incidentCount() == 1)" condition added by Christoph Bumiller.	2013-03-12 12:55:33 +01:00
Francisco Jerez	572bf83ec0	nv50/ir: Clean up references to function values before destroying them.	2013-03-12 12:55:33 +01:00
Francisco Jerez	12f65e38c0	nouveau: Bail out from nouveau_fence_wait if flushing the pushbuf fails.	2013-03-12 12:55:33 +01:00
Vinson Lee	543d032885	mesa: Use correct functions for enum conversion. Fixes mixing enum types defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-11 23:44:10 -07:00
Rob Clark	6173cc19c4	freedreno: gallium driver for adreno Currently works on a220. Others in the a2xx family look pretty similar and should be pretty straightforward to support with the same driver. The a3xx has a new shader ISA, and while many registers appear similar, the register addresses have been completely shuffled around. I am not sure yet whether it is best to support with the same driver, but different compiler, or whether it should be split into a different driver. v1: original v2: build file updates from review comments, and remove GPL licensed header files from msm kernel v3: smarter temp/pred register assignment, fix clear and depth/stencil format issues, resource_transfer fixes, scissor fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-11 21:53:24 -04:00
José Fonseca	44a8e51354	d3d1x: Remove. Unused/unmaintained. Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2013-03-12 00:35:06 +00:00
José Fonseca	7db60f049f	nv50: Remove nv0_ir_from_sm4.* Unused, depends on d3d1x. Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2013-03-12 00:35:06 +00:00
Roland Scheidegger	5c41d1c222	gallivm: clean up passing derivatives around Previously, the derivatives were calculated and passed in a packed form to the sample code (for implicit derivatives, explicit derivatives were packed to the same format). There's several reasons why this wasn't such a good idea: 1) the derivatives may not even be needed (not as bad as it sounds since llvm will just throw the calculations needed for them away but still) 2) the special packing format really shouldn't be part of the sampler interface 3) depending what the sample code actually does the derivatives will be processed differently, hence there is no "ideal" packing. For cube maps with explicit derivatives (which we don't do yet) for instance the packing looked downright useless, and for non-isotropic filtering we'd need different calculations too. So, instead just pass the derivatives as is (for explicit derivatives), or let the rho calculating sample code calculate them itself. This still does exactly the same packing stuff for implicit derivatives for now, though explicit ones are handled in a more straightforward manner (quick estimates show performance should be quite similar, though it is much easier to follow and also does the rho calculation per-pixel until the end, which we eventually need for spec compliance anyway). No piglit changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-12 00:24:22 +01:00
Chad Versace	b7262ac7ea	i965: Fix typo in doxygen hyperlink s/brw_state_upload/brw_upload_state/ Found because the link was broken. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-11 16:01:19 -07:00
Eric Anholt	11b8df0c01	mesa: Reduce memory usage for reg alloc with many graph nodes (part 2). After the previous fix that almost removes an allocation of 4*n^2 bytes, we can use a bitset to reduce another allocation from n^2 bytes to n^2/8 bytes. Between the previous commit and this one, the peak heap size for an oglconform ARB_fragment_program max instructions test on i965 goes from 4GB to 255MB. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	6aa3afbfd6	mesa: Reduce the memory usage for reg alloc with many graph nodes (part 1) We were allocating an adjacency_list entry for every possible interference that could get created, but that usually doesn't happen. We can save a lot of memory by resizing the array on demand. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	5daf867f6c	i965/fs: Improve CSE performance by expiring some available expressions. We're already walking the list, and we can easily know when something has no reason to be in the list any longer, so take a brief extra step to reduce our worst-case runtime (an oglconform test that emits the maximum instructions in a fragment program). I don't actually know what the worst-case runtime was, because it was too long and I got bored. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	f179f419d1	i965/fs: Improve live variables calculation performance. We can execute way fewer instructions by doing our boolean manipulation on an "int" of bits at a time, while also reducing our working set size. Reduces compile time of L4D2's slowest shader from 4s to 1.1s (-72.4% +/- 0.2%, n=10) v2: Remove redundant masking (noted by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	4dc7e6dcbf	i965/fs: Also do the gen4 SEND dependency workaround against other SENDs. We were handling the the dependency workaround for the first written reg of a send preceding the one we're fixing up, but didn't consider the other regs. Thus if you had two sampler calls that got allocated to the same set of regs, one might, rarely, ovewrite the other. This was occurring in XBMC's GLSL shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567 NOTE: This is a candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	4c1fdae0a0	i965/fs: Switch to using sampler LD messages for uniform pull constants. When forcing the compiler to always generate pull constants instead of push constants (in order to have an easy to use testcase), improves performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	1323772543	i965/fs: Fix broken rendering in large shaders with UBO loads. The lowering process creates a new vgrf on gen7 that should be represented in live interval analysis. As-is, it was getting a conflicting allocation with gl_FragDepth in the dolphin emulator, producing broken rendering. NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	c588cd2031	i965/fs: Add a comment about about an implementation detail. I was going to fix the code above like the previous commit, but we already had that covered (otherwise all our uniform access would have been broken, unlike just pull constants). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	f10f5e4980	i965/fs: Fix register allocation for uniform pull constants in 16-wide. We were allowing a compressed instruction to write a register that contained the last use of a uniform pull constant (either UBO load or push constant spillover), so it would get half its values smashed. Since we need to see the actual instruction to decide this, move the pre-gen6 pixel_x/y logic here, which should improve the performance of register allocation since virtual_grf_interferes() is called more than once per instruction. NOTE: This is a candidate for the stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	f09a8e17e5	intel: Remove some unused debug flags. I was looking at the list to see what might be interesting to document for application developers, and it turns out some are completely dead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Zack Rusin	7295fad204	draw/gs: Correctly iterate the emitted primitives We were assuming that each emitted primitive had the same number of vertices. That is incorrect. Emitted primitives can have arbirtrary number of vertices. Simply increment index on iteration to fix it. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-07 20:16:07 -08:00
Zack Rusin	e5406f7058	tgsi/exec: Correctly reset NumOutputs before parsing the shader Whenever we're binding the shaders we're incrementing NumOutputs, assuming the parser spots an output decleration, but we were never reseting the variable. That means that each subsequent bind of a geometry shader would add its number of output to the number of output bound by all previously ran shaders and our indexes would get completely messed up. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-07 20:16:00 -08:00
Roland Scheidegger	9060c835fd	draw/llvm: another quick hack for drawing with no position output Also need to skip things if we have no cv value but pos value (happens with geometry shaders enabled). Needs a round of cleanup, though.	2013-03-11 17:07:51 +01:00
Roland Scheidegger	ef17cc9cb6	softpipe: don't use samplers with prebaked sampler and sampler_view state This is needed for handling the dx10-style sample opcodes. This also simplifies the logic by getting rid of sampler variants completely (sampler_views though OTOH have sort of variants because some of their state is different depending on the shader stage they are bound to). No significant performance difference (openarena run: 840 frames in 459.8 seconds vs. 840 frames in 460.5 seconds). v2: fix reference counting bug spotted by Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-11 17:07:51 +01:00
Roland Scheidegger	f33c744fb9	tgsi: emit code for SVIEWINFO and SAMPLE_I Can handle them since the single sampler interface was introduced. v2: simplify txf/sample_i handling a bit according to Brian's feedback. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-11 17:07:51 +01:00
Roland Scheidegger	7b3a0bb45d	tgsi: fix wrong reg used for unit for TGSI_OPCODE_TXF Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-11 17:07:51 +01:00
Tom Stellard	a0676968b9	r600g/llvm: Fix build	2013-03-11 11:10:51 -04:00
Marek Olšák	e4e655fd11	r600g: add debug options disabling various copy-buffer-related features This will be invaluable for debugging and bug reports.	2013-03-11 13:44:46 +01:00
Marek Olšák	4b69c1a92d	mesa: don't allocate a texture if width or height is 0 in CopyTexImage NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-11 13:44:14 +01:00
Marek Olšák	68ed4c9c89	gallium/util: attempt to fix blitting multisample texture arrays We don't have a test for this yet, but obviously the swizzle was wrong.	2013-03-11 13:43:36 +01:00
Marek Olšák	52efa01de0	r600g: allocate FMASK right after the texture, so that it's aligned with it This avoids the kernel CS checker errors with MSAA textures. Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	2c339f8015	r600g: remove r600.h, move the stuff elsewhere (mostly to r600_pipe.h) Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	ec7d775790	r600g: remove r600_hw_context_priv.h, move the stuff to r600_pipe.h Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	1724ef8908	r600g: remove deprecated state management code It's nice to see so much code that did pretty much nothing go away. Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	65cbf89567	r600g: atomize pixel shader Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	63042af933	r600g: atomize vertex shader Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	167263ecb1	r600g: inline r600_pipe_shader function also change names of other functions, so that they make sense Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	65b2a449bc	r600g: dump vertex elements state along with the fetch shader	2013-03-11 13:43:36 +01:00
Marek Olšák	3f0a51d677	gallium/util: dump instance_divisor	2013-03-11 13:43:36 +01:00
Marek Olšák	3832059b10	r600g: remove bytecode dumping Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	4bf0ebdd4f	r600g: use a single env var R600_DEBUG, disable bytecode dumping Only the disassembler is used to dump shaders. Here's a few examples how to use R600_DEBUG. Log compute info: R600_DEBUG=compute Dump all shaders: R600_DEBUG=fs,vs,gs,ps,cs Dump pixel shaders only: R600_DEBUG=ps Disable Hyper-Z: R600_DEBUG=nohyperz Disable the LLVM backend: R600_DEBUG=nollvm Or use any combination of the above, or print all options: R600_DEBUG=help Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	2ca73bc7f7	r600g: cleanup #include recursion between r600_pipe.h and evergreen_compute.h Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	43d3e0cd3d	r600g: don't check for R600_ENABLE_S3TC env var	2013-03-11 13:43:36 +01:00
Stefan Brüns	b21a9d46e4	glapi/gen: Remove duplicate PYTHON_FLAGS PYTHON_GEN calls python with PYTHON_FLAGS Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>	2013-03-09 16:24:51 -08:00
Frank Henigman	89559c50e7	i965: Link i965_dri.so with C++ linker. Force C++ linking of i965_dri.so by adding a dummy C++ source file. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-08 21:21:53 -08:00
Maxence Le Doré	ba588dd45d	gallium/util: Correct shift value for TSC feature detection. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-08 21:21:53 -08:00
Matt Turner	07f2dee731	configure.ac: Build dricommon for DRI gallium drivers Commit `67ef7559` added an \|\| test "x$enable_dri" check in an attempt to get the DRI common bits built in some necessary cases. That change was inappropriate as it made these common DRI pieces be built unconditionally, so some builds were broken. Subsequently, commit `998d975e3` change the "\|\| test" to a "-a" conjunction within the existing test invocation. This made the '-a "x$enable_dri" = xyes' clause have no effect, (as it was inside an enclosing test for the same condition). So the new breakage from commit `67ef7559` was addressed, but the original problems were regressed. The immediately preceding commit removed the redundant condition. Now, finally this commit fixes the original problem as described in the commit message of 67ef7559: this code should be compiled when using the DRI state tracker. In order to do so, the HAVE_*_DRI conditionals must be moved after the last assignment of HAVE_COMMON_DRI. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821 Tested-by: Stéphane Marchesin <marcheu@chromium.org>	2013-03-08 21:21:46 -08:00

... 3 4 5 6 7 ...

55685 Commits All Branches Search

55685 Commits

All Branches