KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Matt Turner	ad4507b355	i965/fs: Add Haswell cycle timings Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	7997e59b65	i965: Note that write-after-write dependencies are blocking. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:26 -07:00
Matt Turner	f91e371fee	i965: Reword comment about the shared mathbox. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:26 -07:00
Roland Scheidegger	5f41e08cf3	gallivm: consolidate some half-to-float and r11g11b10-to-float code Similar enough that we can try to use shared code. v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com	2013-03-29 16:39:40 +01:00
Chris Forbes	4412f3bc13	mesa: provide default implementation of QuerySamplesForFormat Previously at least i915 failed to provide an implementation, but exposed ARB_internalformat_query anyway, leading to crashes when QueryInternalformativ was called. Default implementation just returns 1 for everything, so is suitable for any driver which does not support multisampling. V2: - Move from intel to core mesa. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 20:54:36 +13:00
Christoph Bumiller	ee624ced36	nvc0: implement MP performance counters There's more, but this only adds (most) of the counters that are handled directly by the shader processors. The other counter domains are not handled on the multiprocessor and there are no FIFO object methods for configuring them. Instead, they have to be programmed by the kernel via PCOUNTER, and the interface for this isn't in place yet.	2013-03-29 00:33:01 +01:00
Christoph Bumiller	480359bcf6	nvc0: enable compression when supported	2013-03-29 00:33:01 +01:00
Christoph Bumiller	25722e3454	nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count	2013-03-29 00:33:00 +01:00
Christoph Bumiller	443b247878	nv50,nvc0: fix 3d blits, restore viewport after blit	2013-03-29 00:33:00 +01:00
Christoph Bumiller	090e73fc46	nv50: fix 3D render target setup	2013-03-29 00:33:00 +01:00
Brian Paul	b54ce3738a	llvmpipe: put .bmp extension on dumped image files	2013-03-28 17:17:26 -06:00
Brian Paul	e90c56bc4e	llvmpipe: add 'f' suffix to 1.0 in fixed_to_float()	2013-03-28 17:17:26 -06:00
Brian Paul	499aa3ddb4	draw: fix some build breakage when LLVM is not used Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-03-28 17:15:58 -06:00
Marek Olšák	9ad9141917	mesa: handle STATE_CURRENT_ATTRIB_MAYBE_VP_CLAMPED for parameter printing Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-28 20:02:50 +01:00
Kenneth Graunke	9fe47756b3	i965: Tidy shader time printing code by using printf's field widths. We can use %-6s%-6s rather than manually counting characters, resulting in much more readable code. This necessitates a small secondary change: using "total fs16" and "" now causes the "" string to be padded out to 6 characters, resulting in too much whitespace. Splitting it into "total" and "fs16" produces the same output as before. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:44 -07:00
Eric Anholt	6192e9b377	i965/vs: Include URB payload setup in shader_time. This much more accurately reflects the cost of the vertex shader, since the payload setup is often a significant fraction of the instructions in the VS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:41 -07:00
Eric Anholt	55feb19704	i965/vs: Use a send from a 2-register VGRF for shader time writes. This will let us emit it later, after we're setting up MRFs for the URB write. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:37 -07:00
Eric Anholt	130138030a	i965/vs: Teach copy propagation about sends from GRFs. This incidentally also teaches it a bit about gen6 math -- we now allow unswizzled, unmodified GRF temps as the sources for math. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:34 -07:00
Eric Anholt	c3a22d42a8	i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs. v2: Fix silly bool handling, and don't add new tabs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:29 -07:00
Eric Anholt	47e795d861	i965/fs: Include everything but the final FB write in shader_time. Previously, if you just wrote a constant color to the render target, no time got noted at all. This is convenient for doing single-instruction timings, but not so much for actual program analysis. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:23 -07:00
Eric Anholt	5c5218ea61	i965/fs: Switch shader_time writes to using GRFs. This avoids conflicts between shader_time and FB writes, so we can include more of the program under our profiling. This does mean hiding more of the message setup from the optimizer, which doesn't have a way to handle multi-reg sends from GRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:15 -07:00
Eric Anholt	5c039543db	i965: Provide more detailed information to match shader_time to programs. Ken asked me the other day what -1 vs 0 vs 3 vs other meant in our shader names, and I realized that it was really unclear. I'd like to do even better, like noting which one is the clear shader, but that would require exposing the metaops struct to the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:11 -07:00
Eric Anholt	d2ba1c24b4	i965: Track ARB program state along with GLSL state for shader_time. This will let us do much better printouts for non-GLSL programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:01 -07:00
Marek Olšák	a19f6e880a	st/dri: fix crash with HUD and single buffering	2013-03-28 18:17:21 +01:00
Marek Olšák	6b5dfa42c9	st/mesa: remove leftover printfs from ReadPixels Oops, I thought I had removed all debugging code.	2013-03-28 18:17:21 +01:00
Eric Anholt	eda434921d	i965/fs: Improve performance of copy propagation dataflow using bitsets. Reduces compile time of l4d2's slowest shader by 17.8% +/- 1.3% (n=10). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 09:48:50 -07:00
Zack Rusin	d066133a76	llvmpipe/draw: Fix texture sampling in geometry shaders We weren't correctly propagating the samplers and sampler views when they were related to geometry shaders. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	186a6bffdd	draw/llvm: Cleanup the store debugging code Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	10964fc73d	draw: Allocate the output buffer for output primitives We were allocating the output buffer but using the input primitives. We need to allocate that buffer using the maximum number of output, not input, primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	f20f981553	gallivm: Implement the breakc instruction Required by more modern examples. Like BRK but with a condition. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	b66ffcf2f8	gallivm: implement implicit primitive flushing TGSI semantics currently require an implicit endprim at the end of GS if an ending primitive hasn't been emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	e96f4e3b85	gallium/llvm: implement geometry shaders in the llvm paths This commits implements code generation of the geometry shaders in the SOA paths. All the code is there but bugs are likely present. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	edcebe665d	draw/gs: Fetch more than one primitive per invocation Allows executing gs on up to 4 primitives at a time. Will also be required by the llvm code because there we definitely don't want to flush with just a single primitive. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Zack Rusin	014c4d1cd7	draw/gs: Abstract the portions of GS that are tgsi specific To be able to add llvm paths later on we need to have some common interface for them. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Zack Rusin	a85c83e427	draw/llvm: Remove unused gs_constants from jit_context The member was never used and we'll need to handle it differently because gs will also need samplers/textures setup. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Zack Rusin	90ee8de700	graw/gs: add missing max output vertices to all tests A few tests were missing this crucial property. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Jerome Glisse	3f7d9710e8	radeonsi: add cs tracing v3 Same as on r600, trace cs execution by writting cs offset after each states, this allow to pin point lockup inside command stream and narrow down the scope of lockup investigation. v2: Use WRITE_DATA packet instead of WRITE_MEM v3: Remove useless nop packet Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-03-27 11:38:02 -04:00
Chris Forbes	21a2dfa55d	mesa: only check sample count if we actually wanted multisampling Fixes various test fallout from `90b5a2425a` on Pineview, which claims to support ARB_internalformat_query but doesn't actually provide the driverfunc. That driver is still broken [GetInternalformativ will still segfault!] but it was silly to be going through the sample count logic in the nonmultisampling case at all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-27 07:49:12 +13:00
Christian König	c77159cc11	radeon/llvm: document LLVM commit We need at least that revision to work correctly now. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-26 15:08:00 +01:00
Christian König	1c10018925	radeonsi: add preloading for all samplers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:43 +01:00
Christian König	0f6cf2bc79	radeonsi: add preloading of all constants Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:40 +01:00
Christian König	44e3224554	radeonsi: mark most intrinsics as readnone/nounwind Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:36 +01:00
Christian König	206f059e1f	radeonsi: mark all loads as constant Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:33 +01:00
Christian König	86f6fc2f1d	radeonsi: remove wqm intrinsic Now the backend handles that itself. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:30 +01:00
Christian König	6249db73ea	radeon/llvm: remove uneeded inclusion The include isn't needed and the file has moved with LLVM master. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:23 +01:00
Christian König	0f001fbff1	glsl_to_tgsi: avoid creating arrays if driver doesn't support them Avoid creating arrays if we replace indirect addressing anyway. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-26 10:22:27 +01:00
Christian König	462de2e65f	glsl_to_tgsi: make simplify_cmp work with arrays Even when we have arrays it is possible for simplify_cmp to work on temps, just not on arrays. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=62696 Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-26 10:22:27 +01:00
Marek Olšák	98a8e5b87e	gallium/docs: document get_driver_query_info	2013-03-26 01:37:40 +01:00
Marek Olšák	8ddae684af	r600g: add a driver query returning the amount of requested VRAM and GTT memory	2013-03-26 01:28:19 +01:00
Marek Olšák	2504380aaf	r600g: add a driver query returning the number of draw_vbo calls between begin_query and end_query	2013-03-26 01:28:19 +01:00

1 2 3 4 5 ...

55813 Commits All Branches Search

55813 Commits

All Branches