KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Chris Forbes	e0015c819c	mesa: allow multisample texture targets in [Get]TexParameter* ARB_texture_storage_multisample allows texture parameters to be queried for TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets. Some parameters may also be set, with the following exceptions: - TEXTURE_BASE_LEVEL may not be set to a nonzero value; generates INVALID_OPERATION - any state which appears in the `per-sampler` state table may not be set; generates INVALID_OPERATION V2: Don't introduce bogus handling of TEXTURE_MAX_LEVEL Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:36 +13:00
Chris Forbes	b15c558c85	mesa: improve reported function name in Tex*Multisample Now that there are 4 variants, just pass the function name into teximagemultisample rather than reconstructing it. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:34 +13:00
Chris Forbes	9cbfe98bfc	mesa: add enable bit for ARB_texture_storage_multisample Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:32 +13:00
Chris Forbes	719974b54c	glapi: add definition of ARB_texture_storage_multisample Adds XML for the extension, dispatch_sanity enabling, and the two new entrypoints. These are both implemented by calling the shared teximagemultisample() with immutable=GL_TRUE. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:28 +13:00
Chris Forbes	788b0f8535	mesa: add support for immutable textures to teximagemultisample() The new entrypoints will come later, but this adds the actual logic for supporting immutable multisample textures: - The immutability flag is set as desired. - Attempting to modify an immutable multisample texture produces INVALID_OPERATION. Note: The extension spec does not mention adding this behavior to TexImage*Multisample, but it seems like the reasonable thing to do. V2: - Cover missing error cases (unsized formats; texture object zero) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [V1] Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:22 +13:00
Chris Forbes	7f32b9560b	mesa: extract _mesa_is_legal_tex_storage_format helper This is about to be used in teximagemultisample() when immutable=true. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:13 +13:00
Kenneth Graunke	fdc5941972	mesa: Delete VERT_ATTRIB_GENERIC_NV and VERT_BIT_GENERIC_NV macros. These haven't been used since we deleted NV_vertex_program support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-30 19:19:45 -07:00
Eric Anholt	0967c362bf	i965: Fix an inconsistency inb the VUE map with gl_ClipVertex on gen4/5. We are intentionally not allocating a slot for gl_ClipVertex. But by leaving the bit set in the slots_valid, the fragment shader's computation of where varyings are in urb entry coming out of the SF would be off by one. Fixes rendering in Freespace 2 SCP, and improves rendering in TF2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62830 Tested-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> NOTE: This is a candidate for the 9.1 branch. Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-03-30 17:24:18 -07:00
Eric Anholt	9dd19575d3	intel: Remove a never-taken debug print path. Alessandro Pignotti noted when I added this code in commit `0e723b135b` that it's in the else block for "if (busy)", so this debug print couldn't happen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-30 17:23:50 -07:00
Brian Paul	c34bbe110d	st/mesa: add ir_lod case in GLSL->TGSI code to silence warning	2013-03-29 17:21:33 -06:00
Ian Romanick	e0131196ca	glsl: Generated masked write instead of vector array index for UBO lowering When reading a column from a row-major matrix, we would slot the single value read into the vector using an ir_dereference_array of the vector with a constant index. This will (eventually) get optimized to a masked-write, so just generate the masked write in the first place. v2: Remove unused variable 'chan'. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net>	2013-03-29 12:01:14 -07:00
Ian Romanick	65cc68f430	glsl: Replace open-coded dot-product with dot Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: Paul Berry <stereotype441@gmail.com>	2013-03-29 12:01:11 -07:00
Ian Romanick	dbf94d105a	glsl: Replace constant-index vector array accesses with swizzles Search and replace: ][0] -> ].x ][1] -> ].y ][2] -> ].z ][3] -> ].w Fixes piglit tests inverse-mat[234].{vert,frag}. These tests call the inverse function with constant parameters and expect proper constant folding to happen. My suspicion is that this patch papers over some bug in constant propagation involving array accesses. Either way, all of these accesses eventually get lowered to swizzles. This cuts out the middle man (saving a trivial amount of CPU). NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: Paul Berry <stereotype441@gmail.com>	2013-03-29 12:01:07 -07:00
Ian Romanick	c770faea0a	glsl: Add missing bool case in glsl_type::get_scalar_type Since the case was missing bec4->get_scalar_type() would return bvec4, but vec4->get_scalar_type() would return float. NOTE: This is a candidate for stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-29 12:01:01 -07:00
Kenneth Graunke	57a502518e	i965: Fix INTEL_DEBUG=shader_time for fragment shaders with discards. "discard" instructions generate HALT instructions which jump to a final HALT near the end of the shader. Previously, fs_generator created this final jump target when it saw the first FS_OPCODE_FB_WRITE, causing it to jump right before the FB write epilogue. This is normally good. However, INTEL_DEBUG=shader_time also has an epilogue section which records the final timestamp. The frontend emits IR for this just before FS_OPCODE_FB_WRITE. Unfortunately, this led to the following ordering: 1. Shader Time Epilogue 2. Final HALT (where discards jump) 3. Framebuffer Write Epilogue This meant that discarded pixels completely skipped the shader time epilogue, causing no ending timestamp to be written. This obviously led to inaccurate results. This patch adds a new FS_OPCODE_PLACEHOLDER_HALT in the IR stream just before any epilogue sections. This is where the final HALT should be generated, and makes it easy to ensure the correct ordering: 1. Final HALT 2. Shader Time Epilogue 3. Framebuffer Write Epilogue For shaders that don't discard, this opcode compiles away to nothing. The scheduler adds barrier dependencies to make sure that it doesn't get moved above any FS_OPCODE_DISCARD_JUMP instructions. One 8-wide shader in GLBenchmark 2.7 dropped from 2291.67 Gcycles to a mere 5.13 Gcycles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 11:39:32 -07:00
Eric Anholt	20d846ce8b	i965: Add names for all instructions to dump_instruction() in FS and VS. I'd previously added the minimum names to understand my dumps, but this makes dumps in general much easier to read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 11:39:21 -07:00
Matt Turner	ed6186f0e8	i965: Enable ARB_texture_query_lod. v2: Support Ironlake as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 10:21:14 -07:00
Matt Turner	b8aa9f7d3a	i965/fs: Generate LOD sampler message from ir_lod. v2: Support Ironlake as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 10:21:14 -07:00
Dave Airlie	110ca8b1f3	glsl: Implement ARB_texture_query_lod v2 [mattst88]: - Rebase. - #define GL_ARB_texture_query_lod to 1. - Remove comma after ir_lod in ir.h for MSVC. - Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp, opt_tree_grafting.cpp. - Rename textureQueryLOD to textureQueryLod, see https://www.khronos.org/bugzilla/show_bug.cgi?id=821 - Fix ir_reader of (lod ...). v3 [mattst88]: - Rename textureQueryLod to textureQueryLOD, pending resolution of Khronos 821. - Add ir_lod case to ir_to_mesa.cpp. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 10:20:26 -07:00
Matt Turner	0e0ab8a071	i965/fs: Use measured Gen7 instruction timings on Gen6. x before + after +------------------------------------------------------------------------------+ \| x x + \| \| xx ++ x + \| \| xx ++ + xx ++ \| \|x xxx x+++++ + xxx xx++++ + x +\| \| \|_____\|____________A______A____M____M_\|_______\| \| +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 23 8083.78 8287.83 8205.55 8162.7461 68.307951 + 23 8107.56 8358.74 8224.33 8186.1765 71.506301 No difference proven at 95.0% confidence Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	f085b21b25	i965/fs: Increase and document MAD latency on Gen7. 58% of mad(8) generated in shader-db are reading registers from the same bank. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	414ea2f560	i965/fs: Add LRP instruction latency. Set its latency to what happens to be the default floating-point instruction latency. One day we may want to handle latency based on register bank information. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	ad4507b355	i965/fs: Add Haswell cycle timings Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	7997e59b65	i965: Note that write-after-write dependencies are blocking. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:26 -07:00
Matt Turner	f91e371fee	i965: Reword comment about the shared mathbox. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:26 -07:00
Roland Scheidegger	5f41e08cf3	gallivm: consolidate some half-to-float and r11g11b10-to-float code Similar enough that we can try to use shared code. v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com	2013-03-29 16:39:40 +01:00
Chris Forbes	4412f3bc13	mesa: provide default implementation of QuerySamplesForFormat Previously at least i915 failed to provide an implementation, but exposed ARB_internalformat_query anyway, leading to crashes when QueryInternalformativ was called. Default implementation just returns 1 for everything, so is suitable for any driver which does not support multisampling. V2: - Move from intel to core mesa. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 20:54:36 +13:00
Christoph Bumiller	ee624ced36	nvc0: implement MP performance counters There's more, but this only adds (most) of the counters that are handled directly by the shader processors. The other counter domains are not handled on the multiprocessor and there are no FIFO object methods for configuring them. Instead, they have to be programmed by the kernel via PCOUNTER, and the interface for this isn't in place yet.	2013-03-29 00:33:01 +01:00
Christoph Bumiller	480359bcf6	nvc0: enable compression when supported	2013-03-29 00:33:01 +01:00
Christoph Bumiller	25722e3454	nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count	2013-03-29 00:33:00 +01:00
Christoph Bumiller	443b247878	nv50,nvc0: fix 3d blits, restore viewport after blit	2013-03-29 00:33:00 +01:00
Christoph Bumiller	090e73fc46	nv50: fix 3D render target setup	2013-03-29 00:33:00 +01:00
Brian Paul	b54ce3738a	llvmpipe: put .bmp extension on dumped image files	2013-03-28 17:17:26 -06:00
Brian Paul	e90c56bc4e	llvmpipe: add 'f' suffix to 1.0 in fixed_to_float()	2013-03-28 17:17:26 -06:00
Brian Paul	499aa3ddb4	draw: fix some build breakage when LLVM is not used Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-03-28 17:15:58 -06:00
Marek Olšák	9ad9141917	mesa: handle STATE_CURRENT_ATTRIB_MAYBE_VP_CLAMPED for parameter printing Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-28 20:02:50 +01:00
Kenneth Graunke	9fe47756b3	i965: Tidy shader time printing code by using printf's field widths. We can use %-6s%-6s rather than manually counting characters, resulting in much more readable code. This necessitates a small secondary change: using "total fs16" and "" now causes the "" string to be padded out to 6 characters, resulting in too much whitespace. Splitting it into "total" and "fs16" produces the same output as before. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:44 -07:00
Eric Anholt	6192e9b377	i965/vs: Include URB payload setup in shader_time. This much more accurately reflects the cost of the vertex shader, since the payload setup is often a significant fraction of the instructions in the VS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:41 -07:00
Eric Anholt	55feb19704	i965/vs: Use a send from a 2-register VGRF for shader time writes. This will let us emit it later, after we're setting up MRFs for the URB write. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:37 -07:00
Eric Anholt	130138030a	i965/vs: Teach copy propagation about sends from GRFs. This incidentally also teaches it a bit about gen6 math -- we now allow unswizzled, unmodified GRF temps as the sources for math. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:34 -07:00
Eric Anholt	c3a22d42a8	i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs. v2: Fix silly bool handling, and don't add new tabs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:29 -07:00
Eric Anholt	47e795d861	i965/fs: Include everything but the final FB write in shader_time. Previously, if you just wrote a constant color to the render target, no time got noted at all. This is convenient for doing single-instruction timings, but not so much for actual program analysis. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:23 -07:00
Eric Anholt	5c5218ea61	i965/fs: Switch shader_time writes to using GRFs. This avoids conflicts between shader_time and FB writes, so we can include more of the program under our profiling. This does mean hiding more of the message setup from the optimizer, which doesn't have a way to handle multi-reg sends from GRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:15 -07:00
Eric Anholt	5c039543db	i965: Provide more detailed information to match shader_time to programs. Ken asked me the other day what -1 vs 0 vs 3 vs other meant in our shader names, and I realized that it was really unclear. I'd like to do even better, like noting which one is the clear shader, but that would require exposing the metaops struct to the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:11 -07:00
Eric Anholt	d2ba1c24b4	i965: Track ARB program state along with GLSL state for shader_time. This will let us do much better printouts for non-GLSL programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:01 -07:00
Marek Olšák	a19f6e880a	st/dri: fix crash with HUD and single buffering	2013-03-28 18:17:21 +01:00
Marek Olšák	6b5dfa42c9	st/mesa: remove leftover printfs from ReadPixels Oops, I thought I had removed all debugging code.	2013-03-28 18:17:21 +01:00
Eric Anholt	eda434921d	i965/fs: Improve performance of copy propagation dataflow using bitsets. Reduces compile time of l4d2's slowest shader by 17.8% +/- 1.3% (n=10). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 09:48:50 -07:00
Zack Rusin	d066133a76	llvmpipe/draw: Fix texture sampling in geometry shaders We weren't correctly propagating the samplers and sampler views when they were related to geometry shaders. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	186a6bffdd	draw/llvm: Cleanup the store debugging code Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00

1 2 3 4 5 ...

55885 Commits All Branches Search

55885 Commits

All Branches