KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	fd7000bd78	radeonsi: pass TGSI processor type to si_shader_binary_read for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	3ce0a2fd7f	radeonsi: pass TGSI processor type to si_compile_llvm for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	dd79034ca6	radeonsi: rename shader parameter definitions and variables for more clarity Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Ilia Mirkin	34217018c4	nvc0/ir: add support for PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 16:20:52 -05:00
Ilia Mirkin	20dee333f3	st/mesa: use PK2H/UP2H when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:47 -05:00
Ilia Mirkin	e9f43d6333	gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:41 -05:00
Ilia Mirkin	459e4532af	tgsi: update PK2H/UP2H channel behavior info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:27 -05:00
Ilia Mirkin	6eb74b87b8	gallium: document PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:19:57 -05:00
Samuel Pitoiset	0ab2c21b93	st/mesa: fix parameter names for tesseval/tessctrl prototypes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 22:01:18 +01:00
Ilia Mirkin	bf34748b39	nouveau: fix double-const qualifier Reported by Tom^ on IRC. The original intent was to mark the pointer constant as well as the data being pointed to, so move the *. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 11:32:15 -05:00
Rob Clark	3684e899ea	freedreno/ir3: use NIR_PASS helper macros Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	317628dbb3	nir: extract out helper macros for running passes Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-03 09:11:27 -05:00
Rob Clark	23bd6affb2	freedreno/ir3: we require block_index metadata Found during NIR_TEST_CLONE=1 piglit run. We were using block->index but forgetting to require it. Causing things to not work with a cloned shader which didn't preserve block_index. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	74135f804a	freedreno/ir3: refactor NIR IR handling Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	ab4efb19dc	freedreno/ir3: drop unnecessary unreachable() case It will still hit a compile_assert() in emit_tex, which has the advantage of dumping out the offending shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Samuel Pitoiset	6a49fcfb1f	gallium/tests: fix build with clang compiler Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-03 12:18:00 +01:00
Samuel Pitoiset	53dddab78c	nv50,nvc0: optimize coherent buffer checking at draw time Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 12:17:05 +01:00
Kenneth Graunke	28dea26626	i965: Make TCS precompile use the TES primitive mode when available. If there's a linked TES program, we should just use the actual primitive mode. If not, just guess triangles (as we did before). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	4a1c8a3037	i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	b022150d70	i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	53a9b6223f	i965: Move 3-src subnr swizzle handling into the vec4 backend. While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Eric Anholt	64253fdb2e	vc4: Fix build from upload changes.	2016-01-02 17:33:19 -08:00
Nicolai Hähnle	8f384d07a8	gallium/radeon: send LLVM diagnostics as debug messages Diagnostics sent during code generation and the every error message reported by LLVMTargetMachineEmitToMemoryBuffer are disjoint reporting mechanisms. We take care of both and also send an explicit message indicating failure at the end, so that log parsers can more easily tell the boundary between shader compiles. Removed an fprintf that could never be triggered. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	255ccd1e99	gallium/radeon: pass pipe_debug_callback into radeon_llvm_compile (v2) This will allow us to send shader debug info via the context's debug callback. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	f8cd11403a	radeonsi: send shader info as debug messages in addition to stderr output The output via stderr is very helpful for ad-hoc debugging tasks, so that remains unchanged, but having the information available via debug messages as well will allow the use of parallel shader-db runs. Shader stats are always provided (if the context is a debug context, that is), but you still have to enable the appropriate R600_DEBUG flags to get disassembly (since it is rather spammy and is only generated by LLVM when we explicitly ask for it). Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	4bb1c8dfec	radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2) This will allow us to send shader debug info. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Nicolai Hähnle	b6847062dd	gallium/radeon: implement set_debug_callback Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Marek Olšák	ecb2da1559	u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	37d0aea772	u_upload_mgr: remove alignment parameter from u_upload_create Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	1bb79c3a7b	u_upload_mgr: pass alignment to u_upload_buffer manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	e0f932846c	u_upload_mgr: pass alignment to u_upload_data manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	020009f7cc	u_upload_mgr: pass alignment to u_upload_alloc manually The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	ffc4716e97	u_upload_mgr: rework the application of alignment The function only aligned the size, but not the offset. The offset was aligned only when the previous suballocation was aligned. That yielded the correct offset alignment if the alignment was constant for all suballocations. Instead, directly align the offset, but allow an unaligned size. There is no change in behavior, because the alignment is constant at the moment. This a prerequisite for allowing a variable alignment for suballocations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	36c93a6fae	st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels (v2) Spotted by luck. The GLSL uniform storage is only associated once in LinkShader and can't be reallocated afterwards, because that would break the association. v2: don't remove st_upload_constants calls, clarify why they're needed Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Marek Olšák	294ed5cd13	program: add _mesa_reserve_parameter_storage The next commit will use this. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Jordan Justen	a2942d8f26	mesa: Fix warning with MESA_VERBOSE=api for BindBufferRange Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-01 17:27:14 -08:00
Ilia Mirkin	c1d14c6817	nv50,nvc0: make sure there's pushbuf space and that we ref the bo early First off, we can't flush in the middle of a command. Secondly requesting the extra push space might cause a flush to happen. If that flush happens, we'd have to do the PUSH_REFN again. So instead do PUSH_REFN after the push space request. This helps avoid rare crashes with supertuxkart in libdrm due to assertion failures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-01 19:52:41 -05:00
Ilia Mirkin	33a415310b	st/mesa: sort extensions enablement array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-01 19:50:02 -05:00
Rob Clark	816ddee6b8	nir/lower_clip: add missing writemask on store Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-01 15:32:46 -05:00
Jordan Justen	3dce7bf268	mesa: Add MESA_VERBOSE=api for GL_ARB_program_interface_query v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-01 12:00:51 -08:00
Jordan Justen	36db91c4c4	mesa: Add MESA_VERBOSE=api for several indexed BindBuffer variants v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-01 12:00:51 -08:00
Dave Airlie	b835255992	st/glsl_to_tgsi: fix block movs for doubles While playing with fp64, I disable varying packing to debug something else, and noticed we never emitted half the output movs for double matrix arrays. We should be moving the left index two slots for dual source doubles, and the right index two slots for non-vs input doubles. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	d214ce86cf	st/glsl_to_tgsi: handle different attrib size vertex inputs are counted differently in some cases, with vertex inputs we need to make sure we don't double count them. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	dc7b33c1f3	st/glsl_to_tgsi: readd the double_reg2 for input index mapping Otherwise we end up emitting the wrong index for the second double. This fixes dmat-vs-gs-tcs-tes.shader_test and dvec3-vs-gs-tcs-tes.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	84dbf3c4ff	st/glsl_to_tgsi: when doing reladdr get vec4 of correct type This fixes fp64 relative addressing, in the upcoming dmat-vs-gs-tcs-tes.shader_test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	d87894b98f	st/glsl_to_tgsi: handle double immediates in matrices properly. This handles matrix initialisation properly. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	7351c7684f	st/glsl_to_tgsi: setup writemask for double arrays and matricies. It's important for the double instruction emission code that the writemasks are correct going in for double so it know which channels to replicate. This fixes it for the array and matrix cases. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	14506dcae2	st/glsl_to_tgsi: handle doubles in array shrinking code. This code takes into account double inputs in the array shrinking code. This fixes some issues with doubles and geom/tess inputs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	aab0c6c9c4	st/glsl_to_tgsi: handle doubles outputs in arrays. This handles the case where a double output is stored in an array, and tracks it for use in the double instruction emit code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	fc890d703e	st/glsl_to_tgsi: store if dst is double in array This is just a precursor patch to a fix for doubles with tessellation that I've written. We need to descend into output arrays in that case and mark dst's as double. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00

1 2 3 4 5 ...

75428 Commits All Branches Search

75428 Commits

All Branches