KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	d00abcc283	nir/algebraic: Add more lowering This commit adds lowering options for the following opcodes: - nir_op_fmod - nir_op_bitfield_insert - nir_op_uadd_carry - nir_op_usub_borrow Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:38 -08:00
Jason Ekstrand	b0d4ee520e	nir/opcodes: Fix up uadd_carry and usub_borrow Both were defined as returning bool but the gpu_shader5 functions are defined to return int. Also, we had the parameters for usub borrwo backwards in the folding expression. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:25 -08:00
Ilia Mirkin	67b31b3c59	nvc0: add ARB_indirect_parameters support I chose to make separate macros for this due to the additional complexity and extra scratch usage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	9a54ccf30a	st/mesa: expose ARB_indirect_parameters when the backend driver allows Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	e1eab5a76f	mesa: add support for ARB_indirect_parameters draw functions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	9327e2d312	mesa: add parameter buffer, used for ARB_indirect_parameters Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	b3e2c21fe5	glapi: add ARB_indirect_parameters definitions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	7ca67c752b	nvc0: add support for real ARB_multi_draw_indirect The draw groups are now split up into groups of 32 if there's a non-packed stride, or in groups of 400-500 if the draw data is packed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d3e43baffe	nvc0: adjust indirect draw macros to handle multiple draws at once These are still invoked one at a time, but the underlying macro can handle multiple draws. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	2860f20859	st/mesa: add support for new mesa indirect draw interface This shifts all indirect draws to go through the new function. If the driver doesn't have support for multi draws, we break those up and perform N draws. Otherwise, we pass everything through for just a single draw call. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d67b9ba9a1	gallium: add caps to expose support for multi indirect draws Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	3e11656694	gallium: add sufficient draw interface to allow new indirect features This makes it possible to support indirect multidraws as well as having the number of such draws to come from a separate GPU resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	60d0cfd429	vbo: create a new draw function interface for indirect draws All indirect draws are passed to the new draw function. By default there's a fallback implementation which pipes it right back to draw_prims, but eventually both the fallback and draw_prim's support for indirect drawing should be removed. This should allow a backend to properly support ARB_multi_draw_indirect and ARB_indirect_parameters. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 18:38:45 -05:00
Roland Scheidegger	2923c7a0ed	llvmpipe: do 64bit plane calculations in the sse path The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	fad283ba9e	llvmpipe: don't store eo as 64bit int eo, just like dcdx and dcdy, cannot overflow 32bit. Store it as unsigned though just in case (it cannot be negative, but in theory twice as big as dcdx or dcdy so this gives it one more bit). This doesn't really change anything, albeit it might help minimally on 32bit archs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	b61b9a377e	llvmpipe: use aligned data for the assembly program in setup Back in the day (before `24678700ed`) the values were not actually in a struct but even then I can't see why we didn't simply align the values. Especially since it's trivial to do so. (Not that it actually matters since the code is pretty much unused for now.) Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	9db7309595	draw: initialize prim header flags when clipping lines Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	64da11f052	draw: fix line stippling with unfilled prims The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:13 +01:00
Timothy Arceri	5cf156c6b4	glsl: replace null check with assert This was added in `54f583a20` since then error handling has improved. The test this was added to fix now fails earlier since `01822706ec` Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 09:12:45 +11:00
Nicolai Hähnle	051603efd5	i965: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:12 -05:00
Nicolai Hähnle	1b74c02e83	i915: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:09 -05:00
Nicolai Hähnle	8882b46226	radeon: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:03 -05:00
Nicolai Hähnle	1c2187b1c2	st/mesa: use _mesa_delete_buffer_object This is more future-proof than the current code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-07 17:06:58 -05:00
Nicolai Hähnle	6aed083b93	mesa/bufferobj: make _mesa_delete_buffer_object externally accessible gl_buffer_object has grown more complicated and requires cleanup. Using this function from drivers will be more future-proof. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:05:54 -05:00
Oded Gabbay	f41b6cfb07	llvmpipe: use sse2 conv code for altivec In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-07 22:07:02 +02:00
Marek Olšák	bca18057a3	radeonsi: adjust the parameters of si_shader_dump The function will be extended to dump all binaries shaders will consist of, so si_shader* makes sense here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	0a51b010e5	radeonsi: move si_shader_dump call out of si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	b0df5f4c19	radeonsi: inline si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	c9c031f3d0	radeonsi: move si_shader_dump call out of si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	f8b34fe093	radeonsi: separate shader dumping code to si_shader_dump and *_dump_stats Eventually, I'd like to dump stats for several combined binaries, which is why you don't see a binary parameter in si_shader_dump_stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	ccd7d7e13d	radeonsi: add si_shader_destroy_binary Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	5c9f104567	radeonsi: don't pass si_shader to si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	54ed83669e	radeonsi: move si_shader_binary_upload out of si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	f20a76a4fd	radeonsi: always keep shader code, rodata, and relocs in memory We won't compile shaders in draw calls, but we will concatenate shader binaries according to states in draw calls, so keep the binaries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	63345cfc3a	radeonsi: don't pass si_shader to si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	2d3a96448a	radeonsi: don't pass si_shader to si_shader_binary_read_config Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	20b9b5d7f5	radeonsi: add struct si_shader_config There will be 1 config per variant, which will be a union of configs from {prolog, main, epilog}. For now, just add the structure. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	890873d106	radeonsi: move NULL exporting into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	a72ed2f6bc	radeonsi: move MRT color exporting into a separate function This will be used by a fragment shader epilog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	0ffe3d3772	radeonsi: use EXP_NULL for pixel shaders without outputs This never happens currently. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	677c65968b	radeonsi: only use LLVMBuildLoad once when updating color outputs at the end without LLVMBuildStore. So: - do LLVMBuildLoad - update the values as necessary - export Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	185267a6fd	radeonsi: export "undef" values for undefined PS outputs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	1ce659f820	radeonsi: move MRTZ export into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	5f3e6b5b0f	radeonsi: simplify setting the DONE bit for PS exports First find out what the last export is and simply set the DONE bit there. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	e00f3f23b1	radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	4e597c25c7	radeonsi: write all MRTs only if there is exactly one output This doesn't fix a known bug, but better safe than sorry. Also, simplify the expression in si_shader.c. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	746a7a7498	radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	2cb8bf90cd	radeonsi: determine DB_SHADER_CONTROL outside of shader compilation because the API pixel shader binary will not emulate alpha test one day, so the KILL_ENABLE bit must be determined elsewhere. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	ff7e77724e	tgsi/scan: set which color components are read by a fragment shader This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	18ec76730a	tgsi/scan: fix tgsi_shader_info::reads_z This has no users in Mesa. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00

... 5 6 7 8 9 ...

75866 Commits All Branches Search

75866 Commits

All Branches