KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Emil Velikov	a0d9279e3b	docs: add news item and link release notes for 11.1.4/11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:28:20 +01:00
Emil Velikov	0c5752b672	docs: add sha256 checksums for 11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:08 +01:00
Emil Velikov	f746aa348e	docs: add release notes for 11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:07 +01:00
Emil Velikov	596c881162	docs: add sha256 checksums for 11.1.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:04 +01:00
Emil Velikov	f93d8a885c	docs: add release notes for 11.1.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:02 +01:00
Jose Fonseca	c521f2d737	scons: Improve Python module dependency discovery. Several NIR scripts were using `from ... import ...` syntax, which wasn't supported. Using Python standard libary's modulefinder solves the problem with less effort and hacks. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-09 14:19:24 +01:00
Marek Olšák	172bfdaa9e	r300g: add support for PIPE_FORMAT_x8R8G8B8_* And set endian swap for packed formats the way it should be done in theory. This allows big endian to work again, but it can still be buggy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-09 13:11:40 +02:00
Daniel Stone	e54b2e902a	Revert "i965: Always use Y-tiled buffers on SKL+" This commit broke Weston, Mutter, and xf86-video-modesetting, on KMS. In order to use Y-tiled buffers, the kernel requires the tiling mode to be explicitly named through the I915_FORMAT_MOD_Y_TILED AddFB2 modifier; it disallows any attempt to infer the buffer's tiling mode. As the GBM API does not have a way to extract modifiers for a buffer, this commit broke all users of GBM on SKL+. Revert it for now, until we get a way to extract modifier information from GBM, and also let GBM users inform the implementation that it intends to use the modifiers. This reverts commit `6a0d036483`. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Hans de Goede <hdegoede@redhat.com>	2016-05-09 10:35:55 +01:00
Dave Airlie	920d78a32c	mesa/shader_query: add missing subroutines cases ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types. Fixes: GL43-CTS.program_interface_query.subroutines-vertex GL43-CTS.program_interface_query.subroutines-tess-control GL43-CTS.program_interface_query.subroutines-tess-eval GL43-CTS.program_interface_query.subroutines-geometry GL43-CTS.program_interface_query.subroutines-fragment GL43-CTS.program_interface_query.subroutines-compute Reviewed-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-09 06:30:52 +10:00
Kenneth Graunke	742bc53d04	spirv: Fix structure splitting with per-vertex interface arrays. We want to use interface_type, not vtn_var->type. They're normally equivalent, but for geometry/tessellation per-vertex interface arrays, we need to unwrap a level. Otherwise, we tried to iterate a structure members but instead used an array length. If the array length was longer than the number of fields in the structure, we'd crash. Fixes the CreatePipelineGeometryInputBlockPositive layer validation test. v2: Just use glsl_without_array() on the vtn_var type (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-07 15:44:41 -07:00
Kenneth Graunke	1896682d27	compiler: Add a C wrapper for glsl_type::without_array(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-07 15:44:41 -07:00
Nicolai Hähnle	b9e6e8e7d4	radeonsi: fix undefined behavior (memcpy arguments must be non-NULL) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	146927ce7b	radeonsi: fix some reported undefined left-shifts One of these is an unsigned bitfield, which I suspect is a false positive, but gcc 5.3.1 complains about it with -fsanitize=undefined. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	60d2fc233b	gallium/radeon: clean left-shift undefined behavior Shifting into the sign bit of a signed int is undefined behavior. Unfortunately, there are potentially many places where this happens using the register macros. This commit is the result of running sed -ie "s/(((\(\w\+\)) & 0x\(\w\+\)) << \(\w\+\))/(((unsigned)(\1) \& 0x\2) << \3)/g" on all header files in gallium/{r600,radeon,radeonsi}. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	62b7958cd0	gallium: fix various undefined left shifts into sign bit Funnily enough, some of these were turned into a compile-time error by gcc with -fsanitize=undefined ("initializer is not a constant"). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	945c6887ab	compiler/glsl: do not downcast list sentinel This crashes gcc's undefined behaviour sanitizer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:58 -05:00
Nicolai Hähnle	bdad1393a0	mesa/main: fix another undefined left shift Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:45:04 -05:00
Nicolai Hähnle	3e1cf8bf3f	mesa/main: define _NEW_xxx flags as unsigned shifts Since 1 << 31 complains about undefined behaviour; the others are changed only for consistency. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:44:33 -05:00
Bas Nieuwenhuizen	6291f19f71	radeonsi: Compute correct LDS size for fragment shaders. No sure where the 36 came from, but we clearly need at least 48 bytes per attribute per primitive. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-06 21:40:17 +02:00
Eric Anholt	a1f698881e	vc4: Add support for loading immediate values in QIR. This will be used for resetting the uniform stream in the presence of branching, but may also be useful as an optimization to reduce how many uniforms we have to copy out per draw call (in exchange for increasing icache pressure).	2016-05-06 10:25:55 -07:00
Eric Anholt	890dc19eeb	vc4: Make vc4_qpu_validate() produce more verbose failures. Seeing the expansion of a QPU_GET_FIELD in an assert isn't very informative, and it's hard find what's going wrong without getting a dump of the instruction that failed.	2016-05-06 10:25:55 -07:00
Eric Anholt	8e2d0843c0	vc4: Add a small QIR validate pass. This has caught a couple of bugs during loop development so far, and I should probably have written it long ago.	2016-05-06 10:25:55 -07:00
Eric Anholt	daaa9d579d	vc4: Fix the src count on exp2/log2. Found by the upcoming QIR validate pass.	2016-05-06 10:25:55 -07:00
Eric Anholt	d36b28402f	vc4: Reuse QPU disasm's cond flags in QIR. In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.	2016-05-06 10:25:55 -07:00
Eric Anholt	419fee92ee	vc4: When emitting an instruction to an existing temp, mark it non-SSA. Prevents a bug in the later control-flow support series.	2016-05-06 10:25:55 -07:00
Eric Anholt	1387e722cd	vc4: Make sure that we don't overwrite the signal for PROG_END. We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.	2016-05-06 10:25:55 -07:00
Samuel Pitoiset	44de03b0f8	nvc0: unreference images when the context is destroyed Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-06 15:15:32 +02:00
Jose Fonseca	8ae78f7d28	nir: Remove spurious return from void function. Left over from `450c061362`. Trivial. Built locally with clang and gcc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95296	2016-05-06 12:03:34 +01:00
Marek Olšák	901f57dff5	radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4 Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-06 12:56:47 +02:00
Marek Olšák	4489d75a58	r600g: use the hw MSAA resolving if formats are compatible This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-06 12:56:47 +02:00
Kenneth Graunke	bd326c229c	Revert "i965: Switch to scalar TCS by default." This reverts commit `b593737ed8`. Apparently it causes GPU hangs on some image load store tests. Let's turn it back off until we figure out why.	2016-05-05 18:03:23 -07:00
Leo Liu	fef0e993a1	st/omx/enc: fix incorrect reference picture order for B frames Stacking frames is for driver that's capable to do dual instances encoding. Such feature is not enabled for B frames currently. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-05 19:26:43 -04:00
Jason Ekstrand	7bc987abe0	i965/fs: Move handling of samples_identical into the switch statement This is where we handle texop_texture_samples so it makes things more consistent.	2016-05-05 16:25:21 -07:00
Jason Ekstrand	3ba228f997	i965/fs: Simplify texture destination fixups There are a few different fixups that we have to do for texture destinations that re-arrange channels, fix hardware vs. API mismatches, or just shrink the result to fit in the NIR destination. These were all being done in a somewhat haphazard manner. This commit replaces all of the shuffling with a single LOAD_PAYLOAD operation at the end and makes it much easier to insert fixups between the texture instruction itself and the LOAD_PAYLOAD. Shader-db results on Haswell: total instructions in shared programs: 6227035 -> 6226669 (-0.01%) instructions in affected programs: 19119 -> 18753 (-1.91%) helped: 85 HURT: 0 total cycles in shared programs: 56491626 -> 56476126 (-0.03%) cycles in affected programs: 672420 -> 656920 (-2.31%) helped: 92 HURT: 42	2016-05-05 16:25:21 -07:00
Jason Ekstrand	7de0ae634e	i965/fs: stop inclinding glsl/ir.h in brw_fs.h We are no longer using anything from GLSL IR in the FS backend.	2016-05-05 16:25:21 -07:00
Jason Ekstrand	a815499294	i965/fs: Merge nir_emit_texture and emit_texture The fs_visitor::emit_texture helper originated when we still had both NIR and IR visitors for the FS backend. Since the old visitor was removed, emit_texture serves no real purpose beyond arbitrarily splitting heavily-linked code across two functions.	2016-05-05 16:25:21 -07:00
Connor Abbott	4fab8dd5ea	nir: remove now-unused nir_foreach_block*_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:42 -07:00
Connor Abbott	7c36f9eb52	vc4: fixup for new nir_foreach_block() Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-05 16:19:41 -07:00
Connor Abbott	582815d9ea	ir3: fixup for new nir_foreach_block()	2016-05-05 16:19:41 -07:00
Jason Ekstrand	31fc4a2528	nir/lower_double_ops: fixup for new nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Jason Ekstrand	450c061362	nir/lower_double_pack: fixup for new nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Jason Ekstrand	8c807cc2a6	nir/gather_info: fixup for new foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	331b9f73a2	nir/lower_two_sided_color: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	d40fbbc27e	nir/lower_tex: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	8a7fe634d2	nir/lower_outputs_to_temporaries: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Kenneth Graunke	b593737ed8	i965: Switch to scalar TCS by default. Normally, we expect SIMD8 shaders to be more instructions than SIMD4x2 shaders, as it takes four instructions to operate on a vec4, rather than a single instruction. However, the benefit is that it can process 8 objects per shader thread instead of 2. Surprisingly, the shader-db statistics show an improvement in both instruction and cycle counts: Synmark: -31.25% instructions, -29.27% cycles, 0 hurt. Tessmark: -36.92% instructions, -37.81% cycles, 0 hurt. Unigine Heaven: -3.42% instructions, -17.95% cycles, 0 hurt. Shadow of Mordor: +13.24% instructions (26 with fewer instructions, 45 with more), -5.23% cycles (44 with fewer cycles, 27 with more cycles). Presumably, this is because the SIMD8 URB messages are a much more natural fit than the SIMD4x2 URB messages - there's a ton less header setup. I benchmarked Shadow of Mordor and Unigine Heaven on my Skylake GT3e, and the performance seems to be the same or increase ever so slightly (< 1 FPS difference). So I believe it's strictly superior. There's also a lot more optimization potential we can do in scalar mode. This will also help us finish fp64 support, as scalar support is going to land much sooner than vec4-mode support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	bc0062c54a	nir: Optimize out stores of undefs. There are a couple of cycle count changes in shader-db, but it's basically a wash. However, with the Broadwell scalar TCS backend enabled, many Shadow of Mordor shaders benefit from this patch. Because we don't batch up output writes for TCS, vec4 outputs might not have all components defined. Many output writes have a value of undef, which is useless. With scalar TCS, stats for tessellation shaders on Broadwell: total instructions in shared programs: 1283000 -> 1280444 (-0.20%) instructions in affected programs: 34302 -> 31746 (-7.45%) helped: 71 HURT: 0 total cycles in shared programs: 10798768 -> 10780682 (-0.17%) cycles in affected programs: 158004 -> 139918 (-11.45%) helped: 71 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	c7a8b32700	nir: Replace vecN(undef, undef, ...) with a single undef. shader-db statistics on Broadwell: total instructions in shared programs: 8963409 -> 8962455 (-0.01%) instructions in affected programs: 60858 -> 59904 (-1.57%) helped: 318 HURT: 0 total cycles in shared programs: 71408022 -> 71406276 (-0.00%) cycles in affected programs: 398416 -> 396670 (-0.44%) helped: 199 HURT: 51 GAINED: 1 The only shaders affected were in Dota 2 Reborn. It also sets up for the next optimization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	49ea7454a1	nir: Rename opt_undef_alu to opt_undef_csel; update comments. This better reflects what it does. I plan to add other ALU optimizations as well, so the old name would be confusing. In preparation for that, also move the file comments about csels above the opt_undef_csel function, and delete the ones about there not being other optimizations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	a808ba5965	i965: Rework passthrough TCS checks. According to Timothy, using program_string_id == 0 to identify the passthrough TCS is going to be problematic for his shader cache work. So, change it to strcmp() the name at visitor creation time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00

... 5 6 7 8 9 ...

81381 Commits All Branches Search

81381 Commits

All Branches