KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	5e32cc9192	nv50/ir: fix a comment in canDualIssue() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:50:25 +02:00
Samuel Pitoiset	70834d05cd	nv50/ir: fix SUSTx constraints on Kepler To prevent out-of-bounds access and format mismatch we add a predicate on sustp, but we have to account for it when the sources are condensed because a predicate is a source. Using the range 3:6 will only condense the input data and it's always the case. This also fixes constraints when an indirect access is used. This ensures that sources are correctly aligned. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:06:14 +02:00
Kenneth Graunke	9c0d16adc1	i965: Just read the existing tally on EndTransformFeedback if paused. If the transform feedback object is paused when ending, then there are no new snapshots to add to the tally. In fact, we haven't written a starting snapshot, so we'd best not try and compute (end - start). Just load the existing tally so we can convert it to the number of vertices written and store it to the final result location. This is the Haswell+ equivalent of the previous commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:42 -07:00
Kenneth Graunke	915f7c25fa	i965: Don't write a counter snapshot on EndTransformFeedback if paused. If the transform feedback object is paused, then we've already written an ending counter snapshot. We don't want to write another one. This fixes assertions in GL33-CTS.transform_feedback.api_errors_test, which calls EndTransformfeedback after PauseTransformFeedback. On the next BeginTransformFeedback, we tried to tally up the results, and saw an odd number of snapshots (due to the double-end), and tripped an assertion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:40 -07:00
Kenneth Graunke	47fbe178fa	mesa: Call TransformFeedback driver hooks before setting flags. This way, the driver's EndTransformFeedback() hook can tell whether the transform feedback operation was paused. It's also convenient to have Paused remain false until the driver's PauseTransformFeedback hook finishes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:26 -07:00
Kenneth Graunke	f7eb95a526	nir: Fix crash in nir_lower_wpos_center(). Otherwise we rewrote the fadd to use itself, causing crashes in validation. Instead, start after the last use like we should. A brown paper bag fix. Fixes crashes in several Vulkan tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-20 16:33:24 -07:00
Dave Airlie	0970c563d6	nir: remove dead glsl variables before lowering io. For cull distance GLSL will let unsized unused arrays get into the backend, we should nuke those straight away, to save caring about them later. This fixes: arb_separate_shader_objects/linker/large-number-of-unused-varyings as a side effect (even without culling changes). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 08:56:45 +10:00
Kenneth Graunke	de45da6a8c	spirv: Handle the PixelCenterInteger execution mode. This isn't allowed by Vulkan, but might be useful someday for SPIR-V in OpenGL (if that ever becomes a thing). It's easy enough to hook up, and as precedent, we already do so for OriginLowerLeft. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 14:44:22 -07:00
Kenneth Graunke	9b8b3f7501	i965: Delete dead dFdy flipping code. Rob's nir_lower_wpos_ytransform() pass flips dFdy in the opposite case of what I expected, so we always take the negate_value case. It doesn't really matter. v2: Write src0 before src1 in ADD instructions (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:09 -07:00
Kenneth Graunke	08bc74e694	i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height. Now that we handle flipping and other gl_FragCoord transformations via a uniform, these key fields have no users. This patch actually eliminates the associated recompiles. The Tomb Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable number. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:09 -07:00
Kenneth Graunke	dac10e8a13	i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:08 -07:00
Kenneth Graunke	6e5d86c07a	nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers. nir_lower_wpos_ytransform() is great for OpenGL, which allows applications to choose whether their coordinate system's origin is upper left/lower left, and whether the pixel center should be on integer/half-integer boundaries. Vulkan, however, has much simpler requirements: the pixel center is always half-integer, and the origin is always upper left. No coordinate transform is needed - we just need to add <0.5, 0.5>. This means that we can avoid using (and setting up) a uniform. I thought about adding more options to nir_lower_wpos_ytransform(), but making a new pass that never even touched uniforms seemed simpler. v2: Use normal iterator rather than _safe variant (noticed by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:30:00 -07:00
Kenneth Graunke	12ab7fc6ac	nir: Don't use ffma in nir_lower_wpos_ytransform(). ffma is an explicitly fused multiply add with higher precision. The optimizer will take care of promoting mul/add to fma when it's beneficial to do so. This fixes failures on Gen4-5 when using this pass, as those platforms don't actually implement fma(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	b8b1b1c34c	nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform. These also need flipping! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	4b7577fad8	nir: Make lower_wpos_ytransform_block a void function. The return value was used for the old nir_foreach_block callback system, but at this point it no longer means anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	88ea960aa7	nir: Make nir_lower_wpos_ytransform() match FragCoord by location. gl_FragCoord is a shader input with location == VARYING_SLOT_POS. ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS, but it isn't called gl_FragCoord. We do want to transform it. Matching by location guarantees we catch both. Fixes several fp tests on a branch which uses this pass on i965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	c9192fcbd2	nir: Add interp_var_at_offset flipping. The Y-offset needs flipping as well, similar to ddy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	287f099db1	nir: Fix fddy swizzles in nir_lower_wpos_ytransform(). The original value might have been swizzled. That's taken care of in the fmul source - we don't want to reswizzle it again. Fixes validation failures in glsl-derivs-varyings on a branch of mine which uses this pass in i965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	7fe9a19302	nir: Fix wpos_ytransform lowering state_slot swizzle. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:28:30 -07:00
Kenneth Graunke	1539009bf0	i965: Fix brw_regs_equal() for NaN and positive/negative zero. We'd like the comparisons to mean "the exact same bits". Comparing doubles won't do that for NaN values or positive vs. negative zero. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:28:06 -07:00
Dave Airlie	b19a0d506d	virgl: handle cull distance cap. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 06:19:54 +10:00
Rob Herring	2235b80f2a	virgl: Add missing texture transfer_inline_write transfer_inline_write cannot be NULL and the virgl renderer doesn't support inline writes for textures, so add the default version. This fixes a crash in st_TexSubImage since commit `fb9fe352ea` ("st/mesa: use transfer_inline_write for memcpy TexSubImage path"). Cc: Marek Olšák <marek.olsak@amd.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 06:07:18 +10:00
Kristian Høgsberg Kristensen	12dc89d844	anv: Merge in my TODO list items Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-20 10:35:57 -07:00
Matt Turner	015f2207cf	mesa: Replace uses of Shared->Mutex with hash-table mutexes We were locking the Shared->Mutex and then using calling functions like _mesa_HashInsert that do additional per-hash-table locking internally. Instead just lock each hash-table's mutex and use functions like _mesa_HashInsertLocked and the new _mesa_HashRemoveLocked. In order to do this, we need to remove the locking from _mesa_HashFindFreeKeyBlock since it will always be called with the per-hash-table lock taken. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-20 10:05:09 -07:00
Matt Turner	aded1160e5	hash: Add _mesa_HashRemoveLocked() function. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-20 10:05:09 -07:00
Matt Turner	fb5dcb81cc	i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 10:04:06 -07:00
Mark Janes	9ca5ec2a31	glsl: Guard against NULL dereference This trivially corrects mesa `3ca1c221`, which introduced a check that crashes when a match is not found. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-20 09:52:49 -07:00
Nanley Chery	9b8c4000d0	anv: Enable textureCompressionASTC_LDR on Gen9+ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	0d2847e177	anv/format: Reorder ASTC mappings to match ISL enum ordering Keep the lists consistent for ease of use. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	f3ed3a0a15	genxml: Expand SKL's SurfaceFormat field width for ASTC In the expanded field, only ASTC format enums have the MSB set to 1. Expanding the field width makes the process of handling these formats identical to the way other formats are handled. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	a141576887	isl: Handle npot ASTC block dimensions on Gen9+ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	de86fb875d	isl: Add 2D ASTC format layouts and enums Also, make changes needed for successful compilation and registration as a texture compression mode. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Youry Metlitsky	4e2c9a0435	mesa: Build EGL without X11 headers after interop patchset Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-20 08:44:18 -07:00
Rob Clark	df361fc58c	nir/validate: assume() that hashtable entry exists At this point, it would require a logic error in nir_validate to not have already populated this hashtable entry, but coverity doesn't realize that: CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	fcd6b3f42b	nir: coverity unitialized pointer read Not sure how coverity arrives at the conclusion that we can read comp[j] unitialized (around line 204), other than not being aware that ncomp is greater than 1 so it won't underflow in the 'if (tex->is_array)' case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	53c48feae0	nir: coverity sign-extension fix Not 100% sure, but I think being an unsigned literal will help: CID 1358505 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension: load1->def.num_components with type unsigned char (8 bits, unsigned) is promoted in load1->def.num_components * (load1->def.bit_size / 8) to type int (32 bits, signed), then sign-extended to type unsigned long (64 bits, unsigned). If load1->def.num_components * (load1->def.bit_size / 8) is greater than 0x7FFFFFFF, the upper bits of the result will all be 1. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	bb993da795	nir/glsl_to_nir: quell some uninit_member coverity errors Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	3a1bbd6a0a	freedreno/ir3: need to lower fmod too Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-20 11:13:50 -04:00
Mark Janes	a2d28ddc01	i965: Fix strerror error code sign This trivial fix to error-handling corrects the sign of drm error codes before passing them to strerror. Identified by Coverity: CID1358581	2016-05-20 05:58:18 -07:00
Jason Ekstrand	eb384daae8	nir/spirv: Handle the NonReadable decoration on struct members	2016-05-19 21:18:59 -07:00
Jason Ekstrand	ea8c11fdc2	anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled	2016-05-19 21:18:59 -07:00
Jason Ekstrand	902628bce6	anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled	2016-05-19 21:18:59 -07:00
Jason Ekstrand	23090b51e0	anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment Originally we removed the instruction, changed the source, and then re-inserted it. This works, but nir_instr_rewrite_src is a bit more obviously correct.	2016-05-19 21:18:59 -07:00
Jason Ekstrand	c29ffea6d1	anv/device: Add a boolean for robust buffer access	2016-05-19 21:18:59 -07:00
Jason Ekstrand	d5b4638d6a	anv: Add a TODO file	2016-05-19 20:09:31 -07:00
Dave Airlie	3ca1c2216d	glsl: handle same struct redeclaration (v2) This works around a bug in older version of UE4, where a shader defines the same structure twice. Although we aren't sure this is correct GLSL (it most likely isn't) there are enough UE4 based things out there we should deal with this. This drops the error to a warning if the struct names and contents match. v1.1: do better C++ on record_compare declaration (Rob) v2: restrict this to desktop GL only (Ian) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-20 11:22:52 +10:00
Matt Turner	8a65b5135a	i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz. Ken suggested instead of a big and complicated optimization pass, to just recognize the operations here. It's certainly less code and a lot prettier, but it seems to actually perform worse for currently unknown reasons. total instructions in shared programs: 8923452 -> 8904108 (-0.22%) instructions in affected programs: 814563 -> 795219 (-2.37%) helped: 3336 HURT: 10 total cycles in shared programs: 66970734 -> 66651476 (-0.48%) cycles in affected programs: 10582686 -> 10263428 (-3.02%) helped: 2438 HURT: 691 total spills in shared programs: 1811 -> 1789 (-1.21%) spills in affected programs: 85 -> 63 (-25.88%) helped: 4 total fills in shared programs: 3143 -> 3109 (-1.08%) fills in affected programs: 167 -> 133 (-20.36%) helped: 4 LOST: 2 GAINED: 36 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Matt Turner	75dccf5ac2	i965: Add infrastucture for sample lod-zero operations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Matt Turner	07353599e0	i965/fs: Add and use get_nir_src_imm(). The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Ilia Mirkin	8bf5493899	nvc0: account for shader-allocated local memory needs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-19 20:20:23 -04:00

1 2 3 4 5 ...

81645 Commits All Branches Search

81645 Commits

All Branches