KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Karol Herbst	78c5336ca9	nouveau: fix nir and TGSI shader cache collision v9: rename variable to driver_flags use constants for shader cache flags Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	f014ae3c7c	nouveau: add support for nir not all those nir options are actually required, it just made the work a little easier. v2: fix asserts parse compute shaders don't lower bitfield_insert v3: fix memory leak v4: don't lower fmod32 v5: set lower_all_io_to_temps to false fix memory leak because we take over ownership of the nir shader merge: use the lowering helper v6: include TGSI debug header for proper assert call add nv50 support v7: fix Automake build v8: free shader only for the set shader type v9: check for IR type inside get_compiler_options squash "nouveau: add env var to make nir default" fix memory leak when creating compute shaders use debug_get_bool_option as it is available in non debug builds return failure if unsupported IR is encountered don't lower fpow in nir lower int 64 divmod inside nir to prevent crashes Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	a211c92c4b	nv50/ir: add lowering helper if we start supporting multiple input IRs we might want to move lowering code into a common place and keep the initial translation simplier. This will also allows us to react on ISA changes more easily. v5: also handle SAT v6: rename type variables fixed lowering of NEG add lowering of NOT v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	a0393010c4	nv50/ir: move common converter code in base class v2: remove TGSI related bits Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	bb50cb66f0	nvc0: print the shader type when dumping headers this makes debugging the shader header a little easier Acked-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:27 +01:00
Bas Nieuwenhuizen	213de3ea99	radeonsi: Remove implicit const cast. Fixes: `b9e02fe138` "gallium: add pipe_grid_info::last_block" Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-03-17 00:07:38 +01:00
Bas Nieuwenhuizen	158d45db0c	gitlab-ci: Build turnip. No autotools build to care about. The half baked turnips param is kind of ugly, but felt like a waste defining more variables for it now. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-16 14:38:51 +00:00
Bas Nieuwenhuizen	42ed6d9789	turnip: Deconflict vk_format_table regeneration Avoids src/freedreno/vulkan/meson.build:42:0: ERROR: Tried to create target "vk_format_table.c", but a target of that name already exists. when building both radv and turnip. Fixes: `26380b3a9f` "turnip: Add driver skeleton (v2)" Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-16 14:38:51 +00:00
Bas Nieuwenhuizen	e1161d2ea7	turnip: Fix GCC compiles. Apparently GCC does not consider static const variables to be integer constants, and hence the array size and the static assert result in compile failures. Fixes: `4b9f967cd1` "turnip: add a more complete format table" Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-16 14:38:51 +00:00
Jason Ekstrand	d3386e73c5	intel/nir: Lower array-deref-of-vector UBO and SSBO loads This fixes a serious performance issue with DXVK: https://github.com/doitsujin/dxvk/issues/937 This was caused by a recent change that to improve performance on RADV which back-fired on ANV and killed performance for some apps: `e5a06d3f4a` Throwing in this bit of lowering lets us come along and CSE those UBO loads (or copy-prop for SSBO load) and get one load where we previously would have gotten several. VkPipeline-db results on Kaby Lake: total instructions in shared programs: 5115361 -> 5073185 (-0.82%) instructions in affected programs: 1754333 -> 1712157 (-2.40%) helped: 5331 HURT: 63 total cycles in shared programs: 2544501169 -> 2481144545 (-2.49%) cycles in affected programs: 2531058653 -> 2467702029 (-2.50%) helped: 9202 HURT: 4323 total loops in shared programs: 3340 -> 3331 (-0.27%) loops in affected programs: 9 -> 0 helped: 9 HURT: 0 total spills in shared programs: 3246 -> 3053 (-5.95%) spills in affected programs: 384 -> 191 (-50.26%) helped: 10 HURT: 5 total fills in shared programs: 4626 -> 4452 (-3.76%) fills in affected programs: 439 -> 265 (-39.64%) helped: 10 HURT: 5 All of the shaders with hurt spilling were in Rise of the Tomb Raider which also had shaders solidly helped in the spilling department. Not shown in those results (because I've not had success dumping the shaders) is Witcher 3 where this reduces spilling and improves over-all perf by around 20-25%. There were no shader-db changes. Apparently, this just isn't a pattern that happens in OpenGL. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: "19.0" mesa-stable@lists.freedesktop.org	2019-03-15 23:10:27 -05:00
Jason Ekstrand	35b8f6f40b	nir: Add a new pass to lower array dereferences on vectors This pass was originally written for lowering TCS output reads and writes but it is also applicable just about anything including UBOs, SSBOs, and shared variables. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 23:10:27 -05:00
Jason Ekstrand	fe9a6c0f14	nir/builder: Add a vector extract helper This one's a tiny bit better than what we had in spirv_to_nir because it emits a binary tree rather than a linear walk. It also doesn't leave around unneeded bcsel instructions for a constant index and returns an undef for constant OOB access. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 23:10:26 -05:00
Gert Wollny	9bb63e9a7c	softpipe: Enable PIPE_CAP_MIXED_COLORBUFFER_FORMATS It seems softpipe actually supports this. This change enables the following piglits as passing without regressions in the gpu test set: gl-3.1-mixed-int-float-fbo gl-3.1-mixed-int-float-fbo int_second fbo-blending-format-quirks Changes for deqp: dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_none_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_none_rbo QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_none_tex QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_rbo_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_tex_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_none_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_none_rbo QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_none_tex QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_rbo_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_tex_none QualityWarning -> Pass dEQP-GLES3.functional.fbo.completeness.samples.rbo0_rbo0_tex Fail -> Pass dEQP-GLES3.functional.fbo.completeness.samples.rbo0_tex_none Fail -> Pass dEQP-GLES3.functional.fbo.completeness.samples.rbo1_rbo1_rbo1 Fail -> Pass dEQP-GLES3.functional.fragment_out.random.* NotSupported -> Pass dEQP-GLES31.functional.shaders.builtin_functions.common.frexp._fragment Fail -> Pass dEQP-GLES31.functional.shaders.builtin_functions.common.frexp._vertex Fail -> Pass dEQP-GLES31.functional.shaders.builtin_functions.precision.frexp._fragment. Fail -> Pass dEQP-GLES31.functional.shaders.builtin_functions.precision.frexp._vertex. Fail -> Pass Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-15 19:04:05 +01:00
Rob Clark	ca11f9263e	freedreno/ir3/cp: fix ldib bug Something that we didn't hit earlier because of the extra shr.b Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-15 10:52:11 -07:00
James Zhu	abfd572bd2	gallium/auxiliary/vl: Change weave compute shader implementation Use 2D_ARRARY instead of RECT to fetch texels for weave compute shader. Problem 2,3: Fixed interpolation issue with weave de-interlace Fixes: `9364d66cb7` (Add video compositor compute shader render) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-03-15 11:53:15 -04:00
James Zhu	a8ee07d83e	gallium/auxiliary/vl: Change grid setting Using draw area for grid setting instead of destination buffer size. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-03-15 11:53:15 -04:00
James Zhu	998dca4dbb	gallium/auxiliary/vl: Increase shader_params size Increase shader_params size to pass sampler data to compute shader during weave de-interlace. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-03-15 11:53:15 -04:00
Marek Olšák	b276e8358a	omx: add a compute path in enc_LoadImage_common Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Marek Olšák	323e7be91c	omx: clean up enc_LoadImage_common - add *pipe - add documentation Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Marek Olšák	b9e02fe138	gallium: add pipe_grid_info::last_block The OpenMAX state tracker will use this. RadeonSI is adapted to use pipe_grid_info::last_block instead of its internal state. Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Alejandro Piñeiro	34b3b92bbe	nir/xfb: move varyings info out of nir_xfb_info When varyings was added we moved to use to dynamycally allocated pointers, instead of allocating just one block for everything. That breaks some assumptions of some vulkan drivers (like anv), that make serialization and copying easier. And at the same time, varyings are not needed for vulkan. So this commit moves them out. Although it seems a little an overkill, fixing the anv side would require a similar, or more, changes, so in the end it is about to decide where do we want to put our effort. v2: (from Jason review) * Don't use a temp variable on the _create methods, just return result of rzalloc_size * Wrap some lines too long. Fixes: `cf0b2ad486` ("nir/xfb: adding varyings on nir_xfb_info and gather_info") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-15 11:59:32 +01:00
Samuel Pitoiset	d5befdbe4a	radv: always load 3 channels for formats that need to be shuffled This fixes a rendering issue with Hellblade and DXVK. Fixes: `a66b186beb` ("radv: use typed buffer loads for vertex input fetches") Reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-15 11:35:52 +01:00
Mathias Fröhlich	ebc15ecde5	mesa: Add assert to _mesa_primitive_restart_index. Make sure the inde_size parameter is meant to be in bytes. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	d66faa54b2	vbo: Fix GL_PRIMITIVE_RESTART_FIXED_INDEX in display list compiles. The maximum value primitive restart index is different for each index data type. Use the appropriate fixed restart index value. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	a503f0562a	vbo: Fix basevertex handling in display list compiles. The standard requires that the primitive restart comparison happens before the basevertex value is added. Do this now, drop a reference to the standard why this happens at this place. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	94b64eb462	mesa: Use mapping tools in debug prints. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	a8183c1334	mesa: Remove _ae_{,un}map_vbos and dependencies. Since mapping and unmapping the buffer objects in a VAO is handled directly from the VAO, this part of the _NEW_ARRAY state is no longer used. So remove this part of array element state. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	b89ae55a70	mesa: Replace _ae_{,un}map_vbos with _mesa_vao_{,un}map_arrays Due to the use of bitmaps, the _mesa_vao_{,un}map_arrays functions should provide comparable runtime efficienty to the currently used _ae_{,un}map_vbos functions. So use this functions and enable further cleanup. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	b43fae364f	mesa: Use _mesa_array_element in dlist save. Make use of the newly factored out _mesa_array_element function in display list compilation. For now that duplicates out the primitive restart logic. But that turns out to need a fix in display list handling anyhow. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	80e319485a	mesa: Factor out _mesa_array_element. The factored out function handles emitting the vertex attributes at the given index. The now public accessible function gets used in the following patches. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	85fd380878	mesa: Implement helper functions to map and unmap a VAO. Provide a set of functions that maps or unmaps all VBOs held in a VAO. The functions will be used in the following patches. v2: Update comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Jason Ekstrand	efa4fc0ebd	st/mesa: Let NIR lower UBO and SSBO access when we have it Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	be2990d8fb	i965: Stop setting LowerBuferInterfaceBlocks Instead, we do UBO and SSBO deref lowering in NIR after we've given it a chance to optimize SSBO access: Shader-db results on Kaby Lake: total instructions in shared programs: 15235775 -> 15235484 (<.01%) instructions in affected programs: 14992 -> 14701 (-1.94%) helped: 19 HURT: 20 total cycles in shared programs: 339220331 -> 339027307 (-0.06%) cycles in affected programs: 79831981 -> 79638957 (-0.24%) helped: 540 HURT: 602 total loops in shared programs: 4402 -> 4348 (-1.23%) loops in affected programs: 186 -> 132 (-29.03%) helped: 27 HURT: 0 total spills in shared programs: 23261 -> 23234 (-0.12%) spills in affected programs: 38 -> 11 (-71.05%) helped: 1 HURT: 0 total fills in shared programs: 31442 -> 31371 (-0.23%) fills in affected programs: 98 -> 27 (-72.45%) helped: 1 HURT: 0 LOST: 12 GAINED: 12 Most of the help and hurt in instruction counts was just churn caused by re-ordering of optimizations and the fact that the NIR deref lowering code is emitting slightly different instructions. Nothing was hurt by more than three instructions and most things weren't helped by more than four. The primary exception to this is one Car Chase shader: shaders/non-free/gfxbench4/carchase/341.shader_test CS SIMD32: 1144 -> 821 (-28.23%) There is also one compute shader in Manhattan 3.1 and a fragment shader in the UE4 Shooter Game demo that now get a loop partially unrolled. Those showed up in the results as hurt instructions but were manually removed to get the results above. The lost/gained was a dozen Car Chase shaders that went from SIMD8 to SIMD16 thanks to improved register pressure: shaders/non-free/gfxbench4/carchase/366.shader_test CS shaders/non-free/gfxbench4/carchase/368.shader_test CS shaders/non-free/gfxbench4/carchase/370.shader_test CS shaders/non-free/gfxbench4/carchase/372.shader_test CS shaders/non-free/gfxbench4/carchase/376.shader_test CS shaders/non-free/gfxbench4/carchase/378.shader_test CS shaders/non-free/gfxbench4/carchase/380.shader_test CS shaders/non-free/gfxbench4/carchase/382.shader_test CS shaders/non-free/gfxbench4/carchase/384.shader_test CS shaders/non-free/gfxbench4/carchase/388.shader_test CS shaders/non-free/gfxbench4/carchase/4.shader_test CS shaders/non-free/gfxbench4/carchase/6.shader_test CS Given how much it appeared to be improved, I ran Car Chase on my laptop. Unfortunately, I wasn't able to see any measurable improvement. It might be helped by 1-2% but it's in the noise. It does render correctly as far as I can tell so the improvement is legitimate. All of the loops that got delete were in dolphin uber shaders. I've had no opportunity to test them for correctness or performance. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	810dde2a6b	glsl/nir: Add a pass to lower UBO and SSBO access Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	77e5ec394e	glsl/nir: Handle unlowered SSBO atomic and array_length intrinsics We didn't have any of these before because all NIR consumers always called lower_ubo_references. Soon, we want to pass the derefs straight through to NIR so we need to handle these intrinsics directly. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	76ba225184	glsl/nir: Set explicit types on UBO/SSBO variables We want to be able to use variables and derefs for UBO/SSBO access in NIR. In order to do this, the rest of NIR needs to know the type layout information. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	8f3ab8aa78	glsl: Don't lower vector derefs for SSBOs, UBOs, and shared All of these are backed by some sort of memory so if you have multiple threads writing to different components of the same vector at the same time, the load-vec-store pattern that GLSL IR emits won't work. This shouldn't affect any drivers today as they all call GLSL IR lowering which lowers access to these variables to index+offset intrinsics before we get to this point. However, NIR will start handling the derefs itself and won't want the lowering. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	3c11fc7654	nir/lower_io: Add a new buffer_array_length intrinsic and lowering Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	c8d42c8cf6	nir: Rename nir_address_format_vk_index_offset to not be vk It's just a 32-bit index and offset. We're going to want to use it in GL as well so stop talking about Vulkan. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	60af3a93e9	nir/deref: Consider COHERENT decorated var derefs as aliasing If we get to two deref_var paths with different variables, we usually know they don't alias. However, if both of the paths are marked coherent, we don't have to worry about it. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	8b073832ff	compiler/types: Add helpers to get explicit types for standard layouts We also need to modify the current size/align helpers to not blow up when they encounter an explicitly laid out type. Previously we considered using the size/align helpers mutually exclusive with standard layouts but now we just assert that they match. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	5b2b144566	compiler/types: Add a C wrapper to get full struct field data Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	ef4ca44780	compiler/types: Add a new is_interface C wrapper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	b315f6f82b	nir/validate: Allow 32-bit boolean load/store intrinsics With UBOs and SSBOs we have boolean types but they're actually 32-bit values. Make the validator a little less strict so that we can do a 32-bit load/store on boolean types. We're about to add a lowering pass called gl_nir_lower_buffers which will lower boolean load/store operations to 32-bit and insert i2b and b2i instructions to convert to/from 1-bit booleans. We want that to be legal. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	5d26f2d3d5	nir/validate: Only require bare types to match for copy_deref If we want to be able to use copy_deref instructions on explicitly laid out types, we have to be a little more flexible about what types we allow. Instead, of requiring the types to exactly match, only require the bare types to match. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	2b76de9b5d	nir/algebraic: Add a couple optimizations for iabs and ishr Shader-db results on Kaby Lake: total instructions in shared programs: 15225213 -> 15222365 (-0.02%) instructions in affected programs: 43524 -> 40676 (-6.54%) helped: 203 HURT: 0 Lots of shaders in Shadow Warrior had this pattern along with Deus Ex, Civ, Shadow of Mordor, and several others. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-15 01:02:19 +00:00
Eric Anholt	0803bef006	mesa/st: Fix leaks of TGSI tokens in VP variants. Starting a glxgears and closing it, I was seeing a lot of leaked TGSI for the fixed function VPs. v2: drop unused delete_ir() arg. Fixes: `3b4929ec6e` ("st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-14 16:18:59 -07:00
Eric Anholt	e0806c1ea0	mesa/st: Make sure that prog_to_nir NIR gets freed. GLSL NIR gets freed on relink by _mesa_delete_program(), but for ARB programs we need to free the old NIR when PSN is used to set up new NIR in the same gl_program. Additionally, set the base .nir field so that it will get freed by _mesa_delete_program(). Fixes: `3d7611e9a6` ("st/nir: use NIR for asm programs") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-14 16:18:38 -07:00
Alyssa Rosenzweig	1ea42894c7	panfrost/midgard: Implement fpow We have a native op for this, which was just found in a disassembly -- so instead of lowering, use it! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:50:24 +00:00
Alyssa Rosenzweig	2eb65c2173	panfrost: Compute viewport state on the fly Previously, we were caching this incorrectly; there's no real reason to given how variable it is (sensitive to changes in viewport, framebuffer dimensions, and scissors) and how cheap it is to recompute. So, just do it on the fly each draw. Fixes glmark-es2 -bshadow and -brefract. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:12 +00:00

1 2 3 4 5 ...

109272 Commits All Branches Search

109272 Commits

All Branches