KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Eric Anholt	d845dca0f5	nir: Make algebraic backtrack and reprocess after a replacement. The algebraic pass was exhibiting O(n^2) behavior in dEQP-GLES2.functional.uniform_api.random.3 and dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 (along with other code-generated tests, and likely real-world loop-unroll cases). In the process of using fmul(b2f(x), b2f(x)) -> b2f(iand(x, y)) to transform: result = b2f(a == b); result = b2f(c == d); ... result = b2f(z == w); -> temp = (a == b) temp = temp && (c == d) ... temp = temp && (z == w) result = b2f(temp); nir_opt_algebraic, proceeding bottom-to-top, would match and convert the top-most fmul(b2f(), b2f()) case each time, leaving the new b2f to be matched by the next fmul down on the next time algebraic got run by the optimization loop. Back in 2016 in `7be8d07732` ("nir: Do opt_algebraic in reverse order."), Matt changed algebraic to go bottom-to-top so that we would match the biggest patterns first. This helped his cases, but I believe introduced this failure mode. Instead of reverting that, now that we've got the automaton, we can update the automaton's state recursively and just re-process any instructions whose state has changed (indicating that they might match new things). There's a small chance that the state will hash to the same value and miss out on this round of algebraic, but this seems to be good enough to fix dEQP. Effects with NIR_VALIDATE=0 (improvement is better with validation enabled): Intel shader-db runtime -0.954712% +/- 0.333844% (n=44/46, obvious throttling outliers removed) dEQP-GLES2.functional.uniform_api.random.3 runtime -65.3512% +/- 4.22369% (n=21, was 1.4s) dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime -68.8066% +/- 6.49523% (was 4.8s) v2: Use two worklists, suggested by @cwabbott, to cut out a bunch of tricky code. Runtime of uniform_api.random.3 down -0.790299% +/- 0.244213% compred to v1. v3: Re-add the nir_instr_remove() that I accidentally dropped in v2, fixing infinite loops. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-26 10:13:46 -08:00
Eric Anholt	90ad6304bf	nir: Refactor algebraic's block walk My motivation was to clarify the changes in the following commit, but incidentally, it reduces runtime of dEQP-GLES2.functional.uniform_api.random.3 (an algebraic-heavy testcase) by -5.39524% +/- 2.21179% (n=15) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-26 10:13:40 -08:00
Connor Abbott	305d1300f9	nir: Maintain the algebraic automaton's state as we work. In order to have nir_opt_algebraic be able to do further algebraic work on the output of a replacement, we need to maintain the automaton's state. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-26 10:13:19 -08:00
Jonathan Marek	2da4a58ed9	etnaviv: support 3d/array/integer formats in texture descriptors Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-26 19:07:04 +01:00
Jonathan Marek	7806e058c9	etnaviv: blt: fix partial ZS clears with TS If not all bits are cleared, then BLT needs to be given the current clear value and not the new one. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-26 19:04:51 +01:00
Daniel Schürmann	7cd548d352	aco: don't value-number instructions from within a loop with ones after the loop. Fixes: Wolfenstein:Youngblood (w/o shader_ballot) dEQP-VK.descriptor_indexing.combined_image_sampler_in_loop_with_lod Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-26 14:39:27 +00:00
Rhys Perry	46420dd294	aco: set dlc/glc correctly for image loads Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-11-26 14:39:27 +00:00
Rhys Perry	37843e454e	aco: allow constant offsets for global/scratch instructions on GFX10 I don't think the bug applies for global/scratch instructions and load_barycentric_at_sample selection expects this feature to work. Fixes various dEQP-VK.pipeline.multisample_interpolation.* tests on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-11-26 14:39:27 +00:00
Bas Nieuwenhuizen	02375b8436	radv: Enable VK_KHR_buffer_device_address. Still no capture/replay or multi device support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-26 11:59:52 +00:00
Samuel Pitoiset	34dd4251e2	radv: fix reporting subgroup size with VK_KHR_pipeline_executable_properties Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-26 10:48:48 +01:00
Bas Nieuwenhuizen	25bc9102d8	radv: Allocate cmdbuffer space for buffer marker write. Fixes: `946193ae00` "radv: add support for VK_AMD_buffer_marker" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-26 09:35:02 +00:00
Gert Wollny	e41958e344	r600: Disable eight bit three channel formats Commit `0899bf55` made some deqp-gles3 tests related to RGB8 PBOs fail on R600 because it exposed PIPE_FORMAT_R8G8B8_UNORM and R600 doesn't propely handle this. Disabling this format also for buffers fixes the issue. In addition, disabling also the related RGB8 integer formats for buffers fixes some deqp-gles3 tests: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb8ui_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_3d Fixes: `0899bf55` st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM Closes #2118 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-26 09:28:52 +01:00
Samuel Pitoiset	f6770b9726	ac/llvm: fix warning in ac_build_canonicalize() ../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’: ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4567 \| return ac_build_intrinsic(ctx, intr, type, params, 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4568 \| AC_FUNC_ATTR_READNONE); \| ~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-26 08:35:10 +01:00
Tapani Pälli	5d58fea660	mapi: add GetInteger64vEXT with EXT_disjoint_timer_query From EXT_disjoint_timer_query spec: "Interaction: This extension adds GetInteger64vEXT if OpenGL ES 3.0 is not supported" See https://github.com/KhronosGroup/OpenGL-Registry/issues/326. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-26 07:41:24 +02:00
Jason Ekstrand	200a3301e2	vulkan: Update the XML and headers to 1.1.129 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 02:48:42 +00:00
Jason Ekstrand	854859fefa	anv/entrypoints: Better handle promoted extensions In the case of promoted extensions we can end up with an entrypoint that we support being an alias of an entrypoint we do not support. For instance, if an extension gets promoted from EXT to KHR, the EXT entry- points may be aliases of the KHR ones. We want to leave everything as EXT until we get around to advertising the KHR so that we don't break things when we update the XML and headers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 02:48:42 +00:00
Jason Ekstrand	121551bfdb	vulkan/enum_to_str: Handle out-of-order aliases The current code can only handle enum aliases if the original enum is declared first followed by the alias as we walk the XML in a linear fashion. This commit allows us to handle aliases where the alias declaration comes before the thing it's aliasing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 02:48:42 +00:00
Kenneth Graunke	f6aa51103b	iris: Update SURFACE_STATE addresses when setting sampler views We may have replaced the backing storage for a texture buffer while it was unbound, at which point iris_rebind_buffer would not have caught it and updated it. We need to ensure that the current resource's address matches the one our SURFACE_STATE points at. If not, update addresses and re-upload the SURFACE_STATE. Shader images and buffers do not suffer from this problem because we re-stream the surface state on every set call, since there isn't a created CSO object for those with a saved SURFACE_STATE. Constant buffers are also currently re-streamed (we pitch the SURFACE_STATE on every set_constant_buffer call). Surfaces would need this treatment (as they're created CSOs) except that we never swap out their backing storage today (we only do it for buffers), so it's OK for now. Fixes misrendering in Unreal 4 demos (Elemental, Matinee Fight Scene). Huge thanks to Andrii Simiklit for tracking down the problem - it was quite difficult to find! Also fixes Andrii's new Piglit test for the bug, 'arb_texture_buffer_object-re-init'. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1365	2019-11-25 15:54:54 -08:00
Kenneth Graunke	060a2c52fa	iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces. When replacing the backing storage for texture buffers, image buffers, and so on, we may need to update the "Surface Base Address" field in any corresponding SURFACE_STATE. This is easier to accomplish if we have a copy on the CPU - we can just compare the current field, update it, and re-upload. This patch adds a CPU-side copy to the new iris_surface_state wrapper struct, and reworks allocation and upload to fill things out on the CPU copy first, then upload that to the GPU when finished. This will be necessary to fix iris_invalidate_resource bugs shortly. Technically, we never replace the backing storage for pipe_surfaces (render targets), so we don't need to make this change there. However, it's nice to have surfaces, sampler views, and image views handled similarly. Plus, if we ever wanted to swap out backing storage for busy textures, we'd need this infrastructure. v2: Properly free memory (caught by Andrii Simiklit)	2019-11-25 15:54:54 -08:00
Kenneth Graunke	2b09e818dc	iris: Create an "iris_surface_state" wrapper struct Today, we only have a state reference to the GPU buffer containing our uploaded SURFACE_STATEs. However, we're going to want a CPU-side copy soon. Making a wrapper struct means we can talk about both together, and also put both in the field called "surface_state".	2019-11-25 15:54:54 -08:00
Kenneth Graunke	4c1f81ad62	iris: Drop 'old_address' parameter from iris_rebind_buffer We can just compare the VERTEX_BUFFER_STATE address field to the current BO's address. When calling rebind, we've already updated the resource to the new buffer, but the state will have the old address.	2019-11-25 15:54:54 -08:00
Kenneth Graunke	518be59c1a	iris: Stop mutating the resource in get_rt_read_isl_surf(). Mutating fields of global resources is generally not safe, and the only reason we were doing it was to avoid passing an extra parameter to the fill_surface_state helper.	2019-11-25 15:54:54 -08:00
Marek Olšák	b02e0d2604	radeonsi/nir: don't run si_nir_opts again if there is no change 0.3% less overhead Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-25 16:48:27 -05:00
Marek Olšák	4675cb2019	radeonsi: initialize the per-context compiler on demand This takes a noticable amount of time in piglit and some tests don't need it. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-25 16:48:27 -05:00
Marek Olšák	f671cc4d95	ac: set swizzled bit in cache policy as a hint not to merge loads/stores LLVM now merges loads and stores for all opcodes, so this must be set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 16:48:27 -05:00
Eric Anholt	8afab607ac	nir: Add a scheduler pass to reduce maximum register pressure. This is similar to a scheduler I've written for vc4 and i965, but this time written at the NIR level so that hopefully it's reusable. A notable new feature it has is Goodman/Hsu's heuristic of "once we've started processing the uses of a value, prioritize processing the rest of their uses", which should help avoid the heuristic otherwise making such systematically bad choices around getting texture results consumed. Results for v3d: total instructions in shared programs: 6497588 -> 6518242 (0.32%) total threads in shared programs: 154000 -> 152828 (-0.76%) total uniforms in shared programs: 2119629 -> 2068681 (-2.40%) total spills in shared programs: 4984 -> 472 (-90.53%) total fills in shared programs: 6418 -> 1546 (-75.91%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v2) v2: Use the DAG datastructure, fold in the scheduling-for-parallelism patch, include SSA defs in live values so we can switch to bottom-up if we want. v3: Squash in improvements from Alejandro Piñeiro for getting V3D to successfully register allocate on GLES3.1 dEQP. Make sure that discards don't move after store_output. Comment spelling fix.	2019-11-25 21:12:21 +00:00
Jonathan Marek	5159db60fc	etnaviv: implement 64bpp clear At the same time, update etna_clear_blit_pack_rgba to work with integer formats. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-25 20:23:22 +01:00
Jonathan Marek	2214f99c07	etnaviv: avoid using RS for 64bpp formats At the same time, this change allows using BLT for 8bpp formats Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-25 20:22:43 +01:00
Christian Gmeiner	92d5e3c692	etnaviv: add support for extended pe formats Use the extended format if an such a format was passed. v1 -> v2: - set FORMAT_MASK bit when using ext PE format as suggested by Wladimir J. van der Laan Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-25 20:12:52 +01:00
Christian Gmeiner	396818fd9d	etnaviv: handle 8 byte block in tiling Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-25 20:11:30 +01:00
Samuel Pitoiset	2af39c719e	radv: select the depth decompress path based on the aspect mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:23 +01:00
Samuel Pitoiset	905c005561	radv: create decompress pipelines for separate depth/stencil layouts No functional changes as the driver still uses the depth+stencil pipeline. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:21 +01:00
Samuel Pitoiset	faa58201f3	radv: rework creation of decompress/resummarize meta pipelines This refactoring will help for creating more decompress pipelines. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:18 +01:00
Samuel Pitoiset	8f0fb38825	radv: set the image view aspect mask before resolves No functional changes, but it will be used to decompress separate depth/stencil aspects. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:16 +01:00
Samuel Pitoiset	9dec90b7bc	radv: set the image view aspect mask during subpass transitions No functional changes because the aspect mask is still not used during image transitions but it will be needed for the separate depth/stencil aspects logic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:13 +01:00
Rhys Perry	459bc77763	aco: enable load/store vectorizer Totals from affected shaders: SGPRS: 1890373 -> 1900772 (0.55 %) VGPRS: 1210024 -> 1215244 (0.43 %) Spilled SGPRs: 828 -> 828 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 252 -> 252 (0.00 %) dwords per thread Code Size: 81937504 -> 74608304 (-8.94 %) bytes LDS: 746 -> 746 (0.00 %) blocks Max Waves: 230491 -> 230158 (-0.14 %) In NeiR:Automata and GTA V, the code decrease is especially large: -13.79% and -15.32%, respectively. v9: rework the callback function v10: handle load_shared/store_shared in the callback Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	0a759c3be6	nir: add load/store vectorizer tests v7: run nir_opt_algebraic v9: rework the callback function v9: update alignment on all loads/stores, even if they're not vectorized v10: add tests for 64-bit offsets v10: add tests for signed offsets Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	ce9205c03b	nir: add a load/store vectorization pass This pass combines intersecting, adjacent and identical loads/stores into potentially larger ones and will be used by ACO to greatly reduce the number of memory operations. v2: handle nir_deref_type_ptr_as_array v3: assume explicitly laid out types for derefs v4: create less deref casts v4: fix shared boolean vectorization v4: fix copy+paste error in resources_different v4: fix extract_subvector() to pass nir_load_store_vectorize_test.ssbo_load_intersecting_32_32_64 v4: rebase v5: subtract from deref/offset instead of scheduling offset calculations v5: various non-functional changes/cleanups v5: require less metadata and preserve more v5: rebase v6: cleanup and improve dependency handling v6: emit less deref casts v6: pass undef to components not set in the write_mask for new stores v7: fix 8-bit extract_vector() with 64-bit input v7: cleanup creation of store write data v7: update align correctly for when the bit size of load/store increases v7: rename extract_vector to extract_component and update comment v8: prevent combining of row-major matrix column acceses v9: rework process_block() to be able to vectorize more v9: rework the callback function v9: update alignment on all loads/stores, even if they're not vectorized v9: remove entry::store_value, since it will not be updated if it's was from a vectorized load v9: fix bug in subtract_deref(), causing artifacts in Dishonored 2 v9: handle nir_intrinsic_scoped_memory_barrier v10: use nir_ssa_scalar v10: handle non-32-bit offsets v10: use signed offsets for comparison v10: improve create_entry_key_from_offset() v10: support load_shared/store_shared v10: remove strip_deref_casts() v10: don't ever pass NULL to memcmp v10: remove recursion in gcd() v10: fix outdated comment v11: use the new nir_extract_bits() v12: remove use of nir_src_as_const_value in resources_different v13: make entry key hash function deterministic v13: simplify mask_sign_extend() v14: add comment in hash_entry_key() about hashing pointers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	b3a3e4d1d2	radv: set alignment for load_ssbo/store_ssbo in meta shaders Otherwise, nir_intrinsic_align() will assert when called on the intrinsics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 13:59:11 +00:00
Rhys Perry	c14f823ee5	nir: add nir_num_variable_modes and nir_var_mem_push_const These will be useful in the upcoming load/store vectorizer. v11: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-25 13:59:11 +00:00
Connor Abbott	01eb6ef870	aco: Make unused workgroup id's 0 It shouldn't matter, but the 1 was leftover from when it was handled together with workgroup_size and num_work_groups. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	bb78f9b4e4	aco: Use common argument handling Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	e7f4cadd02	radv: Replace supports_spill with explict_scratch_args The former was always true and hence dead code. We will want to explicitly declare the ring offset register with ACO, but we also want to declare the scratch offset too, and we can't try to disable it since ACO also supports spilling and the determination of whether spilling has to happen occurs well after setting up registers. So replace supports_spill with something that will actually be used for ACO. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:17:51 +01:00
Connor Abbott	4d6676d78a	aco: Make num_workgroups and local_invocation_ids one argument each To match the LLVM argument setup code. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	a7f1c63442	aco: Split vector arguments at the beginning Due to how LLVM works we have to make some of the FS inputs become vectors, and therefore have to split them early so that they don't take up extra register pressure due to how RA currently works. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	b45c54ff8d	aco: Use radv_shader_args in aco_compile_shader() Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	680b086db1	aco: Constify radv_nir_compiler_options in isel It's already const for everything else. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	66c703b3e8	radv: Move argument declaration out of nir_to_llvm Now it's executed for ACO too. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:17:51 +01:00
Connor Abbott	3b143369a5	ac/nir, radv, radeonsi: Switch to using ac_shader_args Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-11-25 14:17:10 +01:00
Connor Abbott	9885af3bdf	ac: Add a shared interface between radv, radeonsi, LLVM and ACO ac_shader_args will be similar to ac_shader_abi, except for being free from LLVM-specific concepts and therefore capable of being shared between LLVM and ACO. This will help us accomplish a few different things: - Decouple setting up SGPR and VGPR arguments from translating to LLVM, so that we can reference these arguments in NIR lowering passes, which will let us lower e.g. descriptor sets in NIR. - Stop using radv-specific structures for things like determining the chip generation in ACO. In the end, we should replace ac_shader_abi with this structure + driver-specific lowering passes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:12:46 +01:00

1 2 3 4 5 ...

117984 Commits All Branches Search

117984 Commits

All Branches