Commit Graph

122366 Commits

Author SHA1 Message Date
Connor Abbott 274f3815a5 ir3: Plumb through bindless support
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott 7d0bc13fca ir3: LDC also has a destination
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott 1842961e58 ir3: Also don't propagate immediate offset with LDC
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott de7d90ef53 ir3: Plumb through support for a1.x
This will need to be used in some cases for the upcoming bindless
support, plus ldc.k instructions which push data from a UBO to const
registers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott c8b0f90439 ir3: Add bindless instruction encoding
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott 122a900d7d freedreno/a6xx: Add registers for the bindless model
In Vulkan, descriptors for samplers, SSBO's, etc. are collected into
descriptor sets, and shaders can use multiple descriptor sets. At
command-recording time, users can swap out only some of the descriptor
sets, and the driver is supposed to do the minimum amount necessary to
update any internal binding tables, knowing that only some of the
descriptors have changed.

With the old binding model, focused on GL, where there are separate
tables for each type of resource, we can do somewhat better than now by
preserving descriptors from lower descriptor sets when switching higher
descriptor sets. However we still have to copy around descriptors before
each draw.

At least for a6xx, qualcomm went further, essentially copying the Vulkan
binding model as an alternate way to load resources. There's an array of
registers (actually an array for compute and one for everything else),
where each register holds a pointer to a descriptor set that can contain
various different descriptor types. The descriptors are padded out to 16
dwords, so that every instruction can use an index instead of a dword
offset. It's called "bindless", I think, because it can also be used to
implement the old GL bindless extensions (presumably it allows more
samplers and textures than the old model).

This commit adds the register and cmdstream parts. Next up will be the
instruction encoding.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott e088d82aa6 freedreno/a6xx: Add UBO size field
Verified with the vulkan blob, which uses ldc and UBO descriptors, and
turnip will too soon.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott d3b7681df2 tu: ir3: Emit push constants directly
Carve out some space at the beginning for push constants, and push them
directly, rather than remapping them to a UBO and then relying on the
UBO pushing code. Remapping to a UBO is easy now, where there's a single
table of UBO's, but with the bindless model it'll be a lot harder. I
haven't removed all the code to move the remaining UBO's over by 1,
though, because it's going to all get rewritten with bindless anyways.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Connor Abbott 63c2e8137d tu: Dump out shader assembly when requested
We don't use the ir3 variant machinery, so we have to do this ourselves.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Daniel Schürmann d22e2b3bd0 aco: RA - move all std::function objects into proper functions
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann 5351fee56a aco: move all needed helper containers to ra_ctx
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann 2ae27b96ef aco: change live_out variables to std::unordered_set
Improves performance of live_var_analysis for larger shaders

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann acc10a7e51 aco: change some std::map to std::unordered_map in register_allocation
This improves compile times slightly for larger shaders

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann 69b6069dd2 aco: refactor try_remove_trivial_phi() in RA
Minor refactoring to avoid some pointer chasing.
This patch also changes the live_out argument to be
passed by reference to avoid an unnecessary copy.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann b66f474121 aco: improve speed of live_var_analysis
by merging live_sgprs and live_vgprs sets.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann 09850e0a94 aco: during RA only insert into renames table if a variable got renamed
This improves the speed of register allocation.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann 48a74b6815 aco: replace assignment hashmap by std::vector in register allocation
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann ba482c2e5f aco: improve register assignment when live-range splits are necessary
When finding a good place for a register, we can ignore
killed operands.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann fb5a7902f2 aco: improve hashing for value numbering
An improved hashing greatly reduces the number of collisions,
and thus, increases the speed for lookups in the hash table.
The hash function now uses Murmur3 written by Austin Appleby.

This patch also pre-reserves space for the hashmap to avoid rehashing.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann c99107ece0 aco: add explicit padding for all Instruction sub-structs
This patch also adds static_asserts on the size of Instructions
to ensure no internal padding is present.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann 7f962a9362 aco: guarantee that Temp fits in 4 bytes
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Jonathan Marek 2e084c2cb3 turnip: new clear/blit implementation with shader path fallback
The shader path is used to implement the following cases:
* stencil aspect mask on D24S8 (for image_to_buffer,buffer_to_image)
* clear/copy msaa destination (2D engine can't have msaa dest)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek de6967488a turnip: add vk_format_is_snorm/is_float
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek 51fe52d2fd turnip: rework format helpers
* Take tile_mode as input directly
* tu6_format_gmem to tu6_base_format, use may not be limited to GMEM
* Add new helpers that will return the correct tile_mode as for image level
  as part of the format.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek 009082dcff turnip: use dirty bits for dynamic viewport/scissor state
CmdClearAttachments shader path will overwrite this state, so it needs to
be re-emitted with dirty bits in that case.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek ed83281f0c turnip: save attachment samples in renderpass state
This is needed to be able to know the number of samples during
CmdClearAttachments which can be used while the framebuffer is unknown.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek 0637eab678 turnip: disable 8x msaa
Not everything supports 8x msaa, and the blob doesn't support it at all.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek f03e63cd99 turnip: fix nir validate failure from push constant lowering
Fixes newly added checks in nir validate failing.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek 86d1a4c907 turnip: split up gmem/tile alignment
Note: the x1/y1 align in tu6_emit_blit_scissor was broken

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek f494799a7f turnip: RB_CCU_CNTL fixes
* Correct bypass value for a618
* Bypass value for blitter
* Don't set RB_CCU_CNTL again unnecessarily in tu6_emit_binning_pass

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek cca7c29980 freedreno/a6xx: set bypass RB_CCU_CNTL value for blitter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Jonathan Marek e4c05a5335 freedreno/registers: add RB_CCU_CNTL bitfields
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09 14:43:02 +00:00
Samuel Pitoiset 2d8453e6e6 radv: allow TC-compat HTILE with GENERAL outside of render loops
This gives +8% with Wolfeinstein Youngblood on my Vega64, and
according to someone else, it also improves performance with Doom
2016 and Wolfenstein 2 (and probably other ID Tech games).

This improvement is because Youngblood uses GENERAL for the main
depth-only pass and TC-compat HTILE is now enabled with GENERAL if
we know that we are outside of a render loop. This obviously also
reduces the number of HTILE decompressions from/to GENERAL.

Note that Youngblood violates the Vulkan spec regarding render loops
because they are only allowed with input attachments. Expect possible
rendering issues if apps use render loops with the wrong way (ie.
without input attachmens) because HTILE might not be coherent if
a depth-stencil texture is sampled and rendered in the same draw.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2704
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4391>
2020-04-09 12:10:37 +00:00
Samuel Pitoiset 4de84c8cbd radv: only enable TC-compat HTILE for images readable by a shader
If no texture fetches happen it's useless to enable TC-compat HTILE.

Because the driver currently doesn't support TC-compat HTILE for
storage images we don't have to check.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4497>
2020-04-09 11:55:59 +00:00
Samuel Pitoiset 63f07a3047 radv: only expose fp16 control features for chips with double rate fp16
This disables all fp16 shader control features on GFX8 because only
GFX9+ supports double rate packed math.

This improves consistency regarding other AMD Vulkan drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
2020-04-09 13:35:08 +02:00
Samuel Pitoiset 1e4bd1de98 radv: only expose storageInputOutput16 for chips with double rate fp16
This feature allows to use both 16-bit integers and 16-bit floats
as inputs/outputs.

This disables storageInputOutput16 on GFX8 because only GFX9+ supports
double rate packed math.

This improves consistency regarding other AMD Vulkan drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
2020-04-09 13:35:08 +02:00
Samuel Pitoiset 1d74c6565d radv: only expose shaderFloat16 for chips with double rate fp16
This disables shaderFloat16 on GFX8 because only GFX9+ supports
double rate packed math.

This improves consistency regarding other AMD Vulkan drivers and
it makes no sense to enable that feature without packed math.

This also reduces performance with Wolfeinstein Youngblood if
fp16 is forced enabled on GFX8, while it's similar on GFX9.

We might re-introduce that feature in the future with ACO support
if it ends up being faster and correct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
2020-04-09 13:34:36 +02:00
Samuel Pitoiset a3113e07b9 ac,radv: add ac_gpu_info::has_double_rate_fp16
Only GFX9+ support double rate packed math instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
2020-04-09 13:30:54 +02:00
Jonathan Marek 420ca1e4a1 turnip: use buffer size instead of bo size for VFD_FETCH_SIZE
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>
2020-04-09 02:05:52 +00:00
Jonathan Marek e62f8ae15a turnip: improve vertex input handling
Emit vertexBindingDescriptionCount bindings, instead of one per attribute.

Verified with dEQP-VK.pipeline.vertex_input.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>
2020-04-09 02:05:52 +00:00
James Zhu 98743f648a radeonsi: fix Segmentation fault during vaapi enc test
Fix Segmentation fault during vaapi enc test on Arcturus.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by Leo Liu <leo.liu@amd.com>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4472>
2020-04-08 18:11:45 +00:00
Bas Nieuwenhuizen a7e2efa7c9 radv: Use correct buffer count with variable descriptor set sizes.
Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.iublimitlow.frag.ialimitlow.0

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2607
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4489>
2020-04-08 15:26:50 +00:00
Bas Nieuwenhuizen bb7e44a23d radv: Whitespace fixup.
Review comment that I did, but forgot to git add before amending ...

From https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4334

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4488>
2020-04-08 11:18:18 +00:00
Samuel Iglesias Gonsálvez 8b42d26132 radv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>
2020-04-08 11:03:35 +00:00
Samuel Iglesias Gonsálvez cc678c9ce9 radv: check buffer size in vkCreateBuffer()
Fixes:

   dEQP-VK.api.buffer.basic.size_max_uint64

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>
2020-04-08 11:03:35 +00:00
Bas Nieuwenhuizen a3682670c8 radv: Consider maximum sample distances for entire grid.
The other pixels in the grid might have samples with a larger
distance than the (0,0) pixel.

Fixes dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.samples_8_packed
when CTS is compiled with clang.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4480>
2020-04-08 10:53:33 +00:00
Samuel Pitoiset 9f005f1f85 radv: enable lowering of GS intrinsics for the LLVM backend
This replaces emit_vertex with:
   if (vertex_count < max_vertices) {
      emit_vertex_with_counter vertex_count ...
      vertex_count += 1
   }

Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.

pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)

No pipeline-db changes on other generations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
2020-04-08 08:24:05 +02:00
Samuel Pitoiset cd99ea7318 radv: remove radv_layout_has_htile() helper
The goal of this function was to return whether a depth-stencil image
has HTILE, in comparison to radv_layout_is_htile_compressed() which
is used to know whether a depth-stencil image has HTILE compressed.

These two functions are actually similar and they have never been
used for what they were supposed to. Remove radv_layout_has_htile()
in favour of radv_layout_is_htile_compressed() for now. If it's
needed in the future, I will re-introduce this concept properly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
2020-04-08 07:55:16 +02:00
Samuel Pitoiset ffea3e7348 radv: cleanup creating the decompress/resummarize pipelines
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
2020-04-08 07:55:14 +02:00
Samuel Pitoiset 6f6276bd24 radv: rename extra graphics pipeline decompress/resummarize fields
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
2020-04-08 07:55:12 +02:00