Commit Graph

140 Commits

Author SHA1 Message Date
Marek Olšák c63e225bff radeonsi: remove some definitions and helpers from r600_pipe_common.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Nicolai Hähnle bc65dcab3b radeonsi: avoid syncing the driver thread in si_fence_finish
It is really only required when we need to flush for deferred fences.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:11 +01:00
Nicolai Hähnle 1a6d9e087a radeonsi: record and dump time of flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:04 +01:00
Marek Olšák 529cdce799 radeonsi: remove 'Authors:' comments
It's inaccurate. Instead, see the copyright and use "git log" and
"git blame" to know the authorship.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-02 18:19:03 +01:00
Marek Olšák 65f2e33500 radeonsi: import r600_streamout from drivers/radeon
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:55 +02:00
Marek Olšák 3784ce9782 radeonsi: enumerize DBG flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:16 +02:00
Nicolai Hähnle 6b416ec3d6 radeonsi: move and rename scissor and viewport state and functions
v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Marek Olšák 06bfb2d28f r600: fork and import gallium/radeon
This marks the end of code sharing between r600 and radeonsi.
It's getting difficult to work on radeonsi without breaking r600.

A lot of functions had to be renamed to prevent linker conflicts.
There are also minor cleanups.

Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-26 04:21:14 +02:00
Marek Olšák c3ebac6890 radeonsi/gfx9: implement primitive binning
This increases performance, but it was tuned for Raven, not Vega.
We don't know yet how Vega will perform, hopefully not worse.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Marek Olšák 113278ee79 radeonsi: remove Constant Engine support
We have come to the conclusion that it doesn't improve performance.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Nicolai Hähnle 4c3f36ec6b radeonsi: print saved CS to the log context
Use the auto logger facility, so that CS chunks will be interleaved
with other log info.

v2:
- fix some crashes when not using CE
- fix skipping "previous" chunks of current (unflushed) IB
- fix error handling in si_begin_cs_debug

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:14 +02:00
Marek Olšák db039d67aa radeonsi: don't prefetch VBO descriptors if vertex elements == NULL
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-21 23:06:42 +02:00
Marek Olšák 13aa8d3da9 radeonsi: don't use CLEAR_STATE on SI
This fixes random hangs with Unigine Valley.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102201

Fixes: 064550238e ("radeonsi: use CLEAR_STATE to initialize some registers")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 15:59:22 +02:00
Marek Olšák e887c68bd2 radeonsi: add a separate dirty mask for prefetches
so that we don't rely on si_pm4_state_enabled_and_changed, allowing us
to move prefetches after draw calls.

v2: ckear the dirty mask after unbinding shaders

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-08-07 21:12:24 +02:00
Marek Olšák 58d062b87d radeonsi: de-atomize L2 prefetch
I'd like to be able to move the prefetch call site around.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák 1aeafb59e6 radeonsi: print CE IBs into ddebug reports
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Marek Olšák 28c7fbbe0f radeonsi: rely on CLEAR_STATE for clearing UCP and blend color registers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-28 08:03:24 +02:00
Marek Olšák 7c721b28f6 radeonsi: rely on CLEAR_STATE for resetting the framebuffer and sample mask
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-28 08:03:24 +02:00
Marek Olšák c2f82fc1d3 Revert "radeonsi: don't emit partial flushes at the end of IBs (v2)"
This reverts commit c9040dc9e7.

People have reported it causes corruption on VI, and I see GPU hangs
on GFX9.
2017-06-23 19:13:55 +02:00
Marek Olšák c9040dc9e7 radeonsi: don't emit partial flushes at the end of IBs (v2)
The kernel sort of does the same thing with fences.

v2: do emit partial flushes on SI

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Samuel Pitoiset 333c8f65cf radeonsi: add all resident buffers to the current CS
Resident buffers have to be added to every new command stream.
Though, this could be slightly improved when current shaders
don't use any bindless textures/images but usually applications
tend to use bindless for almost every draw call, and the winsys
thread might help when buffers are added early.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Marek Olšák 38828094e9 radeonsi: update si_ce_needed_cs_space
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák edb59ef2dc radeonsi: do only 1 big CE dump at end of IBs and one reload in the preamble
A later commit will only upload descriptors used by shaders, so we won't do
full dumps anymore, so the only way to have a complete mirror of CE RAM
in memory is to do a separate dump after the last draw call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák 2769dadb0f gallium/radeon: always flush asynchronously and wait after begin_new_cs
This hides the overhead of everything in the driver after the CS flush and
before returning from pipe_context::flush.
Only microbenchmarks will benefit.

+2% FPS for glxgears.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Nicolai Hähnle ff39f0d59c radeonsi: emit VS_STATE register explicitly from si_draw_vbo
We will merge other derived state information into this register.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:18 +02:00
Marek Olšák 408f9a1584 radeonsi: atomize the scratch buffer state
The update frequency is very low.

Difference: Only account for the size when allocating a new one and when
            starting a new IB, and check for NULL. (v3)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:29:36 +01:00
Marek Olšák c78177fc64 radeonsi: move VGT_VERTEX_REUSE_BLOCK_CNTL into shader states for Polaris
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák 802fcdc0d2 radeonsi: atomize L2 prefetches
to move the big conditional statement out of draw_vbo

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák e16245b339 radeonsi: turn SDMA IBs into de-facto preambles of GFX IBs
Draw calls no longer flush SDMA IBs. r600_need_dma_space is
responsible for synchronizing execution between both IBs.

Initial buffer clears and fast clears will stay unflushed in the SDMA IB
(up to 64 MB) as long as the GFX IB isn't flushed either.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-05 18:43:24 +01:00
Marek Olšák 29144d0f34 gallium/radeon: stop using PIPE_BIND_CUSTOM
it has no effect whatsoever

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-26 13:02:58 +02:00
Nicolai Hähnle b5cd7dfe3e gallium/radeon: implement set_device_reset_callback
Check for device reset on flush. It would be nicer if the kernel just
reported this as an error on the submit ioctl (and similarly for fences),
but this will do for now.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-05 15:51:56 +02:00
Marek Olšák 3ee9be42ac radeonsi: move VGT_LS_HS_CONFIG to derived tess_state
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:53 +02:00
Marek Olšák a67d81580b radeonsi: remove the cache_flush atom
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-09 22:45:06 +02:00
Marek Olšák fe40a65fb6 radeonsi: skip redundant INDEX_TYPE writes
Ported from Vulkan.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-07 11:13:13 +02:00
Marek Olšák 687c4be9cf gallium/radeon: set VPORT_ZMIN/MAX registers correctly
Calculate depth ranges from viewport states and
pipe_rasterizer_state::clip_halfz.

The evergreend.h change is required to silence a warning.

This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-05 18:01:15 +02:00
Marek Olšák fe91ae06d3 gallium/radeon: unify and simplify checking for an empty gfx IB
We can take advantage of the fact that multi_fence does the obvious thing
with NULL fences.

This fixes unflushed fences that can get stuck due to empty IBs.
2016-08-25 21:19:17 +02:00
Marek Olšák 16d568d911 gallium/radeon: count gfx IB flushes
This will be used as a counter for whether fence_finish needs to flush
the IB.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák c5ff0d3e65 gallium/radeon: move radeon_winsys::cs_memory_below_limit to drivers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák a6bfafa083 gallium/radeon: move last_gfx_fence from radeonsi to common code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03 17:46:46 +02:00
Marek Olšák 1a1cc67edd gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flag
always set

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-19 23:45:06 +02:00
Nicolai Hähnle d938b8c0bf radeonsi: explicitly choose center locations for 1xAA on Polaris
Unlike SC, the small primitive filter does not automatically use center
locations in 1xAA mode, so this is needed to avoid artifacts caused by
the small primitive filter discarding triangles that it shouldn't.

As a side effect of how the effective number of samples is now calculated,
this patch also avoids submitting the sample locations for line/poly smoothing
when they're not really needed.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-08 10:52:50 +02:00
Marek Olšák 28d0d0c5b4 radeonsi: fix fractional odd tessellation spacing for Polaris
ported from Vulkan (and no source explains why this is needed)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 17:36:43 +02:00
Nicolai Hähnle d46a9db840 radeon: check VM faults from DMA flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle 80dd7870fe radeonsi: move gfx fence wait out of si_check_vm_faults
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle ad8438403b radeonsi: extract IB and bo list saving into separate functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:02 +02:00
Bas Nieuwenhuizen 54f755fa0f radeonsi: Reinitialize all descriptors in CE preamble.
This fixes a problem with the CE preamble and restoring only stuff in the
preamble when needed.

To illustrate suppose we have two graphics IB's 1 and 2, which  are submitted in
that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we
have a context switch at the start of IB 1, but not between IB 1 and IB 2.

The old code put the CE RAM loads in the preamble of IB 2. As the preamble of
IB 1 does not have the loads and the preamble of IB 2 does not get executed, the
old values are not load into CE RAM.

Fix this by always restoring the entire CE RAM.

v2: - Just load all descriptor set buffers instead of load and store the entire
      CE RAM.
    - Leave the ce_ram_dirty tracking in place for the non-preamble case.

v3: - Fixed parameter alignment.
    - Rebased to master (Nicolai's descriptor series).

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-10 12:18:29 +02:00
Nicolai Hähnle 89ba076de4 radeon/winsys: introduce radeon_winsys_cs_chunk
We will chain multiple chunks together and will keep pointers to the older
chunks to support IB dumping.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle d6211a61b0 gallium/radeon: use cs_check_space throughout
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Marek Olšák fc4896e686 radeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0
It's not needed since it was fixed in the kernel.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-31 16:41:22 +02:00
Nicolai Hähnle 0558564200 gallium/radeon: add radeon_emitted to check for non-trivial IBs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:39 -05:00