iris: Use the hardware blitter for DRI PRIME blits

In a hybrid graphics setup, Mesa allocates two buffers for the window
surface.  The first is what the discrete card renders to; it lives in
VRAM and is usually tiled and possibly compressed.  The second is a
shadow copy that lives in system memory (readable by the integrated
card with the displays); it's usually linear and uncompressed.

Mesa's window system code schedules blits to update the shadow copy
when needed, typically at the end of a frame.  These can be fairly
costly when running a full-screen application at high resolutions.

We'd like to use the blitter for these copies, as it lets us perform
the copy asynchronously, letting the 3D engine race ahead and start
rendering the next frame.  If we used the 3D engine, the next frame
could not start rendering until the PRIME blit finishes, giving us
less time to draw the frame.  Fortunately, Tigerlake introduced new
blitter commands which can operate at full memory bandwidth.

DRI PRIME blits happen via the Gallium blit() hook.  We can detect that
case by looking for the PIPE_BIND_PRIME_BLIT_DST flag on the destination
resource.  This patch detects that case and calls iris_copy_region() on
IRIS_BATCH_BLITTER to handle it.  We know a priori that the blitter can
handle this operation (it's not a scaled blit, the formats match and
should not be 96bpp, there's no combined depth stencil, or other weird
edge cases).  blorp_copy() will also assert that edge cases don't occur.

Together with the next patch, this improves performance on DG1 Hybrid
scenarios by about 5-6%.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13877>
This commit is contained in:
Kenneth Graunke 2022-02-02 20:16:22 -08:00 committed by Marge Bot
parent f9eba6e2b5
commit e3cb620b55
1 changed files with 14 additions and 0 deletions

View File

@ -27,6 +27,7 @@
#include "pipe/p_screen.h"
#include "util/format/u_format.h"
#include "util/u_inlines.h"
#include "util/u_surface.h"
#include "util/ralloc.h"
#include "intel/blorp/blorp.h"
#include "iris_context.h"
@ -401,6 +402,19 @@ iris_blit(struct pipe_context *ctx, const struct pipe_blit_info *info)
return;
}
/* Do DRI PRIME blits on the hardware blitter on Gfx12+ */
if (devinfo->ver >= 12 &&
(info->dst.resource->bind & PIPE_BIND_PRIME_BLIT_DST)) {
assert(!info->render_condition_enable);
assert(util_can_blit_via_copy_region(info, false, false));
iris_copy_region(&ice->blorp, &ice->batches[IRIS_BATCH_BLITTER],
info->dst.resource, info->dst.level,
info->dst.box.x, info->dst.box.y, info->dst.box.z,
info->src.resource, info->src.level,
&info->src.box);
return;
}
if (abs(info->dst.box.width) == abs(info->src.box.width) &&
abs(info->dst.box.height) == abs(info->src.box.height)) {
if (info->src.resource->nr_samples > 1 &&