In case blorp needs to configure it will be just as if render or
compute pipeline had configured it.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger
re-emission of the entire 3D pipeline on the next draw. However, there
are some packets BLORP simply leaves alone, so there's no need to
re-emit them. Trying to reduce the set of dirty bits flagged after
BLORP runs is tricky.
Instead, we introduce a BRW_NEW_BLORP flag. This should be set on any
atom which emits a packet that BLORP also emits. When BLORP runs, it
will flag BRW_NEW_BLORP, causing those packets to get re-emitted.
This also makes it easy to avoid re-emitting specific atoms - we can
simply drop the BRW_NEW_BLORP flag on those.
To start, we assume that all packets need to be re-emitted. This is the
safest approach and closest to the existing code's behavior. Many of
these are obviously not required, and can be dropped in subsequent
patches.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
This is identical to the blorp version which only differs in case
fragment shader isn't used. In that case blorp would reset batch
buffer address to zero.
This is not really needed, and having blorp to use base state
address setup that is compatible with normal upload allows one to
skip resetting it.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We're trying to move to more of the new style intrinsics with include
the correct target name, and map directly to ISA instructions.
v2:
- Only do this with LLVM 3.8 and newer.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The range metadata tells LLVM the range of expected values for this intrinsic,
so it can do some additional optimizations on the result.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Although Gen9 samples from most HDR ASTC surfaces of correctly,
there currently are no software workarounds to fix the incorrect
sampling that occurs in others of certain color endpoint modes.
With this change, we are no longer failing the 14 tests from:
dEQP-GLES3.functional.texture.compressed.astc.endpoint_value_hdr_cem_15.*
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Switch boolean template arguments to typename template arguments of type
std::integral_constant<bool, VALUE>.
This allows the template argument unroller to easily be extended to enums.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted
(there is also a bug with RCL tiled blits)
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
There's no reason we couldn't do non-MSAA full resolution tile buffer
load/stores, but we would have claimed buffer overflow was being
attempted. Nothing does this currently.
This was a bug from the MSAA enabling. Tests for surfaces with
nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly
fail out.
Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance.
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
I had made the previous blit fix non-MSAA only because I was thinking
about how the hardware infers stride from the RENDERING_CONFIG packet.
However, I'm also inferring the stride for both MSAA src and dst in
vc4_render_cl.c from the width argument in the ioctl.
Fixes 15 EXT_framebuffer_multisample piglit tests.
On Broadwell, I get the following shader-db statistics:
Tessellation Control Shaders:
total instructions in shared programs: 57327 -> 57012 (-0.55%)
instructions in affected programs: 27334 -> 27019 (-1.15%)
helped: 45
HURT: 0
total cycles in shared programs: 265692 -> 255188 (-3.95%)
cycles in affected programs: 263122 -> 252618 (-3.99%)
helped: 184
HURT: 26
Tessellation Evaluation Shaders:
total instructions in shared programs: 23236 -> 23157 (-0.34%)
instructions in affected programs: 2791 -> 2712 (-2.83%)
helped: 27
HURT: 0
total cycles in shared programs: 151858 -> 149704 (-1.42%)
cycles in affected programs: 151858 -> 149704 (-1.42%)
helped: 101
HURT: 114
Geometry Shaders:
Orbital Explorer goes from 6442 -> 6356 instructions.
Two Shadow of Mordor shaders increase by a single instruction.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
NIR will lower it in nir_opt_algebraic.
No change in shader-db.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The timestamps are stored in a funny place, and even though they are a
64-bit result, are not stored with is64bit. Account for that when
retrieving the query result into a resource.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
This lets us delete some redundant code and keep all of the
image_load_store format lowering logic in one place: libisl.
Reviewed-by: Chad Versace <chad.versace@intel.com>
Previously, we were relying on has_matching_typed_format returning true for
MESA_FORMAT_NONE which, in turn, relied on _mesa_get_format_bytes returning
1 for MESA_FORMAT_NONE. When we switch to ISL, this behaviour will no
longer be something we can rely on.
Reviewed-by: Chad Versace <chad.versace@intel.com>