KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Ian Romanick	5116646a76	nir/algebraic: Recognize open-coded fsat with modifiers This change also enables a later change (nir/algebraic: Replace 1-fsat(a) with fsat(1-a)) to affect more shaders. Almost all of the affected shaders are in Bioshock Infinite, and all of those shaders all require GLSL 4.10. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17228584 -> 17228376 (<.01%) instructions in affected programs: 31438 -> 31230 (-0.66%) helped: 105 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.98 x̃: 1 helped stats (rel) min: 0.08% max: 1.53% x̄: 0.73% x̃: 0.70% 95% mean confidence interval for instructions value: -2.20 -1.76 95% mean confidence interval for instructions %-change: -0.80% -0.67% Instructions are helped. total cycles in shared programs: 360936431 -> 360935690 (<.01%) cycles in affected programs: 420100 -> 419359 (-0.18%) helped: 71 HURT: 21 helped stats (abs) min: 1 max: 160 x̄: 19.28 x̃: 10 helped stats (rel) min: <.01% max: 9.78% x̄: 0.95% x̃: 0.48% HURT stats (abs) min: 1 max: 198 x̄: 29.90 x̃: 10 HURT stats (rel) min: 0.05% max: 8.36% x̄: 1.24% x̃: 0.90% 95% mean confidence interval for cycles value: -16.77 0.66 95% mean confidence interval for cycles %-change: -0.85% -0.06% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	c769641c8e	nir/algebraic: Push unary operations into source operands of fsat source Pushing a unary operation, like fneg, into the operation that generates its operand allows the fsat to be applied to the inner instruction instead of on a separate instruction that performs the unary operation. This changes fmul ssa_100, ssa_99, ssa_98 fmov.sat ssa_101, -ssa_100 into fmul.sat ssa_100, -ssa_99, ssa_98 Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown) total instructions in shared programs: 17228658 -> 17228584 (<.01%) instructions in affected programs: 3163 -> 3089 (-2.34%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.51 x̃: 2 helped stats (rel) min: 0.58% max: 9.09% x̄: 3.69% x̃: 3.51% 95% mean confidence interval for instructions value: -1.66 -1.37 95% mean confidence interval for instructions %-change: -4.37% -3.00% Instructions are helped. total cycles in shared programs: 360937144 -> 360936431 (<.01%) cycles in affected programs: 24029 -> 23316 (-2.97%) helped: 47 HURT: 2 helped stats (abs) min: 4 max: 18 x̄: 15.34 x̃: 16 helped stats (rel) min: 0.69% max: 6.18% x̄: 3.78% x̃: 4.27% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.34% max: 0.67% x̄: 0.50% x̃: 0.50% 95% mean confidence interval for cycles value: -16.05 -13.05 95% mean confidence interval for cycles %-change: -4.07% -3.15% Cycles are helped. All Gen7 and earlier platforms had similar results. (Haswell shown) total instructions in shared programs: 13536059 -> 13535884 (<.01%) instructions in affected programs: 8797 -> 8622 (-1.99%) helped: 150 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.40% max: 11.11% x̄: 3.51% x̃: 1.96% 95% mean confidence interval for instructions value: -1.23 -1.11 95% mean confidence interval for instructions %-change: -3.97% -3.05% Instructions are helped. total cycles in shared programs: 357696119 -> 357694193 (<.01%) cycles in affected programs: 50216 -> 48290 (-3.84%) helped: 109 HURT: 14 helped stats (abs) min: 2 max: 92 x̄: 18.97 x̃: 16 helped stats (rel) min: 0.26% max: 19.09% x̄: 7.37% x̃: 5.37% HURT stats (abs) min: 2 max: 26 x̄: 10.14 x̃: 5 HURT stats (rel) min: 0.18% max: 4.73% x̄: 1.84% x̃: 0.92% 95% mean confidence interval for cycles value: -19.27 -12.05 95% mean confidence interval for cycles %-change: -7.34% -5.31% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	3b74790941	nir/algebraic: Recognize open-coded flrp(a, b, fsat(c)) All Gen6+ GPUs had similar results. (Skylake shown) total instructions in shared programs: 15336712 -> 15336622 (<.01%) instructions in affected programs: 3952 -> 3862 (-2.28%) helped: 24 HURT: 0 helped stats (abs) min: 3 max: 5 x̄: 3.75 x̃: 4 helped stats (rel) min: 1.75% max: 2.70% x̄: 2.34% x̃: 2.46% 95% mean confidence interval for instructions value: -4.06 -3.44 95% mean confidence interval for instructions %-change: -2.47% -2.22% Instructions are helped. total cycles in shared programs: 355722052 -> 355721235 (<.01%) cycles in affected programs: 27326 -> 26509 (-2.99%) helped: 20 HURT: 4 helped stats (abs) min: 1 max: 227 x̄: 44.75 x̃: 14 helped stats (rel) min: 0.12% max: 22.95% x̄: 3.83% x̃: 1.23% HURT stats (abs) min: 2 max: 64 x̄: 19.50 x̃: 6 HURT stats (rel) min: 0.21% max: 3.63% x̄: 1.24% x̃: 0.55% 95% mean confidence interval for cycles value: -61.61 -6.47 95% mean confidence interval for cycles %-change: -5.59% -0.39% Cycles are helped. No changes on Ice Lake, Iron Lake, or GM45. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-14 11:38:21 -07:00
Ian Romanick	a79570099b	intel/fs: Allow cmod propagation to instructions with saturate modifier v2: Add unit tests. Suggested by Matt. All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 17229441 -> 17228658 (<.01%) instructions in affected programs: 159574 -> 158791 (-0.49%) helped: 489 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.60 x̃: 1 helped stats (rel) min: 0.07% max: 2.70% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.72 -1.48 95% mean confidence interval for instructions %-change: -0.64% -0.58% Instructions are helped. total cycles in shared programs: 360944149 -> 360937144 (<.01%) cycles in affected programs: 1072195 -> 1065190 (-0.65%) helped: 254 HURT: 27 helped stats (abs) min: 2 max: 234 x̄: 30.51 x̃: 9 helped stats (rel) min: 0.04% max: 8.99% x̄: 0.75% x̃: 0.24% HURT stats (abs) min: 2 max: 83 x̄: 27.56 x̃: 24 HURT stats (rel) min: 0.09% max: 3.79% x̄: 1.28% x̃: 1.16% 95% mean confidence interval for cycles value: -30.11 -19.75 95% mean confidence interval for cycles %-change: -0.70% -0.41% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2019-05-14 11:38:21 -07:00
Ian Romanick	a7724b1cbb	nir/algebraic: Add missing ffma(-1, a, b) pattern All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17229439 -> 17229377 (<.01%) instructions in affected programs: 9859 -> 9797 (-0.63%) helped: 41 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 1.51 x̃: 1 helped stats (rel) min: 0.08% max: 11.54% x̄: 1.65% x̃: 0.67% 95% mean confidence interval for instructions value: -1.88 -1.14 95% mean confidence interval for instructions %-change: -2.48% -0.81% Instructions are helped. total cycles in shared programs: 360944145 -> 360942989 (<.01%) cycles in affected programs: 178167 -> 177011 (-0.65%) helped: 36 HURT: 19 helped stats (abs) min: 1 max: 222 x̄: 38.03 x̃: 5 helped stats (rel) min: 0.01% max: 31.01% x̄: 4.01% x̃: 0.45% HURT stats (abs) min: 1 max: 34 x̄: 11.21 x̃: 6 HURT stats (rel) min: 0.03% max: 2.74% x̄: 0.72% x̃: 0.50% 95% mean confidence interval for cycles value: -36.01 -6.02 95% mean confidence interval for cycles %-change: -4.18% -0.57% Cycles are helped. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:03 -07:00
Ian Romanick	7b4ff6a1af	nir: Mark ffma as 2src_commutative This doesn't make any real difference now, but future work (not in this series) will add a LOT of ffma patterns. Having to duplicate all of them for ffma(a, b, c) and ffma(b, a, c) is just terrible. No shader-db changes on any Intel platform. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Ian Romanick	e049a9c92b	nir: Add support for 2src_commutative ops that have 3 sources v2: Instead of handling 3 sources as a special case, generalize with loops to N sources. Suggested by Jason. v3: Further generalize by only checking that number of sources is >= 2. Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Ian Romanick	ede45bf9cf	nir: Rename commutative to 2src_commutative The meaning of the new name is that the first two sources are commutative. Since this is only currently applied to two-source operations, there is no change. A future change will mark ffma as 2src_commutative. It is also possible that future work will add 3src_commutative for opcodes like fmin3. v2: s/commutative_2src/2src_commutative/g. I had originally considered this, but I discarded it because I did't want to deal with identifiers that (should) start with 2. Jason suggested it in review, so we decided that _2src_commutative would be used in nir_opcodes.py. Also add some comments documenting what 2src_commutative means. Also suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Jason Ekstrand	e99081e76d	intel/fs/ra: Spill without destroying the interference graph Instead of re-building the interference graph every time we spill, we modify it in place so we can avoid recalculating liveness and the whole O(n^2) interference graph building process. We make a simplifying assumption in order to do so which is that all spill/fill temporary registers live for the entire duration of the instruction around which we're spilling. This isn't quite true because a spill into the source of an instruction doesn't need to interfere with its destination, for instance. Not re-calculating liveness also means that we aren't adjusting spill costs based on the new liveness. The combination of these things results in a bit of churn in spilling. It takes a large cut out of the run-time of shader-db on my laptop. Shader-db results on Kaby Lake: total instructions in shared programs: 15311224 -> 15311360 (<.01%) instructions in affected programs: 77027 -> 77163 (0.18%) helped: 11 HURT: 18 total cycles in shared programs: 355544739 -> 355830749 (0.08%) cycles in affected programs: 203273745 -> 203559755 (0.14%) helped: 234 HURT: 190 total spills in shared programs: 12049 -> 12042 (-0.06%) spills in affected programs: 2465 -> 2458 (-0.28%) helped: 9 HURT: 16 total fills in shared programs: 25112 -> 25165 (0.21%) fills in affected programs: 6819 -> 6872 (0.78%) helped: 11 HURT: 16 Total CPU time (seconds): 2469.68 -> 2360.22 (-4.43%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	147665d0a2	intel/fs/ra: Put the VGRFs at the end of the nodes This is slightly less convenient in some places but it will make it much easier when we want to start adding nodes dynamically. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e7b7d572b3	intel/fs/ra: Re-arrange interference setup The old code was arranged by the type of interference being added. It would set up payload registers and then add payload interference for all VGRFs. It would set up MRFs and add MRF interference for all VGRFs. This commit re-arranges things to be organized differently. It first creates and sets up all RA nodes and then groups interference into two new categories: live range and instruction interference. Once all the RA nodes have been set up, it walks the list of VGRFs and sets up their live range interference and then walks the list of instructions and sets up instruction interference. This new arrangement will be advantageous for a future patch but, at the moment, it cuts 2% off the run-time of shader-db on my laptop. Shader-db results on Kaby Lake: total instructions in shared programs: 15311224 -> 15311224 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355544739 -> 355544739 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2523.45 -> 2469.68 (-2.13%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	0fd60e95fb	intel/fs/ra: Do the spill loop inside RA Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	47b1dcdcab	intel/fs/ra: Only add MRF hack interference if we're spilling The only use of the MRF hack these days is for spilling and there we don't need the precise MRF usage information. If we're spilling then we know pretty well how many MRFs are going to be used. It is possible if the only things that are spilled have fewer SIMD channels than the dispatch width of the shader that this may be more MRFs than needed. That's a risk we're willing to takd. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311224 (<.01%) instructions in affected programs: 16664 -> 16788 (0.74%) helped: 1 HURT: 5 total cycles in shared programs: 355543197 -> 355544739 (<.01%) cycles in affected programs: 731864 -> 733406 (0.21%) helped: 3 HURT: 6 The hurt shaders are all SIMD32 compute shaders where we reserve enough space for a 32-wide spill/fill but don't need it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	69878a9bb0	intel/fs/ra: Pull the guts of RA into its own class This accomplishes two things. First, it makes interfaces which are really private to RA private to RA. Second, it gives us a place to store some common stuff as we go through the algorithm. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9e00a251be	intel/fs/ra: Move assign_regs further down in the file It's the main function from which all the other functions are called. It belongs at the bottom. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	5d9ac57c8c	intel/fs/ra: Split building the interference graph into a helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	472ef2f98d	intel/fs/ra: Initialize grf_used with first_non_payload_grf There's no reason why we need to use the calculated payload_node_count value which is just first_non_payload_grf aligned up. The grf_used value will be aligned up to 16 anyway (which is a much bigger alignment) before being handed off to hardware. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	096ad8a809	intel/fs/ra: Stop adding RA interference to too many SENDS nodes We only have one node per VGRF so this was adding way too much interference. No idea how we didn't catch this before. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355543197 (0.02%) cycles in affected programs: 2472492 -> 2547639 (3.04%) helped: 17 HURT: 20 Fixes: `014edff0d2` "intel/fs: Add interference between SENDS sources" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	5911abd76f	util/ra: Assert nodes are in-bounds in add_node_interference Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	88cac12230	intel/fs/ra: Only add dest interference to sources that exist Fixes: `83dedb6354` "i965: Add src/dst interference for certain" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e291cd8a7e	util/ra: Don't destroy the graph in ra_allocate() We want to be able to call ra_allocate() and, when it fails, mutate the graph and try again rather than re-building the graph from scratch. This commit moves all the scratch bits except the final register allocation (which is really an out value not scratch) into sub-structs named "tmp" to make it clear which things are scratch. It also adds bits to the ra_select() initialization loop to initialize things (since we can't trust rzalloc anymore) and copy q_test and forced_reg over. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9040215f5d	util/ra: Add a helper for resetting a node's interference Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	698bb9b984	util/ra: Add helpers for adding nodes to an interference graph Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	6c0f75c953	util/ralloc: Add helpers for growing zero-initialized memory Unfortunately, we can't quite follow the standard C conventions for these because ralloc doesn't know the sizes of pointers. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	6212326941	intel/fs: Stop doing extra RA calls In the last phase of the schedule and RA loop, the RA call is redundant if we spill. Immediately afterwards, we're going to see that we couldn't allocate without spilling and call back into RA and tell it to go ahead and spill. We've known about it for a while but we've always brushed over it on the theory that, if you're going to spill, you'll be calling RA a bunch anyway and what does one extra RA hurt? As it turns out, it hurts more than you'd expect. Because the RA interference graph gets sparser with each spill and the RA algorithm is more efficient on sparser graphs, the RA call that we're duplicating is actually the most expensive call in the RA-and-spill loop. There's another extra RA call we do that's a bit harder to see which this also removes. If we try to compile a shader that isn't the minimum dispatch width and it fails to allocate without spilling we call fail() to set an error but then go ahead and do the first spilling RA pass and only after that's complete do we detect the fail and bail out. By making minimum dispatch widths part of the spill condition, we side-step this problem. Getting rid of these extra spills takes the compile time of a nasty Aztec Ruins shader from about 28 seconds to about 26 seconds on my laptop. It also makes shader-db 1.5% faster Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355468050 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2524.31 -> 2486.63 (-1.49%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	41b310e219	util/ra: Improve the performance of ra_simplify The most expensive part of register allocation is the ra_simplify step which is a fixed-point algorithm with a worst-case complexity of O(n^2) which adds the registers to a stack which we then use later to do the actual allocation. This commit uses bit sets and changes the core loop of ra_simplify to first walk 32-node chunks and then walk each chunk. This lets us skip whole 32-node chunks in one go based on bit operations and compute the minimum q value potentially 32x as fast. Of course, the algorithm still has the same fundamental O(n^2) run-time but the constant is now much lower. In the nasty Aztec Ruins compute shader, this shaves a full four seconds off the 30s compile time for a release build of mesa. In a debug build (needed for accurate stack traces), perf says that ra_select takes 20% of runtime before this patch and only 5-6% of runtime after this patch. It also makes shader-db runs faster. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355468050 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2602.37 -> 2524.31 (-3.00%) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e1511f1d4c	util/ra: Only update q_total if the reg is not assigned We only use q_total if the reg is not assigned so there's no point in updating it if the reg is not assigned. This has no known perf benefit but it will reduce churn in a future commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9d6d1f47e7	util/ra: Only update best_optimistic_node if !progress This shaves about half a second off the 30 second compile time of one of the compute shaders in Aztec ruins. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	de56d3a2d1	util/ra: Make in_stack a bitset in the graph Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	7720ad65ae	util/ra: Get rid of tabs Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Chia-I Wu	34810f4237	virgl: clean up virgl_res_needs_flush Add comments and some minor cleanups. v2: document the function Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> (v1) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2019-05-14 17:00:22 +00:00
Chia-I Wu	08241624ad	virgl: comment on a sync issue in transfers Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Chia-I Wu	76e45534d2	virgl: PIPE_TRANSFER_READ does not imply flush virgl_res_needs_flush should suffice. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Chia-I Wu	9f8521882a	virgl: do not skip readback because of explicit flush Both apps and we (see virgl_buffer_transfer_flush_region) might flush regions that are unmodified. We have to read back for those flushes. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Chia-I Wu	be8eeb3b59	virgl: remove unused virgl_transfer_inline_write It currently has no user and is probably incorrect (resource_wait is required in some more cases). Remove it so that we can focus on transfers first. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Nanley Chery	e81392868e	iris/resource: Drop redundant checks for aux support Drop some checks that are already done by ISL. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	75a3947af4	iris/resource: Fall back to no aux if creation fails No surface requires an auxiliary surface to operate correctly. Fall back to an uncompressed surface if mesa fails to create and allocate an auxiliary surface. This enables adding more restrictions to ISL without having to update iris. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	1423b78633	i965/miptree: Refactor intel_miptree_supports_ccs_e() Update and rename this function to format_supports_ccs_e() to better match its behavior. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	779bd8d332	i965/miptree: Drop intel_*_supports_hiz() intel_tiling_supports_hiz() and intel_miptree_supports_hiz() duplicate much the work done by isl_surf_get_hiz_surf(). Replace them with simple expressions. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	29a13eb71d	isl: Add restrictions to isl_surf_get_hiz_surf() Import some restrictions from intel_tiling_supports_hiz() and intel_miptree_supports_hiz(). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	942755bec4	i965/miptree: Drop intel_*_supports_ccs() intel_tiling_supports_ccs() and intel_miptree_supports_ccs() duplicate much the work done by isl_surf_get_ccs_surf(). Drop them both and index a boolean array to choose CCS_D in intel_miptree_choose_aux_usage(). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	d57242190e	isl: Add restriction and comments to isl_surf_get_ccs_surf() Import some restrictions and comments from intel_miptree_supports_ccs(). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	91a42537d1	i965/miptree: Drop intel_miptree_supports_mcs() This function duplicates much the work done by isl_surf_get_mcs_surf(). Replace it with a simple expression. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	1de089797c	isl: Modify restrictions in isl_surf_get_mcs_surf() Import some restrictions from intel_miptree_supports_mcs() and don't assume that the caller knows which device generations are supported. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	cf758c4182	i965/miptree: Fall back to no aux if creation fails No surface requires an auxiliary surface to operate correctly. Fall back to an uncompressed surface if mesa fails to create and allocate an auxiliary surface. This enables adding more restrictions to ISL without having to update i965. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Mathias Fröhlich	fc455797c1	mesa: Set _NEW_VARYING_VP_INPUTS iff varying_vp_inputs are set. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	b4b1df5a17	mesa: Avoid setting _NEW_VARYING_VP_INPUTS in non fixed function mode. Instead of checking the API variant on entry of set_varying_vp_inputs to check if we can ever be interrested in fixed function processing or not, we can check if we are actually fixed function processing. To check this we can use the immediately updated gl_context::VertexProgram._VPMode value that tells us if we have a user provided shader program or if we are in fixed function processing either through an internal TNL shader of directly through hardware. When doing so, we also need to recheck the varying_vp_inputs variable at the time gl_context::VertexProgram._VPMode is set to VP_MODE_FF. Put asserts at the consumers of gl_context::varying_vp_inputs to make sure gl_context::VertexProgram._VPMode is set to VP_MODE_FF. By that gl_context::varying_vp_inputs should be up to date then. By not looking at the opengl api for this decision we should actually catch more cases where we can avoid setting a state change flag, including the ones where we cannot get into VP_MODE_FF by the choice of the api. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	663f93c869	mesa: Fix test for setting the _NEW_VARYING_VP_INPUTS flag. The precondition stated in the comment is not true. The values mentioned are only set from _mesa_update_state which in turn may not yet be called. For now set the _NEW_VARYING_VP_INPUTS flag a bit more often, we will narrow that down to a minimum again in a later patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	df50af19d3	mesa: Make _mesa_set_varying_vp_inputs static in state.c. Is no longer used outside that file. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	99952579f3	mesa: Fix old outdated variable name in a comment. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00

... 3 4 5 6 7 ...

111195 Commits All Branches Search

111195 Commits

All Branches