KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Mark Janes	e67b8f504b	iris: implement iris layer of INTEL_MEASURE Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>	2021-02-01 17:24:57 -08:00
Francisco Jerez	aa78d05a23	iris: Remove depth cache set tracking and synchronization. The depth cache set is now redundant with the more general seqno matrix-based cache tracking mechanism. Removed as a separate patch for bisectability. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	fc221875cf	iris: Introduce cache coherency matrix for batch-local memory ordering. This introduces a representation of the cache coherency status of the GPU at any point in the batch. This is done by defining a matrix C of synchronization sequence numbers such that at any point of batch construction, a memory operation from domain i introduced into the batch is guaranteed to be ordered after any memory operation from domain j in a previous batch section with seqno n if the following condition holds: C_i_j >= n This allows us to efficiently determine whether additional flushing and/or invalidation is required in order to access a buffer object from some arbitrary domain. Except for batch buffer reset which requires clearing the whole matrix, all operations on the matrix are either O(n) or O(1) on the number of caching domains (which is basically constant). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	4b7fd91be6	iris: Report use of any in-flight buffers on first draw call after sync boundary. This is the main performance trade-off of this cache tracking mechanism: In order for the seqno vector of buffer objects to be accurate, they need to be marked as used again every time the batch is split into a new synchronization section if they remain bound to the pipeline. This can be achieved easily by re-using iris_restore_render_saved_bos() and iris_restore_compute_saved_bos(), which currently serve a similar purpose across batch buffer boundaries. The impact on Piglit drawoverhead results seems to be within a standard deviation of the current results. XXX - It might be possible to completely remove the current iris_batch::contains_draw flag at a small additional performance cost. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	eb5d1c2722	iris: Annotate all BO uses with domain and sequence number information. Probably the most annoying patch to review from the whole series -- Mark every buffer object use as accessed through some caching domain with the sequence number of the current synchronization section of the batch. The additional argument of iris_use_pinned_bo() makes sure I'd have gotten a compile error if I had missed any buffer added to the batch validation list. There are only a few exceptions where a buffer is left untracked while adding it to the validation list, justified below: - Batch buffers: These are strictly read-only for the moment. - BLORP buffer objects: Their seqnos are bumped manually at the end of iris_blorp_exec() instead, in order to avoid plumbing domain information through BLORP address combining. - Scratch buffers: The contents of these are strictly thread-local. - Shader images and SSBOs: Accesses of these buffers are explicitly synchronized at the API level. v2: Opt out of tracking more aggressively (Ken): In addition to the above, surface states, binding tables, instructions and most dynamic states are now left untracked, which means a lot more BO uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely carefully, since the cache tracker won't be able to provide any coherency guarantees for them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	8cbe953548	iris: Add infrastructure to partition batch into sync boundaries. This introduces some minimalistic infrastructure which will be used in order to partition the batch into a series of sections, each one with a unique, monotonically-increasing sequence number. Section boundaries will typically lie at points in the batch where the execution and memory coherency status of some previous commands are known, e.g. at batch buffer boundaries or PIPE_CONTROL commands. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	45918e0d8c	iris: Simplify iris_batch_prepare_noop(). This makes iris_batch_prepare_noop() return a boolean instead of passing through the relevant set of dirty flags. It will make it easier to change the representation of dirty flags. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>	2020-06-03 22:22:19 +00:00
Chris Wilson	034329128b	iris: Rename iris_seqno to iris_fine_fence Rename iris_seqno to iris_fine_fence, borrowed from si_fine_fence, to avoid introducing any confusion with any other seqno used for tracking pipelines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5233>	2020-05-28 12:47:19 -07:00
Lionel Landwerlin	07781f0afe	iris: store workaround address This will allow to select a different address later, leaving the beginning of the buffer to some other use. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3203>	2020-05-20 15:58:22 +00:00
Chris Wilson	e31b703c42	iris: Place a seqno at the end of every batch We can use seqno as a basic for fast userspace fences: where we can check a value directly to test for fence completion without having to query using the kernel. To do so we need to write a breadcrumb from the batch and track those writes as the basis for our lightweight fences. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>	2020-05-01 19:00:02 +00:00
Kenneth Graunke	c94379c770	iris: Give up on not passing ice to iris_init_batch We're going to need it to create a uploader in the batch soon. We still avoid storing it, to maintain the charade of separation, and make people think twice about fetching random fields from there and intertwining things even worse. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>	2020-05-01 19:00:02 +00:00
Kenneth Graunke	4a1ed75b85	iris: Rename iris_syncpt to iris_syncobj for clarity. This is just a refcounted wrapper around a drm_syncobj. There is enough terminology going on in the area of synchronization (sync objects, sync files, ...) that I'd rather not invent our own. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>	2020-05-01 19:00:02 +00:00
Mike Blumenkrantz	91375f13ce	iris: move iris_vtable to iris_screen instead of inlining this into every context, now a struct is used in the screen struct to reduce memory usage and simplify a couple of the methods Closes: https://gitlab.freedesktop.org/kwg/mesa/-/issues/6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4376>	2020-04-29 16:59:45 +00:00
Paulo Zanoni	2c82b13c8f	iris: make BATCH_SZ smaller by BATCH_RESERVED bytes Iris allocates gem buffers using buckets of allocation sizes that are page aligned. We always ask for batch buffers of size BATCH_SZ + BATCH_RESERVED, which is not page aligned: we ask for 65552 bytes, which ends up in the bucket of size 81920, resulting in 20% unused space. Adjust things so there is no waste of space: BATCH_SZ + BATCH_RESERVED is now 65536. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4561>	2020-04-15 21:35:14 +00:00
Lionel Landwerlin	4151d84323	iris: add support INTEL_blackhole_render v2: Use a software mechanism to manage blackhole state v3: s/iris_batchbuffer/iris_batch/ (Ken) v4: Fixup state transition mistake (Ken/Lionel) v5: Cleanup iris_batch_flush (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2964> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2964>	2020-02-13 17:05:05 +00:00
Kenneth Graunke	ba148813d7	iris: Support multiple chained batches. There was never much point in artificially limiting chaining to two batches - we can trivially support arbitrary length chains. Currently, we should only ever have 1 or 2, but this may change. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>	2020-01-29 19:53:22 +00:00
Kenneth Graunke	afcb6625e3	iris: Drop 'engine' from iris_batch. For the moment, everything is I915_EXEC_RENDER, so this isn't necessary. But even should that change, I don't think we want to handle multiple engines in this manner. Nowadays, we have batch->name (IRIS_BATCH_RENDER, IRIS_BATCH_COMPUTE, possibly an IRIS_BATCH_BLIT for blorp batches someday), which describes the functional usage of the batch. We can simply check that and select an engine for that class of work (assuming there ever is more than one). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>	2020-01-29 19:53:22 +00:00
Jordan Justen	2e6a7ced4d	iris/gen12: Write GFX_AUX_TABLE base address register Rework: * Move last_aux_map_state to iris_batch. (Nanley, Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Kenneth Graunke	382f92a814	iris: Increase BATCH_SZ to 64kB This seems to improve performance by roughly ~1% across the board. Thanks to Rafael Antognolli and Dan Walsh for their help tuning.	2019-08-06 09:09:26 -07:00
Kenneth Graunke	db878a728c	iris: Make an iris_batch_get_signal_syncpt() helper. This returns a pointer to the signalling syncpt, without incrementing the reference count. This can be useful for comparisons. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-06-26 09:49:01 -07:00
Kenneth Graunke	7d2b54e393	iris: Record state sizes for INTEL_DEBUG=bat decoding. Felix noticed a crash when using INTEL_DEBUG=bat decoding. It turned out that we were sometimes placing variable length data near the end of a buffer, and with the decoder guessing random lengths rather than having an actual count, it was walking off the end and crashing. So this does more than improve the decoder output. Unfortunately, this is a bit more complicated than i965's handling, because we don't have a single state buffer. Various places upload data via u_upload_mgr, and so there isn't a central place to record the size. We don't need to catch every single place, however, since it's only important to record variable length packets (like viewports and binding tables). State data also lives arbitrarily long, rather than being discarded on every batch like i965, so we don't know when to clear out old entries either. (We also don't have a callback when an upload buffer is released.) So, this tracking may space leak over time. That's probably okay though, as this is only a debugging feature and it's a slow leak. We may also get lucky and overwrite existing entries as we reuse BOs, though I find this unlikely to happen. The fact that the decoder works in terms of offsets from a state base address is also not ideal, as dynamic state base address and surface state base address differ for iris. However, because dynamic state addresses start from the top of a 4GB region, and binding tables start from addresses [0, 64K), it's highly unlikely that we'll get overlap. We can always improve this, but for now it's better than what we had.	2019-05-23 08:07:08 -07:00
Kenneth Graunke	c61862ddfc	iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY This provides a way for the application to query whether any resets have happened, which lets us expose "robust" contexts. This also enables the KHR_robust_buffer_access_behavior tests.	2019-05-09 16:49:07 -07:00
Kenneth Graunke	343f41781c	iris: Hook up device reset callbacks This mechanism lets the driver inform the state tracker about GPU resets, say for destroying a robust API context and reporting a "device lost" error to the application, making it take action to deal with this.	2019-05-09 16:49:07 -07:00
Chris Wilson	04ddff1aa4	iris: Wire up EGL_IMG_context_priority Add the missing PIPE_CAP_CONTEXT_PRIORITY_MASK and parsing of the context construction flags. Testcase: piglit/egl-context-priority Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-07 20:27:10 -08:00
Jordan Justen	bd0ad651e0	iris: Always use in-tree i915_drm.h Ref: `f1374805a8` "drm-uapi: use local files, not system libdrm" Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-24 21:06:40 -08:00
Kenneth Graunke	1cd001aa63	iris: Make a iris_batch_reference_signal_syncpt helper function. Suggested by Chris Wilson. More obvious what's going on.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	e092ed9213	iris: Drop dead state_size hash table I inherited this from i965. It would be nice to track the state size so INTEL_DEBUG=color,bat decoding can print the right number of e.g. binding table entries or blend states, but...without a single point of entry for state, it's a little tricky to get right. Punt for now, and drop the dead code in the meantime.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	f1a7392be1	iris: Put batches in an array We keep re-making this array all over the place	2019-02-21 10:26:10 -08:00
Kenneth Graunke	d69bc4ac12	iris: Hang on to the last batch's sync-point, so we can wait on it	2019-02-21 10:26:10 -08:00
Chris Wilson	fae74234d9	iris: Tag each submitted batch with a syncobj (adjusted by Ken to make the signalling sync object immediately on batch reset, rather than batch finish time. this will work better with deferred flushes...)	2019-02-21 10:26:10 -08:00
Kenneth Graunke	3e332af611	iris: Drop vestiges of throttling code	2019-02-21 10:26:10 -08:00
Kenneth Graunke	3455f57575	iris: replace vestiges of fence fds with newer exec_fence API patch by me and Chris Wilson	2019-02-21 10:26:10 -08:00
Kenneth Graunke	587e438128	iris: Print the batch name when decoding	2019-02-21 10:26:09 -08:00
Kenneth Graunke	c3cc525c7a	iris: Cross-link iris_batches so they can potentially flush each other This makes e.g. the render batch aware of the compute batch, so it can ask questions like "is this BO referenced by some other batch?" and do something about that.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	eff081cdd9	iris: Support multiple binder BOs, update Surface State Base Address	2019-02-21 10:26:08 -08:00
Kenneth Graunke	888efcd192	iris: Allow inlining of require/get_command_space eliminates so many callqs for ptr++	2019-02-21 10:26:08 -08:00
Kenneth Graunke	621cb43f41	iris: rename ring to engine makes more sense these days. split from a patch by Chris Wilson	2019-02-21 10:26:08 -08:00
Kenneth Graunke	3f863cf680	iris: fix the validation list on new batches	2019-02-21 10:26:06 -08:00
Kenneth Graunke	a9e357caac	iris: fix release builds	2019-02-21 10:26:06 -08:00
Kenneth Graunke	9ea05ccf1f	iris: completely rewrite binder now we get a new one per batch, and flush if it fills up	2019-02-21 10:26:06 -08:00
Kenneth Graunke	604a1a1614	iris: chaining not growing	2019-02-21 10:26:06 -08:00
Kenneth Graunke	ca735c5e0c	iris: delete growing code and just die for now we need proper batch chaining. without relocations, we can't grow, since we've only allocated so much VMA for the batch, and the mechanism only works if we can pin it at the old address	2019-02-21 10:26:06 -08:00
Kenneth Graunke	c9d9e44720	iris: bits of blorp code	2019-02-21 10:26:06 -08:00
Kenneth Graunke	60d708bb80	iris: copy over i965's cache tracking needed to split out vtbl so I can pipe control without ice	2019-02-21 10:26:06 -08:00
Kenneth Graunke	c75a1254a4	iris: Move get_command_space to iris_batch.c for reuse in blorp. it's a better interface anyway.	2019-02-21 10:26:06 -08:00
Kenneth Graunke	d890aee15d	iris: SBA once at context creation, not per batch hooray!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	bf90d8a125	iris: delete more trash	2019-02-21 10:26:05 -08:00
Kenneth Graunke	65073c2217	iris: hook up batch decoder	2019-02-21 10:26:05 -08:00
Kenneth Graunke	1af84d345a	iris: set EXEC_OBJECT_WRITE	2019-02-21 10:26:05 -08:00
Kenneth Graunke	651be7cf3d	iris: rewrite to use memzones and not relocs	2019-02-21 10:26:05 -08:00

1 2

58 Commits