Commit Graph

86 Commits

Author SHA1 Message Date
Dave Airlie 127bcbed18 gallivm/st/lvp: add flags arg to get_query_result_resource api.
Currently this just has wait, but in order to get the right answer
for vulkan partial, lavapipe/llvmpipe need to pass a partial flag
through here in the future.

This just changes the API so that's possible.

v2: use an enum (zmike)

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15009>
2022-02-15 10:12:01 +10:00
Paulo Zanoni b2c811ccdc iris: syncobjs are now owned by bufmgr instead of screen
The next patches will justify the new ownership. We want the BOs to
have references on the batches' syncobjs so we can implement implicit
tracking. In other words: BOs will be able to wait on syncobjs owned
by different screens. Since our syncobjs are actually just a Kernel
handle with a refcount, they can be used globally and it makes more
sense to map them to the bufmgr, just like the BOs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12363>
2021-09-01 21:48:13 +00:00
Nanley Chery 2944f49610 intel: Parse INTEL_NO_HW for devinfo construction
This commit does several things:

* Unify code common to several drivers by evaluating INTEL_NO_HW within
  intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
  separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
  code simplification, but we may find reason to undo this later on.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
2021-08-24 00:12:47 +00:00
Anuj Phogat 61e8636557 intel: Rename gen_device prefix to intel_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat 9da8a55b08 intel: Rename GEN_GEN macro to GFX_VER
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN_GEN" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_GEN/GFX_VER/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:06 +00:00
Lionel Landwerlin 33bc2977e5 intel/mi_builder: use device info to use the right CS prefetch size
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9679>
2021-03-18 20:08:45 +00:00
Kenneth Graunke 1b1c857248 iris: Make various classes inherit from u_threaded_context base classes
u_threaded_context requires various objects to inherit from a new
threaded_foo base class rather than directly from pipe_foo.  This
patch does most of the mechanical changes required for that.

It also initializes the new threaded_resource fields.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:21 -08:00
Jason Ekstrand 1e53e0d2c7 intel/mi_builder: Drop the gen_ prefix
mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining
us anything except extra characters.  It's possible that MI_ may
conflict a tiny bit with GenXML but it doesn't seem to be a problem
today and we can deal with that in the future if it's ever an issue.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>
2021-03-04 15:14:27 +00:00
Nanley Chery 04ac3a6620 iris: Delete iris_resolve_conditional_render
This function has no more users.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7762>
2021-01-08 23:23:31 +00:00
yshi18 3aaac40b12 iris: fix memleak for query_buffer_uploader
In the Chrome WebGL Aquarium  stress test, 20 instances of Chrome will run
Aquarium  simultaneously over 20+ hours. That causes Chrome crash.
During the stress, glBeginQueryIndexed is called frequently.

1.Each query will only use 32 bytes from query_buffer_uploader. After the offset
exceed 4096, it will alloc new buffer for query_buffer_uploader->buffer
and release the old buffer.

2.But iris_begin_query will call u_upload_alloc when the offset changed, and it
will increase the query_buffer_uploader->buffer->reference.count every time
when it called u_upload_alloc.

3.So when u_upload_release_buffer try to release the resource of
query_buffer_uploader->buffer, its reference.count is
already equal to 129. pipe_reference_described will only decrease its reference
count to 128.So it never called old_dst->screen->resource_destroy.

4.The old resouce bo will never be freeed. And chrome will called mmap every time
when it alloc new resource bo.

5. Chrome process map too many vmas in its process. Its map count exceed the
sysctl_max_map_count which is 65530 defined in kernel.

6. When iris_begin_query want to alloc new resource bo, it will meet NULL pointer
because mmap return failed. Finally chrome crashed when it access this NULL resource
bo.

The fix is decrease the reference count in iris_destroy_query.

Patch is verified by chrome webgl Aquarium test case for more than 72 hours.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Yang Shi <yang.a.shi@intel.com>
Reviewed-by: Alex Zuo <alex.zuo@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7890>
2020-12-08 11:48:53 +00:00
Francisco Jerez ae88e79f69 iris: Drop redundant iris_address::write flag.
The write flag is redundant since it can be inferred easily from the
iris_address::access domain.  This allows the iris_address struct to
be laid out more efficiently in memory, leading to a measurable
improvement in several Piglit Drawoverhead test-cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
2020-06-03 23:12:22 +00:00
Francisco Jerez eb5d1c2722 iris: Annotate all BO uses with domain and sequence number information.
Probably the most annoying patch to review from the whole series --
Mark every buffer object use as accessed through some caching domain
with the sequence number of the current synchronization section of the
batch.  The additional argument of iris_use_pinned_bo() makes sure I'd
have gotten a compile error if I had missed any buffer added to the
batch validation list.

There are only a few exceptions where a buffer is left untracked while
adding it to the validation list, justified below:

 - Batch buffers: These are strictly read-only for the moment.

 - BLORP buffer objects: Their seqnos are bumped manually at the end
   of iris_blorp_exec() instead, in order to avoid plumbing domain
   information through BLORP address combining.

 - Scratch buffers: The contents of these are strictly thread-local.

 - Shader images and SSBOs: Accesses of these buffers are explicitly
   synchronized at the API level.

v2: Opt out of tracking more aggressively (Ken): In addition to the
    above, surface states, binding tables, instructions and most
    dynamic states are now left untracked, which means a *lot* more BO
    uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely
    carefully, since the cache tracker won't be able to provide any
    coherency guarantees for them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
2020-06-03 23:12:22 +00:00
Francisco Jerez e81c07de41 iris: Bracket batch operations which access memory within sync regions.
This delimits all batch operations which access memory between
iris_batch_sync_region_start() and iris_batch_sync_region_end() calls.
This makes sure that any buffer objects accessed within the region are
considered in use through the same caching domain until the end of the
region.

Adding any buffer to the batch validation list outside of a sync
region will lead to an assertion failure in a future commit, unless
the caller explicitly opted out of the cache tracking mechanism.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
2020-06-03 23:12:22 +00:00
Francisco Jerez 46183a999b iris: Extend iris_context dirty state flags to 128 bits.
We're nearly out of dirty bits, and some patches pending review on
GitLab no longer apply due to that.  Make room for them by splitting
off shader stage-specific bits into a separate stage_dirty mask.

An alternative would be to split compute-related bits into a separate
mask, but that would prevent the '<< stage' indexing done in various
parts of the driver from working.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>
2020-06-03 22:22:19 +00:00
Kenneth Graunke 4a1ed75b85 iris: Rename iris_syncpt to iris_syncobj for clarity.
This is just a refcounted wrapper around a drm_syncobj.  There is
enough terminology going on in the area of synchronization (sync
objects, sync files, ...) that I'd rather not invent our own.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
2020-05-01 19:00:02 +00:00
Mike Blumenkrantz 91375f13ce iris: move iris_vtable to iris_screen
instead of inlining this into every context, now a struct is used in the screen
struct to reduce memory usage and simplify a couple of the methods

Closes: https://gitlab.freedesktop.org/kwg/mesa/-/issues/6
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4376>
2020-04-29 16:59:45 +00:00
Lionel Landwerlin 33b9c7a7f6 intel/perf: break GL query stuff away
This stuff is somewhat specific to the GL extension & drivers. On
Vulkan we won't use this, it also made a rather large file.

v2: Fix Android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Danylo Piliaiev f3ca47d9f3 iris/query: Implement PIPE_QUERY_GPU_FINISHED
Implementation is similar to radeonsi in 5f1cef76

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-06 12:43:14 +02:00
Kenneth Graunke 0f3768bc5d iris: Free query on error path
CID: 1452276
2019-08-11 14:04:31 -07:00
Mark Janes 9c597514d4 iris/query: enable amd performance monitors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:34 -07:00
Mark Janes aca42759ff iris/perf: implement iris_create_monitor_object
This is the first call that provides the iris context to the monitor
implementation.  On the first call, use the iris context to initialize
the monitor context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:14 -07:00
Mark Janes 2446f5cfd8 intel/perf: move perf-related constants to common location
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers.  Place these macros within perf,
if the perf implementation contains the only references to the values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Kenneth Graunke 0e24d10ff5 iris: Use gen_mi_builder to handle CS ALU operations.
In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke fe7ed6b057 iris: Make iris_query.c a genxml-compiled file.
This will let us use Jason's new MI-builder shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke 975f7e4a59 iris: Move iris_resolve_conditional_render to the vtable.
It's going to be in genxml code shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Ilia Mirkin 0e30c6b8a7 gallium: switch boolean -> bool at the interface definitions
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).

This has been build-tested on amd64 with:

Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
                 vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st:      mesa xa xvmc xvmc vdpau va

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 22:13:51 -04:00
Kenneth Graunke 1d5ee31553 iris: Drop copy and pasted iris_timebase_scale
Lionel moved brw_timebase_scale to gen_device_info_timebase_scale a few
months ago, so we should just use that, and not our own copy in iris.
2019-07-16 17:22:48 -07:00
Kenneth Graunke 712ac83033 iris: Simplify devinfo access in calculate_result_on_gpu()
We have devinfo, no need for screen->devinfo.
2019-07-12 00:33:19 -07:00
Kenneth Graunke ab009b7d6e iris: Fix overzealous query object batch flushing.
In the past, each query object had their own BO.  Checking if the batch
referenced that BO was an easy way to check if commands were still
queued to compute the query value.  If so, we needed to flush.

More recently (c24a574e6c), we started using an u_upload_mgr for query
objects, placing multiple queries in the same BO.  One side-effect is
that iris_batch_references is a no longer a reasonable way to check if
commands are still queued for our query.  Ours might be done, but a
later query that happens to be in the same BO might be queued.  We don't
want to flush in that case.

Instead, check if the current batch's signalling syncpt is the one we
referenced when ending the query.  We know the syncpt can't have been
reused because our query is holding a reference, so a simple pointer
comparison should suffice.

Removes all batch flushing caused by query objects in Shadow of Mordor.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-06-26 09:49:01 -07:00
Kenneth Graunke d4a4384b31 iris: Implement INTEL_DEBUG=pc for pipe control logging.
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush.  That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on.  It should make it easier to determine whether we're doing
too many flushes and why.
2019-06-20 13:32:15 -05:00
Kenneth Graunke bbbf7a538c iris: Bail on queries for INTEL_NO_HW=1.
We don't execute any of the commands to record snapshots, so we can't
actually produce a real result.  We do however need to avoid waiting
on a syncpt which will never be signalled.  So, just return 0.
2019-06-19 11:55:43 -05:00
Illia Iorin a35269cf44 iris: Implement ARB_indirect_parameters
iris_draw_vbo is divided into two functions to remove unnecessary
operations from the loop. This implementation of ARB_indirect_parameters
takes into account NV_conditional_render by saving MI_PREDICATE_RESULT
at the start of a draw call and restoring it at the end also the result
of NV_conditional_render is taken into account when computing predicates
that limit draw calls for ARB_indirect_parameters in a similar way
to 1952fd8d in ANV.

v2: Optimize indirect draws (suggested by Kenneth Graunke)
v3: (by Kenneth Graunke)
 - Fix an issue where indirect draws wouldn't set patch information
   before updating the compiled TCS.
 - Move some code back to iris_draw_vbo to avoid duplicating it.
 - Fix minor indentation issues.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-11 23:56:52 -07:00
Kenneth Graunke 2208d5a683 iris: Fix DrawTransformFeedback math when there's a buffer offset
We need to subtract the starting offset from the final offset before
dividing by the stride.  See src/intel/vulkan/genX_cmd_buffer.c:3142.

Not known to fix anything.
2019-04-23 15:57:07 -07:00
Kenneth Graunke 8d9e169bdd iris: Save/restore MI_PREDICATE_RESULT, not MI_PREDICATE_DATA.
MI_PREDICATE_DATA is an intermediate storage for the MI_PREDICATE
command's calculations - it holds the result of the subtraction when
the compare operation is SRCS_EQUAL or DELTAS_EQUAL.  But the actual
result of the predication is MI_PREDICATE_RESULT, which is what we
want to copy from the render context to the compute context.
2019-04-04 11:41:10 -07:00
Rafael Antognolli ce830a364e iris: Add iris_resolve_conditional_render().
This function can be used to stall on the CPU and resolve the predicate
for the conditional render. It will convert ice->state.predicate from
IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or
IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query.

v2:
 - return void (Ken)
 - update the stored condition (Ken)
 - simplify the code leading to resolve the predicate (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Kenneth Graunke 1cd001aa63 iris: Make a iris_batch_reference_signal_syncpt helper function.
Suggested by Chris Wilson.  More obvious what's going on.
2019-02-21 10:26:11 -08:00
Kenneth Graunke 9376799bd6 iris: Use READ_ONCE and WRITE_ONCE for snapshots_landed
Suggested by Chris Wilson, if only to make it obvious to the human
readers that these are volatile reads.  It may also be necessary for
the compiler in a few cases.
2019-02-21 10:26:11 -08:00
Kenneth Graunke 18e31a9b31 iris: Fix accidental busy-looping in query waits
When switching from bo_wait to sync-points, I missed that we turned an
if (not landed) bo_wait into a while (not landed) check_syncpt(), which
has a timeout of 0.  This meant, rather than sleeping until the batch
is complete, we'd busy-loop, continually asking the kernel "is the batch
done yet???".  This is not what we want at all - if we wanted a busy
loop, we'd just loop on !snapshots_landed.  We want to sleep.

Add an effectively infinite timeout so that we sleep.
2019-02-21 10:26:11 -08:00
Kenneth Graunke 3b1ac8244e iris: Add a timeout_nsec parameter, rename check_syncpt to wait_syncpt
I want to be able to wait with a non-zero timeout from elsewhere.
2019-02-21 10:26:11 -08:00
Sagar Ghuge c24a574e6c iris: Don't allocate a BO per query object
Instead of allocating 4K BO per query object, we can create a large blob
of memory and split it into pieces as required.

Having one BO for multiple query objects, we don't want to wait on all
of them, instead when we write last snapshot, we create a sync point, and
check syncpoints while waiting on particular object.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-02-21 10:26:11 -08:00
Kenneth Graunke 5d3d757178 iris: Zero the compute predicate when changing the render condition
1. Set a render condition.  We emit it immediately on the render
   engine, and stash q->bo as ice->state.compute_predicate in case
   the compute engine needs it.

2. Clear the render condition.  We were incorrectly leaving a stale
   compute_predicate kicking around...

3. Dispatch compute.  We would then read the stale compute predicate,
   and try to load it into MI_PREDICATE_DATA.  But q->bo may have been
   freed altogether, causing us to try and use garbage memory as a BO,
   adding it to the validation list, failing asserts, and tripping
   EINVALs in execbuf.

Huge thanks to Mark Janes for narrowing this sporadic GL CTS failure
down to a list of 48 tests I could easily run to reproduce it.  Huge
thanks to the Valgrind authors for the memcheck tool that immediately
pinpointed the problem.
2019-02-21 10:26:11 -08:00
Kenneth Graunke 1f91f688e8 iris: Switch to the new PIPELINE_STATISTICS_QUERY_SINGLE capability
I had a hack in place earlier to pass the query type as q->index
for the regular statistics query, but we ended up adjusting the
interface and adding a new query type.  Use that instead, fixing
pipeline statistics queries since the rebase.
2019-02-21 10:26:11 -08:00
Kenneth Graunke a23c06cabc iris: Use new PIPE_STAT_QUERY enums rather than hardcoded numbers. 2019-02-21 10:26:11 -08:00
Kenneth Graunke 5aef30b886 iris: Fix Broadwell WaDividePSInvocationCountBy4
We were dividing by 4 in calculate_result_on_gpu(), and also in
iris_get_query_result().  We should stop doing the latter, and instead
divide by 4 in calculate_result_on_cpu() as well.

Otherwise, if snapshots were available, and you hit the
calculate_result_on_cpu() path, but requested it be written to a QBO,
you'd fail to get a divide.
2019-02-21 10:26:11 -08:00
Dave Airlie df60241ff7 iris: handle qbo fragment shader invocation workaround 2019-02-21 10:26:11 -08:00
Dave Airlie 5ae2e5aa94 iris: add fs invocations query workaround for broadwell 2019-02-21 10:26:11 -08:00
Kenneth Graunke 3ab3aa23c2 iris: Add a more long term TODO about timebase scaling 2019-02-21 10:26:11 -08:00
Dave Airlie aaaf611130 iris: fix gpu calcs for timestamp queries 2019-02-21 10:26:11 -08:00
Kenneth Graunke 5307ff6a5f iris: Implement DrawTransformFeedback()
We get the count by dividing the offset by the stride.
2019-02-21 10:26:11 -08:00
Jason Ekstrand 2e103fff63 iris: Copy anv's MI_MATH helpers for multiplication and division
(import done by Ken but with author set to Jason because it's his
code that's being imported, so he deserves the credit)
2019-02-21 10:26:11 -08:00