Commit Graph

114165 Commits

Author SHA1 Message Date
Iago Toral Quiroga fb9f7872e7 v3d: handle wait requirement when retrieving query results correctly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 0f2d1dfe65 v3d: use the GPU to record primitives written to transform feedback
We can use the PRIMITIVE_COUNTS_FEEDBACK packet to write various primitive
counts to a buffer, including the number of primives written to transform
feedback buffers, which will handle buffer overflow correctly.

There are a couple of caveats with this:

Primitive counters are reset when we emit a 'Tile Binning Mode Configuration'
packet, which can happen in the middle of a primitives query, so we need to
read the buffer when we submit a job and accumulate the counts in the context
so we don't lose them.

We also need to do the same when we switch primitive type during transform
feedback so we can compute the correct number of recorded vertices from
the number of primitives. This is necessary so we can provide an accurate
vertex count for draw from transform feedback.

v2:
 - When computing the number of vertices for a primitive, pass in the base
   primitive, since that is what the hardware will count.
 - No need to update primitive counts when switching primitive types if
   the base primitives are the same.
 - Log perf warning when mapping the primitive counts BO for readback (Eric).
 - Only emit the primitive counts packet once at job end (Eric).
 - Use u_upload mechanism for the primitive counts buffer (Eric).
 - Use the XML to generate indices into the primitive counters buffer (Eric).

Fixes piglit tests:
spec/ext_transform_feedback/overflow-edge-cases
spec/ext_transform_feedback/query-primitives_written-bufferrange
spec/ext_transform_feedback/query-primitives_written-bufferrange-discard
spec/ext_transform_feedback/change-size base-shrink
spec/ext_transform_feedback/change-size base-grow
spec/ext_transform_feedback/change-size offset-shrink
spec/ext_transform_feedback/change-size offset-grow
spec/ext_transform_feedback/change-size range-shrink
spec/ext_transform_feedback/change-size range-grow
spec/ext_transform_feedback/intervening-read prims-written

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga cf8986bce0 gallium/util: add a helper to compute vertex count from primitive count
v2:
  - Only compute vertex counts for base primitives.
  - Add a unit test (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 9eb8699e0f v3d: be more explicit about the query types supported
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 9b316ab57a v3d: generate packet unpack functions
These were not being compiled because of the lack of __gen_unpack_address.

v2:
 - Shift raw address correctly (Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga 5ffb8b1716 v3d: add header guards in v3d_packet_helpers.h
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Tomeu Vizoso e7eac8a1e8 panfrost: Print errors from kernel
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 7c8434889d panfrost: Mark buffers as PANFROST_BO_HEAP
What we call GROWABLE in Mesa corresponds to the HEAP BO flag in the
kernel. These buffers cannot be memory mapped in the CPU side at the
moment, so make sure they are also marked INVISIBLE.

This allows us to allocate a big heap upfront (16MB) without actually
reserving space unless it's needed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 19afd41e65 panfrost: Mark BOs as NOEXEC
Unless a BO has the EXECUTABLE flag, mark it as NOEXEC.

v2: - Rework version detection (Alyssa).

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 9398932c2d panfrost: Take into account flags when looking up in the BO cache
This will be useful right now so we avoid retrieving a non-executable
buffer when a executable one is needed.

As we support more flags, this logic will need to be extended to
consider the different trade-offs to be made when matching BO
specifications to BOs in the cache.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 950b5fc596 panfrost: Allocate shaders in their own BOs
Instead of all shaders being stored in a single BO, have each shader in
its own.

This removes the need for a 16MB allocation per context, and allows us
to place transient blend shaders in BOs marked as executable (before
they were allocated in the transient pool, which shouldn't be
executable).

v2: - Store compiled blend shaders in a malloc'ed buffer, to avoid
      reading from GPU-accessible memory when patching (Alyssa).
    - Free struct panfrost_blend_shader (Alyssa).
    - Give the job a reference to regular shaders when emitting
      (Alyssa).

v3: - Split out the allocation flags change (Rob).

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso 5804d75b9c util/hash_table: Fix hashing in clears on 32-bit
Some hash functions (eg. key_u64_hash) will attempt to dereference the
key, causing an invalid access when passed DELETED_KEY_VALUE (0x1) or
FREED_KEY_VALUE (0x0).

When in 32-bit arch a 64-bit key value doesn't fit into a pointer, so
hash_table_u64 internally use a pointer to a struct containing the
64-bit key value.

Fix _mesa_hash_table_u64_clear() to handle the 32-bit case by creating a
temporary hash_key_u64 to pass to the hash function.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-08-08 07:42:52 +02:00
Tapani Pälli aba57b11ee anv: support GetSwapchainGrallocUsage2ANDROID for Android
New function supports gralloc1 usage flags that get set separately
for producer and consumer. As we still need to support old method too,
let's share common code and use android_convertGralloc0To1Usage helper.
Bump the VK_ANDROID_native_buffer version to indicate support for the
new call.

Changes were tested on Android Celadon P with Basemark GPU and various
Sascha Willems Vulkan demos.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 05:08:01 +00:00
Mark Janes 51c3ab618b st/mesa: eliminate unnecessary redirection
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 61c54a8878 intel/perf: fix debug typo
Misspelling was seen with INTEL_DEBUG=perfmon.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 2df1ab4d48 intel/perf: make gen_perf_query_object private
Encapsulate the details of this structure within the perf implemenation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes deea3798b6 intel/perf: make perf context private
Encapsulate the details of this data structure.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 1f4f421ce0 intel/perf: print debug information
INTEL_DEBUG=perfmon will iterate over the perf queries, printing
information about the state of each query.  Some of this information
will be private to intel/perf, and needs to a dump routine that can be
called from i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes a663c8c26e intel/perf: make internal methods private
Now that all references from i965 have been moved to perf, we can make
internal methods private again.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes be8b466cff intel/perf: make oa_sample_buffers private
All references to this data structure have been moved inside the perf
subsystem.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes f2a049b4e3 intel/perf: expose method to create query
By encapsulating this implementation within perf, we can eventually
make struct gen_perf_ctx private.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 9f5c160d82 intel/perf: move initialization of pipeline statistics metrics to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 9f84efb452 intel/perf: move get_query_data into gen_perf
This refactor moves several helper functions for get_query_data as
well:

 - accumulate_oa_reports
 - read_gt_frequency
 - get_pipeline_stats_data
 - get_oa_counter_data

Functions which are no longer referenced in brw_performance_query.c
have been removed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 73eccdc4a5 intel/perf: move delete_query to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 8c9eac1234 intel/perf: move is_query_ready to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes a9be292722 intel/perf: move wait_query to perf
The following methods have duplicate implementation of read_oa_samples_until in
brw_performance_query.c:

 - read_oa_samples_for_query
 - read_oa_samples_until

They ar still referenced by other methods in the file and will be
removed on the subsequent commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 3c8ed58486 intel/perf: create a vtable entry for bo_busy
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 6fed756388 intel/perf: create a vtable entry for bo_wait_rendering
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 511bb15d4b intel/perf: create a vtable entry for batch_references
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 3ecb23092e intel/perf: refactor gen_perf_end_query into gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes 018f9b81e5 intel/perf: refactor gen_perf_begin_query into gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 52d3db9ab6 intel/perf: move perf-related state into gen_perf_context
To move more operations into intel/perf, several state items are
needed.  Save references to that state in the perf_ctxt, rather than
passing them in for every operation.

This commit includes an initializer for gen_perf_context, to set those
references and also encapsulate the initialization of the sample
buffer state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes df18acee78 intel/perf: create a vtable entries for buffer object map/unmap
These operations are needed to refactor subsequent methods into perf

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes a330d759c5 intel/perf: move client reference counts into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 4d0d4aa1b5 intel/perf: move open_perf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 79ded7cc8f intel/perf: move close_perf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes f57c8a6dc1 intel/perf: create a vtable entry for emit_mi_flush
This method is needed to move subsequent methods into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 52f7a0bff7 intel/perf: use temporary pointers to simplify access to perf state
Most accesses to perf state were made through repeated dereferences of
brw_context members.  Prefering temporary variables of perf_ctx and
perf_cfg has the following advantages:

 - more concise implementation
 - easier refactor when moving subsequent methods to perf

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes a157f5acb1 intel/perf: move snapshot_statistics_registers into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 8ae6667992 intel/perf: move query_object into perf
Query objects can now be encapsulated within the perf subsystem.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 7e890ed476 intel/perf: create a vtable entry for store_register_mem64
This method is needed to move subsequent methods into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 4b2c885207 intel/perf: move free_sample_bufs into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 2f712d21b9 intel/perf: move reap_old_sample_buffers into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 31758bd36c intel/perf: move get_free_sample_buf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes e08a69b7f4 intel/perf: move the perf context into perf
The "context" that is necessary to submit and process perf commands to
the hardware was previously present in the brw_context.perfquery
struct.  This commit moves it into perf and provides a more
understandable name.

The intention is for this struct to be private, when all methods that
access it are migrated into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes fb622054f7 intel/perf: move get_metric_id to perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes b14e15e26a intel/perf: move oa_sample_buf structure to perf
oa_sample_buf holds the data provided by the kernel that will be
collated into performance metrics.  Since this functionality will be
implemented in perf, the struct needs to be defined there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes e091f33990 intel/perf: enumerate query-based metrics in perf
Iris and i965 both need to enumerate the available metrics, so these
routines must be located in perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 2446f5cfd8 intel/perf: move perf-related constants to common location
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers.  Place these macros within perf,
if the perf implementation contains the only references to the values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes 67675a5802 intel/perf: create a vtable entry for capture_frequency_stat_register
In preparation for calling both Iris and i965 implementions from perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00