Commit Graph

90003 Commits

Author SHA1 Message Date
Grazvydas Ignotas 274aaa331c util/disk_cache: check rename result
I haven't seen this causing problems in practice, but for correctness
we should also check if rename succeeded to avoid breaking accounting
and leaving a .tmp file behind.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20 08:24:46 +11:00
Grazvydas Ignotas 67911fa4b8 util/disk_cache: delete .tmp if target exists
At the time of target file check, .tmp file is already created and file
lock is held, so we should remove the .tmp, like in other error paths.

With this, piglit no longer leaves large amount of empty .tmp files
behind, which waste directory entries and may interfere with eviction.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20 08:24:38 +11:00
Grazvydas Ignotas bd93cea691 util/disk_cache: fix stored_keys index
It seems there is a bug because:
- 20 bytes are compared, but only 1 byte stored_keys step is used
- entries can overlap each other by 19 bytes
- index_mmap is ~1.3M in size, but only first 64K is used

With this fix for Deus Ex:
- startup time (from launch to Feral logo): ~38s -> ~16s
- disk_cache_has_key() hit rate: ~50% -> ~96%

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20 08:14:31 +11:00
Ilia Mirkin 663e7c25f5 nv30: create uploader after pipe->screen is set
Fixes crashes after recent upload rework.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-19 01:24:06 -04:00
Ilia Mirkin 0e9232dbcc nv50,nvc0: enable TEX_LZ and TXF_LZ
There should be minimal gain, if any, for nvc0, but nv50 may end up
noticing more often that the lod argument is uniform. This, in turn,
will remove the need for some unnecessary transformations, which were
being hit due to the checks being done pre-ssa.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-18 20:37:52 -04:00
Ilia Mirkin dab88e9af7 st/mesa: set result writemask based on ir type
This prevents textureQueryLevels, which maps as LODQ, from ending up
with a xyzw writemask, which is illegal.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-18 20:16:45 -04:00
Karol Herbst 09f16de7e6 nvc0/ir: treat FMA like MAD for operand propagation
Helps mainly Feral-ported games, due to their use of fma()

shader-db changes:
total instructions in shared programs : 3901147 -> 3842505 (-1.50%)
total gprs used in shared programs    : 471258 -> 467359 (-0.83%)
total local used in shared programs   : 27405 -> 27361 (-0.16%)
total bytes used in shared programs   : 35749888 -> 35214176 (-1.50%)

                local        gpr       inst      bytes
    helped          17        1829        4091        4091
      hurt           4          44           3           3

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-03-18 20:15:45 -04:00
Alan Swanson a7eb7984bf util/disk_cache: pass predicate functions file stats directly (v4)
Since switching to LRU eviction the only user of these predicate
functions now resolves directory entry stats itself so pass them
directly saving calling fstat and strlen twice (and the
expensive strlen is skipped entirely if access time is newer).

v2: Update for empty cache dir detection changes
v3: Fix passing string length to predicate with the +1 for NULL
    termination and also pass sb as pointer
v4: Missed ampersand for passing sb as pointer

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-18 14:32:57 +11:00
Timothy Arceri bf8bc6190e glsl: use set for copy propagation kills
Previously each time we saw a variable we just created a duplicate
entry in the list. This is particularly bad for loops were we add
everything twice, and then throw nested loops into the mix and the
list was growing expoentially.

This stops the glsl-vs-unroll-explosion test which has 16 nested
loops from reaching the tests mem usage limit in this pass. The
test now hits the mem limit in opt_copy_propagation_elements()
instead.

I suspect this was also part of the reason this pass can be so
slow with some shaders.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-03-18 14:21:09 +11:00
Timothy Arceri 9e42b93f33 st/dri: wait for thread to finish before unbinding context
Fixes a bunch of piglit crashes that hit an assert() when trying
to delete the framebuffer. The assert() was triggered because
WinSysDrawBuffer was set to NULL before glDeleteFramebuffers()
was called.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-18 14:15:52 +11:00
Timothy Arceri 40bc1afc94 glsl: don't leak memory when trying to count loop iterations
Suggested-by: Damian Dixon <damian.dixon@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789
2017-03-18 14:12:40 +11:00
Jason Ekstrand 1d5f4f46da genxml: Make MI_STORE_DATA_IMM have a single 64-bit data field
This is way more convenient than having two separate dword fields.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 15:31:19 -07:00
Jason Ekstrand ced61fd53e anv: Turn on inherited queries
It all just works since it's just a hardware register so we might as
well turn it on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Ilia Mirkin e675f57d4f anv: Implement pipeline statistics queries
In the end, pipeline statistics queries look a lot like occlusion
queries only with between 1 and 11 begin/end pairs being generated
instead of just the one.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand dda54890f3 anv: Disable VF statistics for blorp and SOL memcpy
In order to get accurate statistics, we need to disable statistics for
blits, clears, and the surface state memcpy at the top of each secondary
command buffer.  There are two possible approaches to this:

 1) Disable before the blit/memcpy and re-enable afterwards

 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it
    part of pipeline state and then just disabale statistics before
    blits and memcpy operations.

Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't
really matter which path we take.  We choose the second option as it's
more consistent with the way the rest of the statistics are enabled and
disabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand 9576cea519 anv/pipeline: Enable clipper statistics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand 2a616242cd genxml: s/Clipper Statistics Enable/Statistics Enable/
It's in 3DSTATE_CLIP, so it doesn't really need the extra detail.  This
matches what we do for VS, FS, etc.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand 149d10d38a anv/query: Rework store_query_result
The new version is a nice GPU parallel to cpu_write_query_result and it
nicely handles things like dealing with 32 vs. 64-bit offsets in the
destination buffer.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand c773ae88df anv/query: Break GPU query calculation into a helper
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand 7de73f0c94 genxml: Add pipeline statistics registers on gen7+
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand 0557dfdb4a anv/query: Add a helper for writing a query pool result
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand bce4a935c6 anv/query: Use a variable-length slot size
Not all queries are the same.  Even the two queries we support today
require a different amount of data per slot.  Once we introduce pipeline
statistics queries, the size will vary wildly.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:49 -07:00
Jason Ekstrand 1c797af2c6 anv/query: Move the available bits to the front
We're about to make slots variable-length and always having the
available bits at the front makes certain operations substantially
easier once we do that.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:47 -07:00
Jason Ekstrand 9d43afa3dc anv/query: Let 32-bit values wrap
From the Vulkan 1.0.39 Specification:

   "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a
   32-bit value, the value may either wrap or saturate."

So we can either clamp or wrap.  Wrapping is both easier and what the
user gets if they use vkCmdCopyQueryPoolResults and we should be
consistent.  We could make vkCmdCopyQueryPoolResults clamp but it's
annoying and ends up burning extra batch for something the spec clearly
doesn't require.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:11:35 -07:00
Alex Deucher c2a97fb7ae radeonsi: add new polaris12 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-17 14:13:17 -04:00
Marek Olšák 4b064d16e5 gallium/radeon: formalize that create_batch_query doesn't need pipe_context
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák be6173e7d6 gallium/radeon: formalize that create_query doesn't need pipe_context
for threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák 04e6977e5d gallium/radeon: reference pipe_resource in pipe_transfer
for threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák 03127bb6d5 radeonsi: compile all TGSI compute shaders asynchronously
required by threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák e9c6953ddb radeonsi: require that compiler threads are enabled
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák 080f322f06 trace: remove leftover assertions after pipe_resource wrapping removal 2017-03-17 18:30:21 +01:00
Marek Olšák 6c0a28084d gallium/u_upload: make the first persistent mapping unsynchronized
This is simpler for drivers.
2017-03-17 18:30:21 +01:00
Robert Bragg a27b62e794 anv/device: init timestampPeriod from devinfo
Now that there's a timebase_scale in gen_device_info which is
effectively the 'period' this switches anv_GetPhysicalDeviceProperties
to using this common device info to initialize the timestampPeriod
device limit.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-17 16:10:22 +00:00
Robert Bragg 344d1a4015 i965: Allow a per gen timebase scale factor
Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock
with the convenient property of being able to scale by an integer (80)
to nanosecond units.

For Skylake the frequency is 12MHz or a scale factor of 83.333333

This updates gen_device_info to track a floating point timebase_scale
factor and makes corresponding _queryobj.c changes to no longer assume a
scale factor of 80 works across all gens.

Although the gen6_ code could have been been left alone, the changes
keep the code more comparable, and it now shares a few utility functions
for scaling raw timestamps and calculating deltas. The utility for
calculating deltas takes into account 32 or 36bit overflow depending on
the current kernel version.

Note: this leaves the timestamp handling of ARB_query_buffer_object
untouched, which continues to use an incorrect scale of 80 on Skylake
for now. This is more awkward to solve since the scaling is currently
done using a very limited uint64 ALU available to the command parser
that doesn't support multiply or divide where it's already taking a
large number of instructions just to effectively multiple by 80.

This fixes piglit arb_timer_query-timestamp-get on Skylake

v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-17 15:45:19 +00:00
Jason Ekstrand 28b134c75c anv/device: Remove a use of a compound literal
Older versions of GCC don't like compound literals in static const
variable declarations because they don't think it's an actual constant
value.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 08:40:30 -07:00
Robert Bragg 76dc49f3fb i965: bounds checks while concatenating sysfs paths
This adds some missing return value checks for all uses of snprintf in
brw_performance_query.c. This also switches a use of strncpy + strncat
for snprintf for consistency and to avoid the chance of the strncpy
leaving an unterminated string in the dest buffer if the src is too
long.

This issue with strncpy was picked up by Coverity.

CID: 1402201
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 13:40:29 +00:00
Emil Velikov f8b1b9404e mesa: automake: add all headers to the tarball.
Fixes: d8d81fbc31 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 13:10:09 +00:00
Emil Velikov d9a41ce8aa mapi: automake: add all python scripts to EXTRA_DIST
Otherwise it'll be missing in the tarball and make distcheck will fail.

Fixes: 05dd4a1104 ("glapi: Generate GL API marshalling code from the XML.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 13:10:09 +00:00
Jonathan Gray 9e8d6ba1d6 glapi: avoid using $< in non-suffix make rules
Using $< in non-suffix make rules is a GNU extension.  Explicitly use
the name of the python script to fix the build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabore.com>
2017-03-17 13:06:26 +00:00
Alex Smith ce4058dafd radv/ac: Fix shared memory offset calculation
The index passed to get_shared_memory_ptr is an attribute slot index,
i.e. the index of a vec4 within LDS. Therefore this must be scaled by
sizeof(vec4) to give the LDS byte offset.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: <mesa-stable@lists.freedesktop.org>
2017-03-17 09:35:48 +01:00
James Legg e88cac1df0 radv: Fix using more than 4 bound descriptor sets
Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when
using more than 4 descriptor sets. radv claims support for 8.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 09:12:43 +01:00
Tapani Pälli 70d25cae8b util/build-id: check dlpi_name before strstr call
According to dl_iterate_phdr man page first object visited is the
main program where dlpi_name is an empty string. This fixes segfault
on Android when using build-id as identifier.

Fixes: d4fa083e11 ("util: Add utility build-id code.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-17 07:34:26 +02:00
Tapani Pälli 4d4558411d android: fix segfault within swap_buffers
Function droid_swap_buffers may get called without dri2_surf->buffer set,
in these cases we don't have a back buffer set either. Patch fixes segfault
seen with 3DMark that uses android.opengl.GLSurfaceView for rendering it's UI.

backtrace:
   #00 pc 00013f88  /system/lib/egl/libGLES_mesa.so (droid_swap_buffers+104)
   #01 pc 000117b2  /system/lib/egl/libGLES_mesa.so (dri2_swap_buffers+50)
   #02 pc 000058b2  /system/lib/egl/libGLES_mesa.so (eglSwapBuffers+386)
   #03 pc 00011329  /system/lib/libEGL.so (eglSwapBuffersWithDamageKHR+553)
   #04 pc 000118e7  /system/lib/libEGL.so (eglSwapBuffers+55)
   #05 pc 000754dc  /system/lib/libandroid_runtime.so

v2: do like other backends, call get_back_bo (Emil Velikov)

Fixes: 2acc69d ("EGL/Android: Add EGL_EXT_buffer_age extension")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 07:30:34 +02:00
Timothy Arceri 72ab7bb765 radv: make sure gs copy shader is retrieved from the cache with the variant
Apps can limit the size of the cache via VkAllocationCallbacks so we
can't be sure that both are always in the cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri 2845a108a9 radv: fallback to an in-memory cache when no pipline cache is provided
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri 315e8a9321 radv: always create an fallback pipeline cache
This will be used as an in-memory cache when a pipeline cache is
not provided by the app.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri 4ffdab78b9 radv: move cache check inside insert and search functions
This will allow us to use fallback in-memory and on-disk caches
should the app not provide a pipeline cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri 124ec417f9 st/mesa: call glthread_destroy() before _vbo_DestroyContext()
Otherwise we have a race condition between vbo calls in the
glthread and the _vbo_DestroyContext() call.

This fixes a bunch of piglit crashes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-17 09:47:02 +11:00
Jason Ekstrand 08df015b9d anv/GetQueryPoolResults: Actually implement the spec
The Vulkan spec is fairly clear about when we should and should not
write query pool results.  We're also supposed to return VK_NOT_READY if
VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries
which are not yet finished.  This fixes rendering corruptions on The
Talos Principle where geometry flickers in and out due to bogus query
results being returned by the driver.  These issues are most noticable
on Sky Lake GT4 2hen running on "ultra" settings.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:18 -07:00
Jason Ekstrand 81840130c0 anv/query: Invalidate the correct range
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00