Commit Graph

95817 Commits

Author SHA1 Message Date
Gert Wollny d3675812b5 travis: force llvm-3.3 for "make Gallium ST Other"
In Ubuntu Trusty the default version of llvm is 3.4 and the build was
actually randomly picking 3.5 or 3.9. Adding libunwind would then result
is build success or failure depending of what version was picked.

Install the llvm-3.3-dev package and force its use: On one hand it is
the minimum required version we want to the build test against, and on
the other hand forcing the version stabilizes the build.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 13:56:28 +01:00
Gert Wollny c75d781610 mesa/st/tests: Correct build flags and force -std=c++11
Include src/gallium/Automake.inc, correct the build flags accordingly.

Force -std=c++11 (extensively used by the test) as otherwise it gets
defined only when building against llvm >= 3.9.

Fixes: 7be6d8fe12  ("mesa/st: glsl_to_tgsi: add tests for the new
temporary lifetime tracker")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2017-09-15 13:56:28 +01:00
Emil Velikov 3c5fb7346f automake: include radv_shader.h in the sources list
Otherwise it will be missing from the tarball, leadin to build failure.

Fixes: d4d777317b ("radv: move shaders related code to radv_shader.c")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 13:56:27 +01:00
Gurkirpal Singh 6a8aa11c20 st/omx_bellagio: Rename state tracker and option
Changes --enable-omx option to --enable-omx-bellagio

Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com>
Reviewed-and-Tested-by: Julien Isorce <julien.iso...@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-15 14:28:36 +02:00
Tapani Pälli acbfcb7105 i965: fix build warning on clang
fixes following warning:
   warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long')

cast is needed to avoid this change turning in to another warning:
   warning: format specifies type 'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned long')

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-15 12:39:33 +03:00
Samuel Pitoiset 8e8c7c6703 radv: fix a potential crash if attachments allocation failed
Also, it's useless to set the error code twice. Though, we
should probably skip the next commands when the command buffer
is considered invalid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-15 09:16:38 +02:00
Samuel Pitoiset a0495d4bb3 radv: dump the device name into the hang report
Similar to RadeonSI renderer string.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-15 09:16:35 +02:00
Samuel Pitoiset 176c2ad10c radv: add get_chip_name() callback
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-15 09:16:34 +02:00
Dave Airlie 1b163238f5 r600: add .gitignore for egd_tables.h 2017-09-15 13:55:01 +10:00
Timothy Arceri a70a401f52 radeonsi: enable STD430 packing of UBOs by default
Before this change we were defaulting to STD140 which is slightly
less efficient at packing arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri fac9f2c4b0 st/mesa: set UseSTD430AsDefaultPacking const based on CAP
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri c96e45ebf0 gallium: introduce PIPE_CAP_LOAD_CONSTBUF
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri b4401cc104 radeonsi: make use of LOAD for UBOs
v2: always set can_speculate and allow_smem to true

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri 51cf16319d mesa/st: add LOAD support for UBOs
This will allow us to use STD430 packing by default if the driver
supports it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri ee0fbc8b71 mesa/st: create add_buffer_to_load_and_stores() helper
Will be used to add LOAD support to UBOs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:54 +10:00
Timothy Arceri 6fa60b5e40 gallium: add CONSTBUF type to tgsi_file_type
This will be use to distinguish between load types when using
the TGSI_OPCODE_LOAD opcode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:54 +10:00
Dave Airlie b6f6ead198 virgl: drop const dimensions on first block.
The virgl protocol version of tgsi doesn't handle this yet,
transform it back to the old ways.

Thanks to Nicolai Hähnle <nicolai.haehnle@amd.com>
for also writing nearly the same patch.

Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D
Tested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-15 10:33:14 +10:00
Dave Airlie a7a7bf21bd st/glsl->tgsi: fix u64 to bool comparisons.
Otherwise we end up using a 32-bit comparison which didn't end well.

Timothy caught this while playing around with some opt passes.

Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-15 09:49:50 +10:00
Kenneth Graunke 62f2670cba i965: Print size of validation and relocation lists in INTEL_DEBUG=flush
It's nice to have this information.  While we're at it, tweak the
formatting to try and vertically align numbers in the common case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 7c5988e615 i965: Disentangle batch and state buffer flushing.
We now flush the batch when either the batchbuffer or statebuffer
reaches the original intended batch size, instead of when the sum of
the two reaches a certain size (which makes no sense now that they're
separate buffers).

With this change, we also need to update our "are we near the end?"
estimate to require separate batch and state buffer space.  I obtained
these estimates by looking at the size of draw calls in the Unreal 4
Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).

This will significantly impact the size of our batches.  I've adjusted
both down to try and be roughly similar to what we had been doing.  On
various benchmarks, a 20kB batch and 16kB statebuffer seemed to about
right, but we may need to adjust this further.  I tried a 16kB batch,
but that regressed Synmark OglMultithread performance by a fair bit.
32kB for both would have significantly increased our batch sizes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 2c46a67b41 i965: Delete BATCH_RESERVED handling.
Now that we can grow the batchbuffer if we absolutely need the extra
space, we don't need to reserve space for the final do-or-die ending
commands.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 9034d157c0 i965: Make BLORP properly avoid batch wrapping.
We need to set brw->no_batch_wrap to actually avoid flushing in the
middle of our BLORP operation, and instead grow the batchbuffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 2dfc119f22 i965: Grow the batch/state buffers if we need space and can't flush.
Previously, we would just assert fail and die in this case.  The only
safeguard is the "estimated max prim size" checks when starting a draw
(or compute dispatch or BLORP operation)...which are woefully broken.

Growing is fairly straightforward:

1. Allocate a new larger BO.
2. memcpy the existing contents over to the new buffer
3. Set the new BO to the same GTT offset as the old BO.  When emitting
   relocations, we write the presumed GTT offset of the target BO.  If
   we changed it, we'd have to update all the existing values (by
   walking the relocation list and looking at offsets), which is more
   expensive.  With the old BO freed, ideally the kernel could simply
   place the new BO at that offset anyway.
4. Update the validation list to contain the new BO.
5. Update the relocation list to have the GEM handle for the new BO
   (which we can skip if using I915_EXEC_HANDLE_LUT).

v2: Update to handle malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 78c404f106 i965: Use a separate state buffer, but avoid changing flushing behavior.
Previously, we emitted GPU commands and indirect state into the same
buffer, using a stack/heap like system where we filled in commands from
the start of the buffer, and state from the end of the buffer.  We then
flushed before the two met in the middle.

Meeting in the middle is fatal, so you have to be certain that you
reserve the correct amount of space before emitting commands or state
for a draw.  Currently, we will assert !no_batch_wrap and die if the
estimate is ever too small.  This has been mercifully obscure, but has
happened on a number of occasions, and could in theory happen to any
application that issues a large draw at just the wrong time.

Estimating the amount of batch space required is painful - it's hard to
get right, and getting it right involves a lot of code that would burn
CPU time, and also be painful to maintain.  Rolling back to a saved
state and retrying is also painful - failing to save/restore all the
required state will break things, and redoing state emission burns a
lot of CPU.  memcpy'ing to a new batch and continuing is painful,
because commands we issue for a draw depend on earlier commands as well
(such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state).

The best plan is to never run out of space, which is totally doable but
pretty wasteful - a pessimal draw requires a huge amount of space, and
rarely occurs.  Instead, we'd like to grow the batch buffer if we need
more space and can't safely flush.

We can't grow with a meet in the middle approach - we'd have to move the
state to the end, which would mean updating every offset from dynamic
state base address.  Using separate batch and state buffers, where both
fill starting at the beginning, makes it easy to grow either as needed.

This patch separates the two concepts.  We create a separate state
buffer, with a second relocation list, and use that for brw_state_batch.

However, this patch tries to retain the original flushing behavior - it
adds the amount of batch and state space together, as if they were still
co-existing in a single buffer.  The hope is to flush at the same time
as before.  This is necessary to avoid provoking bugs caused by broken
batch wrap handling (which we'll fix shortly).  It also avoids suddenly
increasing the size of the batch (due to state not taking up space),
which could have a significant performance impact.  We'll tune it later.

v2:
- Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught
  by Chris).  Unfortunately, we lose the ability to capture state data
  on older kernels.
- Continue to support the malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 0bf3fa4c53 i965: Pass screen to intel_batchbuffer_reset().
This will let us access screen->kernel_features in the next patch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 2e68c4e454 i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.
We'll need to read from both buffers when decoding state.

This also drops the "failed to map" fallback - it's completely useless
on LLC systems where we write directly to the mapped BO.  It's not that
useful on non-LLC systems either.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke e723255901 i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.
brw_batch_reloc emits a relocation from the batchbuffer to elsewhere.
brw_state_reloc emits a relocation from the statebuffer to elsewhere.

For now, they do the same thing, but when we actually split the two
buffers, we'll change brw_state_reloc to use the state buffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 1674a0bcbc i965: Refactor relocs into a brw_reloc_list structure.
I'm planning on splitting batch and state into separate buffers, at
which point we'll need two relocation lists.  In preparation for that,
this patch refactors the relocation stuff into a structure we can
replicate...which looks a lot like anv_reloc_list.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 1bc44e0e7f i965: Move brw_state_batch code to intel_batchbuffer.c
The batch buffer and state buffer code is fairly tied together,
and having it in one .c file will make refactoring easier.

Also, drop some commentary above brw_state_batch.  The "aperture
checking performance hacks" are long since gone, so that paragraph
makes little sense at this point.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 3b812e62a1 i965: Drop a useless ret == 0 check.
Prior to the previous patch, we would pwrite the batchbuffer contents,
and wanted to skip the execbuffer if that failed.  Now that we memcpy,
we don't set ret != 0 on failure anymore, so it will always be 0.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 717e753912 i965: Use a WC map and memcpy for the batch instead of pwrite.
We'd like to eliminate the malloc'd shadow copy eventually, but there
are still unresolved performance problems.  In the meantime, let's at
least get rid of pwrite.

On Apollolake, improves Synmark OglBatch6 performance by:
1.53581% +/- 0.269589% (n=108).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke 343aa09a22 i965: Use batch->bo->size in brw_emit_reloc assertion.
This makes the assertion safe against batchbuffers growing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke d124521141 i965: Delete a batch size assertion that isn't very useful.
This assertion prevents you from doing intel_batchbuffer_require_space
with a size so huge it won't fit in the batchbuffer.  This doesn't seem
like a common mistake, and I've never seen the assert to be useful.

Soon, I hope to have batches grow, at which point this won't make sense.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Jason Ekstrand 939b53d332 i965/screen: Implement queryDmaBufFormatModifierAttirbs
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:42 -07:00
Jason Ekstrand 9c52aef7d7 i965/screen: Report the correct number of image planes
For non-CCS images, we were reporting just one plane even though they
may have multiple in the case of YUV.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:40 -07:00
Jason Ekstrand 8824141b8d gbm: Add a gbm_device_get_format_modifier_plane_count function
This allows the user to query the number of planes required by a given
format+modifier combination without having to create a bo or surface.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:39 -07:00
Jason Ekstrand 0a25a417ce dri/image: Add a format modifier attributes query
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:18 -07:00
Christoph Berliner 7ffd4d2a66 drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-09-14 21:02:36 +02:00
Iago Toral Quiroga 98141366f9 glsl: avoid accessing invalid memory after get_variable_being_redeclared()
After get_variable_being_redeclared() has been called, it is no longer
safe to access the original variable pointer, since its memory might have
been freed.

Since callers of this function should only be accessing the variable pointer
returned by the function, avoid potential bugs by re-assigning the
original variable pointer to the result of the function call,
making it impossible for the remaining code to access an invalid variable
pointer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-14 11:23:26 +02:00
Iago Toral Quiroga a7017746d7 glsl: make the redeclared variable NULL if it is deleted
get_variable_being_redeclared() can delete the original variable
in a specific scenario. The code sets it to NULL after this so other
code in that same function doesn't try to access trashed memory after
the fact, however, the copy of that variable in the caller code
won't see any of this making it very easy to overlook.

Make the function a bit safer by taking a pointer to the original
variable so we can also make NULL the caller's pointer to the variable
if this function deletes it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-14 11:23:26 +02:00
Iago Toral Quiroga 4af156224e glsl: use 'declared_var' instead of 'var' after checking redeclarations
Since the original 'var' might have been deleted from this point forward.

Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=102685
Fixes: 51bf007d2c (glsl: Disallow unsized array of atomic_uint)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-14 11:23:26 +02:00
Eric Engestrom 412ab3f6fd dri/radeon: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-14 09:56:00 +01:00
Samuel Pitoiset 49c72d84c2 radv: dump the list of enabled options when a hang occured
Useful to know which debug/perftest options were enabled when
a hang report is generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset 302e34d24b radv: dump last 60 lines of dmesg when a hang occured
Copied from dd_dump_dmesg().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset 26bc664ca0 radv: dump descriptors when a hang occured
Might be useful for checking if all descriptors are sets by
the application.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset b3c8de1c55 radv: save all descriptor pointers into the trace BO
To dump them when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset d7f2430703 radv: dump annotated shaders using UMR
This might be very useful in order to figure out where a shader
is stucked. This uses UMR to detect which instruction is executing
bad things.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset f0d09d9012 radeonsi: move si_get_wave_info() to AMD common code
This will allow us to use it from radv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset 8181427b14 radv: dump some status MMIO registers when a hang occured
Might report some useful information to help figuring out where
does the hang happened.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset 140621f7c4 radv/winsys: add a read_registers() callback
To dump some status MMIO registers when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00