Commit Graph

91020 Commits

Author SHA1 Message Date
Kenneth Graunke 59fdd94b85 i965/drm: Merge bo->handle and bo_gem->gem_handle.
These fields are the same value.  In the bad old days, bo->handle could
have been an identifier from the pre-GEM fake bufmgr, but that's long
gone.  Keep the "gem_handle" name for clarity.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:08 -07:00
Kenneth Graunke eb41aa82c4 i965/drm: Rewrite relocation handling.
The execbuf2 kernel API requires us to construct two kinds of lists.
First is a "validation list" (struct drm_i915_gem_exec_object2[])
containing each BO referenced by the batch.  (The batch buffer itself
must be the last entry in this list.)  Each validation list entry
contains a pointer to the second kind of list: a relocation list.
The relocation list contains information about pointers to BOs that
the kernel may need to patch up if it relocates objects within the VMA.

This is a very general mechanism, allowing every BO to contain pointers
to other BOs.  libdrm_intel models this by giving each drm_intel_bo a
list of relocations to other BOs.  Together, these form "reloc trees".

Processing relocations involves a depth-first-search of the relocation
trees, starting from the batch buffer.  Care has to be taken not to
double-visit buffers.  Creating the validation list has to be deferred
until the last minute, after all relocations are emitted, so we have the
full tree present.  Calculating the amount of aperture space required to
pin those BOs also involves tree walking, which is expensive, so libdrm
has hacks to try and perform less expensive estimates.

For some reason, it also stored the validation list in the global
(per-screen) bufmgr structure, rather than as an local variable in the
execbuffer function, requiring locking for no good reason.

It also assumed that the batch would probably contain a relocation
every 2 DWords - which is absurdly high - and simply aborted if there
were more relocations than the max.  This meant the first relocation
from a BO would allocate 180kB of data structures!

This is way too complicated for our needs.  i965 only emits relocations
from the batchbuffer - all GPU commands and state such as SURFACE_STATE
live in the batch BO.  No other buffer uses relocations.  This means we
can have a single relocation list for the batchbuffer.  We can add a BO
to the validation list (set) the first time we emit a relocation to it.
We can easily keep a running tally of the aperture space required for
that list by adding the BO size when we add it to the validation list.

This patch overhauls the relocation system to do exactly that.  There
are many nice benefits:

- We have a flat relocation list instead of trees.
- We can produce the validation list up front.
- We can allocate smaller arrays and dynamically grow them.
- Aperture space checks are now (a + b <= c) instead of a tree walk.
- brw_batch_references() is a trivial validation list walk.
  It should be straightforward to make it O(1) in the future.
- We don't need to bloat each drm_bacon_bo with 32B of reloc data.
- We don't need to lock in execbuffer, as the data structures are
  context-local, and not per-screen.
- Significantly less code and a better match for what we're doing.
- The simpler system should make it easier to take advantage of
  I915_EXEC_NO_RELOC in a future patch.

Improves performance in Synmark 7.0's OglBatch7:

    - Skylake GT4e: 12.1499% +/- 2.29531%  (n=130)
    - Apollolake:   3.89245% +/- 0.598945% (n=35)

Improves performance in GFXBench4's gl_driver2 test:

    - Skylake GT4e: 3.18616% +/- 0.867791% (n=229)
    - Apollolake:   4.1776%  +/- 0.240847% (n=120)

v2: Feedback from Chris Wilson:
    - Omit explicit zero initializers for garbage execbuf fields.
    - Use .rsvd1 = ctx_id rather than i915_execbuffer2_set_context_id
    - Drop unnecessary fencing assertions.
    - Only use _WR variant of execbuf ioctl when necessary.
    - Shrink the arrays to be smaller by default.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:00 -07:00
Kenneth Graunke e7ab0ea5e7 i965/drm: Make register write check handle execbuffer directly.
I'm about to rewrite how relocation handling works, at which point
drm_bacon_bo_emit_reloc() and drm_bacon_bo_mrb_exec() won't exist
anymore.  This code is already largely not using the batchbuffer
infrastructure, so just go all the way and handle relocations, the
validation list, and execbuffer ourselves.  That way, we don't have
to think the weird case where we only have a screen, and no context,
when redesigning the relocation handling.

v2: Write reloc.presumed_offset + reloc.delta into the batch, rather
    than duplicating the comment, so it's obvious that they match
    (suggested by Chris).  Also add a comment about why we don't do
    any error checking.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:56 -07:00
Kenneth Graunke 6368284a34 i965: Make a screen::aperture_threshold field.
This is the threshold after which drm_intel_bufmgr_check_aperture_space
returns -ENOSPC, signalling that it thinks an execbuf is likely to fail
and we need to roll back and flush the batch.

We'll need this when we rewrite aperture space checking, shortly.
In the meantime, we can also use it in GLX_MESA_query_renderer.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:55 -07:00
Kenneth Graunke 6079f4f16e i965: Make/use a brw_batch_references() wrapper.
We'll want to change the implementation of this shortly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:54 -07:00
Kenneth Graunke 6537a3ca11 i965: Use brw_emit_reloc() instead of drm_bacon_bo_emit_reloc().
I'm about to make brw_emit_reloc do actual work, so everybody needs
to start using it and not the raw drm_bacon function.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:52 -07:00
Kenneth Graunke eadd5d1b51 i965: Change intel_batchbuffer_reloc() into brw_emit_reloc().
This renames intel_batchbuffer_reloc to brw_emit_reloc and changes the
parameter naming and ordering to match drm_intel_bo_emit_reloc().

For now, it's a trivial wrapper that accesses batch->bo.  When we
rework relocations, it will start doing actual work.

target_offset should be expanded to a uint64_t to match the kernel,
but for now we leave it as its original 32-bit type.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:51 -07:00
Kenneth Graunke fbb3297165 i965/drm: Drop GEM_SW_FINISH stuff.
This is only useful when doing an incoherent CPU mapping of the current
scanout buffer.  That's a terrible plan, so we never do it.  We always
use an uncached GTT map.

So, this is useless.  Drop the code.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:49 -07:00
Kenneth Graunke 80761a42e0 i965/drm: Drop code to search for an existing bufmgr.
This functionality was added by libdrm commit
743af59669386cb6e063fa4bd85f0a0b2da86295 (intel: make bufmgr_gem
shareable from different API) in an attempt to solve libva/mesa buffer
sharing problems.  Specifically, this was working around an issue hit
by Chromium, which used the same drm_fd for multiple APIs, and shared
buffers between them.

This code attempted to work around that issue by using the same bufmgr
for both libva and Mesa.  It worked because libdrm_intel was loaded by
both libraries.  However, now that Mesa has forked, we don't have a
common library, and this code cannot work.

The correct solution is to have each API open its own file descriptor
(and get a corresponding buffer manager), and then use PRIME export
and import to share BOs across those APIs.  Then the kernel can manage
those shared resources.  According to Chris, the kernel will pass back
the same handle for a prime FD if the lookup is from the same device FD.

We believe Chromium has since moved to this model.

In Mesa, there is already only one screen per FD, and so there will
only be one bufmgr per FD.  We don't need any of this code.

v2: Add a big warning comment written by Chris Wilson.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:48 -07:00
Kenneth Graunke b666654201 i965/drm: Unwrap the unnecessary drm_bacon_reloc_target_info struct.
This used to have another field, but now it's just a BO pointer.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:46 -07:00
Kenneth Graunke 2662894baa i965/drm: Switch from uthash to Mesa's hash table.
No performance data has been gathered about this choice.  I just don't
want that many hash tables.  Chris points out that this is not
performance critical - we should not be recreating that many handles
from scratch.  In the past we used a linear list, which became
unreasonable in stress tests that used hundreds of thousands of BOs.
In real usage, it shouldn't matter that much.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:45 -07:00
Kenneth Graunke ad1b1cce44 i965/drm: Drop bo_gem::kflags.
It's always zero now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:43 -07:00
Kenneth Graunke a972c903cb i965/drm: Drop has_exec_async related API.
Mesa doesn't use this yet.  We'll almost certainly want to, but we can
add the functionality back after we clean up the messy drm code.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:42 -07:00
Kenneth Graunke d606f64e2d i965/drm: Drop softpin support for now.
We may want this eventually, but simplify for now.  We can add it back
later when we actually intend to use it.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:41 -07:00
Kenneth Graunke 0314eed3b1 i965/drm: Drop userptr support for now.
We'll want userptr support for GL_AMD_pinned_memory support someday,
and possibly some other upload optimizations.  Chris says "not in this
form" though.  Drop it and simplify for now - we can add it back later
when we're ready to hook it up fully.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:39 -07:00
Kenneth Graunke a460e1eb51 i965/drm: Delete engine checks.
This is basically handholding to prevent a bogus caller from trying to
execbuffer on a bogus engine.  i965 already does this correctly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:37 -07:00
Kenneth Graunke 1dc02da6d7 i965/drm: Drop intel_chipset.h in favor of using gen_device_info.
This moves the PCI ID detection to intel_screen.c and makes
drm_bacon_bufmgr_gem_init() take a devinfo pointer.

We also drop the HAS_LLC query stuff - devinfo has that info already,
without kernel queries, and it makes no sense to have two has_llc flags
set by different mechanisms.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:36 -07:00
Kenneth Graunke 55ee8f36a8 i965/drm: Drop deprecated drm_bacon_bo::offset.
This field was the wrong size, so we replaced it with offset64.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:35 -07:00
Kenneth Graunke a29fb9b2ee i965/drm: Drop has_wait_timeout.
The wait-ioctl was introduced in kernel v3.6 (20120930) and that is our
current minimum requirement for screen creation.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:33 -07:00
Kenneth Graunke b97bcf3b6b i965/drm: Assume aperture size query will work.
This query has been available since 2.6.28.  We require 3.6.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:32 -07:00
Kenneth Graunke c28691ab77 i965/drm: Combine drm_bacon_bufmgr_gem and drm_bacon_bufmgr classes.
The distinction was required when the bufmgr was virtualised, now there
is only one class, we no longer need the distraction of pretending it is
a subclass.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:31 -07:00
Kenneth Graunke 3673b89bf3 i965/drm: Move _drm_bacon_context to intel_bufmgr_gem.c.
This moves us one step closer to killing off intel_bufmgr_priv.h.

We might want to nuke it altogether, since it's basically just a
uint32_t handle, but for now, let's focus on removing files.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:29 -07:00
Kenneth Graunke b0d1c5983b i965/drm: Drop cliprects and dr4 from execbuf variants.
Legacy DRI1 leftovers.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:28 -07:00
Kenneth Graunke 2c257ff226 i965/drm: Devirtualize the bufmgr.
libdrm_bacon used to have a GEM-based bufmgr and a legacy fake bufmgr,
but that's long since dead (and we never imported it to i965).  So,
drop the extra layer of function pointers.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:27 -07:00
Kenneth Graunke dca224a9ef i965/drm: Check INTEL_DEBUG & DEBUG_BUFMGR directly.
Eliminates some API around this, and more importantly, the last
field in one bufmgr class.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:25 -07:00
Kenneth Graunke 68cb0c6d92 i965/drm: Use Mesa's macros.h instead of duplicating them.
Replace the duplicated macros imported from libdrm:

   ARRAY_SIZE, MAX2, ALIGN, STATIC_ASSERT

and remove unused ROUND_UP_TO and ROUND_UP_TO_MB.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:24 -07:00
Kenneth Graunke c5cdb0f405 i965/drm: Use ALIGN, not ROUND_UP_TO.
ROUND_UP_TO handles a NPOT alignment, but all the alignments we use
are power of two anyway, so there's no need.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:23 -07:00
Kenneth Graunke 1d476e64e5 i965/drm: Delete execbuf1 support.
execbuf2 has been around since v2.6.33.  We require v3.6.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:21 -07:00
Kenneth Graunke ddf01d3f41 i965/drm: Remove Gen2-3 fence accounting.
Since gen4, we do not use fence registers for any GPU access and so
never have to account for the fence during batch construction. All the
related fence functions are unused.

Based on Kristian Høgsberg's patch; commit message by Chris Wilson.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:20 -07:00
Kenneth Graunke 4f698b0049 i965/drm: Remove some unused functions and macros.
Mesa doesn't use these functions or macros, so we can delete them,
and save work refactoring and cleaning them up.  We'll delete a lot
more later, too.

Based on a patch by Kristian Høgsberg.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:18 -07:00
Kenneth Graunke 09b2f6124a i965/drm: Switch to util/list.h instead of libdrm_lists.h.
Both are kernel style lists, so this is trivial.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:16 -07:00
Kenneth Graunke 7c64096b2d i965/drm: Port to Mesa's atomic header.
Drop xf86atomic.h in favor of Mesa's util/u_atomic.h.  We replace the
atomic_t wrapper struct with a bare integer, switch to the 'p_atomic'
naming conventions, and move over the one extra helper.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:13 -07:00
Kenneth Graunke eed86b975e i965/drm: Use our internal libdrm (drm_bacon) rather than the real one.
Now we can actually test our changes.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:11 -07:00
Kenneth Graunke 91b973e3a3 i965/drm: s/drm_intel/drm_bacon/g
Using drm_intel_* as a prefix is hazardous - we don't want to conflict
with the actual libdrm_intel symbols.  In particular, I think we could
get into trouble during the final megadrivers linking.

So, rename everything to an different yet arbitrary prefix.  bacon and
intel are the same number of characters, so we don't have to reindent
the world.  It's also an homage to Ian's "Bacon Trail" platform.

I was going to use "drm_relic" to poke fun at libdrm being ancient,
and so we could explain the name with a "historical reasons" pun,
but it sounds too much like ralloc.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:09 -07:00
Kenneth Graunke 4ad0758f51 i965/drm: Drop libpciaccess dependencies.
i965 doesn't use drm_intel_get_aperture_sizes(), so we can delete
support for it.  This avoids a build dependency on libpciaccess.

Chris also notes:

"There's a really old bug that hopefully has been closed already
 (although as far as I can tell, it has never been fixed) about
 how using libpciaccess from libdrm_intel breaks the world (since
 libpciaccess uses a singleton that is torn down at the first request
 rather than upon the last user)."

This bug should go away in two commits when we switch over to our
internal copy of libdrm_intel.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84325
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:05 -07:00
Kenneth Graunke d614135e95 i965/drm: Make libdrm_lists.h compile by defining typeof.
typeof doesn't seem to exist, so this won't compile (but we don't yet
try).  Define it to __typeof__.  This code is going to die soon anyway.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:03 -07:00
Kenneth Graunke b97c7ef4c8 i965/drm: remove legacy defines, aub functions, and decoder prototypes
We never imported any of this code, so drop the prototypes, unused
enums, and defines.

Based on patches by Emil Velikov.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:00 -07:00
Kenneth Graunke 514db96c11 i965: Import libdrm_intel.
This imports commit 19c4cfc54918d361f2535aec16650e9f0be667cd of
libdrm/intel/*.[ch], minus a few files that we're never going to use
(and would immediately delete), plus a few necessary dependencies.

We rename intel_bufmgr.h to brw_bufmgr.h to avoid #include conflicts.
We also fix UTF-8 symbol problems in intel_bufmgr_gem.c comments
because vim keeps trying to fix that every time I edit the file,
and we may as well fix it right away.

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:30:53 -07:00
Kenneth Graunke 915820cc59 i965: Make sure we don't use CPU maps for the scanout buffer.
Using an incoherent CPU map on the active scanout buffer is really
sketchy - we may need extra flushing via GEM_SW_FINISH, or using
drmModeDirtyFB() and kernel commit a6a7cc4b7db6d (4.10+).

Chris suggests "never ever do that", which seems like a wise plan!

intel_miptree_map_raw() uses CPU maps on linear buffers.

Having a linear scanout buffer should be really rare, and mapping the
front buffer should be similarly rare.  Together, it should basically
never happen.  But, in case it does somehow...make sure that mapping
the scanout buffer always goes through an uncached GTT map.

v2: Add a giant comment written by Chris Wilson.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-10 14:30:49 -07:00
Kenneth Graunke eb28ce2b0b i965: Stop calling drm_intel_bufmgr_gem_enable_fenced_relocs().
This does nothing on Gen4+, which is the only hardware we support.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:30:44 -07:00
Kenneth Graunke 034b220dc4 i965: Fix GLX_MESA_query_renderer video memory on 32-bit.
On modern systems with 4GB apertures, the size in bytes is 4294967296,
or (1ull << 32).  The kernel gives us the aperture size as a __u64,
which works out great.

Unfortunately, libdrm "helpfully" returns the data as a size_t, which
on 32-bit systems means it truncates the aperture size to 0 bytes.
We've happily reported this value as 0 MB of video memory via
GLX_MESA_query_renderer since it was originally exposed.

This patch bypasses libdrm and calls the ioctl ourselves so we can
use a proper uint64_t, avoiding the 32-bit integer overflow.  We now
report a proper video memory size on 32-bit systems.

Chris points out that the aperture size (CPU mappable size limit)
isn't really the right thing to be checking.  But libdrm_intel uses
it to fail execbuffer, so it is an actual limit for now.  Once that's
fixed we can probably move to something else.  In the meantime, fix
the obvious typecasting bug.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:30:40 -07:00
Samuel Pitoiset 5bcfe90501 gallium/radeon: add HUD queries for GPU temperature and clocks
Only the Radeon kernel driver exposed the GPU temperature and
the shader/memory clocks, this implements the same functionality
for the AMDGPU kernel driver.

These queries will return 0 if the DRM version is less than 3.10,
I don't explicitely check the version here because the query
codepath is already a bit messy.

v2: - rebase on top of master

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:06:19 +02:00
Samuel Pitoiset 0f39fb8500 configure.ac: require libdrm_amdgpu 2.4.79
The sensor info requires amdgpu_query_sensor_info().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:06:17 +02:00
Samuel Pitoiset def02007cd radeonsi: add new si_check_render_feedback_texture() helper
For bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:41 +02:00
Samuel Pitoiset fbcc8664fd radeonsi: add new si_decompress_color_texture() helper
For bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:38 +02:00
Samuel Pitoiset 6646212de0 radeonsi: add new depth_needs_decompression() helper
v2: - rename to depth_needs_decompression() instead

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:32 +02:00
Samuel Pitoiset 9cc91ba6d5 radeonsi: add a 'break' in si_check_render_feedback_*()
No need to check all color buffers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:29 +02:00
Samuel Pitoiset 51d6641700 radeonsi: re-use 'desc' in si_set_shader_image()
No need to compute the offset in the descriptor twice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:27 +02:00
Samuel Pitoiset a1c37ff9e4 ac: add unreachable() in ac_build_image_opcode()
To silent the following compiler warning:

common/ac_llvm_build.c: In function ‘ac_build_image_opcode’:
common/ac_llvm_build.c:1080:3: warning: ‘name’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   snprintf(intr_name, sizeof(intr_name), "%s%s%s%s.v4f32.%s.v8i32",
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    name,
    ~~~~~
    a->compare ? ".c" : "",
    ~~~~~~~~~~~~~~~~~~~~~~~
    a->bias ? ".b" :
    ~~~~~~~~~~~~~~~~
    a->lod ? ".l" :
    ~~~~~~~~~~~~~~~
    a->deriv ? ".d" :
    ~~~~~~~~~~~~~~~~~
    a->level_zero ? ".lz" : "",
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    a->offset ? ".o" : "",
    ~~~~~~~~~~~~~~~~~~~~~~
    type);
    ~~~~~

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:02:12 +02:00
Constantine Kharlamov 61e47d92c5 r600g: get rid of dummy pixel shader
The idea is taken from radeonsi. The code mostly was already checking for null
pixel shader, so little checks had to be added.

Interestingly, acc. to testing with GTAⅣ, though binding of null shader happens
a lot at the start (then just stops), but draw_vbo() never actually sees null
ps.

v2: added a check I missed because of a macros using a prefix to choose
a shader.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 22:45:22 +02:00