Commit Graph

104527 Commits

Author SHA1 Message Date
Erik Faye-Lund 1b2444dffc virgl: replace fprintf-call with debug_printf
This is the only direct call-site for fprintf in virgl; all other
call-sites call debug_printf instead. So let's follow in style here.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-08-28 14:13:43 +02:00
Erik Faye-Lund 2ebfa90abe virgl: delete commented out fprintf-call
This is just debug-cruft left over. Let's just get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2018-08-28 14:13:43 +02:00
Guido Günther 9de34b4dde meson: Don't enable any vulkan drivers on arm, aarch64
There's no Vulkan support for arm atm.

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-27 11:32:04 -07:00
Guido Günther 05e2fc6860 meson: Be a bit more helpful when arch or OS is unknown
V2: Add one missing @0@

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-27 11:31:52 -07:00
Sagar Ghuge a1e3305f75 intel/eu: print bytes instead of 32 bit hex value
INTEL_DEBUG=hex prints 32 bit hex value and due to endianness of CPU
byte order is reversed. In order to disassemble binary files, print
each byte instead of 32 bit hex value.

v2: Print blank spaces in order to vertically align output of compacted
    instructions hex value with uncompacted instructions hex value.
    (Matt Turner)

v3: Fix line wrap at correct length

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-08-27 11:07:39 -07:00
Lionel Landwerlin 440a988bd1 intel: decoder: handle 0 sized structs
Gen7.5 has a BLEND_STATE of size 0 which includes a variable length
group. We did not deal with that very well, leading to an endless
loop.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-27 18:33:18 +01:00
Rhys Perry e56e600bd3 nv50/ir,nvc0: use constant buffers for compute when possible on Kepler+
Gives a +7.79% increase in FPS with Hitman on lowest quality settings on
my GTX 1060.

total instructions in shared programs : 5787979 -> 5748677 (-0.68%)
total gprs used in shared programs    : 669901 -> 669373 (-0.08%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21064 (-0.02%)

                local     shared        gpr       inst      bytes
    helped           1           0         152         274         274
      hurt           0           0           0           0           0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 14:23:42 +01:00
Rhys Perry d27c791891 nv50/ir: optimize multiplication by 16-bit immediates into two xmads
Rather than the usual three that would be created.

total instructions in shared programs : 5796385 -> 5786560 (-0.17%)
total gprs used in shared programs    : 670103 -> 669968 (-0.02%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21164 -> 21068 (-0.45%)

                local     shared        gpr       inst      bytes
    helped           1           0          64        1040        1040
      hurt           0           0          27           0           0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:57:11 +01:00
Rhys Perry 400a4eb964 nv50/ir: optimize near power-of-twos into shladd
total instructions in shared programs : 5819319 -> 5796385 (-0.39%)
total gprs used in shared programs    : 670571 -> 670103 (-0.07%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21164 -> 21164 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0         318        1758        1758
      hurt           0           0          63           0           0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:57:01 +01:00
Rhys Perry 2f52925f5c nv50/ir: move a * b -> a << log2(b) code into createMul()
With this commit, OP_MAD is handled on nv50 too. This commit is also
useful for later commits.

Also, instead of creating a shladd, it relies on LateAlgebraicOpt to
create one. This simplifies the code and helps shader-db slightly overall.

total instructions in shared programs : 5820882 -> 5819319 (-0.03%)
total gprs used in shared programs    : 670595 -> 670571 (-0.00%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21164 -> 21164 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0          18         230         230
      hurt           0           0           8         263         263

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:47 +01:00
Rhys Perry b60bc7a4ab nv50/ir: optimize imul/imad to xmads
This hits the shader-db numbers a good bit, though a few xmads is way
faster than an imul or imad and the cost is mitigated by the next commit,
which optimizes many multiplications by immediates into shorter and less
register heavy instructions than the xmads.

total instructions in shared programs : 5768871 -> 5820882 (0.90%)
total gprs used in shared programs    : 669919 -> 670595 (0.10%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21164 (0.46%)

                local     shared        gpr       inst      bytes
    helped           0           0          38           0           0
      hurt           1           0         365        3076        3076

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:44 +01:00
Rhys Perry bcbcdf8448 gm107/ir: add support for OP_XMAD on GM107+
v4: make the immediate field 16 bits
v5: don't ever emit h1 flags for immediates

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:41 +01:00
Rhys Perry 5d6952d2de nv50/ir: add preliminary support for OP_XMAD
v4: remove uint16_t(...)
v4: don't allow immediates outside [0,65535] in insnCanLoad()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-08-27 13:56:36 +01:00
vadym.shovkoplias 4a8444d5bc glsl/linker: Allow unused in blocks which are not declated on previous stage
>From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:

    "Only the input variables that are actually read need to be written
     by the previous stage; it is allowed to have superfluous
     declarations of input variables."

Fixes:
    * interstage-multiple-shader-objects.shader_test

v2:
  Update comment in ir.h since the usage of "used" field
  has been extended.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-27 12:13:53 +02:00
Jason Ekstrand 07a227f543 nir: Pull block_ends_in_jump into nir.h
We had two different implementations in different files.  May as well
have one and put it in nir.h.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-27 02:15:38 -05:00
Samuel Iglesias Gonsálvez 59a8e0dbf8 anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2()
VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1.

Fixes Vulkan CTS CL#2849.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-27 09:07:52 +02:00
Jason Ekstrand aad501f15e intel/tools: Add 0x in front of a couple of hex values
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 18:47:08 -05:00
Jason Ekstrand 76b0e4d8c9 anv: Fill holes in the VF VUE to zero
This fixes a GPU hang in DOOM 2016 running under wine.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104809
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 18:47:08 -05:00
Kai Wasserbäch b2313ef4a8 intel: tools: Fix aubinator_error's fprintf call (format-security)
The recent commit 4616639b49 introduced
the new function aubinator_error() which is a trivial wrapper around
fprintf() to STDERR. The call to fprintf() however is passed the message
msg directly:
  fprintf(stderr, msg);

This is a format-security violation and leads to an FTBFS with
-Werror=format-security (GCC 8):
  ../../../src/intel/tools/aubinator.c: In function 'aubinator_error':
  ../../../src/intel/tools/aubinator.c:74:4: error: format not a string literal and no format arguments [-Werror=format-security]
      fprintf(stderr, msg);
      ^~~~~~~

This patch fixes this trivially by introducing a catch-all "%s" format
argument.

Fixes: 4616639b49 ("intel: tools: split aub parsing from aubinator")
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 16:52:12 +01:00
Jason Ekstrand 70de31d0c1 intel/batch_decoder: Print blend states properly
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:45 -05:00
Jason Ekstrand cbd4bc1346 intel/batch_decoder: Fix dynamic state printing
Instead of printing addresses like everyone else, we were accidentally
printing the offset from state base address.  Also, state_map is a void
pointer so we were incrementing in bytes instead of dwords and every
state other than the first was wrong.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:43 -05:00
Jason Ekstrand d1971be6ea intel/decoder: Print ISL formats for vertex elements
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:40 -05:00
Jason Ekstrand 2abd7ae189 intel/decoder: Clean up field iteration and fix sub-dword fields
First of all, setting iter->name in advance_field is unnecessary because
it gets set by gen_decode_field which gets called immediately after
gen_decode_field in the one call-site.  Second, we weren't properly
initializing start_bit and end_bit in the initial condition of
gen_field_iterator_next so the first field of a struct would get printed
wrong if it doesn't start on the first bit.  This is fixed by adding a
iter_start_field helper which sets the field and also sets up the other
bits we need.  This fixes decoding of 3DSTATE_SBE_SWIZ.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-25 07:50:36 -05:00
Kenneth Graunke 1281608849 gallium: Split out PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE.
Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not
PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER.

Drivers for such hardware would like to advertise support for
ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp.

This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit,
changes the extension enable to be based on that, and enables it
in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP
(so they continue supporting this mode).
2018-08-24 17:25:36 -07:00
Lionel Landwerlin f430a37fa7 intel: decoder: unify MI_BB_START field naming
The batch decoder looks for a field with a particular name to decide
whether an MI_BB_START leads into a second batch buffer level. Because
the names are different between Gen7.5/8 and the newer generation we
fail that test and keep on reading (invalid) instructions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-24 23:10:08 +01:00
Dylan Baker 7f745c19c1 docs: Update calendar, news, relnotes for 18.1.7 2018-08-24 09:35:24 -07:00
Dylan Baker 82c2e7bf9e docs: Add mesa 18.1.7 notes 2018-08-24 09:34:03 -07:00
Dylan Baker 2d8569073e docs: Add mesa 18.1.7 docs 2018-08-24 09:33:59 -07:00
Andres Gomez 0d3bb146a8 docs: update calendar 18.2.0-rc4 is out, extend to 18.2.0-rc5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-24 18:58:00 +03:00
Kevin Rogovin e345247092 docs/relnotes: Mark NV_fragment_shader_interlock support in i965
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-24 08:59:54 -05:00
Emil Velikov 081395e99d egl/drm: use gbm_dri_bo() wrapper
Remove the explicit cast, using the appropriate wrapper instead.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-08-24 11:53:24 +01:00
Emil Velikov 7b4269a5e0 egl/drm: use gbm_dri_surface() wrapper
Remove the explicit cast, using the appropriate wrapper instead.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-08-24 11:53:20 +01:00
Emil Velikov 7eb4a28d41 egl/drm: use gbm_dri_device() wrapper
Remove the explicit cast, using the appropriate wrapper instead.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-08-24 11:52:48 +01:00
Emil Velikov 2c049384b1 egl/android: simplify device open/probe
Currently droid_probe_device, does not do any 'probing' but filtering
out a device if it doesn't match the vendor string given.

Rename the function, straighten the return type and call it only as
needed - an actual vendor string is provided.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:44 +01:00
Emil Velikov 2f8403a4ca egl/android: remove drmVersion::name NULL check
The name string is guaranteed to be non-NULL.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:41 +01:00
Emil Velikov d1211f3112 egl/android: remove droid_probe_driver()
The function name is misleading - it effectively checks if
loader_get_driver_for_fd fails. Which can happen only only on strdup
error - a close to impossible scenario.

Drop the function - we call the loader API at at later stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:39 +01:00
Emil Velikov 9b5bf7afce egl/android: use strcmp with drmVersion::name
The name string is guaranteed to be NULL terminated. Drop the explicit
length check that comes with strncmp().

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:37 +01:00
Emil Velikov 3827966643 egl/android: use drmDevice instead of the manual /dev/dri iteration
Replace the manual handling of /dev/dri in favor of the drmDevice API.
The latter provides a consistent way of enumerating the devices,
providing device details as needed.

v2:
 - Use ARRAY_SIZE (Frank)
 - s/famour/favor/ typo (Frank)
 - Make MAX_DRM_DEVICES a macro - fix vla errors (RobF)
 - Remove left-over dev_path instance (RobF)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com> (v1)
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:50:36 +01:00
Emil Velikov cff80b6c15 Revert "configure: allow building with python3"
This reverts commit ae7898dfdb.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:14:15 +01:00
Emil Velikov 7a4d2d1fdf Revert "travis: use python3 for the autoconf builds"
This reverts commit 855af9a5a2.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:10:24 +01:00
Kenneth Graunke 93e8e17fa4 Revert "mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES"
This reverts commit 095515e16c.

This breaks KHR-GL46.map_buffer_alignment.functional on i965.

This code was apparently not reviewed and I don't know why we would
move from a driver configurable constant to a hardcoded value for all
drivers.  This really looks like an accidental hack push.
2018-08-24 00:36:01 -07:00
Kenneth Graunke 9d670fd86c Revert recent changes about not including compute in combined limits.
As far as I can tell, no one reviewed these changes, they made i965
assert fail on driver load, and I am not certain they are correct.
(Hopefully reverting these does not break radeonsi too badly...)

The uniform related changes seem fine and reasonable, but the texture
image units change is possibly incorrect.  According to the
OES_tessellation_shader spec issue 5:

   (5) How are aggregate shader limits computed?

    RESOLVED: Following the GL 4.4 model, but we restrict uniform
    buffer bindings to 12/stage instead of 14, this results in

        MAX_UNIFORM_BUFFER_BINDINGS = 72
            This is 12 bindings/stage * 6 shader stages, allowing a static
            partitioning of the bindings even though at most 5 stages can
            appear in a program object).
        MAX_COMBINED_UNIFORM_BLOCKS = 60
            This is 12 blocks/stage * 5 stages, since compute shaders can't
            be mixed with other stages.
        MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96
            This is 16 textures/stage * 6 stages.

which definitely is including compute shaders in that last limit.
Not including compute shaders breaks the following test:
dEQP-GLES31.functional.state_query.integer.max_combined_texture_image_units_getinteger

There was enough breakage that I figured we should just send this back
to the drawing board.

Revert "i965: don't include compute resources in "Combined" limits"
Revert "st/mesa: don't include compute resources in "Combined" limits"
Revert "mesa: don't include compute resources in MAX_COMBINED_* limits"

This reverts commit b03dcb1e5f.
This reverts commit cff290df4c.
This reverts commit 45f87a48f9.
2018-08-24 00:36:01 -07:00
Roland Scheidegger 8e1be9a34a gallivm: don't use saturated unsigned add/sub intrinsics for llvm 8.0
These have been removed. Unfortunately auto-upgrade doesn't work for
jit. (Worse, it seems we don't get a compilation error anymore when
compiling the shader, rather llvm will just do a call to a null
function in the jitted shaders making it difficult to detect when
intrinsics vanish.)

Luckily the signed ones are still there, I helped convincing llvm
removing them is a bad idea for now, since while the unsigned ones have
sort of agreed-upon simplest patterns to replace them with, this is not
the case for the signed ones, and they require _significantly_ more
complex patterns - to the point that the recognition is IMHO probably
unlikely to ever work reliably in practice (due to other optimizations
interfering). (Even for the relatively trivial unsigned patterns, llvm
already added test cases where recognition doesn't work, unsaturated
add followed by saturated add may produce atrocious code.)
Nevertheless, it seems there's a serious quest to squash all
cpu-specific intrinsics going on, so I'd expect patches to nuke them as
well to resurface.

Adapt the existing fallback code to match the simple patterns llvm uses
and hope for the best. I've verified with lp_test_blend that it does
produce the expected saturated assembly instructions. Though our
cmp/select build helpers don't use boolean masks, but it doesn't seem
to interfere with llvm's ability to recognize the pattern.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106231
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-24 07:50:13 +02:00
Marek Olšák 45b5f5fa25 st/mesa: expose KHR_texture_compression_astc_sliced_3d
This is ASTC 2D LDR allowing texture arrays and 3D, compressing each
slice as a separate 2D image. Tested by piglit. Trivial.
2018-08-24 00:36:18 -04:00
Marek Olšák dae4cf397d st/mesa: expose EXT_disjoint_timer_query
same cap as ARB_timer_query, no changes needed, tested by piglit
2018-08-24 00:36:18 -04:00
Marek Olšák 263c962cfd mesa: expose EXT_vertex_attrib_64bit
because the closed driver exposes it.
It's the same as the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák 5c90091036 mesa: expose AMD_query_buffer_object
it's a subset of the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák 056b9a5a36 mesa: expose AMD_multi_draw_indirect
because the closed driver exposes it.
This is equivalent to the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák b3c17330e6 mesa: expose AMD_gpu_shader_int64
because the closed driver exposes it.

It's equivalent to ARB_gpu_shader_int64.
In this patch, I did everything the same as we do for ARB_gpu_shader_int64.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák 1cf3631b9c mesa: expose ARB_post_depth_coverage in the Compatibility profile
It only contains GLSL changes.

v2: allow the layout qualifier on GLSL <= 1.30
2018-08-24 00:36:18 -04:00