Commit Graph

104394 Commits

Author SHA1 Message Date
Emil Velikov 2c049384b1 egl/android: simplify device open/probe
Currently droid_probe_device, does not do any 'probing' but filtering
out a device if it doesn't match the vendor string given.

Rename the function, straighten the return type and call it only as
needed - an actual vendor string is provided.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:44 +01:00
Emil Velikov 2f8403a4ca egl/android: remove drmVersion::name NULL check
The name string is guaranteed to be non-NULL.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:41 +01:00
Emil Velikov d1211f3112 egl/android: remove droid_probe_driver()
The function name is misleading - it effectively checks if
loader_get_driver_for_fd fails. Which can happen only only on strdup
error - a close to impossible scenario.

Drop the function - we call the loader API at at later stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:39 +01:00
Emil Velikov 9b5bf7afce egl/android: use strcmp with drmVersion::name
The name string is guaranteed to be NULL terminated. Drop the explicit
length check that comes with strncmp().

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:52:37 +01:00
Emil Velikov 3827966643 egl/android: use drmDevice instead of the manual /dev/dri iteration
Replace the manual handling of /dev/dri in favor of the drmDevice API.
The latter provides a consistent way of enumerating the devices,
providing device details as needed.

v2:
 - Use ARRAY_SIZE (Frank)
 - s/famour/favor/ typo (Frank)
 - Make MAX_DRM_DEVICES a macro - fix vla errors (RobF)
 - Remove left-over dev_path instance (RobF)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com> (v1)
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-08-24 11:50:36 +01:00
Emil Velikov cff80b6c15 Revert "configure: allow building with python3"
This reverts commit ae7898dfdb.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:14:15 +01:00
Emil Velikov 7a4d2d1fdf Revert "travis: use python3 for the autoconf builds"
This reverts commit 855af9a5a2.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:10:24 +01:00
Kenneth Graunke 93e8e17fa4 Revert "mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES"
This reverts commit 095515e16c.

This breaks KHR-GL46.map_buffer_alignment.functional on i965.

This code was apparently not reviewed and I don't know why we would
move from a driver configurable constant to a hardcoded value for all
drivers.  This really looks like an accidental hack push.
2018-08-24 00:36:01 -07:00
Kenneth Graunke 9d670fd86c Revert recent changes about not including compute in combined limits.
As far as I can tell, no one reviewed these changes, they made i965
assert fail on driver load, and I am not certain they are correct.
(Hopefully reverting these does not break radeonsi too badly...)

The uniform related changes seem fine and reasonable, but the texture
image units change is possibly incorrect.  According to the
OES_tessellation_shader spec issue 5:

   (5) How are aggregate shader limits computed?

    RESOLVED: Following the GL 4.4 model, but we restrict uniform
    buffer bindings to 12/stage instead of 14, this results in

        MAX_UNIFORM_BUFFER_BINDINGS = 72
            This is 12 bindings/stage * 6 shader stages, allowing a static
            partitioning of the bindings even though at most 5 stages can
            appear in a program object).
        MAX_COMBINED_UNIFORM_BLOCKS = 60
            This is 12 blocks/stage * 5 stages, since compute shaders can't
            be mixed with other stages.
        MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96
            This is 16 textures/stage * 6 stages.

which definitely is including compute shaders in that last limit.
Not including compute shaders breaks the following test:
dEQP-GLES31.functional.state_query.integer.max_combined_texture_image_units_getinteger

There was enough breakage that I figured we should just send this back
to the drawing board.

Revert "i965: don't include compute resources in "Combined" limits"
Revert "st/mesa: don't include compute resources in "Combined" limits"
Revert "mesa: don't include compute resources in MAX_COMBINED_* limits"

This reverts commit b03dcb1e5f.
This reverts commit cff290df4c.
This reverts commit 45f87a48f9.
2018-08-24 00:36:01 -07:00
Roland Scheidegger 8e1be9a34a gallivm: don't use saturated unsigned add/sub intrinsics for llvm 8.0
These have been removed. Unfortunately auto-upgrade doesn't work for
jit. (Worse, it seems we don't get a compilation error anymore when
compiling the shader, rather llvm will just do a call to a null
function in the jitted shaders making it difficult to detect when
intrinsics vanish.)

Luckily the signed ones are still there, I helped convincing llvm
removing them is a bad idea for now, since while the unsigned ones have
sort of agreed-upon simplest patterns to replace them with, this is not
the case for the signed ones, and they require _significantly_ more
complex patterns - to the point that the recognition is IMHO probably
unlikely to ever work reliably in practice (due to other optimizations
interfering). (Even for the relatively trivial unsigned patterns, llvm
already added test cases where recognition doesn't work, unsaturated
add followed by saturated add may produce atrocious code.)
Nevertheless, it seems there's a serious quest to squash all
cpu-specific intrinsics going on, so I'd expect patches to nuke them as
well to resurface.

Adapt the existing fallback code to match the simple patterns llvm uses
and hope for the best. I've verified with lp_test_blend that it does
produce the expected saturated assembly instructions. Though our
cmp/select build helpers don't use boolean masks, but it doesn't seem
to interfere with llvm's ability to recognize the pattern.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106231
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-08-24 07:50:13 +02:00
Marek Olšák 45b5f5fa25 st/mesa: expose KHR_texture_compression_astc_sliced_3d
This is ASTC 2D LDR allowing texture arrays and 3D, compressing each
slice as a separate 2D image. Tested by piglit. Trivial.
2018-08-24 00:36:18 -04:00
Marek Olšák dae4cf397d st/mesa: expose EXT_disjoint_timer_query
same cap as ARB_timer_query, no changes needed, tested by piglit
2018-08-24 00:36:18 -04:00
Marek Olšák 263c962cfd mesa: expose EXT_vertex_attrib_64bit
because the closed driver exposes it.
It's the same as the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák 5c90091036 mesa: expose AMD_query_buffer_object
it's a subset of the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák 056b9a5a36 mesa: expose AMD_multi_draw_indirect
because the closed driver exposes it.
This is equivalent to the ARB extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák b3c17330e6 mesa: expose AMD_gpu_shader_int64
because the closed driver exposes it.

It's equivalent to ARB_gpu_shader_int64.
In this patch, I did everything the same as we do for ARB_gpu_shader_int64.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák 1cf3631b9c mesa: expose ARB_post_depth_coverage in the Compatibility profile
It only contains GLSL changes.

v2: allow the layout qualifier on GLSL <= 1.30
2018-08-24 00:36:18 -04:00
Jason Ekstrand 8d8222461f intel/nir: Enable nir_opt_find_array_copies
We have to be a bit careful with this one because we want it to run in
the optimization loop but only in the first brw_nir_optimize call.
Later calls assume that we've lowered away copy_deref instructions and
we don't want to introduce any more.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15176942 -> 15176942 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

In spite of the lack of any shader-db improvement, this patch completely
eliminates spilling in the Batman: Arkham City tessellation shaders.
This is because we are now able to detect that the temporary array
created by DXVK for storing TCS inputs is a copy of the input arrays and
use indirect URB reads instead of making a copy of 4.5 KiB of input data
and then indirecting on it with if-ladders.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:47:51 -05:00
Jason Ekstrand 53072582dc nir: Add an array copy optimization
This peephole optimization looks for a series of load/store_deref or
copy_deref instructions that copy an array from one variable to another
and turns it into a copy_deref that copies the entire array.  The
pattern it looks for is extremely specific but it's good enough to pick
up on the input array copies in DXVK and should also be able to pick up
the sequence generated by spirv_to_nir for a OpLoad of a large composite
followed by OpStore.  It can always be improved later if needed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:47:47 -05:00
Jason Ekstrand a4a9c07549 intel/nir: Use nir_shrink_vec_array_vars
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15177605 -> 15176765 (<.01%)
    instructions in affected programs: 4259 -> 3419 (-19.72%)
    helped: 1
    HURT: 0

    total spills in shared programs: 10954 -> 10855 (-0.90%)
    spills in affected programs: 295 -> 196 (-33.56%)
    helped: 1
    HURT: 0

    total fills in shared programs: 22222 -> 22117 (-0.47%)
    fills in affected programs: 417 -> 312 (-25.18%)
    helped: 1
    HURT: 0

The helped shader is from the OglCSDof synmark test.  On my Kaby Lake
laptop, the actual framerate of the benchmark didn't appear to improve
beyond the noise.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:46:56 -05:00
Jason Ekstrand be8d009908 nir: Add a array-of-vector variable shrinking pass
This pass looks for variables with vector or array-of-vector types and
narrows the type to only the components used.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:46:56 -05:00
Jason Ekstrand 02a5442dd7 intel/nir: Use the new structure and array splitting passes
We call structure splitting once because it is guaranteed to split all
the structures in the entire shader in one go.  We call array splitting
in the loop in case future optimizations turn indirects into direct
dereferences and we can split more arrays.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15177605 -> 15177605 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

This is unsurprising because nir_lower_vars_to_ssa already effectively
does structure and array splitting internally.  It doesn't actually
split the variables but it's ability to reason about aliasing in the
presence of arrays and structures and pick out scalars or vectors to be
lowered to SSA values is fairly advanced.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand fa6417495c nir: Add an array splitting pass
This pass looks for array variables where at least one level of the
array is never indirected and splits it into multiple smaller variables.

This pass doesn't really do much now because nir_lower_vars_to_ssa can
already see through arrays of arrays and can detect indirects on just
one level or even see that arr[i][0][5] does not alias arr[i][1][j].
This pass exists to help other passes more easily see through arrays of
arrays.  If a back-end does implement arrays using scratch or indirects
on registers, having more smaller arrays is likely to have better memory
efficiency.

v2 (Jason Ekstrand):
 - Better comments and naming (some from Caio)
 - Rework to use one hash map instead of two

v2.1 (Jason Ekstrand):
 - Fix a couple of bugs that were added in the rework including one
   which basically prevented it from running

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand 26eb077ec4 nir: Add a structure splitting pass
This pass doesn't really do much now because nir_lower_vars_to_ssa can
already see through structures and considers them to be "split".  This
pass exists to help other passes more easily see through structure
variables.  If a back-end does implement arrays using scratch or
indirects on registers, having more smaller arrays is likely to have
better memory efficiency.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand b489998e63 nir/types: Add array_or_matrix helpers
Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
2018-08-23 21:44:14 -05:00
Kenneth Graunke b03dcb1e5f i965: don't include compute resources in "Combined" limits
The combined limits should only include shader stages that can be active
at the same time.  We don't need to include compute.

See also cff290df4c for st/mesa.

Unbreaks i965 from assert failing on driver load since Marek's
45f87a48f9, which dropped the core
Mesa capabilities before adjusting driver limits down to match.
2018-08-23 17:27:27 -07:00
Marek Olšák 9176703788 radeonsi: increase the maximum UBO size to 2 GB
Same as the closed driver.

This causes a failure in GL45-CTS.compute_shader.max, which has a trivial
bug.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 5693ca865d radeonsi: bump MAX_GS_INVOCATIONS
same as the closed driver

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák d3c1b212bc gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZE
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák f6ccd594e7 gallium: add PIPE_CAP_MAX_GS_INVOCATIONS
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 8c71b70f07 tgsi/ureg: don't call tgsi_sanity when it's too slow
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 80aecad0ca st/mesa: fix up uniform limits to be able to expose large UBOs
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák cff290df4c st/mesa: don't include compute resources in "Combined" limits
The combined limits should only include shader stages that can be active
at the same time.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák d36af3a9d9 st/mesa: set ctx->Const.SubPixelBits
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 3867af39f9 glsl: fix error checking against MAX_UNIFORM_LOCATIONS
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák f01338118c mesa: make MaxCombinedUniformComponents 64-bit to allow large UBOs
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák a8b71f2db8 mesa: add ctx->Const.MaxGeometryShaderInvocations
radeonsi wants to report a different value

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 45f87a48f9 mesa: don't include compute resources in MAX_COMBINED_* limits
5 is the maximum number of shader stages that can be used by 1 execution
call at the same time (e.g. a draw call). The limit ensures that each
stage can use all of its binding points.

Compute is separate and doesn't need the 5x multiplier.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 095515e16c mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES
same number as our closed GL driver

v2: don't use MaxArrayLockSize

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 356ff963ec mesa: remove incorrect change for EXT_disjoint_timer_query
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák 37eee90df7 glapi: actually implement GL_EXT_robustness for GLES
The extension was exposed but not the functions.

This fixes:
    dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.readn_pixels
    dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformfv
    dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformiv

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-23 16:54:30 -04:00
Kenneth Graunke 578e45ab7b intel/decoder: Decode SFIXED values.
This lets us example SAMPLER_STATE's LOD Bias field, among other things.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-08-23 13:04:53 -07:00
Emil Velikov 855af9a5a2 travis: use python3 for the autoconf builds
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:28 +01:00
Emil Velikov ae7898dfdb configure: allow building with python3
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.

Note:
 - python3.4 is used as it's the earliest supported version
 - python3 chosen prior to python2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:13 +01:00
Emil Velikov c51e7486d9 bin/git_sha1_gen.py: remove execute bit/shebang
The script is executed explicitly via the build system, that uses
PYTHON/prog_python and equivalent.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:04 +01:00
Eric Engestrom 993a456360 vk/wsi: avoid reading uninitialised memory
It will be ignored by x11_swapchain_result() anyway (because reaching
the `fail` label without setting `result` means the swapchain status was
already a hard error), but the compiler still complains about reading
uninitialised memory.

While at it, drop the unused assignment right before returning.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 14:47:59 +01:00
Eric Engestrom a0f6a11944 egl: drop unused _EGL_BUILT_IN_DRIVER_DRI2
Unused since b174a1ae72 "egl: Simplify the "driver" interface".

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-08-23 14:47:59 +01:00
Samuel Pitoiset 87fbc16e34 radv/gfx9: implement coherent shaders for VK_ACCESS_SHADER_READ_BIT
Single-sample color and single-sample depth (not stencil)
are coherent with shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2018-08-23 15:42:56 +02:00
Mathieu Bridon 6027d354d1 bin/install_megadrivers.py: Remove shebang and executable bit
Since the script is never executed directly, but launched by Meson as an
argument to the Python interpreter, those are not needed any more.

In addition, they are the reason this script was missed when I moved the
Meson buildsystem to Python 3, so removing them helps avoiding future
confusion.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-23 12:12:06 +01:00
Mathieu Bridon 8c8fd0bb8e meson: Run the install script with Python 3
The script was being run directly as an executable, and it has a
Python 2 shebang.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-08-23 12:12:06 +01:00