Commit Graph

97968 Commits

Author SHA1 Message Date
Dylan Baker 33627d23d0 meson: add logic to select apple and windows dri
This is still not fully correct (haiku and BSD is notably probably not
correct), but Linux is not regressed and this should be correct for
macOS and Windows.

v2: - set the dri_platform to windows on Cygwin as well (Jon)
v3: - Add a better todo for Hurd (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker 2d1a3bf657 meson: Fix LLVM requires for radeonsi
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker 48f64e591f meson: convert llvm option to tristate
This option has been acting as a strange sort of half-tri state anyway.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker 4b61b07e4b meson: Convert platform to auto
This is necessary to support operating systems other than the *nix
family (excluding macOS). For Linux nothing has changed, the defaults
are still the same.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker b5d98a101b meson: Remove duplicate _GNU_SOURCE
There is one provided unconditionally, and one guarded by platform ==
linux. Remove the unconditional one.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:47:43 -08:00
Dylan Baker 9c3e894ebe meson: Remove completed or irrelevant TODO comments
These are all either done already, or are autotools specific. The
misspelled gallium G3DVL is the autotools specific bit, meson is
handling that via build_by_default.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Dylan Baker e89842ebbc meson: Fix TODO for missing dl_iterate_phdr function
This function is required for both the Intel "Anvil" vulkan driver and
the i965 GL driver. Error out if either of those is enabled but this
function isn't found.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Dylan Baker 2d62fc0646 meson: disable x86 asm in fewer cases.
This patch allows building asm for x86 on x86_64 platforms, when the
operating system is the same. Previously cross compile always turned off
assembly. This allows using a cross file to cross compile x86 binaries
on x86_64 with asm.

This could probably be relaxed further thanks to meson's "exe_wrapper",
which is way to specify an emulator or compatibility layer (wine) that
can run the foreign binaries on the build system. Since the meson build
at this point only supports building on Linux I can't test this and I
don't want to write/enable code that cannot even be build tested.

v4: - set condition to build == x86_64 and host == x86 and
      build.system == host.system

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Dylan Baker 84486f6462 meson: Enable SSE4.1 optimizations
This patch checks for an and then enables sse4.1 optimizations if the
host machine will be x86/x86_64.

v2: - Don't compile code, it's unnecessary since we require a compiler
      which always has SSE4.1 (Matt)
v3: - x64 -> x86_64 (Matt)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-22 12:46:00 -08:00
Eric Anholt 6a78416dab broadcom/vc5: Fix BASE_LEVEL handling with txl.
The HW doesn't add the base level anywhere (the min/max lod clamping is
what does base level), so we need to add it manually in this case.

Fixes piglit tex-miplevel-selection *Lod 2D.
2017-11-22 10:56:31 -08:00
Eric Anholt c55813c22e broadcom/vc5: Fix array texture layer count setup.
Fixes piglit array-texture.
2017-11-22 10:56:31 -08:00
Eric Anholt ad1521d708 broadcom/vc5: Don't increment primitive queries while they're paused.
Fixes ext_transform_feedback-generatemipmap prims_generated
2017-11-22 10:56:31 -08:00
Eric Anholt 1214c2ea2a broadcom/vc5: Fix incorrect padding of TF outputs.
After the first output, we were padding by an extra size of the previous
output.  Fixes piglit ext_transform_feedback-output-type mat4x3[2] and
friends.
2017-11-22 10:56:31 -08:00
Eric Anholt b18840ac6e broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.
The HW was computing an implicit height for the surface based on the image
size, but that may be smaller than the surface with ARB_fbo mismatched
sizes.  In that case, we need to tell it about the pad, either with the
little 4-bit field in the RT config, or the extended field in
CLEAR_COLORS_PART3.

Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.
2017-11-22 10:56:31 -08:00
Wladimir J. van der Laan 9f162fa107 etnaviv: Put HALTI level in specs
The HALTI level is an indication of the gross architecture of the GPU.
It determines for significant part what feature level the GPU has, what
state (especially frontend state) is there, and where it is located.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-22 14:42:06 +01:00
Wladimir J. van der Laan 391c958f08 etnaviv: Const-correctness etnaviv_emit.h
The relocation structure is never changed by submitting it.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-22 14:42:00 +01:00
Juan A. Suarez Romero 1b0638c65f meson: add si_driinfo.h in libgallium_dri
v2: generate target conditionally (Dylan)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-22 12:35:38 +01:00
Iago Toral Quiroga a217cbd7ec nir/gather_info: recognize load_patch_vertices_in as a system value
This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-22 08:03:55 +01:00
Jordan Justen 386f6cd041 i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat
This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated
samplers & surfaces.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-11-21 12:11:57 -08:00
Kristian H. Kristensen 24609377f9 intel/genxml: Add helpers for determining field type
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-21 11:15:06 -08:00
Matt Turner beaea7abfa i965/fs: Check ADD/MAD with immediates in satprop unit test
The gen had to be changed from 4 to 6 so that we could test MAD, which
is new on Gen6.

mad_imm_float_neg_mov_sat tests the case fixed by the previous commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-21 10:13:07 -08:00
Matt Turner a05af1f7b8 i965/fs: Handle negating immediates on MADs when propagating saturates
MADs don't take immediate sources, but we allow them in the IR since it
simplifies a lot of things. I neglected to consider that case.

Fixes: 4009a9ead4 ("i965/fs: Allow saturate propagation to propagate
                      negations into MADs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616
Reported-and-Tested-by: Ruslan Kabatsayev <b7.10110111@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-21 10:13:07 -08:00
Juan A. Suarez Romero ce221cbbcf mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D
From section 8.7, page 179 of OpenGL ES 3.2 spec:

  An INVALID_OPERATION error is generated by CompressedTexImage3D
  if internalformat is one of the the formats in table 8.17 and target
  is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D.

  An INVALID_OPERATION error is generated by CompressedTexImage3D if
  internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array”
  column of table 8.17 is not checked, or if internalformat is
  TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked.

So far it was only considering TEXTURE_2D_ARRAY as valid target. But as
"Cube Map Array" column is checked for all the cases, in practice we can
consider also TEXTURE_CUBE_MAP_ARRAY.

This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-11-21 13:05:42 +01:00
Tapani Pälli 6236ffeb83 intel: fix disasm_info memory leaks
Fixes: 4f82b17287 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-21 08:36:43 +02:00
Timothy Arceri 04a9558497 st/glsl_to_nir: don't generate nir twice for gs
This was left out of c980a3aa31

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-21 15:57:39 +11:00
Roland Scheidegger b5957cee92 llvmpipe: fix snorm blending
The blend math gets a bit funky due to inverse blend factors being
in range [0,2] rather than [-1,1], our normalized math can't really
cover this.
src_alpha_saturate blend factor has a similar problem too.
(Note that piglit fbo-blending-formats test is mostly useless for
anything but unorm formats, since not just all src/dst values are
between [0,1], but the tests are crafted in a way that the results
are between [0,1] too.)

v2: some formatting fixes, and fix a fairly obscure (to debug)
issue with alpha-only formats (not related to snorm at all), where
blend optimization would think it could simplify the blend equation
if the blend factors were complementary, however was using the
completely unrelated rgb blend factors instead of the alpha ones...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-11-21 04:06:29 +01:00
Dave Airlie 464c2d8083 r600: add cull distance support
This passes all the tests in piglit.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-21 09:00:52 +10:00
Aravindan Muthukumar 971b3c019b i965: Optimize bucket index calculation
Reducing Bucket index calculation to O(1).

This algorithm calculates the index using matrix method.  Assuming
PAGE_SIZE is 4096, matrix arrangement is as below:

          1*4096   2*4096    3*4096    4*4096
          5*4096   6*4096    7*4096    8*4096
          10*4096  12*4096   14*4096   16*4096
          20*4096  24*4096   28*4096   32*4096
           ...      ...       ...       ...
           ...      ...       ...       ...
           ...      ...       ...   max_cache_size

From this matrix its clearly seen that every row follows the below way:

          ...       ...       ...        n
        n+(1/4)n  n+(1/2)n  n+(3/4)n    2n

Row is calculated as log2(size/PAGE_SIZE) Column is calculated as
converting the difference between the elements to fit into power size of
two and indexing it.

Final Index is (row*4)+(col-1)

Tested with Intel Mesa CI.

Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20)

v4: Review comments on style and code comments implemented (Ian).
v3: Review comments implemented (Ian).
v2: Review comments implemented (Jason).

Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar Karanje <kedar.j.karanje@intel.com>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-20 14:52:42 -08:00
Dylan Baker c8417c8d25 meson: Guard the gallium dri componenet
Currently the target has a redundant guard, and the state tracker isn't
properly guarded.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-20 14:28:31 -08:00
Dylan Baker 689fb74716 meson: don't build gallium subdir unless we're building gallium
This will allow us to simplify some guards within the gallium directory.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-20 14:28:31 -08:00
Eric Anholt 494effd242 broadcom/vc5: Align 1D texture miplevels to 64b.
Fixes tex-miplevel-selection GL2:texture() 1D
2017-11-20 13:54:45 -08:00
Eric Anholt 9d5972da80 broadcom/vc5: Clamp min lod to the last level.
Otherwise, the simulator would complain in tex-miplevel-selection that the
min/max clamp was out of order.  The actual HW seems to have clamped to
the max anyway.
2017-11-20 13:52:33 -08:00
Eric Anholt 2c8913e224 broadcom/vc5: Increase simulator memory for tex-miplevel-selection.
We were overflowing, because of all the little 4k allocations for CLs that
were getting expanded to 128kb in the simulator due to the GMP alignment.
2017-11-20 13:52:33 -08:00
Tim Rowley 34838c2212 swr/rast: Repair simd8 frontend code rot
Keep non-default simd8 frontend code running for comparison purposes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:51:10 -06:00
Tim Rowley 005d937e15 swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader
Disabled for now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:51:06 -06:00
Tim Rowley 2e244c7168 swr/rast: Simplify GATHER* jit builder api
General cleanup, and prep work for possibly moving to llvm masked
gather intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:51:01 -06:00
Tim Rowley 44025def06 swr/rast: Add alignment to transpose targets
Needed to ensure alignment for avx512.

Fixes address sanitizer crash.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:56 -06:00
Tim Rowley bc356b0fc0 swr/rast: Cache eventmanager
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:51 -06:00
Tim Rowley 395a298fa5 swr/rast: Enable AVX-512 targets in the jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:45 -06:00
Tim Rowley 37bb69fb88 swr/rast: Points with clipdistance can't go through simplepoints path
Fixes piglit glsl-1.20:vs-clip-vertex-primitives and
glsl-1.30:vs-clip-distance-primitives.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:38 -06:00
Tim Rowley d9de8f3122 swr/rast: Code style change (NFC)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:29 -06:00
Tim Rowley 08512c52de swr/rast: Widen fetch shader to SIMD16
Widen fetch shader to SIMD16, enable SIMD16 types in the jitter,
and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:50:23 -06:00
Tim Rowley e612231f20 swr/rast: Support flexible vertex layout for DS output
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20 13:49:59 -06:00
Nicolai Hähnle 3f17d3c017 gallium/u_threaded: avoid syncing in threaded_context_flush
We could always do the flush asynchronously, but if we're going to wait
for a fence anyway and the driver thread is currently idle, the additional
communication overhead isn't worth it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:15 +01:00
Nicolai Hähnle bc65dcab3b radeonsi: avoid syncing the driver thread in si_fence_finish
It is really only required when we need to flush for deferred fences.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:11 +01:00
Nicolai Hähnle 3db1ce01b1 radeonsi: recompute the relative timeout after waiting for ready fence
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:06 +01:00
Nicolai Hähnle f5ea8d18ff ddebug: fix the hang detection timeout calculation
Fixes: c9fefa062b ("ddebug: rewrite to always use a threaded approach")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:03 +01:00
Nicolai Hähnle 16f8da2997 ddebug: fix use-after-free of streamout targets
Fixes: b47727a83a ("ddebug: implement pipelined hang detection mode")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:16:00 +01:00
Nicolai Hähnle aaebf49eba gallium/u_threaded: properly initialize fence unflushed tokens
This got lost in a rebase but never hurt anything because we happened
to always sync in fence_finish anyway...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:56 +01:00
Nicolai Hähnle 81aabb20f3 util/u_queue: really use futex-based fences
The relevant define changed in the final revision of the simple mutex
patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-20 18:15:53 +01:00