Commit Graph

116735 Commits

Author SHA1 Message Date
Dylan Baker c6d41e7f0b bin/gen_release_notes.py: Return "None" if there are no new features
Which is very likely .Z > 0 releases.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:03 -07:00
Dylan Baker df3d4ad82d bin/gen_release_notes.py: strip '#' from gitlab bugs
If they use the `Fixes: #1` form.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:00 -07:00
Dylan Baker 69f540c017 bin/gen_release_notes.py: fix conditional of bugfix
Previously this would result in the .0 warning be generated for .z > 0
and the .z == 0 would get the other message.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:45:53 -07:00
Illia Iorin 6b672e342a mesa/main: Ignore filter state for MS texture completeness
After the discussion in
https://github.com/KhronosGroup/OpenGL-API/issues/45
the section 8.17 (texture completeness) of the OpenGL 4.6 core profile
was changed to explicitly say that multisample texture completeness
ignores filter state of the texture.

"Using the preceding definitions, a texture is complete unless any of the
 following conditions hold true:
   ...
  - The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
    the texture is not multisample, and the texture is not mipmap complete.
  - The texture is not multisample; either the magnification filter is not
    NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
    MIPMAP_NEAREST; and any of
    – The internal format of the texture is integer (see table 8.12).
    – The internal format is STENCIL_INDEX.
    – The internal format is DEPTH_STENCIL, and the value of DEPTH_-
      STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-25 21:16:23 +00:00
Illia Iorin 71d4ece366 Revert "mesa/main: Fix multisample texture initialize"
This reverts commit a113a42e73.

Per https://github.com/KhronosGroup/OpenGL-API/issues/45 it
was a wrong way to fix the issue.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-25 21:16:23 +00:00
Marek Olšák 88e9042b6c glsl/serialize: optimize for equal offsets in uniform remap tables
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1416

This decreases the shader cache size in the ticket from 1.6 MB to 40 KB.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-25 17:01:26 -04:00
Marek Olšák e90269d90a glsl/serialize: restructure remap table code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-25 17:01:25 -04:00
Kenneth Graunke f306d07932 nir: Use VARYING_SLOT_TESS_MAX to size indirect bitmasks
MAX_VARYINGS_INCL_PATCH subtracts VARYING_SLOT_VAR0 giving us a size
that's too small, so BITSET_SET writes words out of bounds, corrupting
the stack and causing all kinds of chaos.  VARYING_SLOT_TESS_MAX is
the right value to use here, as it's the largest location.

Closes: 2002
Fixes: ee2050b111 ("nir: Use BITSET for tracking varyings in lower_io_arrays")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-25 13:29:09 -07:00
Neil Armstrong e919c44c3b Revert "ci: Disable lima until its farm can get fixed."
This reverts commit fb9362c6fb.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-25 20:52:03 +02:00
Jason Ekstrand e2bb7fef94 Revert "mapi: Inline call x86_current_tls."
This reverts commit e137b3a9b7.  It
completely broke 32-bit EGL such that wflinfo can't even run without
crashing.
2019-10-25 11:31:51 -05:00
Jon Turney 2649609ac5
rbug: Fix use of alloca() without #include "c99_alloca.h"
[12/60] Compiling C object 'src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o'.
FAILED: src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o
[...]
../src/gallium/auxiliary/rbug/rbug_texture.c: In function 'rbug_send_texture_info_reply':
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: error: implicit declaration of function 'alloca'; did you mean 'malloc'? [-Werror=implicit-function-declaration]
  uint32_t *height = alloca(sizeof(uint32_t) * height_len);
                     ^~~~~~
                     malloc
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
../src/gallium/auxiliary/rbug/rbug_texture.c:303:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
  uint32_t *depth = alloca(sizeof(uint32_t) * height_len);
                    ^~~~~~
cc1: some warnings being treated as errors

Include c99_alloca.h to portably make the alloca() prototype available.

See also: 498d9d0f, adfb9c5c, fc8139b1

Fixes: 6174cba7 ("rbug: fix transmitted texture sizes")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-25 16:04:34 +01:00
Alyssa Rosenzweig f98e9a2771 pan/midgard: Express allocated registers as offsets
Rather than supplying a mask/swizzle to compose with the original, just
supply the offset of the allocated register so we can directly offset
the mask/swizzle, without resorting to composition.

This is simpler, cleaner, and will generalize to non-32-bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig c1d36eb115 pan/midgard: Expose more typesize manipulation routines
These internal mir.c routines will help the RA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig 9bba182840 pan/midgard: Add mir_set_bytemask helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Timur Kristóf 85cc40f7ce st/nine: Fix unused variable warnings in release build.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-25 12:44:44 +02:00
Timur Kristóf f091b02825 st/nine: Fix build with -Werror=empty-body
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1995
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-25 12:44:44 +02:00
Timur Kristóf c580f134ae aco: Refactor hazard mitigations, separate pass for GFX10.
GFX10 hazards require a different approach compared to previous
generations, for example it doesn't need s_nop, and most hazards
can't be solved by adding NOPs at all. Also, they are not
resolved by branch instructions.

This commit reorganizes aco_insert_NOPs so that there is now a
separate pass for GFX10. The new GFX10 pass also respects the
control flow of the shader.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf b01847bd94 aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.
This commit refines the VMEMtoScalarWriteHazard mitigation, based
upon a closer look at what LLVM does. Also changes the code to
match the structure of the other hazard mitigations.

* The hazard is not only triggered by VMEM, FLAT and GLOBAL
  but also SCRATCH and DS instructions.
* The SMEM/SALU instructions only cause a hazard when they
  write a register that the VMEM/etc. are reading.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf c037ba1bb7 aco/gfx10: Mitigate LdsBranchVmemWARHazard.
There is a hazard caused by there is a branch between a
VMEM/GLOBAL/SCRATCH instruction and a DS instruction.
This commit adds a workaround that avoids the problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf 09d676d81a aco/gfx10: Mitigate SMEMtoVectorWriteHazard.
There is a hazard that happens when an SMEM instruction
reads an SGPR and then a VALU instruction writes that same SGPR.
This commit adds a workaround that avoids the problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf d6dfce02d0 aco/gfx10: Mitigate VcmpxExecWARHazard.
There is a hazard when a non-VALU instruction reads the EXEC mask
and then a VALU instruction writes the EXEC mask.
This commit adds a workaround that avoids the problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf e5a8616973 aco/gfx10: Mitigate VcmpxPermlaneHazard.
Any permlane instruction that follows any VOPC instruction can cause a hazard,
this commit implements a workaround that avoids this causing a problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf 99aed688d3 aco/gfx10: Add notes about some GFX10 hazards.
ACO currently mitigates VMEMtoScalarWriteHazard and Offset3fBug
(names from LLVM). There are some bugs that ACO needn't care about.
Just to be on the safe side, add an assertion that makes sure
that we aren't hit by FlatSegmentOffsetBug.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:41 +02:00
Samuel Pitoiset 2bf8a9b337 radv: fix VK_KHR_shader_float_controls dependency on GFX6-7
From the Vulkan spec 1.1.126 :
   "VK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY_KHR specifies
    that shader float controls for 32-bit floating point can be set
    independently; other bit widths must be set identically to each
    other."

Forgot to update this when I enabled that extension recently.

Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls.independence_settings.independence_setting

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-25 07:49:20 +02:00
Lepton Wu e137b3a9b7 mapi: Inline call x86_current_tls.
This saves one return and a simple benchmark which calls glGetString
repeatedly on my desktop shows it improves calls per second from 118M
to 128M.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 23:37:18 +00:00
Lepton Wu a4fec4dd6a virgl: Remove formats with unusual sample count.
Most GPU require the sample count is power of 2. Just remove those
formats with unusual sample count. This decreases dEQP EGL tests run
time a lot.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-24 23:11:08 +00:00
Kristian H. Kristensen ee2050b111 nir: Use BITSET for tracking varyings in lower_io_arrays
MAX_VARYINGS_INCL_PATCH is greater than 64, so we'll need more that 64
bits (per component) to track which vars have indirects. This pass was
trying to track patch varyings (which start at bit 63) in a separate
64 bit word, but failed to subtract VARYING_SLOT_PATCH0 and accessed
out of bounds.

Do away with the ad-hoc bit mask tracking and just use a BITSET.

Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.per_patch_block.vertex_io_array_size_implicit.triangles
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 15:32:20 -07:00
Dylan Baker 5658b13fb6 docs: update calendar, add news item and link release notes for 19.2.2 2019-10-24 14:12:04 -07:00
Dylan Baker 89d94f5ecc docs: Add sha256 sum for 19.2.2 2019-10-24 14:08:25 -07:00
Dylan Baker 849415d615 docs: Add release notes for 19.2.2 2019-10-24 14:07:30 -07:00
Rob Clark bc67b892d0 freedreno/ir3: handle the progress case
In some cases, in particular when you have things that can be src
modifiers ((abs)/(neg)), once eliminating one mov, there is a
possibility to remove another.  Handle this by re-visiting an
instruction after eliminating a copy on one of it's srcs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark 97b24efd9f freedreno/ir3: remove restrictions on const + (abs)/(neg)
These date back to relatively early days of ir3, when a lot was still
not well understood.  But according to CI (and what I've seen blob
driver do), these are not actually real restrictions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark e665e65f96 freedreno/ir3: allow copy-propagate out of fanout
Now that we fixed the sharp edges that this was papering over, we can
relax the restriction about eliminating a mov coming out of a fanout
(for example from result of texture fetch).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark 3ac328875e freedreno/ir3: treat high vs low reg as conversion
This avoids copy-propagating a high register into an instruction which
cannot consume it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark 9e211b57b8 freedreno/ir3: propagate dest flags for collect/fanin
We did this properly already for split/fanout.  But collect was missed.
Extract out a helper to share.

This way we avoid copy propagating a mov from high or half reg into an
instruction which cannot consume a high/half reg.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark 49ab94694d freedreno/ir3: make high regs easier to see in IR dumps
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark 0f395f0933 freedreno/ir3: debug cleanup
1) deduplicate IR3_SHADER_DEBUG=disasm versus fs/vs/etc handling
2) standardize shader stage name prints, in particular VERT vs BVERT
3) don't mix stderr and stdout

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Caio Marcelo de Oliveira Filho d31f415ba0 spirv: Add helper to find args of Image Operands
Avoid keeping track of the idx and all possible image operands for
each operation.  Note for convenience we split up the handling of
ImageOperandsOffsetMask and ImageOperandsConstOffsetMask.

Suggested by Jason Ekstrand.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho c7d8fe2f0d spirv: Check that only one offset is defined as Image Operand
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho d27b853c08 spirv: Add imageoperands_to_string helper
Change the information to also include the category, so that the
particulars of BitEnum enumeration can be handled in the template.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho 06aecb14c0 anv: Implement VK_KHR_vulkan_memory_model
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho b8784fe652 spirv: Handle MakePointerAvailable/Visible
Emit barriers with semantics matching the access operand and the
storage class of the pointer.

v2: Fix order of visible / available emission relative to the
    operations.  (Bas)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho 129c85c28b spirv: Handle MakeTexelAvailable/Visible
Set the memory semantics and scope for later emitting the barrier.
Note the barrier emission code already exist in vtn_handle_image for
the Image atomics.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho c649e64edc spirv: Add option to emit scoped memory barriers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho c022043102 spirv: Add SpvMemoryModelVulkan and related capabilities
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho 1bb191a0d1 spirv: Emit memory barriers for atomic operations
Add a helper to split the memory semantics into before and after the
operation, and use that result to emit memory barriers.

v2: Be more explicit about which bits we are keeping around when
    splitting memory semantics into a before and after.  For now
    we are ignoring Volatile.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho d6992f996b spirv: Parse memory semantics for atomic operations
Including the right storage memory semantic based on the storage class
of the operation.  These will be used later to emit memory barriers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho e142061399 intel/fs: Implement scoped_memory_barrier
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho 901071044e nir/tests: Add copy propagation tests with scoped_memory_barrier
Three groups of tests, effectively defining what cases the
optimization is allowed or prevented

- Redudant loads       (a load  generated the value)
- Propagate SSA values (a store generated the value)
- Propagate a var      (a copy  generated the value)

Change the shader type of the tests to be COMPUTE so
nir_var_mem_shared can also be used.  Doesn't affect the semantic of
the copy propagation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho 73572abc2a nir: Add scoped_memory_barrier intrinsic
Add a NIR instrinsic that represent a memory barrier in SPIR-V /
Vulkan Memory Model, with extra attributes that describe the barrier:

- Ordering: whether is an Acquire or Release;
- "Cache control": availability ("ensure this gets written in the memory")
  and visibility ("ensure my cache is up to date when I'm reading");
- Variable modes: which memory types this barrier applies to;
- Scope: how far this barrier applies.

Note that unlike in SPIR-V, the "Storage Semantics" and the "Memory
Semantics" are split into two different attributes so we can use
variable modes for the former.

NIR passes that took barriers in consideration were also changed

- nir_opt_copy_prop_vars: clean up the values for the mode of an
  ACQUIRE barrier.  Copy propagation effect is to "pull up a load" (by
  not performing it), which is what ACQUIRE restricts.

- nir_opt_dead_write_vars and nir_opt_combine_writes: clean up the
  pending writes for the modes of an RELEASE barrier.  Dead writes
  effect is to "push down a store", which is what RELEASE restricts.

- nir_opt_access: treat the ACQUIRE and RELEASE as a full barrier for
  the modes.  This is conservative, but since this is a GL-specific
  pass, doesn't make a difference for now.

v2: Fix the scoped barrier handling in copy propagation.  (Jason)
    Add scoped barrier handling to nir_opt_access and
    nir_opt_combine_writes.  (Rhys)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:55 -07:00