Commit Graph

115626 Commits

Author SHA1 Message Date
Samuel Pitoiset 46b7512b0a radv: fix writing depth/stencil clear values to image
Use the fastest way only if both aspects are used. Oops.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111728
Fixes: 218ce34962 ("radv: add mipmap support for the clear depth/stencil values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 13:27:46 +02:00
Michel Dänzer 88e5796daa gitlab-ci: Merge scons-nollvm and scons-llvm jobs
The new job tests scons without LLVM and with all LLVM versions >= 6.0.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer baa5024e24 gitlab-ci: Test scons with all LLVM versions
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer 0374aacac0 gitlab-ci: Move scons build/test commands to a separate shell script
Preparatory, no functional change intended.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer 8a8388ca67 gitlab-ci: Use crossbuild-essential-* packages
They are convenience packages which pull in everything needed for
cross-building via dependencies.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer a01230e73a gitlab-ci: Use newer packages from backports by default
This is needed in particular to get a recent enough version of meson in
the stretch image, but should be generally beneficial.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer 8a19992869 gitlab-ci: Create separate docker images for Debian stretch & buster
Pros:
* Less fragile due to not mixing packages from stretch and buster
* No longer need to use third-party LLVM packages
* The buster image now uses GCC 8 for C++ as well (previously 6 for C++,
  8 for C), allowing to drop some hacks

Con:
* The stretch image now only uses GCC 6 for C as well as C++
* Need separate jobs for testing old LLVM versions

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer 26fcc8baba gitlab-ci: Pass --no-remove to apt-get where possible
If installing new packages would require removing previously installed
ones, this flag causes apt-get to abort with an error instead,
preventing later obscure failures due to the missing packages.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer 2259b45174 gitlab-ci: Reference full ci-templates commit hash
8 digits might become ambiguous at some point.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Haihao Xiang 8a9b81ab9d i965: support AYUV/XYUV for external import only
Fixes: 89785e2d56 ("i965: add support for sampling from AYUV")
Fixes: 7cab8d3661 ("i965: Add support for sampling from XYUV images")
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-18 12:07:23 +03:00
Boris Brezillon 1e483a87bc panfrost: Allocate tiler and scratchpad BOs per-batch
If we want to execute several batches in parallel they need to have
their own tiler and scratchpad BOs. Let move those objects to
panfrost_batch and allocate them on a per-batch basis.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:40:17 +02:00
Boris Brezillon 0eec73a800 panfrost: Add FBO BOs to batch->bos earlier
If we want the batch dependency tracking to work correctly we must
make sure all BOs are added to the batch->bos set early enough. Adding
FBO BOs when generating the fragment job is clearly to late. Add a
panfrost_batch_add_fbo_bos helper and call it in the clear/draw path.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:37:56 +02:00
Boris Brezillon 5a4d095f9b panfrost: Add the panfrost_batch_create_bo() helper
This helper automates the panfrost_bo_create()+panfrost_batch_add_bo()+
panfrost_bo_unreference() sequence that's done for all per-batch BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:37:31 +02:00
Boris Brezillon 9af4aeaaf7 panfrost: Don't return imported/exported BOs to the cache
We don't know who else is using the BO in that case, and thus shouldn't
re-use it for something else.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:52 +02:00
Boris Brezillon 90b8934547 panfrost: Add panfrost_bo_{alloc,free}()
Thanks to that we avoid the recursive call into panfrost_bo_create()
and we can get rid of panfrost_bo_release() by inlining the code in
panfrost_bo_unreference().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:29 +02:00
Boris Brezillon cb71ae5572 panfrost: Stop using panfrost_bo_release() outside of pan_bo.c
panfrost_bo_unreference() should be used instead.

The only difference caused by this change is that the scratchpad,
tiler_heap and tiler_dummy BOs are now returned to the cache instead
of being freed when a context is destroyed. This is only a problem if
we care about context isolation, which apparently is not the case since
transient BOs are already returned to the per-FD cache (and all contexts
share the same address space anyway, so enforcing context isolation
is almost impossible).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:06 +02:00
Boris Brezillon e15ab939fd panfrost: Stop passing screen around for BO operations
Store a screen pointer in panfrost_bo so we don't have to pass a screen
object to all functions manipulating the BO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:34:27 +02:00
Boris Brezillon 10ce751726 panfrost: Don't check if BO is mmaped before calling panfrost_bo_mmap()
panfrost_bo_mmap() already takes care of that.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:34:08 +02:00
Boris Brezillon a06e08def9 panfrost: Stop exposing panfrost_bo_cache_{fetch,put}()
They are not expected to be called directly, users should use
panfrost_bo_{create,release}() instead.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:33:51 +02:00
Boris Brezillon 154cb725d4 panfrost: Move the BO API to its own header
Right now, the BO API is spread over pan_{allocate,resource,screen}.h.
Let's move all BO related definitions to a separate header file.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:29:13 +02:00
Boris Brezillon 34efaafc93 panfrost: s/PAN_ALLOCATE_/PAN_BO_/
Change the prefix for BO allocation flags to make it consistent with
the rest of the BO API.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:55 +02:00
Boris Brezillon 29d0e5c177 panfrost: Move panfrost_bo_{reference,unreference}() to pan_bo.c
This way we have all BO related functions placed in the same source
file.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:39 +02:00
Boris Brezillon 0500c9e514 panfrost: Get rid of pan_drm.c
pan_drm.c was only meaningful when we were supporting 2 kernel drivers
(mali_kbase, and the drm one). Now that there's now kernel-driver
abstraction we're better off moving those functions were they belong:

* BO related functions in pan_bo.c
* fence related functions + query_gpu_version() in pan_screen.c
* submit related functions in pan_job.c

While at it, we rename the functions according to the place they're
being moved to.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:22 +02:00
Boris Brezillon 1e47c3ee7b panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch()
has_draws can be inferred directly from the batch->last_job value, no
need to pass it around.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:03 +02:00
Boris Brezillon 07085fe8a4 panfrost: Kill a useless memset(0) in panfrost_create_context()
ctx is allocated with rzalloc() which takes care of zero-ing the memory
region. No need to call memset(0) on top.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:47 +02:00
Boris Brezillon 4eac1b2008 panfrost: Add polygon_list to the batch BO set at allocation time
That's what we do for other per-batch BOs, and we'll soon add an helper
to automate this create_bo()+add_bo()+bo_unreference() sequence, so
let's prepare the code to ease this transition.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:30 +02:00
Boris Brezillon c16fb1f48d panfrost: Add missing panfrost_batch_add_bo() calls
Some BOs are used by batches but never explicitly added to the BO set.
This is currently not a problem because we wait for the execution of
a batch to be finished before releasing a BO, but we will soon relax
this rule.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:09 +02:00
Boris Brezillon a94d028065 panfrost: Use the correct type for the bo_handle array
The DRM driver expects an array of u32, let's use the correct type, even
if using an int works in practice because it's still a 32-bit integer.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:49 +02:00
Boris Brezillon 2b771b8424 panfrost: Stop exposing internal panfrost_*_batch() functions
panfrost_{create,free,get}_batch() are only called inside pan_job.c.
Let's make them static.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:21 +02:00
Christian Gmeiner 8d5f905faa etnaviv: disable ARB_shadow
Looks like only HALT2 GPUs have support for it but that is not yet
implemented so disable ARB_shadow for now.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:26 +02:00
Christian Gmeiner dcc0e23438 Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
There are GPUs that do not support this feature.

This reverts commit e871abe452

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:21 +02:00
Lepton Wu 417d602fda virgl: Remove wrong EAGAIN handling for drmIoctl
drmIoctl handles EAGAIN itself and actually it always return -1 on errors.
Remove the wrong handling of its return value. Also, print a warning when
it fails.

v2: - use _debug_printf instead of fprintf (Gurchetan Singh)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2019-09-18 03:36:10 +00:00
Kenneth Graunke f8c44e4ed7 iris: Skip allocating a null surface when there are 0 color regions.
The compiler now sets the "Null Render Target" bit in the RT write
extended message descriptor, causing it to write to an implicit null
surface without us needing to set one up in the binding table.

Together with the last patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Kenneth Graunke f76a724e06 intel/compiler: Set "Null Render Target" ex_desc bit on Gen11
When there are no color regions (i.e. a depth only pass), we can set
the "Null Render Target" bit in the Gen11 RT write extended message
descriptor to indicate that it should behave as if it's writing to a
null render target, without the need for a binding table entry.

This lets drivers avoid setting up that null RT binding table entry,
but more importantly means the HW doesn't actually have to bother
looking up the surface state.

Together with the next patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Samuel Iglesias Gonsálvez f55f7b199d docs/relnotes: add support for VK_KHR_shader_float_controls on Intel
v2:
- Move to 19.2.0 release notes (Andres).

v3:
- Move to 19.3.0 release notes (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez f5dd6dfe01 anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls
This adds support for
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and
enables de Vulkan and SPIR-V extensions.

Also, notice that this includes the updates applied to the
VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension
VK_KHR_shader_float_controls v4 and Vulkan 1.1.116.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez 9b07020a4f i965/fs: add support for shader float control to remove_extra_rounding_modes()
The remove_extra_rounding_modes() optimization will remove duplicated
rounding mode changes.

v2:
- Fix bug in the rounding mode change (Alejandro).

v3:
- Fix rounding modes.

v4:
- Updated to renamed shader info member and enum values (Andres).

v5:
- Simplify flags logic operations (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez 9bd88d10d8 i965/fs: set rounding mode when emitting nir_op_f2f32 or nir_op_f2f16
v2:
- Consider nir_op_f2f16 case too (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez ba1e25e1aa i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions
v2:
- Updated to renamed shader info member (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez 9da56ffc52 i965/fs: add emit_shader_float_controls_execution_mode() and aux functions
We need this function to emit code that setups the control register
later with the defined execution mode for the shader. Therefore, we
emit it as the first instruction.

v2:
- Fix bug in setting the default mode mask in brw_rnd_mode_from_nir().
- Fix support for rounding modes in brw_rnd_mode_from_nir().

v3:
- Updated to renamed shader info member and enum values (Andres).

v4:
- Add actual emission as first instruction of emit_nir_code (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez 8a6507b6fe i965/fs/generator: add new opcode to set float controls modes in control register
Before this commit, we had only FPRoundingMode decoration (the per
instruction one) that is applied during the SPIR-V handling. In
vtn_alu we find out the rounding mode, and generate the code
accordingly that later will be used to look for the respective
nir_op_f2f16_{rtz,rtne}.

Per-instruction gets prioritized because we make them explicit
conversions (with RTZ or RTNE nir opcodes) and they will override the
default execution mode defined with float controls. However, we need
to come back to the mode defined by float controls after the execution
of the FP Rounding instruction.

Therefore, the new SHADER_OPCODE_FLOAT_CONTROL_MODE opcode will be
used to set the default rounding mode and denorms treatment in the
whole shader while the pre-existent SHADER_OPCODE_RND_MODE, will be
used as prioritized rounding mode in a per-instruction basis.

v2:
- Fix bug in defining BRW_CR0_FP_MODE_MASK.

v3:
- Update comment (Caio).

v4:
- Split the patch into the helper and the new opcode (this
  one) (Caio).

v5:
- Add an explanation on the actual purpose and priority of the newly
  introduced opcode in the commit log (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez 28da9558f5 i965/fs/generator: refactor rounding mode helper in preparation for float controls
v2:
- Fix bug in defining BRW_CR0_FP_MODE_MASK.

v3:
- Update comment (Caio).

v4:
- Split the patch into the helper (this one) and the new
  opcode (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez cdace5b0c6 i965/fs/nir: add nir_op_unpack_half_2x16_split_*_flush_to_zero
The denorm mode is set in the control register, no need to do
something else.

v2:
- Add an assert to make sure that we realize if this assumption is
  broken in the future (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 3c474f8513 intel/nir: do not apply the fsin and fcos trig workarounds for consts
If we have fsin or fcos trigonometric operations with constant values
as inputs, we will multiply the result by 0.99997 in
brw_nir_apply_trig_workarounds, making the result wrong.

Adjusting the rules so they do not apply to const values we let a
later constant fold to deal with it.

v2:
- Do not early constant fold but only apply the trig workaround for
  non constants (Caio).
- Add fixes tag to commit log (Caio).

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez dba4d0a319 nir: fix fmin/fmax support for doubles
Until now, it was using the floating point version of fmin/fmax,
instead of the double version.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 2abc6299bf nir: fix denorm flush-to-zero in sqrt's lowering at nir_lower_double_ops
v2:
- Replace hard coded value with DBL_MIN (Connor).

v3:
- Have into account the FLOAT_CONTROLS_DENORM_PRESERVE_FP64
  flag (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 1e0e3ed15a nir: fix denorms in unpack_half_1x16()
According to VK_KHR_shader_float_controls:

"Denormalized values obtained via unpacking an integer into a vector
 of values with smaller bit width and interpreting those values as
 floating-point numbers must: be flushed to zero, unless the entry
 point is declared with the code:DenormPreserve execution mode."

v2:
- Add nir_op_unpack_half_2x16_flush_to_zero opcode (Connor).

v3:
- Adapt to use the new NIR lowering framework (Andres).

v4:
- Updated to renamed shader info member and enum values (Andres).

v5:
- Simplify flags logic operations (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez f097247dd8 nir/algebraic: disable inexact optimizations depending on float controls execution mode
If FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE or
FLOAT_CONTROLS_DENORM_FLUSH_TO_ZERO are enabled, do not apply the
inexact optimizations so the VK_KHR_shader_float_controls execution
mode is respected.

v2:
- Do not apply inexact optimizations if SHADER_DENORM_FLUSH_TO_ZERO is
  enabled (Andres).

v3:
- Updated to renamed shader info member (Andres).

v4:
- Directly access execution mode instead of dragging it by parameter (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]
2019-09-17 23:39:18 +03:00
Andres Gomez 3f782cdd25 nir/algebraic: mark float optimizations returning one parameter as inexact
With the arrival of VK_KHR_shader_float_controls algebraic
optimizations for float types of the form (('fop', a, b), a) become
inexact depending on the execution mode.

For example, if we have activated SHADER_DENORM_FLUSH_TO_ZERO, in case
of a denorm value for the "a" parameter, we cannot return it still as
a denorm, it needs to be flushed to zero. Therefore, we mark now all
those operations as inexact.

Suggested-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 5e22f3e29a nir/constant_expressions: mind rounding mode converting from float to float16 destinations
v2:
- Move the op-code specific knowledge to nir_opcodes.py even if it
  means a rount trip conversion (Connor).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00