Commit Graph

108830 Commits

Author SHA1 Message Date
Jason Ekstrand 0ce1aea88b i965: Compile the fp64 program based on nir options
Instead of looking the devinfo directly, look at the lowering options we
provided to NIR.  This is more accurate as it's now checking for "do we
need full software lowering" rather than a hardware bit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 17:24:57 +00:00
Jason Ekstrand 9314084237 nir: Teach loop unrolling about 64-bit instruction lowering
The lowering we do for 64-bit instructions can cause a single NIR ALU
instruction to blow up into hundreds or thousands of instructions
potentially with control flow.  If loop unrolling isn't aware of this,
it can unroll a loop 20 times which contains a nir_op_fsqrt which we
then lower to a full software implementation based on integer math.
Those 20 invocations suddenly get a lot more expensive than NIR loop
unrolling currently expects.  By giving it an approximate estimate
function, we can prevent loop unrolling from going to town when it
shouldn't.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 17:24:57 +00:00
Jason Ekstrand ebb3695376 nir: Expose double and int64 op_to_options_mask helpers
We already have one internally for int64 but we don't have a similar one
for doubles so we'll have to make one.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 17:24:57 +00:00
Iago Toral Quiroga ca2b5e9069 compiler/nir: add an is_conversion field to nir_op_info
This is set to True only for numeric conversion opcodes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 17:24:57 +00:00
Ian Romanick 55e6454d5e intel/fs: Fix extract_u8 of an odd byte from a 64-bit integer
In the old code, we would generate the exact same instruction for
extract_u8(some_u64, 0) and extract_u8(some_u64, 1).  The mask-a-word
trick only works for even numbered bytes.

This fixes the (new) piglit test
tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test.

v2: Use a SHR instead of an AND.  This saves an instruction compared to
using two moves.  Suggested by Jason.

Fixes: 6ac2d16901 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-06 08:35:45 -08:00
Ian Romanick 4aaf139ea4 intel/fs: nir_op_extract_i8 extracts a byte, not a word
Fixes: 6ac2d16901 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-06 08:35:42 -08:00
Ian Romanick bbf20a1ca3 intel/compiler: Silence unused parameter warning in brw_interpolation_map.c
The parameter is never used, and it's not part of a common interface
idiom.  Remove it.

src/intel/compiler/brw_interpolation_map.c: In function ‘brw_setup_vue_interpolation’:
src/intel/compiler/brw_interpolation_map.c:62:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
                             const struct gen_device_info *devinfo)
                                                           ^~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-06 08:35:36 -08:00
Ian Romanick dea19138dd intel/compiler: Silence many unused parameter warnings in brw_eu.h
In file included from src/intel/compiler/brw_eu_util.c:34:0:
src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’:
src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_message_desc_header_present(const struct gen_device_info *devinfo,
                                                               ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’:
src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_message_ex_desc(const struct gen_device_info *devinfo,
                                                   ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’:
src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_message_ex_desc_ex_mlen(const struct gen_device_info *devinfo,
                                                           ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’:
src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_sampler_desc_binding_table_index(const struct gen_device_info *devinfo,
                                                                    ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’:
src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_sampler_desc_sampler(const struct gen_device_info *devinfo, uint32_t desc)
                                                        ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’:
src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_sampler_desc_return_format(const struct gen_device_info *devinfo,
                                                              ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’:
src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_dp_desc_binding_table_index(const struct gen_device_info *devinfo,
                                                               ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’:
src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter]
                                unsigned exec_size, /**< 0 for SIMD4x2 */
                                         ^~~~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’:
src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter]
                                      unsigned exec_size,
                                               ^~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-06 08:35:31 -08:00
Eric Engestrom 89241eeafc meson: remove unused include_directories(vulkan)
The correct include path is "vulkan/…".

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-03-06 12:46:11 +00:00
Eric Engestrom ad862c36e5 meson: fix with_dri2 definition for GNU Hurd
Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Cc: Timo Aaltonen <tjaalton@debian.org>
Cc: James Clarke <jrtc27@debian.org>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-03-06 12:40:06 +00:00
Lionel Landwerlin b49726afd4 radv: set num_components on vulkan_resource_index intrinsic
In 61e009d2c4 we changed the number of components in the
vulkan_resource_index intrinsic and forgot the update Radv's code for
it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 61e009d2c4 ("spirv: Use the same types for resource indices as pointers")
Reviewed-by: Samuel Pitoiset samuel.pitoiset@gmail.com
2019-03-06 11:56:21 +00:00
Timothy Arceri 54522d0506 nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()
Replace done using:
find ./src -type f -exec sed -i -- \
's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Timothy Arceri e16a27fcf8 glsl: rename record_types -> struct_types
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Timothy Arceri 8294295dbd glsl: rename record_location_offset() -> struct_location_offset()
Replace done using:
find ./src -type f -exec sed -i -- \
's/record_location_offset(/struct_location_offset(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Timothy Arceri 88d8c4e290 glsl: rename get_record_instance() -> get_struct_instance()
Replace done using:
find ./src -type f -exec sed -i -- \
's/get_record_instance(/get_struct_instance(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Timothy Arceri 81ee2cd8ba glsl: rename is_record() -> is_struct()
Replace was done using:
find ./src -type f -exec sed -i -- \
's/is_record(/is_struct(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Karol Herbst 272e927d0e nir/spirv: initial handling of OpenCL.std extension opcodes
Not complete, mostly just adding things as I encounter them in CTS. But
not getting far enough yet to hit most of the OpenCL.std instructions.

Anyway, this is better than nothing and covers the most common builtins.

v2: add hadd proof from Jason
    move some of the lowering into opt_algebraic and create new nir opcodes
    simplify nextafter lowering
    fix normalize lowering for inf
    rework upsample to use nir_pack_bits
    add missing files to build systems
v3: split lines of iadd/sub_sat expressions

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 22:28:29 +01:00
Karol Herbst d0b47ec4df nir/vtn: add support for SpvBuiltInGlobalLinearId
v2: use formula with fewer operations

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-05 22:28:29 +01:00
Karol Herbst f48c672965 nir: add support for address bit sized system values
v2: add assert in else clause
    make local group intrinsics 32 bit wide
v3: always use 32 bit constant for local_size
v4: add comment by Jason

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 22:28:29 +01:00
Karol Herbst 5f8257fb0b nir/spirv: improve parsing of the memory model
v2: add some vtn_fail_ifs

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 22:28:29 +01:00
Karol Herbst 5d48359a2c nir: replace magic numbers with M_PI
we define it inside 'include/c99_math.h' so it is safe to use.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 22:28:29 +01:00
Caio Marcelo de Oliveira Filho 69cc6272fb anv: Implement VK_EXT_external_memory_host
v2: Ignore the import if handleType == 0. (Jason)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 12:59:50 -08:00
Eric Anholt 5c655c47db v3d: Drop the V3D 3.x vpm read dead code elimination.
We now have NIR dead code eliminating our VPM reads, so this shouldn't be
necessary.
2019-03-05 12:57:39 -08:00
Eric Anholt e8ee1f8eaf v3d: Eliminate the TLB and TLBU files.
We can just use the magic register file like we do for other magic waddrs.
2019-03-05 12:57:39 -08:00
Eric Anholt 110f14d4b4 v3d: Use ldunif instructions for uniforms.
The idea is that for repeated use of the same uniform, we could avoid
loading it on each consumer.  The results look pretty good.

total instructions in shared programs: 6413571 -> 6521464 (1.68%)
total threads in shared programs: 154214 -> 154000 (-0.14%)
total uniforms in shared programs: 2393604 -> 2119629 (-11.45%)
total spills in shared programs: 4960 -> 4984 (0.48%)
total fills in shared programs: 6350 -> 6418 (1.07%)

Once we do scheduling at the NIR level, the register pressure (and thus
also instructions) issues we see here will drop back down.
2019-03-05 12:57:39 -08:00
Eric Anholt 4036fce8fd v3d: Add support for register-allocating a ldunif to a QFILE_TEMP.
On V3D 4.x, we can use ldunifrf to load uniforms to any register, and this
will let us schedule the ldunif wherever we want in the program.
2019-03-05 12:57:39 -08:00
Eric Anholt 70df388219 v3d: Drop the old class bits splitting up the accumulators.
This seems to be left over from vc4, and I don't use them any more.
2019-03-05 12:57:39 -08:00
Eric Anholt dff1fc04e0 v3d: Add support for vir-to-qpu of ldunif instructions to a temp.
We can load a uniform to any register, so add support for non-ALU
instructions with sig.ldunif to a temp.
2019-03-05 12:57:39 -08:00
Eric Anholt 4739181a16 v3d: Switch implicit uniforms over to being any qinst->uniform != ~0.
I'm not sure why I didn't do this before -- it's clearly much simpler to
add dumping of the extra thing than to have it as another implicit source.
2019-03-05 12:57:39 -08:00
Eric Anholt 1e98f02d88 v3d: Do uniform rematerialization spilling before dropping threadcount
This feels like the right tradeoff for threads vs uniforms, particularly
given that we often have very short thread segments right now:

total instructions in shared programs: 6411504 -> 6413571 (0.03%)
total threads in shared programs: 153946 -> 154214 (0.17%)
total uniforms in shared programs: 2387665 -> 2393604 (0.25%)
2019-03-05 12:57:39 -08:00
Eric Anholt 060979a380 v3d: Fix temporary leaks of temp_registers and when spilling.
On each iteration of successfully spilling a reg, we'd allocate another
copy of temp_registers, and when decrementing thread conut we'd allocate
another copy of the graph.  These all got cleaned up on freeing the
compile.
2019-03-05 12:57:39 -08:00
Eric Engestrom faf9e40f35 gitlab-ci: drop job prefixes
It is already obvious whether the job is building a container or running
a mesa build, so let's drop that prefix so that we can see more
information on the screen (eg. in the jobs list on a pipeline page).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-03-05 20:49:42 +00:00
Timur Kristóf 45809bcb33 tgsi_to_nir: Set correct location for uniforms.
Previously, only the driver_location was set for all variables,
but constants need to use the location field instead. This change
is necessary because the nine state tracker can produce non-packed
constants whose location needs to be explicitly set.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-05 19:13:27 +00:00
Timur Kristóf 770faf546d tgsi_to_nir: Improve interpolation modes.
This patch extracts the interpolation mode translation
into a separate function called ttn_translate_interp_mode,
adds support for TGSI_INTERPOLATE_COLOR which was missing,
and also sets the proper interpolation mode to output
variables, which were not set previously.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Kenneth Graunke 2fb800fd1d tgsi_to_nir: use sampler variables and derefs
v2: fix is_shadow, is_array and txq

Some drivers (eg. iris) need the presence of sampler variables and derefs
so that they can count them to determine the number of samplers used.
This change also makes the output NIR closer to what glsl_to_nir outputs.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf 674045d04b tgsi_to_nir: Support FACE and POSITION properly.
Previously, FACE was hard-coded as a sysval, but TTN emulated
it incorrectly. Also, POSITION was not supported when it was
a sysval. This patch fixes these by allowing both of them to
be sysvals or inputs, based on driver capabilities. It also
fixes the TGSI FACE emulation based on the TGSI spec.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf f748fa47f8 tgsi_to_nir: Extract ttn_emulate_tgsi_front_face into its own function.
We'll need to use the same logic in other places, so it makes sense to
have a separate function for this.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf 840c7d1ebd tgsi_to_nir: Restructure system value loads.
Minor cleanup to the way system value loads work in tgsi_to_nir.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf 9a834447d6 tgsi_to_nir: Produce optimized NIR for a given pipe_screen.
With this patch, tgsi_to_nir will output NIR that is tailored to
the given pipe, by reading its capabilities and adjusting the NIR code
to those capabilities similarly to how glsl_to_nir works.

It also adds an optimization loop that brings the output NIR in line
with what glsl_to_nir outputs. This is necessary for the same reason
why glsl_to_nir has its own optimization loop: currently not every
driver does these optimizations yet.

For uses which cannot pass a pipe_screen we also keep a variant
called tgsi_to_nir_noscreen which keeps the old behavior.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Acked-By: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf e582e761b7 freedreno: Plumb pipe_screen through to irX_tgsi_to_nir.
This patch makes it possible for freedreno to pass a pipe_screen
to tgsi_to_nir. This will be needed when tgsi_to_nir supports reading
pipe capabilities.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-03-05 19:13:27 +00:00
Timur Kristóf 6684e039eb nir: Add multiplier argument to nir_lower_uniforms_to_ubo.
Note that locations can be set in different units, and the multiplier
argument caters to supporting these different units. For example,
st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4,
while tgsi_to_nir uses bytes, so the multiplier should be 16.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf 909d1f50f3 nir: Move nir_lower_uniforms_to_ubo to compiler/nir.
The nir_lower_uniforms_to_ubo function is useful outside of
mesa/state_tracker, and in fact is needed to produce NIR for
drivers that have the PIPE_CAP_PACKED_UNIFORMS capability.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf 4dba72c4b3 tgsi_to_nir: Split to smaller functions.
Previously, tgsi_to_nir was a single big function, and this patch
intends to make the code easier to understand by splitting it up
to multiple smaller pieces.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-By: Tested-by: Rob Clark <robdclark@gmail.com>
2019-03-05 19:13:27 +00:00
Timur Kristóf 950aebbc53 tgsi_to_nir: Make the TGSI IF translation code more readable.
This patch is a minor cleanup that only intends to make the
TGSI IF translation a bit easier to read.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf fa076acbc0 tgsi_to_nir: Fix TGSI LIT translation by using flt.
TGSI spec says LIT needs a "greater than" comparison. NIR doesn't have that,
so let's use "less than" and swap the arguments. Previously "greater than or equal"
was used by tgsi_to_nir which is incorrect.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Timur Kristóf 28be7b33b9 tgsi_to_nir: Fix the TGSI ARR translation by converting the result to int.
According to the TGSI spec, ARR needs to do a rounding and then
a float-to-integer conversion which was missing. This patch also
makes the rounding a bit more efficient by using nir_fround_even
instead of the previous nir_ffloor+nir_fadd trick.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-05 19:13:27 +00:00
Timur Kristóf 317f10bf40 nir: Add ability for shaders to use window space coordinates.
This patch adds a shader_info field that tells the driver to use window
space coordinates for a given vertex shader. It also enables this feature
in radeonsi (the only NIR-capable driver that supported it in TGSI),
and makes tgsi_to_nir aware of it.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00
Eric Anholt 2780a99ff8 v3d: Move the stores for fixed function VS output reads into NIR.
This lets us emit the VPM_WRITEs directly from
nir_intrinsic_store_output() (useful once NIR scheduling is in place so
that we can reduce register pressure), and lets future NIR scheduling
schedule the math to generate them.  Even in the meantime, it looks like
this lets NIR DCE some more code and make better decisions.

total instructions in shared programs: 6429246 -> 6412976 (-0.25%)
total threads in shared programs: 153924 -> 153934 (<.01%)
total loops in shared programs: 486 -> 483 (-0.62%)
total uniforms in shared programs: 2385436 -> 2388195 (0.12%)

Acked-by: Ian Romanick <ian.d.romanick@intel.com> (nir)
2019-03-05 10:59:40 -08:00
Eric Anholt a9dd227a47 v3d: Translate f2i(fround_even) as FTOIN.
This appears to be just what the opcode does.  Needed for equivalence when
moving FF VPM stores into NIR.
2019-03-05 10:59:40 -08:00
Eric Anholt a4f612b4cf nir: Improve printing of load_input/store_output variable names.
We were printing only when the channel was exactly the start channel, so
scalarized loads/stores would be missing the name on the rest.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-03-05 10:59:40 -08:00