Commit Graph

104337 Commits

Author SHA1 Message Date
Rob Clark 066930e54d freedreno/ir3: create all inputs in first block
create_input()/create_input_compmask() should take the ctx as arg,
rather than block, to enforce that all inputs are created in the first
block, so that RA sees them as live at the start of the shader.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark 62da068fd3 freedreno/ir3: rename s/frag_pos/frag_vcoord/g
Make it more clear that this is varying fetch related.  Also fixup some
comments.  Just cleanup for next patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark 4a7f9feada compiler: add SYSTEM_VALUE_VARYING_COORD
Used internally in freedreno/ir3 for the vec2 value that hw passes to
shader to use as coordinate for bary.f (varying fetch) instruction.
This is not the same as SYSTEM_VALUE_FRAG_COORD.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Rob Clark b5a098b202 freedreno/ir3: move per-generation compiler config
Move it from the compile ctx to the compiler object, before adding
new things for a6xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 17:59:02 -04:00
Bas Nieuwenhuizen 66e12451ac radv: Update to new VK_EXT_vertex_attribute_divisor to version 2.
Behavior wrt firstInstance got changed, and a divisor of 0 has been
disallowed.

The new version of the ext got published in specification 1.1.81.

Sending to stable since the only known user is DXVK, which needs
this for correctness.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.2 <mesa-stable@lists.freedesktop.org>
2018-08-14 22:13:09 +02:00
Bas Nieuwenhuizen 4bb6c49375 radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9.
Follow radeonsi.

Fixes: 3665f66ef2 "radv: Add support for ETC2 textures."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 22:11:04 +02:00
Bas Nieuwenhuizen bf33ca7512 radv: Fix missing Android platform define.
CC: <mesa-stable@lists.freedesktop.org>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 22:11:04 +02:00
Rob Clark 13b9d32fb1 freedreno: move free() into fdN_context_destroy()
Following patches will be doing further cleanup after calling
fd_context_destroy() so it is easier if we move the free() into
the per-gen backend code.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 15:46:34 -04:00
Jonathan Marek dc9705f30d freedreno: a2xx: ir2 update
this patch brings a number of changes to ir2:
-ir2 now generates CF clauses as necessary during assembly. this simplifies
 fd2_program/fd2_compiler and is necessary to implement optimization passes
-ir2 now has separate vector/scalar instructions. this will make it easier
 to implementing scheduling of scalar+vector instructions together. dst_reg
 is also now seperate from src registers instead of a single list
-ir2 now implements register allocation. this makes it possible to compile
 shaders which have more than 64 TGSI registers
-ir2 now implements the following optimizations: removal of IN/OUT MOV
 instructions generated by TGSI and removal of unused instructions when
 some exports are disabled
-ir2 now allows full 8-bit index for constants
-ir2_alloc no longer allocates 4 times too many bytes

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-08-14 12:46:25 -04:00
Andres Gomez 5406eb5513 docs: update calendar 18.2.0-rc1 and 18.2.0-rc2 are out
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-08-14 17:07:09 +03:00
Bas Nieuwenhuizen fbcd167314 radv: Add on-demand compilation of built-in shaders.
In environments where we cannot cache, e.g. Android (no homedir),
ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the
startup cost of creating a device in radv is rather high, due
to compiling all possible built-in pipelines up front. This meant
depending on the CPU a 1-4 sec cost of creating a Device.

For CTS this cost is unacceptable, and likely for starting random
apps too.

So if there is no cache, with this patch radv will compile shaders
on demand. Once there is a cache from the first run, even if
incomplete, the driver knows that it can likely write the cache
and precompiles everything.

Note that I did not switch the buffer and itob/btoi compute pipelines
to on-demand, since you cannot really do anything in Vulkan without
them and there are only a few.

This reduces the CTS runtime for the no caches scenario on my
threadripper from 32 minutes to 8 minutes.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-14 10:26:24 +02:00
Bas Nieuwenhuizen 24a9033d6f radv: Refactor blit pipeline creation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-14 10:26:11 +02:00
Bas Nieuwenhuizen 806a792b43 radv: Make fs key exemplars ordered to be a reverse fs_key lookup.
While at it, share the exemplars and account for a non-occurring
fs key.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-08-14 10:26:06 +02:00
Dave Airlie 0be5e9f5a1 virgl: ARB_texture_barrier support
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2018-08-14 16:55:56 +10:00
Dylan Baker 6d61aed231 docs: update calendar, add news item and link release notes for 18.1.6
Signed-off-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-13 10:06:45 -07:00
Dylan Baker 973ae7a06b docs: Add sha256 sums for 18.1.6 2018-08-13 10:05:44 -07:00
Dylan Baker 66c8a64e67 docs: Add release notes for 18.1.6 2018-08-13 10:05:42 -07:00
Alejandro Piñeiro 668ab8aeb1 mesa/glspirv: fix compilation with MSVC
From AppVeyor #8582, it seems that MSVC doesn't like uint, so this
patch replaces it with unsigned.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-08-13 18:57:18 +02:00
Eric Engestrom f976d22759 travis: install correct version of mako for each build system
Meson now uses python3, so let's add a block for Autotools, move that
line into the buildsys-specific blocks, and set the correct version for
Meson.

Fixes: 2ee1c86d71 "meson: Build with Python 3"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-13 17:29:42 +01:00
Erik Faye-Lund ae5770171c mesa/st/glsl_to_tgsi: fixup copy-paste mistake
This is clearly a copy-paste error; if we validate the reladdr2-pointer,
we don't want to traverse to the reladdr-pointer. Especially since the
check above shows that reladdr could be NULL here.

Noticed by Coverity.

CID: 1438389, 1438390
Fixes: 568bda2f2d ("mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directly")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-08-13 18:15:36 +02:00
Neil Roberts c91a5f70fb i965/nir: Use the nir copy of shader_info to handle gl_PatchVerticesIn
Instead of using the copy of shader_info stored in gl_program, it now
uses the one in nir_shader. This is needed for SPIR-V because the
info.tess.tcs_vertices_out is filled in via _mesa_spirv_to_nir which
happens much later than with a GLSL shader. The copy of shader_data in
gl_program is only updated later via brw_shader_gather_info but that
is too late.

For GLSL this shouldn't create any problems because the nir copy of
the shader_info is immediately copied from the gl_program in
glsl_to_nir.

v2: updated after commit "i965: Combine both gl_PatchVerticesIn
    lowering passes." (488972) (Alejandro Piñeiro)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Neil Roberts a105c1e6e5 mesa/glspirv: Set separate_shader on shader_info
The value is copied from the gl_program. If we don’t do this then it
will get reset back to zero in brw_shader_gather_info. This isn’t a
problem for GLSL because in that case the nir_shader is initialised
with a copy of the shader_info from the gl_program.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Iago Toral Quiroga 40947d4744 mesa/glspirv: pick off the only entry point we need
This is the same we do for vulkan drivers

This is needed to pass the following CTS test:
KHR-GL45.gl_spirv.spirv_modules_shader_binary_multiple_shader_objects_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro 32e1d4c34b mesa/glspirv: compute double inputs and remap attributes
input locations used by input attributes are not handled in the same
way in OpenGL vs Vulkan. There is a detailed explanation of such
differences on the following commit:

c2acf97fcc

So with this commit, the same adjustment that is done after
glsl_to_nir, is being done after spirv_to_nir, when it is used on
OpenGL (ARB_gl_spirv).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro d6c8066663 nir/glsl: make nir_remap_attributes public
As we plan to reuse it for ARB_gl_spirv implementation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro af194bd38e nir/lower_samplers: don't assume a deref for both texture and sampler srcs
After commit "nir: Use derefs in nir_lower_samplers"
(75286c2d08) assumes one deref for both
the texture and the sampler. However there are cases (on OpenGL, using
ARB_gl_spirv) where SPIR-V is not providing a sampler, like for
texture query levels ops. Although we could make spirv_to_nir to
provide a sampler deref for those cases, it is not really needed, and
wrong from the Vulkan point of view.

This patch fixes the following (borrowed) tests run on SPIR-V mode:
  arb_compute_shader/execution/basic-texelFetch.shader_test
  arb_gpu_shader5/execution/sampler_array_indexing/fs-simple-texture-size.shader_test
  arb_texture_query_levels/execution/fs-baselevel.shader_test
  arb_texture_query_levels/execution/fs-maxlevel.shader_test
  arb_texture_query_levels/execution/fs-miptree.shader_test
  arb_texture_query_levels/execution/fs-nomips.shader_test
  arb_texture_query_levels/execution/vs-baselevel.shader_test
  arb_texture_query_levels/execution/vs-maxlevel.shader_test
  arb_texture_query_levels/execution/vs-miptree.shader_test
  arb_texture_query_levels/execution/vs-nomips.shader_test
  glsl-1.30/execution/fs-textureSize-compare.shader_test

v2: merge lower_tex_src_to_offset and calc_sampler_offsets together,
    update texture/sampler index and texture_array_size directly on
    lower_tex_src_to_offset (Jason)
v3: clarify one comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro fe2de39fb2 nir/linker: take into account hidden uniforms
So they are not exposed through the introspection API.

It is worth to note that the number of hidden uniforms of GLSL linking
vs SPIR-V linking would be somewhat different due the differen order
of the nir lowerings/optimizations.

For example: gl_FbWposYTransform. This is introduced as part of
nir_lower_wpos_ytransform. On GLSL that is executed after the IR-based
linking. So that means that on GLSL the UniformStorage will not
include this uniform. With the SPIR-V linking, that uniform is already
present, but marked as hidden. So it will be included on the
UniformStorage, but as hidden.

One alternative would create a special how_declared for that case, but
seemed an overkill. Using hidden should be ok as far as it is used
properly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:27 +02:00
Alejandro Piñeiro 5332d7582d nir: add how_declared to nir_variable.data
Equivalent to the already existing how_declared at GLSL IR. The only
difference is that we are not adding all the declaration_type
available on GLSL, only the one that we will use on the short term. We
would add more mode if needed on the future.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 16:28:26 +02:00
Neil Roberts be6f472b23 spirv: Make VertexIndex and VertexId both non-zero-based
GLSL has gl_VertexID which is supposed to be non-zero-based.

SPIR-V has both VertexIndex and VertexId builtins whose meanings are
defined by the APIs.

Vulkan defines VertexIndex as being non-zero-based. In Vulkan VertexId
and InstanceId have no meaning and are pretty much just reserved for
OpenGL at this point.

GL_ARB_spirv removes VertexIndex and defines VertexId to be the same
as gl_VertexId (which is also non-zero-based).

Previously in Mesa it was treating VertexIndex as non-zero-based and
VertexId as zero-based, so it was breaking for GL. This behaviour was
apparently based on Khronos bug 14255. However that bug doesn’t seem
to have made a final decision for VertexId.

Assuming there really is no other definition for VertexId for Vulkan
it seems better to just make them both have the same value.

v2: update comment and commit descriptions, based on Jason Ekstrand
    explanation of the meaning/rationale behind all those builtins
    (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-13 16:23:36 +02:00
Alejandro Piñeiro 624c00f1a6 spirv: fill info.gs.input_primitive too
info.gs.output_primitive was already being filled. Not sure why this
is not needed on Vulkan, but we found to be needed for
ARB_gl_spirv. Specifically, this is needed to get the following test
passing:

KHR-GL45.gl_spirv.spirv_validation_builtin_variable_decorations_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-13 12:56:51 +02:00
Tapani Pälli ed94a5799d docs/features: mark GL_EXT_render_snorm as done for i965
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2018-08-13 13:08:22 +03:00
Tapani Pälli fa9e6c235d i965: enable EXT_render_snorm
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-08-13 12:03:17 +03:00
Tapani Pälli 0d356cf478 mesa: enable EXT_render_snorm extension
Patch sets additional formats renderable and enables the extension
when OpenGL ES 3.1 is supported.

v2: instead of dummy_true, have a separate toggle for extension
    (Eric Anholt)

v3: add missing checks, simplify some existing checks and fix
    glCopyTexImage2D check (Nanley Chery)

    add SHORT and BYTE support in read_pixels_es3_error_check

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-08-13 12:03:17 +03:00
Kenneth Graunke de57926dc9 blorp: Properly handle Z24X8 blits.
One of the reasons we didn't notice that R24_UNORM_X8_TYPELESS
destinations were broken was that an earlier layer was swapping it
out for B8G8R8A8_UNORM.  That made Z24X8 -> Z24X8 blits work.

However, R32_FLOAT -> R24_UNORM_X8_TYPELESS was still totally broken.
The old code only considered one format at a time, without thinking
that format conversion may need to occur.

This patch moves the translation out to a place where it can consider
both formats.  If both are Z24X8, we continue using B8G8R8A8_UNORM to
avoid having to do shader math workarounds.  If we have a Z24X8
destination, but a non-matching source, we use our shader hacks to
actually render to it properly.

Fixes: 804856fa57 (intel/blorp: Handle more exotic destination formats)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-11 12:34:01 -07:00
Kenneth Graunke 8a29086285 blorp: Don't try to use R32_UNORM for R24_UNORM_X8_TYPELESS rendering.
The hardware doesn't support rendering to R24_UNORM_X8_TYPELESS, so
Jason decided to fake it with a bit of shader math and R32_UNORM RTs.

The only problem is that R32_UNORM isn't renderable either...so we've
just traded one bad format for another.

This patch makes us use R32_UINT instead.

Fixes: 804856fa57 (intel/blorp: Handle more exotic destination formats)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-11 12:33:27 -07:00
Jason Ekstrand a9f7bcfdf9 intel: Switch the order of the 2x MSAA sample positions
The Vulkan 1.1.82 spec flipped the order to better match D3D.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-08-11 10:58:12 -05:00
Gert Wollny 8a87138885 mesa/st/tests: Add array life range estimation and renumbering tests
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 0981fc84df mesa/st/tests: Add array life range tests infrastructure to common test class
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny d8c2119f9b mesa/st/glsl_to_tgsi: Expose array live range tracking and merging
This patch ties in the array split, merge, and interleave code.

shader-db changes in the TGSI code are:

              original code  |  array-merge  |       change
              mean      max  |  mean    max  | best  mean %  worst
      -----------------------------------------------------------
      arrays   0.05       2  |   0.00     0  |  -2   -100      0
total temps    5.05      21  |   4.92    20  | -15   -2.59     1
      instr   55.33     988  |  55.20   988  | -15   -0.24     0

Evaluation:

Run shader-db in single thread mode (otherwise the output is
not ordered and the best and worst column don't make sense) to
get results pre-stats.txt and post-stats.txt. Then using
python pandas:

 import pandas as pd
 old_stats = pd.read_csv('pre-stats.txt')
 new_stats = pd.read_csv('post-stats.txt')
 omean = old_stats.mean()
 omax = old_stats.max()
 nmean = new_stats.mean()
 nmax = new_stats.max()
 delta =  new_stats - old_stats
 pd.concat([omean, omax, nmean, nmax, delta.min(),
            delta.mean()/old_stats.mean()*100, delta.max()],
            axis=1, keys=['mean', 'max', 'mean', 'max', 'best',
            'avg change %', 'worst'])

v4: - Correct typo and add bugs that are fixed by this series.
    - Update stats and describe stats evaluation

Bugzilla:
  https://bugs.freedesktop.org/show_bug.cgi?id=105371
  https://bugs.freedesktop.org/show_bug.cgi?id=100200
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny c317d0ab54 mesa/st/glsl_to_tgsi: add array life range evaluation into tracking code
v4: Also track the register given in inst->resource. (thanks: Benedikt Schemmer
    for testing the patches on radeonsi, which revealed that I was missing
    tracking this)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 5e58eb37f1 mesa/st/glsl_to_tgsi: add class for array access tracking
Because of the indirect access it is impossible to obtain an accurate per
component and array element tracking. Therefore, the tracking is simplified
to only track whether any element was accessed, whether this happend
conditionally in a loop. In addition, while tracking of temporaries requires
a per-componet tracking that is later fused, for arrays only the components
access mask is neede. The resulting tracking code and evaluation of the array
live range is sufficiently different from the evaluation of the live range of
temporaries to justify implementing this in a different class instead of
adding more complexity to the already existing code for temporary life
range evaluation.

v4: Update commit message to make it clearer why this class is seperate from
    the tracking of temporaries.
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 7d55d01b53 mesa/st/glsl_to_tgsi: move evaluation of read mask up in the call hierarchy
In preparation of the array live range tracking the evaluation of the read
mask is moved out the register live range tracking to the enclosing call
of the generalized read access tracking.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny f2a4636339 mesa/st/glsl_to_tgsi: rename access_record to register_merge_record and some more renames
In preparartion of adding the tracking of the live range the classes that refer
to temporary registers are renamed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 8c89728889 mesa/st/tests: Add tests for array merge helper classes.
v2: - Define tests also in the meson.build file.
v4: - Check no-op mapping of all bits.
    - Convert tests to the new class layout used in the merge evaulation.
    - remove dependency on llvm in meson build (Thanks Dylan Baker for pointing
       out that this might not needed)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 12316aa217 mesa/st/glsl_to_tgsi: Add array merge logic
v4: - Update the code to use the new merge logic.
    - Use a cleaner, class-based approach for the evaluation of merges.
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny d097ef4204 mesa/st/glsl_to_tgsi: Add helper classes to apply array merging and interleaving
v4: - Remove logic for evaluation of swizzles and merges since this
        was moved to array_live_range. This class now only handles the
        actual remapping.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny d54c2f92f9 mesa/st/glsl_to_tgsi: Add helper class for array live range merging and interleaving
This class holds the array length, live range, and accessed components, and
it implements the logic for evaluating how arrays are merged and interleaved.

v4: - Add logic to evaluate merge and interleave of a pair of arrays to
      the class array_live_range.
    - document class
    - update commit message

Thanks Nicolai Hähnle for the pointers given.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 331ae3cde5 mesa/st/glsl_to_tgsi:rename lifetime to register_live_range
On one hand "live range" is the term used in the literature, and on the
other hand a distinction is needed from the array live ranges.

v4: Fix indentions and white spaces

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny f40c9d0225 mesa/st/glsl_to_tgsi: Properly resolve life times simple if/else + use constructs
in constructs like below, currently the live range estimation extends the live range
of t unecessarily to the whole loop because it was not detected that t is
unconditional written and later read only in the "if (a)" scope.

  while (foo)  {
    ...
    if (a) {
       ...
       if (b)
         t = ...
       else
         t = ...
       x = t;
       ...
    }
     ...
  }

This patch adds a unit test for this case and corrects the minimal live range estimation
accordingly.

v4: update comments
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00
Gert Wollny 568bda2f2d mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directly
Array whose elements are only accessed directly are replaced by the
according number of temporary registers. By doing so the otherwise
reserved register range becomes subject to further optimizations like
copy propagation and register merging.

Thanks to the resulting reduced register pressure this patch makes
the piglits

  spec/glsl-1.50/execution -
      variable-indexing/vs-output-array-vec3-index-wr-before-gs
      geometry/max-input-components

pass on r600 (barts) where they would fail before with a "GPR limit exceeded"
error (even with the spilling that was recently added).

v2: * rename method dissolve_arrays to split_arrays
    * unify the tracking and remapping methods for src and dst registers
    * also track access to arrays via reladdr*

v3: * enable this optimization only if the driver requests register merge

v4: * Correct comments
    * Also update inst->resource if it is an array element
      (thanks: Benedikt Schemmer for testing the patches on radeonsi, which
       revealed that I was missing tracking this)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2018-08-11 12:32:42 +02:00