Commit Graph

95817 Commits

Author SHA1 Message Date
Tim Rowley 6cb20c9f3a swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:12 -05:00
Tim Rowley 6afdc8732c swr/rast: Removed some trailing whitespace caught during review
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:06 -05:00
Tim Rowley 4edc5d8305 swr: set caps for VB 4-byte alignment
Needed to compensate for change to fetch jit requiring
alignment.

Fixes regressions in piglit: vertex-buffer-offsets and about
another hundred of the vs-input*byte* tests.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:01:59 -05:00
Tim Rowley 4475583f5e swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:01:39 -05:00
Samuel Pitoiset 5c9af800cb radv: fix error code when resizing the upload BO
malloc() failures are unrelated to the device memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-06 15:52:19 +02:00
Gert Wollny 107ecd97f1 mesa/st/st_glsl_to_tgsi_temprename.cpp: Fix compilation with MSVC
If <windows.h> is included then max is a macro that clashes
with std::numeric_limits::max, hence undefine it.
For some reason the struct access_record is not recognizes
outside the anonymouse namespace, make it a class.
The patch successfully was tested on AppVeyor.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 15:12:19 +02:00
Gert Wollny 09ffe274b0 mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
This patch replaces the old register lifetime estiamtion and
rename mapping evaluation with the new one.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread.

-----------------------------------------------------------
                    old          new(std::sort)

---------------- time ./run -j1 shaders --------------------

  real              5.80s          5.75s
  user              5.75s          5.70s
  sys               0.05s          0.05s

---- valgrind --tool=callgrind --dump-instr=yes------------

 merge               0.08%         0.18%
 estimate lifetime   0.02%         0.11%
 evaluate mapping  (incl=0.3%)     0.04%
 apply mapping       0.03%         0.02%

---   perf (approximate because of statistic sampling) ----

merge (total)        0.09%         0.16%
estimate lifetime    0.03%         0.10%
evaluate mapping  (incl=0.02%)     0.04%
apply mapping        0.04%         0.04%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:52 +02:00
Gert Wollny 33b7728bf9 mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
The patch adds tests for the register rename mapping evaluation and
combined life time estimation and renaming.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:49 +02:00
Gert Wollny 84529c077b mesa/st: glsl_to_tgsi: add register rename mapping evaluator
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
  src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.

Registers that are not written to are not considered for renaming since in
glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:46 +02:00
Gert Wollny 7be6d8fe12 mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker
This patch adds a set of unit tests for the new lifetime tracker.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:43 +02:00
Gert Wollny 978c437b12 mesa/st: glsl_to_tgsi: implement new temporary register lifetime tracker
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned  second time to record the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actual minimal life time is
evaluated.

In addition, when compiled in debug mode (i.e. NDEBUG is not defined)
the shaders and estimated temporary life times can be logged to stderr
by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:39 +02:00
Gert Wollny 732246701f mesa/st: glsl_to_tgsi move some helper classes to extra files
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

    class st_src_reg;
    class st_dst_reg;
    class glsl_to_tgsi_instruction;
    struct rename_reg_pair;

    int swizzle_for_type(const glsl_type *type, int component);

  as inline:

    bool is_resource_instruction(unsigned opcode);
    unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
    unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:27 +02:00
Dave Airlie b65ff7a02d st_glsl_to_tgsi: rewrite rename registers to use array fully.
Instead of having to search the whole array, just use the whole
thing and store a valid bit in there with the rename.

Removes this from the profile on some of the fp64 tests

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 11:44:16 +02:00
Nicolai Hähnle 45c5c44451 radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
When the HS wave is empty, the hardware writes the LS VGPRs starting at
v0 instead of v2. Workaround by shifting them back into place when
necessary. For simplicity, this is always done in the LS prolog.

According to the hardware team, this will be fixed in future chips,
so take that into account already.

Note that this is not a bug fix, as the bug was already worked
around by commit 166823bfd2 ("radeonsi/gfx9: add a temporary workaround
for a tessellation driver bug"). This change merely replaces the
workaround by one that should be better.

v2: add workaround code to shader only when necessary
v3: clarify the prefer_mono comment

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 10:02:49 +02:00
Nicolai Hähnle 552aaa11ed ac/debug: take ASIC generation into account when printing registers
There were some overlapping changes in gfx9 especially in the CB/DB
blocks which made register dumps rather misleading.

The split is along the lines of the header files, so we'll print VI-only
fields on SI and CI, for example, but we won't print GFX9 fields on
SI/CI/VI, and we won't print SI/CI/VI fields on GFX9.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:19 +02:00
Nicolai Hähnle 274f1dace7 amd/common: pass chip_class to ac_dump_reg
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:17 +02:00
Nicolai Hähnle 925ad7d2f6 ac/sid_tables: add FieldTable object
Automatically re-use table entries like StringTable and IntTable do.
This allows us to get rid of the "fields_owner" logic, and simplifies
the next change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:14 +02:00
Nicolai Hähnle 981335b704 ac/sid_tables: remove unused variable varname_values
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:07 +02:00
Nicolai Hähnle 34124e412f radeonsi/gfx9: always flush DB metadata on framebuffer changes
This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:57:08 +02:00
Nicolai Hähnle 1e247511e5 util/ralloc: set prev-pointers correctly in ralloc_adopt
Found by inspection.

I'm not aware of any actual failures caused by this, but a precise
sequence of ralloc_adopt and ralloc_free should be able to cause
problems.

v2: make the code slightly clearer (Eric)

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 09:56:19 +02:00
Iago Toral Quiroga 94f740e3fc mesa/main: Fix GetTextureImage error reporting
GetTex*Image should return INVALID_ENUM if target is not valid, however,
GetTextureImage does not receive a target, and instead should return
INVALID_OPERATION if the effective target is not valid. From the
OpenGL 4.6 core profile spec, section 8.11 Texture Queries:

"An INVALID_OPERATION error is generated by GetTextureImage if the effective
 target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D, TEXTURE_1D_ARRAY,
 TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, TEXTURE_RECTANGLE, or
 TEXTURE_CUBE_MAP (for GetTextureImage only)."

Fixes:
KHR-GL45.direct_state_access.textures_image_query_errors

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-06 08:19:53 +02:00
Tapani Pälli c77ea0501c egl: remove unused 'Screens' array from _egl_display
This was used by EGL_MESA_screen_surface that has been removed
in commit 7a58262e58.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <eml.velikov@collabora.com>
2017-09-06 07:59:14 +03:00
Dave Airlie e38685cc62 Revert "radv: disable support for VEGA for now."
This reverts commit 611076a41a.

With the two previous commits, vega shouldn't be unstable,
doesn't pass CTS, but can do a complete run, and games shouldn't
hang anymore, so bring it back online.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:23:10 +01:00
Dave Airlie 6d929d3f85 radv/gfx9: set descriptor up for base_mip to level range.
This is required on GFX9, fixes a bug in Talos where all the
mipmaps overlay each other.

Just pushing this as well as it fixes Talos.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:22:22 +01:00
Dave Airlie d118ff8765 radv: disable 1d/2d linear optimisation on gfx9.
This causes hangs in some of the CTS tests with a 2d
1536x2 texture.

This fixes hangs with:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.iew_type.1d_aray.format.r4g4b4a4_unorm_pack16.count_1.size.512x1_array_of_3
if we reenable it, make sure these don't regress.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:06:08 +01:00
Dave Airlie b880cd3b59 radv/gfx9: fix buffer size on gfx9.
The VI sizing only applies to VI.

This fixes:
dEQP-VK.image.image_size.buffer.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:05:44 +01:00
Bas Nieuwenhuizen ff23e03d60 radv: Fix vkCopyImage with both depth and stencil aspects.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-06 01:54:37 +02:00
Dave Airlie 9e6b382142 mesa/mtypes: repack gl_sampler_object.
160->152.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:14:25 +10:00
Dave Airlie ff6123925c mesa/mtypes: repack gl_texture_object.
reduces size from 1144 to 1128.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:13:52 +10:00
Dave Airlie ef660abdd5 mesa/mtypes: repack gl_shader_program_data.
This reduces the size from 144 bytes to 128 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:13:22 +10:00
Dave Airlie 449ac347dd mesa/mtypes: reorganise gl_shader
This reduces this from 200->182 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:13:03 +10:00
Dave Airlie a53c63e46b mesa/mtypes: repack display list structs.
This reduces each of these by 8 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:53 +10:00
Dave Airlie a265ffa69f mesa/mtypes: reduce size of gl_sync_object.
Drops from 40->32 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:47 +10:00
Dave Airlie e4bcbe03b5 mesa/mtypes: reorg vertex/fragment program state.
reduces both of these by 8 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:44 +10:00
Dave Airlie cff02d214f mesa/bindless: reorder gl_bindless_image gl_bindless_sampler.
This makes these use 16-bytes instead of 24-bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:12 +10:00
Samuel Pitoiset 7f952eb931 radv: fix a memleak when compiling the GS copy shader
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-05 21:36:44 +02:00
Charmaine Lee c12ef63b69 svga: move index buffer bind flag assertion
The buffer bind flags can be promoted in svga_buffer_handle(), so
move the assertion after it. This has already been done for
vertex buffer in commit 6b4bf7e8be, but it misses the one for
index buffer.

Fixes assertion running WarThunder.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-09-05 10:31:18 -06:00
Charmaine Lee 98badd7f6e svga: avoid emitting redundant SetShaderResources and SetVertexBuffers
Minor performance improvement in avoiding binding the same shader resource
or the same vertex buffer for the same slot.

Tested with MTT glretrace.

v2: Per Brian's suggestion, add a helper function to do vertex buffer
    comparision.
v3: Change the helper function to vertex_buffers_equal().

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-05 10:31:18 -06:00
Jason Ekstrand e439908af9 spirv: Add support for the HelperInvocation builtin
I have no idea how this got missed but it's been missing since forever.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-05 16:39:24 +03:00
Thomas Hellstrom 86df05eb26 loader/dri3: Use client local back to front blit in copySubBuffer if available
The copySubBuffer functionality always attempted a server side blit from
back to fake front if a fake front was present, and we weren't displaying
on a remote GPU.

Now that we always have local blit capability on modern drivers, first
attempt a local blit, and only if that fails, try the server blit.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Axel Davy <axel.davy@normalesup.org>
2017-09-05 12:22:17 +02:00
Marek Olšák c3ebac6890 radeonsi/gfx9: implement primitive binning
This increases performance, but it was tuned for Raven, not Vega.
We don't know yet how Vega will perform, hopefully not worse.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Marek Olšák 51e10c2770 radeonsi: add more state flags into si_state_dsa
3 flags for primitive binning, 2 flags for out-of-order rasterization
(but that will be done some other time)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Marek Olšák 0797eea758 radeonsi/gfx9: don't use BREAK_BATCH and FLUSH_DFSM if DFSM is disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Tapani Pälli 0986f68632 vbo: fix build errors on android
incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int')
from 'const char *' [-Werror,-Wint-conversion]

      offset = indices;
             ^ ~~~~~~~

Fixes: 2d93b462b4 ("vbo: fix offset in minmax cache key")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-05 07:55:34 +03:00
Emil Velikov bddf4a51c1 docs: add news item and link release notes for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-04 18:26:34 +01:00
Emil Velikov cd48ffc755 docs: add sha256 checksums for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b4473dd5191878249ccb53f40407206f1e57fa6f)
2017-09-04 18:24:52 +01:00
Emil Velikov f60fe7a448 docs: Update 17.2.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f5925b2897308530c64e1abf44ebc1ee0e017ada)
2017-09-04 18:24:51 +01:00
Marek Olšák fb7ba68f6c radeonsi: eliminate PS color outputs when colormask kills them
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-04 15:10:39 +02:00
Marek Olšák 468c131033 gallium/radeon: sort DBG shader flags according to pipe_shader_type
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-04 15:10:39 +02:00
Nicolai Hähnle 50283109aa radeonsi: ensure cache flushes happen before SET_PREDICATION packets
The data is read when the render_cond_atom is emitted, so we must
delay emitting the atom until after the flush.

Fixes: 0fe0320dc0 ("radeonsi: use optimal packet order when doing a pipeline sync")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-04 13:50:57 +02:00