Commit Graph

144210 Commits

Author SHA1 Message Date
Timur Kristóf abcc83e713 aco: Fix to_uniform_bool_instr when operands are not suitable.
Don't attempt to transform uniform boolean instructions when
their operands are unsuitable. This can happen eg. due to other
optimizations that combine SALU instructions which clear out
the uniform instruction labels.

Cc: mesa-stable
Fixes: 8a32f57fff
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>
2021-08-25 12:43:50 +00:00
Lionel Landwerlin a13e79843e nir: prevent peephole from generating invalid NIR
We can't append instructions following a return/halt instruction
because the control flow helpers will modify the successor of the
block containing the return/halt. And the NIR validator enforces that
the return/halt must have the end of the function as successor.

This tends to happen following lower_shader_calls lowering which
inserts halts. This probably doesn't prevent the optimization, it'll
just happen in one of the return shaders after the halt has been
removed.

v2: Move prev block ending check earlier in the function (Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12506>
2021-08-25 11:38:21 +00:00
Samuel Pitoiset e0a703af11 ci: update the list of expected failures/skips for RADV
Against CTS 1.2.7.0.
Tested chips are Pitcairn, Polaris10, Navi14 and Sienna Cichlid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12539>
2021-08-25 13:00:07 +02:00
Tomeu Vizoso 3f5053b899 iris/ci: Add manual jobs for tracking performance
Use Piglit's replay profile to measure and store the time that frames
take to render in the GPU.

This job won't run automatically in regular pipelines, but will be
triggered automatically by a script for every successful pre-merge
pipeline.

This is because we want to generate performance data for every relevant
commit merged in main, but we don't want to keep a device busy during
the pre-merge run.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12236>
2021-08-25 09:32:17 +02:00
Samuel Pitoiset cff106c4b6 nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(fabs(b), -a)
and fmin(-fmax(b, a)) to fmin(-fabs(b), -a).

fossils-db (Sienna Cichlid):
Totals from 34 (0.02% of 150170) affected shaders:
CodeSize: 388540 -> 387748 (-0.20%)
Instrs: 74621 -> 74423 (-0.27%)
Latency: 1039407 -> 1039011 (-0.04%)
InvThroughput: 208364 -> 208150 (-0.10%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12519>
2021-08-25 07:18:24 +02:00
Dave Airlie ad78643061 crocus: add missing fs dirty on reduced prim change.
the reduced prim is used to decide some line antialiasing settings.
this fixes mesa-demos antialias

Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12536>
2021-08-25 03:30:16 +00:00
Dave Airlie 6b7a68b7c2 crocus: add missing line smooth bits.
Just noticed this in passing.

Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12536>
2021-08-25 03:30:16 +00:00
Mike Blumenkrantz 560dc4f790 zink: fix pipeline caching
this was apparently always broken, but in a very, very subtle way
where the hash table would compare the current pipeline state against
itself instead of using the cache entry's state

Cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12535>
2021-08-25 03:11:41 +00:00
Mike Blumenkrantz 712a4d2fd2 zink: fix program init flag
this was accidentally !! instead of ! as intended

Fixes: c4702204bc ("zink: optimize shader recalc")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12535>
2021-08-25 03:11:41 +00:00
Michael Tang 4237aa3a7e spirv_to_dxil: Run nir_lower_tex during compilation
We need this to get e.g. a default lod for some instructions when it is
not provided.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12462>
2021-08-24 22:18:30 +00:00
Dave Airlie 4c260f017c crocus: drop u_primconvert header.
This is just leftover.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12531>
2021-08-24 21:38:27 +00:00
Mike Blumenkrantz ea18b0930b zink: add better TODO note for surface swizzles
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>
2021-08-24 21:23:45 +00:00
Mike Blumenkrantz 6ff5eaa7d5 zink: make void swizzle clamping util public
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>
2021-08-24 21:23:45 +00:00
Mike Blumenkrantz 52032d5efa zink: make component mapping function a static inline
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>
2021-08-24 21:23:45 +00:00
Mike Blumenkrantz 08bad3b2b8 zink: move void format detection function to zink_format
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>
2021-08-24 21:23:45 +00:00
Mike Blumenkrantz e645e3c523 nine: replace unnecessary dynamic-sized array with bitfield
PIPE_MAX_VERTEX_STREAMS is 4, so this can be simplified to reduce cpu usage

Reviewed-by: <Axel Davy davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12523>
2021-08-24 20:38:41 +00:00
Alyssa Rosenzweig 16b4916432 panfrost: Take a ctx when submitting/destroying
This reduces the number of batch->ctx shenanigans we do, and in turn
should reduce raciness.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12366>
2021-08-24 20:20:29 +00:00
Ian Romanick fe956d0182 spirv: Add support for SPV_KHR_integer_dot_product
v2 (Ivan): Add missing capability enum handling.

v3 (idr): Properly handle cases where dest_size != 32.

v4 (idr): Rewrite most of the error checking to use vtn_fail_if.  Use
nir_ssa_def with vtn_push_nir_ssa instead of vtn_ssa_value with
vtn_push_ssa_value.  All suggested by Jason.  Massive rewrite of the
handling of packed 4x8 saturating opcodes.  Based on some observations
made by Jason.

v5 (idr): Remove some debugging cruft accidentally added in v4.  Noticed
by Jason.

v6: Emit packed versions of vectored instructions when possible.
Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick 652d304ee9 spirv: Update headers and metadata from latest Khronos commit
This corresponds to e7b49d7 ("Implement SPV_INTEL_optnone extension
(#230)") in https://github.com/KhronosGroup/SPIRV-Headers.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick a6db40605e nir/algebraic: Add some extract optimizations
These help quite a bit when vectored versions of SpvOpSDotKHR and
friends are emitted as packed versions and then lowered.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick 839495efc6 nir/algebraic: Add lowering for dot_4x8 instructions
v2: Fix copy-and-paste bugs in lowering patterns.

v3: Add has_sudot_4x8 flag.  Requested by Rhys.

v4: Since the names of the opcodes changed from dp4 to dot_4x8, also
change the names of the lowering helpers.  Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick 0f809dbf40 intel/compiler: Basic support for DP4A instruction
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick 806cd2341c nir/algebraic: Basic patterns for dot_4x8
v2: Add and modify patterns to let constant folding do better.

v3: Remove '(is_not_zero)' from the patterns that try to combine
addends.  I honestly don't know why I had it there in the first place,
and nothing in my deep git logs could help clue me in.  Noticed by
Alyssa.  Remover patterns that detect open-coded udot_4x8.  Suggested by
Alyssa and Jason.  Add missing sudot_4x8 patterns.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick 6c18a3b497 nir/opcodes: Add integer dot-product opcodes
Six opcodes are added: sdot_4x8_iadd, udot_4x8_uadd, sudot_4x8_iadd,
sdot_4x8_iadd_sat, udot_4x8_uadd_sate, and sudot_4x8_iadd_sat.  These
represent the combinations of integer dot-product and add that operate
on packed source vectors.  That is, the four 8-bit values for each
vector is stored in a single 32-bit integer.

Some hardware may prefer to operate on unpacked byte vectors.  When such
hardware comes to Mesa, we'll have to figure out how to name things.

v2: Add nir_op_iudp4a and nir_op_iudp4a_sat instructions.  These opcodes
are not 2-source commutative.

v3: Rename all opcodes to be more like some existing 4x8 opcodes.
Suggested by Timur.  Change type of packed vector sources to uint32,
change types of constant folding variables to have explicit size, and
delete some extra casts.  All suggested by Jason.

v4: Fix typo previously noticed by Alyssa but missed in v2.

v5: Add has_sudot_4x8 flag.  Requested by Rhys.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick 7d8bf7c167 nir/lower_bit_size: Support add_sat and sub_sat
Without this, lowered saturating ALU instructions would only clamp to
the range of the new type instead of the range of the old type.

v2: Use nir_iclamp.  Suggested by Jason. Use new
u_{int,uint}N_{min,max}() helpers.

Fixes: 090e282407 ("nir: Add a saturated unsigned integer add opcode")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Alyssa Rosenzweig 8503cab2e0 panfrost: Replace writers pointer with hash table
This ensures each context can have a separate batch writing a resource
and we don't race trying to flush each other's batches. Unfortunately
the extra hash table operations regress draw-overhead numbers by about
8% but I'd rather eat the overhead and have an obviously correct
implementation than leave known buggy code in tree.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig afebbadda8 panfrost: Remove writer = NULL assignments
These already happened.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig e7eb28fed0 panfrost: Remove rsrc->track.users
No longer needed.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig e98aa55413 panfrost: Prefer batch->resources to rsrc->users
This expresses the semantic of the flush only applying to batches within
the context, not globally, in line with OpenGL's multithreading rules.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig 5c4fbae571 panfrost: Add foreach_batch iterator
Using the active mask.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig 79dd1a4e63 panfrost: Maintain a bitmap of active batches
This is on the context, so no concurrency issues. This will allow us to
efficiently iterate active batches.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig b8da5b1b7f panfrost: Cache number of users of a resource
This can be tracked efficiently with atomics, and reduces the places we
use the rsrc->track.users bitmap which has concurrency issues.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig 2f63ccd080 panfrost: Switch resources from an array to a set
This will help us reduce shared state and simplify multithreading, at
the expense of additional CPU overhead.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
2021-08-24 19:39:51 +00:00
Mike Blumenkrantz 8e2159a57f zink: stop referencing framebuffers
this is a waste of cycles now that surfaces are accurately tracked;
no-attachment fbs are still deferred to avoid premature deletion

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12429>
2021-08-24 19:23:48 +00:00
Mike Blumenkrantz 1af6618694 zink: defer deletion of no-attachment framebuffers
the ref on these is owned by the context, so defer deletion to avoid
premature destruction if the fb might be in use

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12429>
2021-08-24 19:23:48 +00:00
Alyssa Rosenzweig 3622562e44 panfrost: Inline add_fbo_bos
Only used once, it's just complicating the batch cache interface.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:55 -04:00
Alyssa Rosenzweig 4991e17297 panfrost: Remove get_fresh_batch
Unused, and of dubious value.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig bd15e5e6af panfrost: Move bo->label assignment into the lock
We already took the lock, we just unlocked too early. Since the label is
reset in the BO cache, this is racy. Minimal impact in practice but is
still /wrong/ and caught by helgrind.

Fixes: 3fa1f93dac ("panfrost: Label all BOs in userspace")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig e6924be737 panfrost: Don't use ralloc for resources
ralloc is not thread safe, so we cannot use a pipe_screen as a ralloc
context unless we lock the screen. The allocation patterns for resources
are trivial, so just use malloc/calloc/free directly instead of ralloc.
This fixes a segfault in:

dEQP-EGL.functional.sharing.gles2.multithread.random.images.copytexsubimage2d.1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig 9307028255 panfrost: Simplify get_fresh_batch_for_fbo
Makes the code easier to read, too.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig 81a76d9e42 panfrost: Remove null check in batch_cleanup
Shouldn't happen.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig 40edc87956 panfrost: Protect the variants array with a lock
Without a lock, two threads may bind the same shader CSO simultaneously,
allocate the same variant simultaneously, and then race each other in
the compiler. This manifests in various ways, most commonly failing the
assertion that UBO pushing has only run once. The simple_mtx_t solution
is used in Iris. Fixes the crash in:

dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig 67821af1de panfrost/ci: Don't skip matrix inverse tests
Older versions of these tests were buggy and failed on Bifrost. The test
bug has been resolved upstream, but the skip list was not updated when
dEQP was uprevved with the fix. Run the tests.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12313>
2021-08-24 18:12:14 +00:00
Alyssa Rosenzweig 324a32ac14 panfrost/ci: Switch to suite support
Use the new deqp-runner suite support to combine our dEQP-GLES2,
dEQP-GLES3, and dEQP-GLES31 jobs into a single job. This simplifies load
balancing, enabling us to expand our test coverage without impacting
wall clock time.

With the new infrastructure in place, we add KHR-GLES* jobs for Mali
G52. This would have caught some recent regressions. Once we hit
conformance it's essential we remain conformant.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12313>
2021-08-24 18:12:14 +00:00
Rhys Perry 3d228b6926 nir/gcm: pin some instructions which require uniform sources
fossil-db (Sienna Cichlid, GCM enabled):
Totals from 6192 (4.12% of 150170) affected shaders:
VGPRs: 548392 -> 542040 (-1.16%)
SpillSGPRs: 3702 -> 3990 (+7.78%); split: -0.54%, +8.32%
CodeSize: 62418488 -> 62481516 (+0.10%); split: -0.07%, +0.17%
MaxWaves: 70582 -> 71718 (+1.61%)
Instrs: 11768497 -> 11795079 (+0.23%); split: -0.07%, +0.30%
Latency: 445891848 -> 523561297 (+17.42%); split: -0.07%, +17.49%
InvThroughput: 115675481 -> 121494913 (+5.03%); split: -0.09%, +5.12%
VClause: 164914 -> 164934 (+0.01%); split: -0.05%, +0.06%
SClause: 405991 -> 395302 (-2.63%); split: -2.64%, +0.00%
Copies: 907216 -> 926429 (+2.12%); split: -1.11%, +3.23%
Branches: 456373 -> 457478 (+0.24%); split: -0.13%, +0.38%
PreSGPRs: 648030 -> 642953 (-0.78%); split: -0.88%, +0.10%
PreVGPRs: 522425 -> 516355 (-1.16%); split: -1.16%, +0.00%

Seems to affect Detroit: Become Human and Cyberpunk 2077. The Cyberpunk
2077 changes look like a fixed bug. At least some of the Detroit: Become
Human changes could probably be removed with better divergence analysis.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>
2021-08-24 16:52:31 +00:00
Rhys Perry 884ac52eaa nir: consider push constant loads as always dynamically uniform
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>
2021-08-24 16:52:31 +00:00
Daniel Schürmann a3110c308f radv: call nir_lower_flrp() after the first radv_optimize_nir()
instead of inside the optimization loop

Totals from 2504 (1.67% of 150170) affected shaders: (GFX10.3)
VGPRs: 162592 -> 162416 (-0.11%); split: -0.12%, +0.01%
CodeSize: 18399756 -> 18383552 (-0.09%); split: -0.10%, +0.01%
MaxWaves: 42654 -> 42748 (+0.22%)
Instrs: 3499404 -> 3497075 (-0.07%); split: -0.08%, +0.01%
Latency: 87087238 -> 87064270 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 21159621 -> 21150546 (-0.04%); split: -0.05%, +0.01%
VClause: 56653 -> 56667 (+0.02%); split: -0.00%, +0.03%
Copies: 226332 -> 226423 (+0.04%); split: -0.15%, +0.19%
Branches: 110027 -> 110025 (-0.00%); split: -0.05%, +0.04%
PreSGPRs: 168087 -> 168076 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 160814 -> 160705 (-0.07%)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
2021-08-24 16:10:30 +00:00
Daniel Schürmann 2cf164feb9 nir/opt_algebraic: optimize flrp(fadd, fadd, x) only if fadd are used_once
Totals from 201 (0.13% of 150170) affected shaders: (GFX10.3)
VGPRs: 13880 -> 13856 (-0.17%)
CodeSize: 1517328 -> 1518124 (+0.05%); split: -0.04%, +0.10%
MaxWaves: 3184 -> 3192 (+0.25%)
Instrs: 285487 -> 285569 (+0.03%); split: -0.06%, +0.08%
Latency: 7774066 -> 7780877 (+0.09%); split: -0.10%, +0.19%
InvThroughput: 1936341 -> 1935287 (-0.05%); split: -0.07%, +0.02%
SClause: 11446 -> 11448 (+0.02%); split: -0.01%, +0.03%
Copies: 17500 -> 17506 (+0.03%); split: -0.51%, +0.55%
Branches: 8174 -> 8180 (+0.07%); split: -0.13%, +0.21%
PreVGPRs: 12507 -> 12427 (-0.64%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
2021-08-24 16:10:30 +00:00
Daniel Schürmann 89a842b2b6 nir/loop_analyze: consider instruction cost of nir_op_flrp
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
2021-08-24 16:10:30 +00:00
Chia-I Wu 572ed22494 venus: use uint32_t in vn_ring_submit
And in vn_ring_write_buffer as well, to fix the assert in
vn_ring_write_buffer.

The ring code uses 32-bit unsigned integers and relies on that their
overflow/underflow behavior is well-defined.  When ring->shared.head is
about to overflow and ring->cur has overflowed, this expression

  ring->cur + size - vn_ring_load_head(ring)

gives an incorrect result when size is 64-bit.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12494>
2021-08-24 08:56:16 -07:00