KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Timur Kristóf	abcc83e713	aco: Fix to_uniform_bool_instr when operands are not suitable. Don't attempt to transform uniform boolean instructions when their operands are unsuitable. This can happen eg. due to other optimizations that combine SALU instructions which clear out the uniform instruction labels. Cc: mesa-stable Fixes: `8a32f57fff` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>	2021-08-25 12:43:50 +00:00
Lionel Landwerlin	a13e79843e	nir: prevent peephole from generating invalid NIR We can't append instructions following a return/halt instruction because the control flow helpers will modify the successor of the block containing the return/halt. And the NIR validator enforces that the return/halt must have the end of the function as successor. This tends to happen following lower_shader_calls lowering which inserts halts. This probably doesn't prevent the optimization, it'll just happen in one of the return shaders after the halt has been removed. v2: Move prev block ending check earlier in the function (Daniel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12506>	2021-08-25 11:38:21 +00:00
Samuel Pitoiset	e0a703af11	ci: update the list of expected failures/skips for RADV Against CTS 1.2.7.0. Tested chips are Pitcairn, Polaris10, Navi14 and Sienna Cichlid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12539>	2021-08-25 13:00:07 +02:00
Tomeu Vizoso	3f5053b899	iris/ci: Add manual jobs for tracking performance Use Piglit's replay profile to measure and store the time that frames take to render in the GPU. This job won't run automatically in regular pipelines, but will be triggered automatically by a script for every successful pre-merge pipeline. This is because we want to generate performance data for every relevant commit merged in main, but we don't want to keep a device busy during the pre-merge run. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12236>	2021-08-25 09:32:17 +02:00
Samuel Pitoiset	cff106c4b6	nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(fabs(b), -a) and fmin(-fmax(b, a)) to fmin(-fabs(b), -a). fossils-db (Sienna Cichlid): Totals from 34 (0.02% of 150170) affected shaders: CodeSize: 388540 -> 387748 (-0.20%) Instrs: 74621 -> 74423 (-0.27%) Latency: 1039407 -> 1039011 (-0.04%) InvThroughput: 208364 -> 208150 (-0.10%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12519>	2021-08-25 07:18:24 +02:00
Dave Airlie	ad78643061	crocus: add missing fs dirty on reduced prim change. the reduced prim is used to decide some line antialiasing settings. this fixes mesa-demos antialias Fixes: `f3630548f1` ("crocus: initial gallium driver for Intel gfx 4-7") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12536>	2021-08-25 03:30:16 +00:00
Dave Airlie	6b7a68b7c2	crocus: add missing line smooth bits. Just noticed this in passing. Fixes: `f3630548f1` ("crocus: initial gallium driver for Intel gfx 4-7") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12536>	2021-08-25 03:30:16 +00:00
Mike Blumenkrantz	560dc4f790	zink: fix pipeline caching this was apparently always broken, but in a very, very subtle way where the hash table would compare the current pipeline state against itself instead of using the cache entry's state Cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12535>	2021-08-25 03:11:41 +00:00
Mike Blumenkrantz	712a4d2fd2	zink: fix program init flag this was accidentally !! instead of ! as intended Fixes: `c4702204bc` ("zink: optimize shader recalc") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12535>	2021-08-25 03:11:41 +00:00
Michael Tang	4237aa3a7e	spirv_to_dxil: Run nir_lower_tex during compilation We need this to get e.g. a default lod for some instructions when it is not provided. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12462>	2021-08-24 22:18:30 +00:00
Dave Airlie	4c260f017c	crocus: drop u_primconvert header. This is just leftover. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12531>	2021-08-24 21:38:27 +00:00
Mike Blumenkrantz	ea18b0930b	zink: add better TODO note for surface swizzles Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>	2021-08-24 21:23:45 +00:00
Mike Blumenkrantz	6ff5eaa7d5	zink: make void swizzle clamping util public Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>	2021-08-24 21:23:45 +00:00
Mike Blumenkrantz	52032d5efa	zink: make component mapping function a static inline Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>	2021-08-24 21:23:45 +00:00
Mike Blumenkrantz	08bad3b2b8	zink: move void format detection function to zink_format Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12529>	2021-08-24 21:23:45 +00:00
Mike Blumenkrantz	e645e3c523	nine: replace unnecessary dynamic-sized array with bitfield PIPE_MAX_VERTEX_STREAMS is 4, so this can be simplified to reduce cpu usage Reviewed-by: <Axel Davy davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12523>	2021-08-24 20:38:41 +00:00
Alyssa Rosenzweig	16b4916432	panfrost: Take a ctx when submitting/destroying This reduces the number of batch->ctx shenanigans we do, and in turn should reduce raciness. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12366>	2021-08-24 20:20:29 +00:00
Ian Romanick	fe956d0182	spirv: Add support for SPV_KHR_integer_dot_product v2 (Ivan): Add missing capability enum handling. v3 (idr): Properly handle cases where dest_size != 32. v4 (idr): Rewrite most of the error checking to use vtn_fail_if. Use nir_ssa_def with vtn_push_nir_ssa instead of vtn_ssa_value with vtn_push_ssa_value. All suggested by Jason. Massive rewrite of the handling of packed 4x8 saturating opcodes. Based on some observations made by Jason. v5 (idr): Remove some debugging cruft accidentally added in v4. Noticed by Jason. v6: Emit packed versions of vectored instructions when possible. Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	652d304ee9	spirv: Update headers and metadata from latest Khronos commit This corresponds to e7b49d7 ("Implement SPV_INTEL_optnone extension (#230)") in https://github.com/KhronosGroup/SPIRV-Headers. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	a6db40605e	nir/algebraic: Add some extract optimizations These help quite a bit when vectored versions of SpvOpSDotKHR and friends are emitted as packed versions and then lowered. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	839495efc6	nir/algebraic: Add lowering for dot_4x8 instructions v2: Fix copy-and-paste bugs in lowering patterns. v3: Add has_sudot_4x8 flag. Requested by Rhys. v4: Since the names of the opcodes changed from dp4 to dot_4x8, also change the names of the lowering helpers. Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	0f809dbf40	intel/compiler: Basic support for DP4A instruction v2: Very significant rebase on changes to previous commits. Specifically, brw_fs_nir.cpp changes were pretty much rewritten from scratch after changing the NIR opcode names and types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	806cd2341c	nir/algebraic: Basic patterns for dot_4x8 v2: Add and modify patterns to let constant folding do better. v3: Remove '(is_not_zero)' from the patterns that try to combine addends. I honestly don't know why I had it there in the first place, and nothing in my deep git logs could help clue me in. Noticed by Alyssa. Remover patterns that detect open-coded udot_4x8. Suggested by Alyssa and Jason. Add missing sudot_4x8 patterns. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	6c18a3b497	nir/opcodes: Add integer dot-product opcodes Six opcodes are added: sdot_4x8_iadd, udot_4x8_uadd, sudot_4x8_iadd, sdot_4x8_iadd_sat, udot_4x8_uadd_sate, and sudot_4x8_iadd_sat. These represent the combinations of integer dot-product and add that operate on packed source vectors. That is, the four 8-bit values for each vector is stored in a single 32-bit integer. Some hardware may prefer to operate on unpacked byte vectors. When such hardware comes to Mesa, we'll have to figure out how to name things. v2: Add nir_op_iudp4a and nir_op_iudp4a_sat instructions. These opcodes are not 2-source commutative. v3: Rename all opcodes to be more like some existing 4x8 opcodes. Suggested by Timur. Change type of packed vector sources to uint32, change types of constant folding variables to have explicit size, and delete some extra casts. All suggested by Jason. v4: Fix typo previously noticed by Alyssa but missed in v2. v5: Add has_sudot_4x8 flag. Requested by Rhys. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	7d8bf7c167	nir/lower_bit_size: Support add_sat and sub_sat Without this, lowered saturating ALU instructions would only clamp to the range of the new type instead of the range of the old type. v2: Use nir_iclamp. Suggested by Jason. Use new u_{int,uint}N_{min,max}() helpers. Fixes: `090e282407` ("nir: Add a saturated unsigned integer add opcode") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Alyssa Rosenzweig	8503cab2e0	panfrost: Replace writers pointer with hash table This ensures each context can have a separate batch writing a resource and we don't race trying to flush each other's batches. Unfortunately the extra hash table operations regress draw-overhead numbers by about 8% but I'd rather eat the overhead and have an obviously correct implementation than leave known buggy code in tree. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	afebbadda8	panfrost: Remove writer = NULL assignments These already happened. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	e7eb28fed0	panfrost: Remove rsrc->track.users No longer needed. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	e98aa55413	panfrost: Prefer batch->resources to rsrc->users This expresses the semantic of the flush only applying to batches within the context, not globally, in line with OpenGL's multithreading rules. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	5c4fbae571	panfrost: Add foreach_batch iterator Using the active mask. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	79dd1a4e63	panfrost: Maintain a bitmap of active batches This is on the context, so no concurrency issues. This will allow us to efficiently iterate active batches. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	b8da5b1b7f	panfrost: Cache number of users of a resource This can be tracked efficiently with atomics, and reduces the places we use the rsrc->track.users bitmap which has concurrency issues. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Alyssa Rosenzweig	2f63ccd080	panfrost: Switch resources from an array to a set This will help us reduce shared state and simplify multithreading, at the expense of additional CPU overhead. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>	2021-08-24 19:39:51 +00:00
Mike Blumenkrantz	8e2159a57f	zink: stop referencing framebuffers this is a waste of cycles now that surfaces are accurately tracked; no-attachment fbs are still deferred to avoid premature deletion Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12429>	2021-08-24 19:23:48 +00:00
Mike Blumenkrantz	1af6618694	zink: defer deletion of no-attachment framebuffers the ref on these is owned by the context, so defer deletion to avoid premature destruction if the fb might be in use Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12429>	2021-08-24 19:23:48 +00:00
Alyssa Rosenzweig	3622562e44	panfrost: Inline add_fbo_bos Only used once, it's just complicating the batch cache interface. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:55 -04:00
Alyssa Rosenzweig	4991e17297	panfrost: Remove get_fresh_batch Unused, and of dubious value. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig	bd15e5e6af	panfrost: Move bo->label assignment into the lock We already took the lock, we just unlocked too early. Since the label is reset in the BO cache, this is racy. Minimal impact in practice but is still /wrong/ and caught by helgrind. Fixes: `3fa1f93dac` ("panfrost: Label all BOs in userspace") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig	e6924be737	panfrost: Don't use ralloc for resources ralloc is not thread safe, so we cannot use a pipe_screen as a ralloc context unless we lock the screen. The allocation patterns for resources are trivial, so just use malloc/calloc/free directly instead of ralloc. This fixes a segfault in: dEQP-EGL.functional.sharing.gles2.multithread.random.images.copytexsubimage2d.1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig	9307028255	panfrost: Simplify get_fresh_batch_for_fbo Makes the code easier to read, too. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig	81a76d9e42	panfrost: Remove null check in batch_cleanup Shouldn't happen. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig	40edc87956	panfrost: Protect the variants array with a lock Without a lock, two threads may bind the same shader CSO simultaneously, allocate the same variant simultaneously, and then race each other in the compiler. This manifests in various ways, most commonly failing the assertion that UBO pushing has only run once. The simple_mtx_t solution is used in Iris. Fixes the crash in: dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>	2021-08-24 14:31:54 -04:00
Alyssa Rosenzweig	67821af1de	panfrost/ci: Don't skip matrix inverse tests Older versions of these tests were buggy and failed on Bifrost. The test bug has been resolved upstream, but the skip list was not updated when dEQP was uprevved with the fix. Run the tests. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12313>	2021-08-24 18:12:14 +00:00
Alyssa Rosenzweig	324a32ac14	panfrost/ci: Switch to suite support Use the new deqp-runner suite support to combine our dEQP-GLES2, dEQP-GLES3, and dEQP-GLES31 jobs into a single job. This simplifies load balancing, enabling us to expand our test coverage without impacting wall clock time. With the new infrastructure in place, we add KHR-GLES* jobs for Mali G52. This would have caught some recent regressions. Once we hit conformance it's essential we remain conformant. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12313>	2021-08-24 18:12:14 +00:00
Rhys Perry	3d228b6926	nir/gcm: pin some instructions which require uniform sources fossil-db (Sienna Cichlid, GCM enabled): Totals from 6192 (4.12% of 150170) affected shaders: VGPRs: 548392 -> 542040 (-1.16%) SpillSGPRs: 3702 -> 3990 (+7.78%); split: -0.54%, +8.32% CodeSize: 62418488 -> 62481516 (+0.10%); split: -0.07%, +0.17% MaxWaves: 70582 -> 71718 (+1.61%) Instrs: 11768497 -> 11795079 (+0.23%); split: -0.07%, +0.30% Latency: 445891848 -> 523561297 (+17.42%); split: -0.07%, +17.49% InvThroughput: 115675481 -> 121494913 (+5.03%); split: -0.09%, +5.12% VClause: 164914 -> 164934 (+0.01%); split: -0.05%, +0.06% SClause: 405991 -> 395302 (-2.63%); split: -2.64%, +0.00% Copies: 907216 -> 926429 (+2.12%); split: -1.11%, +3.23% Branches: 456373 -> 457478 (+0.24%); split: -0.13%, +0.38% PreSGPRs: 648030 -> 642953 (-0.78%); split: -0.88%, +0.10% PreVGPRs: 522425 -> 516355 (-1.16%); split: -1.16%, +0.00% Seems to affect Detroit: Become Human and Cyberpunk 2077. The Cyberpunk 2077 changes look like a fixed bug. At least some of the Detroit: Become Human changes could probably be removed with better divergence analysis. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>	2021-08-24 16:52:31 +00:00
Rhys Perry	884ac52eaa	nir: consider push constant loads as always dynamically uniform Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>	2021-08-24 16:52:31 +00:00
Daniel Schürmann	a3110c308f	radv: call nir_lower_flrp() after the first radv_optimize_nir() instead of inside the optimization loop Totals from 2504 (1.67% of 150170) affected shaders: (GFX10.3) VGPRs: 162592 -> 162416 (-0.11%); split: -0.12%, +0.01% CodeSize: 18399756 -> 18383552 (-0.09%); split: -0.10%, +0.01% MaxWaves: 42654 -> 42748 (+0.22%) Instrs: 3499404 -> 3497075 (-0.07%); split: -0.08%, +0.01% Latency: 87087238 -> 87064270 (-0.03%); split: -0.06%, +0.03% InvThroughput: 21159621 -> 21150546 (-0.04%); split: -0.05%, +0.01% VClause: 56653 -> 56667 (+0.02%); split: -0.00%, +0.03% Copies: 226332 -> 226423 (+0.04%); split: -0.15%, +0.19% Branches: 110027 -> 110025 (-0.00%); split: -0.05%, +0.04% PreSGPRs: 168087 -> 168076 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 160814 -> 160705 (-0.07%) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>	2021-08-24 16:10:30 +00:00
Daniel Schürmann	2cf164feb9	nir/opt_algebraic: optimize flrp(fadd, fadd, x) only if fadd are used_once Totals from 201 (0.13% of 150170) affected shaders: (GFX10.3) VGPRs: 13880 -> 13856 (-0.17%) CodeSize: 1517328 -> 1518124 (+0.05%); split: -0.04%, +0.10% MaxWaves: 3184 -> 3192 (+0.25%) Instrs: 285487 -> 285569 (+0.03%); split: -0.06%, +0.08% Latency: 7774066 -> 7780877 (+0.09%); split: -0.10%, +0.19% InvThroughput: 1936341 -> 1935287 (-0.05%); split: -0.07%, +0.02% SClause: 11446 -> 11448 (+0.02%); split: -0.01%, +0.03% Copies: 17500 -> 17506 (+0.03%); split: -0.51%, +0.55% Branches: 8174 -> 8180 (+0.07%); split: -0.13%, +0.21% PreVGPRs: 12507 -> 12427 (-0.64%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>	2021-08-24 16:10:30 +00:00
Daniel Schürmann	89a842b2b6	nir/loop_analyze: consider instruction cost of nir_op_flrp Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>	2021-08-24 16:10:30 +00:00
Chia-I Wu	572ed22494	venus: use uint32_t in vn_ring_submit And in vn_ring_write_buffer as well, to fix the assert in vn_ring_write_buffer. The ring code uses 32-bit unsigned integers and relies on that their overflow/underflow behavior is well-defined. When ring->shared.head is about to overflow and ring->cur has overflowed, this expression ring->cur + size - vn_ring_load_head(ring) gives an incorrect result when size is 64-bit. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Ryan Neph <ryanneph@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12494>	2021-08-24 08:56:16 -07:00

... 3 4 5 6 7 ...

144210 Commits All Branches Search

144210 Commits

All Branches