diff --git a/docs/relnotes/21.2.0.rst b/docs/relnotes/21.2.0.rst new file mode 100644 index 00000000000..875deca1b55 --- /dev/null +++ b/docs/relnotes/21.2.0.rst @@ -0,0 +1,5272 @@ +Mesa 21.2.0 Release Notes / 2021-08-04 +====================================== + +Mesa 21.2.0 is a new development release. People who are concerned +with stability and reliability should stick with a previous release or +wait for Mesa 21.2.1. + +Mesa 21.2.0 implements the OpenGL 4.6 API, but the version reported by +glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / +glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. +Some drivers don't support all the features required in OpenGL 4.6. OpenGL +4.6 is **only** available if requested at context creation. +Compatibility contexts may report a lower version depending on each driver. + +Mesa 21.2.0 implements the Vulkan 1.2 API, but the version reported by +the apiVersion property of the VkPhysicalDeviceProperties struct +depends on the particular driver being used. + +SHA256 checksum +--------------- + +:: + + 0cb3c802f4b8e7699b1602c08c29d06a4d532ab5b8f7a64676c4ca6bb8f4d426 mesa-21.2.0.tar.xz + + +New features +------------ + +- zink supports GL_ARB_texture_filter_minmax, GL_ARB_shader_clock + +- VK_EXT_provoking_vertex on RADV. + +- VK_EXT_extended_dynamic_state2 on RADV. + +- VK_EXT_global_priority_query on RADV. + +- VK_EXT_physical_device_drm on RADV. + +- VK_KHR_shader_subgroup_uniform_control_flow on Intel and RADV. + +- VK_EXT_color_write_enable on RADV. + +- 32-bit x86 builds now default disable x87 math and use sse2. + +- GL ES 3.1 on GT21x hardware. + +- VK_EXT_acquire_drm_display on RADV and ANV. + +- VK_EXT_vertex_input_dynamic_state on lavapipe + +- wideLines on lavapipe + +- VK_EXT_line_rasterization on lavapipe + +- VK_EXT_multi_draw on ANV, lavapipe, and RADV + +- VK_KHR_separate_depth_stencil_layouts on lavapipe + +- VK_EXT_separate_stencil_usage on lavapipe + +- VK_EXT_extended_dynamic_state2 on lavapipe + +- NGG shader based primitive culling is now supported by RADV. + +- Panfrost supports OpenGL ES 3.1 + +- New Asahi driver for the Apple M1 + +- GL_ARB_sample_locations on zink + +- GL_ARB_sparse_buffer on zink + +- GL_ARB_shader_group_vote on zink + +- DRM format modifiers on zink + +- freedreno+turnip: Initial support for a6xx gen4 (a660, a635) + +- None + + +Bug fixes +--------- + +- The image is distorted while use iGPU(Intel GPU) rendering and output via dGPU (AMD GPU) +- lima: regression in plbu scissors cmd +- freedreno: regression in org.skia.skqp.SkQPRunner#gles_multipicturedraw_*_tiled +- Incorrect rendering +- intel/isl: Wrong surface format name in batch +- [RADV] FSR in Resident Evil: Village looks very pixelated on Polaris +- 21.2.0rc1 Build Failure - GCC6.3 +- Crash in update_buffers after closing KDE "splash screen" downloader +- Firefox (wayland) crash in wayland_platform +- Crash in update_buffers after closing KDE "splash screen" downloader +- Firefox (wayland) crash in wayland_platform +- radeonsi: persistent, read-only buffer maps are slow to read +- substance painter flickering with jagged texture and masks shown black +- radv: FP16 mode in FidelityFX FSR doesn't look right +- Regression in Turnip with KGSL and Zink running opengl in proot +- Validation crash on wlroots after wl_shm appeared +- [RADV] Blocky corruption in Scarlet Nexus and vkd3d-proton 2.4 +- Use out encoding for float immediates +- Radeon RX580 and 5700 XT: Reloading ARB assembly shaders causes very glitchy rendering +- i915g: dEQP-GLES2.functional.fragment_ops.depth_stencil..* failures +- i915g: dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.* and fragment_ops.random.* failures +- KHR-GL46.shader_ballot_tests.ShaderBallot* tests fails on the main +- i965 nir compiler is lowering fdiv twice or initializing struct twice +- [radv] Textures missing in Doom (2016) w/ any temporal AA setting enabled +- Drop Android.mk +- [build error] macros.h:88:26: error: size of unnamed array is negative +- Game Issue: Nuclear Throne crashes in RadeonSI +- iris: reduce shader storage buffer object alignment +- agx_compile.h:29:10: fatal error: asahi/lib/agx_pack.h: No such file or directory +- radv: VBO range check issues with odd strides and sizes +- Crash in glLinkProgram while trying to craft the link error +- i915g: wide point failures +- Wolfenstein II: The New Colossus - Screen goes black in some cases +- [radv] [regression] Textures missing in Doom (2016) +- Shader compilation memory leaks +- radv: fd leak in Android WSI radv_AcquireImageANDROID +- SpaceEngine in Steam Proton cannot start with Mesa >=20.3 +- [bisected] KDE plasma menu text renders like stretched strangely [amdgpu] +- radeonsi: glitches in Euro Truck Simulator 2 +- White box for Webrender Firefox with R600_DEBUG=nir on Evergreen GPU +- radv_AllocateDescriptorSets: validation on variable description count is too strict +- Luna Sky Crashes on Launch +- Mesa crashes on undefined texture behaviour +- Mesa crashes on undefined texture behaviour +- cache_test uses uninitialized stack memory +- nir/opt_load_store_vectorize: check_for_robustness() crashes on derefs +- [anv] GravityMark (benchmark) crashes on ANV +- turnip: corrupted geometry after tesselation shaders in GTAV +- [opengl] We happy few not being rendered correctly +- anv: dynamic state prim type is hard +- [iris][bisected] piglit test ...ext_external_objects.vk-image-display-muliple-textures failing after enabled +- Factorio: GPU hang when opening machine inventory +- RuneScape on Mesa 21.1.1 (VEGA10) has bad performance and leaks memory +- LLVM12 breaks atomicCompSwap tests with radeonsi +- freedreno: tex-miplevel-selection causes a creation of too many BOs without flushes, causing a crash +- RADV: Resident Evil Village Freezes during a specific cutscene +- Supraland: flickering black bars on ground +- u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99 +- radv: front face and cull mode dynamic state can desync +- radv: GPU hang in Cyberpunk 2077 on Big Navi +- Cyberpunk 1.22 crashes with amdgpu ring gfx_0.0.0 timeout +- [amdgpu][renoir][rx5500m]: [drm:0xffffffff8198ad5e] \*ERROR* ring gfx timeout, signaled seq=10952, emitted seq=10954 +- [spirv-fuzz] SPIR-V parsing FAILED: Invalid back or cross-edge in the CFG +- anv: dEQP-VK.graphicsfuzz.spv-stable-bubblesort-flag-complex-conditionals fails +- panfrost Mount and Blade: Warband (glitches) +- anv: dEQP-VK.robustness.robustness2*no_fmt_qual.null_descriptor.samples* fail +- KHR-GLES31.core.shader_image_load_store.basic-glsl-earlyFragTests may be flakey on RPI4 +- SPIRV AMD Driver compiler memory leak/usage - 8Gb+ to compile single 41Kb SPIRV file, Ubuntu 21.04 +- Regression of !10941: mutter Wayland on bare metal crashes when closing HW accelerated windows +- radv: ACO miscompiles a specific DIRT 5 shader +- Copy paste bug in v3dv_cmd_buffer.c +- Segfault in mtx_unlock/amdgpu_bo_slab_destroy +- [radeonsi] Incorrect rendering when using glDisable(GL_MULTISAMPLE) with multisample backbuffer +- virtio(vulkan): Error building on Android/arm32 +- [i965] regression: piglit.spec.arb_depth_buffer_float.fbo-depthstencil-gl_depth32f_stencil8-drawpixels +- iris: Select memory map cacheability settings at BO allocation time +- zink: regression for primitive-restart on ANV +- zink: Unimplemented ALU {un,}pack_half_2x16 in nir_to_spirv +- venus: dEQP-VK.api.command_buffers.bad_inheritance_info_random test failure +- [radeonsi] glClearTexImage on 1D array only clears first slice +- zink: Expected Image Operand ConstOffset to be a const object +- docs: bullet-lists no longer show any bullets +- [RADV] - Path of Exile (238960) - Ground decals are missing or corrupted using the Vulkan renderer. +- [ADL-S / TGL-U / TGL-H] Pixels missing / flickering when render some app on weston +- [iris][biected] piglit.spec.glsl-1_50.execution.geometry.clip-distance-vs-gs-out +- panfrost ff9a0000.gpu: Unhandled Page fault in AS1 at VA 0x0000000009801200 +- black stripes in X/Xwayland applications under panfrost/midgard +- radv: significant overhead from radv_pipeline_has_ngg() +- mesa-21.1.1/src/gallium/drivers/vc4/vc4_resource.c:790: reading from uninitialised array +- [iris,i965][hsw,ivb,snb,bdw][bisected][regression] wflinfo crashes +- [iris,i965][hsw,ivb,snb,bdw][bisected][regression] wflinfo crashes +- The image is distorted while use iGPU(Intel GPU) rendering and output via dGPU (AMD GPU) +- [radv] Revert !7207 once BG3 is fixed. +- piglit test 'spec.glsl-1_30.execution.range_analysis_fsat_of_nan' failure +- iris: Rework iris_transfer_mapto better use iris_has_color_unresolved +- ir3_cf breaks nir_op_fquantize2f16 +- [i915g] implicit fallthrough +- Add asahi to CI's build +- RADV - Regression - CPU side 'hang' in metro exodus +- Assassin's Creed Odyssey crash on launch +- Metro Exodus not starting under xwayland +- [i915g] PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTERS is Unknown cap 38 +- Elite Dangerous: Odyssey alpha crashes GPU on launch +- glmark2-es2 -b terrain crashes since Bifrost FP16 +- [iris][bisected][regression] dEQP-GLES31.functional.texture.multisample.* tests crash on multiple platforms +- gallium: tc regression +- panfrost: Look into invalidate_resource() +- [ivb,hsw][i965][bisected] dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_vertex failing +- ci: Explicitly test shader caching +- radv: RoTR fails on Raven APU +- Tungsten Graphics links in Gallium docs +- Obs Studio Broken on Latest Mesa Git(Regression)(Bisected) +- Graphics corruption and GPU hang with RADV/LLVM +- old kernels (4.19) support in radv +- Elite Dangerous: Odyssey alpha crashes GPU on launch +- CSGO: Some default variables can cause problems with trust mode +- freedreno: dEQP-GLES3.functional.fence_sync.client_wait_sync_finish flakes +- glxclient.h:56:10: fatal error: 'loader.h' file not found +- mesa git started to break wine + UnrealTournament.exe (old dx6 game) +- SuperTuxKart artifacting on RK3399 +- [amdgpu]: Golf With Your Friends (431240): ERROR Waiting for fences timed out +- don't flush for each blit/grid +- No sRGB capable visuals/fbconfigs reported in glx +- Documentation request: AMD debug variables +- docs: versions is out-of-date +- Strange results when trying to read from VK_FORMAT_R64_SFLOAT in compute shader +- anv: dEQP-VK.binding_model.buffer_device_address.set3.depth3.basessbo.convertcheck* slow +- Iris doesn't support INTEL_performance_query anymore +- [iris][bisected][regression] dEQP-GLES3.functional.texture.specification.teximage2d* failures +- RADV: TRUNC_COORD breaks gather operations +- [RADV] corruption in avatar after dying in Heroes of the Storm +- drm-shim build error with glibc 2.33 +- Metro Exodus crashing due to memory overflow +- Sauerbraten shader rendering broken on RV530 (r300g) +- texture glitches on CS:GO on Tiger Lake +- Incorrect texture blitting/mapping when running Unigine-Heaven 4.0 on ADL-S / TGL-H, TGL-U +- Build fail due to "parameter name omitted" on Gallium Nine +- v3dv: DiligentEngine fail to run with SRGB swapchain +- Non-DRI builds broken by recent cleanups in Mesa core +- Cinnamon core dump after installing latest oibaf mesa build (165a69d2) +- yuv sampler lowering regression +- clover llvm build failure ‘class llvm::VectorType’ has no member named ‘getNumElements’; +- Possible corruption for apps using multiple Z-buffers on TGL + + +Changes +------- + +Aaron Liu (1): + +- amd: add Yellow Carp support + +Abel García Dorta (2): + +- i915g: add HW atomic counters as unsupported +- i915g: fix implicit fallthrough + +Adam Jackson (53): + +- Revert "glx: Lift sending the MakeCurrent request to top-level code" +- gallium/xlib: Fix for recent gl_config changes +- glx/drisw: Enable GLX_ARB_create_context_no_error +- glx: Remove silly __glXGetGLVersion() indirection +- glx: Remove some truly ancient unused code +- glx: Remove major/minor version tracking from extension table +- glx: Mark GLX_{ATI_pixel_format_float,NV_float_buffer} as supported +- glx: Remove some non-functional GL extension from the table +- glx: Generalize __glXGetStringFromTable a little +- glx: Remove redundant client_support field from extension table +- glx: Enable pure-client-library extensions explicitly +- glx: Fold client_gl_only array into its one real user +- glx: Remove some ancient backwards-compatibility typedefs +- zink: Learn about VK_KHR_swapchain +- zink: Fix format query for minmax reduction support +- gallium: Fix PIPE_BIND_SAMPLER_REDUCTION_MINMAX definition to be unique +- dri: Try harder to infer the drawable fbconfig if needed +- glx: Return the right GLX opcode in synthetic MakeCurrent errors +- glx: s/dri_message/glx_message/ +- glx: Add and use DebugMessageF convenience macro +- glx: Convert undocumented LIBGL_DIAGNOSTIC to LIBGL_DEBUG=verbose +- glx: Remove unused debugging printfs +- glx: Implement GLX_EXT_no_config_context +- glx: Stop pretending the GLX major number isn't 1 +- glx: Stop force-enabling extensions "implied" by GLX 1.3 +- glx: Simplify glXIsDirect +- glx: Remove warn-once-ery around GLX 1.3 functions +- glx: Remove unused opcode argument to __glX{Get,QueryServer}String +- glx: Stash a copy of the XExtCodes in the glx_display +- glx: Simplify some overuse of GetGLXScreenConfigs +- glx: Move server GLX vendor and version strings to glx_screen +- glx: s/Display \*/struct glx_display \*/ over internal API +- glx: Remove some dead declarations from glxclient.h +- Revert "glx: s/Display \*/struct glx_display \*/ over internal API" +- include: Remove unused i810_pci_ids.h +- i915c: Add a symlink for i830_dri.so +- mesa: Remove unused _mesa_{create,destroy}_visual +- mesa: Ignore the depth buffer when computing framebuffer floatness +- gallium: Reset attachments to ST_ATTACHMENT_INVALID when revalidating +- format/fxt1: Clean up fxt1_variance's argument list +- mesa: s/malloc/calloc/ to silence a warning +- gallium: Remove unused st_visual::render_buffer +- gallivm: Fix a signature mismatch warning +- zink/ntv: Don't call free() on ralloc'd memory +- gallium/dri: Remove unused dri_drawable::drisw_surface +- drisw: Don't bzero displaytarget pixels +- intel: properly constify isl_format_layouts +- classic/xlib: Fix the build after !9817 +- swrast: Fix a warning from gcc 11 +- loader/dri3: Properly initialize the XFIXES extension +- loader/dri3: Don't churn through xfixes regions in SwapBuffers +- vl/dri3: Don't leak regions on the X server +- meson: Make prefer-{crocus,iris} always take effect + +Alejandro Piñeiro (30): + +- v3dv/debug: print correct stage name +- v3dv/debug: use gl stage when checking debug flag +- v3dv/pipeline: track descriptor maps per stage, not per pipeline +- v3dv: remove custom icd json generation +- v3dv: move extensions table to v3dv_device +- v3dv: don't use typedef enum with broadcom stages +- v3dv: remove unused v3dv_zs_buffer_from_vk_format +- broadcom/compiler: use proper type field for atomic operations +- v3d/simulator: capture hub interrupts +- v3d/simulator: add a cache flush mode enum +- v3d/simulator: wait for cache flushes +- v3d/simulator: use the proper register when waiting on a CSD submit +- v3d/simulator: use BFC/RFC registers to wait for bin/render to complete +- broadcom/common: move v3d_tiling to common +- v3d/simulator: hw mem is now an v3d_size_t, typedef to uint32_t +- v3d/simulator: get rid of has_gca wrapper +- v3dv: rename v3dv_pack for v3dvx_pack +- v3dv/cmd_buffer: add helper job_emit_binning_prolog +- v3dv/cmd_buffer: move cl_emit calls for Draw methods to helpers +- v3dv: start to move and wrap hw-version code with v3dv_queue +- v3dv: split v3dv_pipeline hw version dependant code to a new source file +- v3dv: split v3dv_image hw version dependant code to a new source file +- v3dv: split v3dv_format hw version dependant code to a new source file +- v3dv: split v3dv_device hw version dependant code to a new source file +- v3dv: move several hw version dependant code to their own (v3dvx) source file +- v3dv: split v3dv_descriptor hw version dependant to a new source file. +- v3dv: don't use cl_packet_length for prepacked data +- v3dv: remove gen-dependant includes from v3dv_private +- v3dv/build: meson infrastructure for multi-hw-version support +- v3dv/format: expose properly that some formats are not filterable + +Alexander Monakov (2): + +- freedreno/drm-shim: pretend to offer DRM 1.6.0 +- freedreno/drm-shim: keep GEM buffers page-aligned + +Alexander Shi (1): + +- mesa: texparam: Add a clamping macro to handle out-of-range floats returned as integers. + +Alexey Nurmukhametov (1): + +- tu/kgsl: Fix file descriptor double close + +Alyssa Rosenzweig (668): + +- nir: Update some comments referring to imov +- panfrost: Don't allow_forward_pixel_to_kill for Z/S blit +- panfrost: Set allow_forward_pixel_to_be_killed for blit +- panfrost: Set clean_fragment_write for blits +- panfrost: Invert blend_reads_dest logic +- panfrost: Don't allow FPK if a RT is missing +- panfrost: Allow FPK when there are no side effects +- panfrost: Keep Bifrost blendable -> pixel in table +- panfrost: Specialize blendable formats for sRGB +- panfrost: Simplify format_to_bifrost_blend prototype +- panfrost: Drop blendable format accessor +- panfrost: Always pick dithered tb formats +- panfrost: Remove padded unorm blendable formats +- docs/macos: Explain Apple GLX versus OSMesa on macOS +- nir/lower_fragcolor: Use shader_instructions_pass +- nir/lower_fragcolor: Handle fp16 outputs +- panfrost: Fix formats converting uninit from AFBC +- nir/lower_fragcolor: Fix driver_location assignment +- nir/lower_fragcolor: Take max cbufs as argument +- d3d12: Switch to nir_lower_fragcolor +- util/bitset: Add BITSET_COUNT helper +- nir: Add fsin_agx opcode +- asahi: Stub command-line compiler for AGX G13B +- agx: Add opcode descriptions as Python +- agx: Generate opcode list +- agx: Generate runtime-accessible opcode table +- agx: Generate builder routines +- agx: Stub NIR backend compiler +- agx: Remap varyings to match AGX ABI +- agx: Stub control flow walking +- agx: Stub NIR instruction iteration +- agx: Stub emit_intrinsic +- agx: Implement load_const as mov +- agx: Implement direct st_vary +- agx: Add agx_alu_src_index helper for emit_alu +- agx: Implement vec2/vec3/vec4 ops +- agx: Implement fragment_out +- agx: Add instruction printing +- agx: Add a trivial register allocator +- agx: Add instruction packing +- agx: Add packing for memory loads/stores +- agx: Add st_vary(_final) instruction packing +- agx: Terminate programs with stop and traps +- agx: Implement ld_vary +- agx: Implement simple floating point ops +- agx: Implement fsin/fcos +- agx: Add 8-bit AGX minifloat routines +- agx: Add minifloat tests +- agx: Implement native float->int conversions +- agx: Implement native int->float conversions +- agx: Add bitwise operations +- agx: Add iadd/imad integer arithmetic +- agx: Add saturated integer add/subtract support +- agx: Add 32-bit bitwise shifts +- agx: Add forward optimizing pass for fmov +- agx: Add dead code eliminator +- agx: Propagate fmov backwards as well +- agx: Propagate immediates +- agx: Implement limited case of i2i16/i2i32 as iadd +- agx: Add sysval management helper +- agx: Implement load_ubo/kernel_input +- agx: Set flag on last st_vary instruction +- agx: Lower load_attr to device memory accesses +- agx: Implement vertex_id +- agx: Add agx_tex_dim helper +- agx: Emit texture ops +- agx: Pack texture ops +- agx: Add min/max support +- agx: Support 1-bit booleans +- agx: Implement b2f +- agx: Add b2i implementation +- agx: Pack cmpsel +- agx: Support bcsel +- asahi: Add hexdump utility +- asahi: Add command buffer XML definitions +- asahi: Add allocation data structure +- asahi: Add a GenXML fork +- asahi: Add (clean room) IOKit uABI header +- asahi: Add command buffer decode helpers +- asahi: Add tiling routines +- asahi: Add device abstraction +- asahi: Add pool data structure +- asahi: Add uniform upload routines +- asahi: Add some magic IOGPU routines +- asahi: Add vertex formats table +- asahi: Add Gallium driver +- nir/opcodes: Reword confusing comment +- pan/bi: Add missing sr_count to pseudo-atomics +- pan/bi: Don't reference uninit source in ATOM_C1 +- pan/bi: Add simple constant folding pass +- pan/bi: Don't reference nir_lower_mediump_outputs +- pan/bi: Simplify Python expression +- pan/bi: Union modifiers from across variants +- pan/bi: Support 16-bit load_interpolated_input +- pan/bi: Emit int CSEL instead of float by default +- pan/bi: Implement vectorized f32_to_f16 +- pan/bi: Fix 16-bit fsat +- pan/bi: Improve assert for vector size errors +- pan/bi: Implement vectorized int downcasts +- pan/bi: Fix loads and stores smaller than 32 bits +- pan/bi: Lower swizzles on CLPER +- pan/bi: Add and use bi_negzero helper +- pan/bi: Don't schedule clamps to +FADD.v2f16 +- pan/bi: Workaround \*V2F32_TO_V2F16 erratum +- panfrost: Don't unroll loops in GLSL +- panfrost: Remove old dEQP workaround +- pan/bi: Track dual-src blend type +- pan/bi: Handle different sizes of LD_TILE +- pan/bi: Add single-component 8-bit mkvec lowering +- pan/bi: Handle swizzles in i2i8 +- pan/bi: Lower 8-bit fragment input +- panfrost: Make comment less confusing +- panfrost: Support alpha_to_one +- panfrost: Minor cleanup of blend CSO +- panfrost: Don't clobber RT0 if RTn is disabled +- pan/lower_blend: Clean up type size handling +- pan/lower_blend: Use NIR helpers +- pan/lower_blend: Rename is_bifrost->scalar +- panfrost/blend: Fix outdated comments +- panfrost/blend: Workaround a v7 implementation-detail +- panfrost/blend: Distribute to_c_factor +- panfrost/blend: Prepare for lower_fragcolor +- panfrost: Call nir_lower_fragcolor based on key +- panfrost: Assume lower_fragcolor has been called +- panfrost/lower_framebufffer: Don't use i2imp +- pan/blend: Emit explicit conversions for all types +- panfrost: Key blend shaders to the input types +- pan/mdg: Hide units behind MIDGARD_MESA_DEBUG=verbose +- pan/mdg: More concise RMU name +- pan/mdg: Don't print zero shifts +- pan/mdg: Suppress most attribute tables +- pan/mdg: Don't print explicit .rte +- pan/mdg: Don't print mem addr brackets +- pan/mdg: Reduced printed parens +- pan/mdg: Don't print zero +- pan/bi: Add imm_uintN helper +- pan/bi: Handle integer min/max ourselves +- pan/bi: Handle ineg +- pan/bi: Handle b2f ourselves +- pan/bi: Handle b2i8/16 +- pan/bi: Track scalarness of 16-bit ALU +- pan/bi: Don't swizzle scalars +- pan/bi: Switch to 1-bit bools +- pan/bi: Use nir_lower_to_bit_size +- pan/mdg: Use _output_ type for outmod printing +- pan/mdg: Remove midgard_opt_copy_prop_reg +- pan/mdg: Enable nir_opt_{move, sink} +- panfrost/blend: Inline blend constants +- pan/mdg: Model blend shader interference +- panfrost: Fix typo handling blend types +- pan/bi: Change swizzled scalars to identity +- pan/bi: Adapt branching for 1-bit bools +- pan/bi: Handle make_vec with 1-bit bools +- pan/bi: Temporarily switch back to 0/~0 bools +- pan/bi: Enable NIR vectorization +- pan/bi: Fix int<-->float size converts +- pan/bi: Copyprop constants +- pan/bi: Garbage collect bifrost_nir.h +- pan/bi: Enable mediump BLEND lowering +- panfrost: Enable 16-bit support on Bifrost +- pan/lower_framebuffer: Fix bitsize mismatch +- nir: Add blend lowering pass +- panfrost: Use common blend lowering +- nir/divergence_anlysis: Add intrinsics for Bifrost +- pan/bi: Drop load_sampler_lod_parameters_pan +- pan/bi: Map load_subgroup_invocation to FAU +- pan/bi: Add "lanes per warp" accessor +- pan/bi: Add divergent intrinsic lowering pass +- asahi: Translate blend CSO to lower_blend options +- asahi: Augment Gallium key with blend state +- asahi: Call nir_lower_blend with selected key +- asahi: Garbage collect bind_state +- asahi: Implement set_blend_color +- asahi: Add blend constant system value +- asahi: Call nir_lower_fragcolor +- asahi: Fix shader key hash function +- asahi: Pass through "reads tilebuffer?" bit +- agx: Return agx_instr* from emit_intrinsic +- agx: Implement blend constant color sysvals +- agx: Rename blend -> st_tile +- agx: Add ld_tile opcode +- agx: Assume lower_fragcolor has been called +- agx: Condition writeout ops on already being emitted +- agx: Implement load_output +- agx: Set reads_tib appropriately +- panfrost: Drop panfrost_fence in favour of pipe_fence_handle +- docs: Simplify now that kmsro is autoenabled +- pan/bi: Add first_vertex to vertex ID +- panfrost: Track buffers needing resolve +- panfrost: Set discard based on the resolve set +- panfrost: Implement framebuffer invalidation +- panfrost: Hide CAP_INT16 behind is_deqp +- panfrost: Don't translate compare funcs +- panfrost: Remove spurious assignment +- panfrost: Clean up cases for emit_fbd +- panfrost: Don't upload empty push uniform table +- pan/mdg: Use smaller LD_UNIFORM instructions +- ci: Build asahi in meson-gallium job +- panfrost: Fix major flaw in BO cache +- panfrost: Drop random #define +- panfrost: Use natural shader limits +- panfrost: Make clear which limits are arbitrary +- panfrost: Garbage collect comment +- panfrost: Shorten iffy comment +- pan/mdg: Remove unused midgard_int_alu_op_prefix +- pan/mdg: Fix output types for scalar fields +- pan/mdg: Fix spills to TLS +- pan/mdg: Set lower_uniforms_to_ubo +- panfrost: Add unowned mode to pan_pool +- panfrost: Label all BOs in userspace +- panfrost: Label pools +- panfrost: Make pool slab size configurable +- panfrost: Add reference type for unowned pool +- panfrost: Pool shaders +- panfrost: Pool texture views +- panfrost: Reduce blitter pool size +- panfrost: Fix blending for unbacked MRT +- panfrost: Fix the reads_dest prototype +- panfrost: Fix is_opaque prototype +- panfrost: Fix blend constant fetch prototype +- panfrost: Fix blend fixed-function prototype +- panfrost: Fix pan_blend_to_fixed_function_equation prototype +- panfrost: Move blend properties to CSO create +- panfrost: Translate fixed-function blend at CSO create +- panfrost: Garbage collect Gallium blend includes +- panfrost: Pack blend equations at CSO create time +- panfrost: Distribute out constant colour code +- panfrost: Simplify blend_final +- panfrost: Pass batch to panfrost_get_blend +- panfrost: Streamline fixed-function get_blend path +- panfrost: Remove unused dither flag +- panfrost: Split Bifrost BLEND emit by word +- panfrost: Precompute bifrost_blend_type_from_nir +- panfrost: Add draw-time merge helper +- panfrost: Prepack partial RSD at compile time +- panfrost: Move depth/stencil/alpha to CSO create +- panfrost: Preset evaluate_per_sample +- panfrost: Correct the type of sample_mask +- panfrost: Fill out the rasterizer CSO +- panfrost: Move early-z decision earlier +- panfrost: Streamline the !fs_required case +- panfrost: Hoist allow_forward_pixel_to_be_killed +- panfrost: Partially determine FPK state +- panfrost: Distribute masks for FPK selection +- panfrost: Pull erratum workaround into own function +- panfrost: Hoist part of shader_reads_tilebuffer +- panfrost: Pack draw-time RSD all-at-once +- panfrost: Move batch_set_requirements to the CSO +- panfrost: Deduplicate some code from indirect/direct draws +- panfrost: Pass batch to panfrost_get_index_buffer_bounded +- panfrost: Remove silly assertion +- panfrost: Mark job_index > 10000 as unlikely +- panfrost: Simplify panfrost_bind_sampler_states +- panfrost: Express viewport in terms of the batch +- asahi: Set PACKED_STREAM_OUTPUT +- glsl: Fix subscripted arrays with no XFB packing +- glsl: Fix packing of matrices for XFB +- panfrost: Streamline varying linking code +- panfrost: Define dirty tracking flags +- panfrost: Add the usual clean/dirty helpers +- panfrost: Dirty all state when batch is set +- panfrost: Dirty track RSDs +- panfrost: Dirty track textures/samplers +- panfrost: Dirty track viewport descriptor +- panfrost: Dirty track fragment images +- panfrost: Add PAN_MESA_DEBUG=dirty option +- panfrost/ci: Disable GLES2 jobs when we run GLES3 +- panfrost/ci: Disable G72 jobs for now +- panfrost/ci: Split rules by ISA +- ci: Condition ppc64-el on specific drivers +- ci: Condition s390x on specific drivers +- panfrost: Only link varyings once in good conditions +- panfrost: Lower max inputs again +- panfrost: Abort on faults in SYNC mode +- panfrost: Remove minimal mode +- panfrost: Increase tiler_heap max allocation to 64MB +- panfrost/ci: Disable terrain trace +- panfrost/ci: Remove reference to dated flag +- panfrost/ci: Run jobs with PAN_MESA_DEBUG=sync +- panfrost: Add Message Preload descriptor XML +- panfrost: Add message preload to pan_shader_info +- panfrost: Inline pan_prepare_shader_descriptor +- panfrost: Don't take ctx in panfrost_shader_compile +- panfrost: Expose PIPE_CAP_SHAREABLE_SHADERS +- asahi: Fix meson.build definition to depend on agx_pack.h +- agx: Drop cmdline version back to ES3.0 +- agx: Pack ld_var Dx +- agx: Enable 1-bit load_const +- agx: Implement boolean mov +- agx: Track current_block +- agx: Track block offsets +- agx: Add nest field to IR +- agx: Add invert_cond (ccn) to IR +- agx: Add branch target to IR +- agx: Add inner loop nesting count field +- agx: Model control flow instructions +- agx: Model pop_exec +- agx: Add push_exec alias +- agx: Pack control flow instructions +- agx: Model jump instructions +- agx: Fix up branch offsets at pack time +- agx: Implement emit_if the simplest way +- agx: Optimize out empty else blocks +- agx: Implement loops in the simplest way +- agx: Add break/continue support +- agx: Zero r0l before first use of control flow +- asahi: Fix scissor descriptor definition +- asahi: Add "set scissor" command +- asahi: Add scissor enable bit +- asahi: Defer viewport pack +- asahi: Dirty track viewport descriptor +- asahi: Track scissor states +- asahi: Mark scissor dirty if rast->scissor changes +- asahi: Skip draws if the scissor culls everything +- agx: Add scissor upload BO +- asahi: Expose PIPE_CAP_CLIP_HALFZ +- asahi: Add unknown bits seen with the GL driver +- asahi: Enable depth culling +- asahi: Update viewport descriptor depth fields +- asahi: Implement scissors and scissor to viewport +- asahi: Fix off-by-one in viewport scissoring +- asahi: Implement wide lines +- asahi: Determine tiling vs linear for internal textures +- asahi: Use dt_stride for line_stride where needed +- asahi: Add layout enum to XML +- asahi: Translate layouts for texture and RTs +- asahi: Identify line stride in texture/RT XML +- asahi: Respect linear strides +- asahi: Handle linear display targets as well as tiled +- asahi: Note that "render target" lacks an sRGB bit +- asahi: Align strides to 16 bytes +- asahi: Print unknown enum values +- asahi: Add format enums +- asahi: Hide pixel formats behind an opaque type +- asahi: Scaffold format table +- asahi: Use pixel table in is_format_supported +- asahi: Respect render target format swizzle +- asahi: Add ETC2 formats to table +- asahi: Add "hacks for dEQP" flag +- asahi: Lift streamout scaffolding from Panfrost +- asahi: Fake CAPs for ES3 with AGX_MESA_DEBUG=deqp +- asahi: Flesh out the formats table +- asahi: Allow half-float vertex buffers +- asahi: Make data_valid a bitset to save memory +- asahi: Abort on blit() +- asahi: Add mipmapping state to the XML +- asahi: Set levels in texture descriptor +- asahi: Allocate slices for mipmapping +- panfrost: Update comment +- panfrost: Shrink pan_draw_mode return type +- panfrost: Add draw parameters dirty flags +- panfrost: Analyze sysval dirty flags +- panfrost: Dirty track constant buffers +- panfrost: Don't allocate empty varying buffer +- panfrost: Dirty track stack sizes +- panfrost: Write translate_index_size better +- panfrost: Minor changes to draw_vbo +- panfrost: Bubble up errors +- panfrost: Elucidate thread group split field +- panfrost: Eliminate reserve_* functions +- panfrost/ci: Report flakes on IRC +- vc4: Use Rn_UINT instead of In_UINT for index buffers +- v3d: Use Rn_UINT instead of In_UINT for index buffers +- etnaviv: Use Rn_UINT instead of In_UINT for index buffers +- freedreno: Use Rn_UINT instead of In_UINT for index buffers +- lima: Use Rn_UINT instead of In_UINT for index buffers +- si: Use Rn_UINT instead of In_UINT for index buffers +- docs/gallium: Document the index buffer format convention +- nir: Add nir_intrinsic_load_back_face_agx +- asahi: Mark special fragment inputs as sysvals +- agx: Model get_sr +- agx: Generate enums from Python +- agx: List sr enum in Python +- agx: Pack SR immediate +- agx: Lower front face to back face +- agx: Handle load_back_face_agx +- ci: Disable the iris APL jobs +- nir/lower_fragcolor: Avoid redundant load_output +- pan/bi: Pull out bi_count_write_registers +- pan/bi: Use TEXS_2D for rect textures +- pan/bi: Simplify TEXC codegen for sr_count=0 +- pan/bi: Fix bi_rewrite_passthrough ordering +- pan/bi: Bundle after RA +- pan/bi: Add post-RA optimizer +- pan/bi: Track liveness while scheduling +- pan/bi: Allow IADD.u32 on FMA as \*IADDC +- pan/bi: Use explicit affinities in RA +- pan/bi: Inline spilling in RA +- pan/bi: Explicit zero reg_live_{in, out} when needed +- pan/bi: Model interference with preloaded regs +- pan/bi: Allow move/sink in blend shaders +- pan/bi: Don't restrict the register file in non-blend shaders +- pan/bi: Model +BLEND clobbering of r48 +- pan/bi: Handle images in vertex shaders +- pan/bi: Lower loads with component > 0 +- pan/bi: Lower stores with component != 0 +- pan/bi: Lower 64-bit ints again +- pan/bi: Emit a dummy ATEST if needed +- pan/bi: Simplify spill code +- pan/bi: Track words instead of bytes in RA +- pan/bi: Don't allocate past the end of the reg file +- panfrost: Remove AFBC format fixups +- panfrost: Add missing 'Reverse issue order flag' +- panfrost: Disable AFBC on v7 +- panfrost: Don't duplicate attribute buffers +- panfrost: Separate image attribute and buffer emit +- panfrost: Be explicit in image modifier handling +- panfrost: Use util_last_bit for images +- panfrost: Default indirect attributes to 1D type +- pan/indirect: Factor out is_power_of_two_or_zero +- pan/indirect_draw: Use unsigned comparisons +- pan/indirect_draw: Fix 1 instance, nonzero divisor +- panfrost: Correctly size varyings +- panfrost: Use varying format from frag shader +- pan/bi: Force u32 for flat varyings +- panfrost: Fix vertex image attribute overrun +- panfrost: Simplify compute_checksum_size formula +- panfrost: Fix crc_valid condition +- panfrost: Zero r_dimension for buffer textures +- panfrost: Add util_draw_indirect() debug path +- panfrost: Align NPOT divisor records +- panfrost: Fix src_offset data type +- panfrost: Make instancing code more obvious +- panfrost: Assert alignment of indirect records +- pan/mdg: Use consistent casing in midgard_print +- pan/mdg: Make -Wswitch happy +- pan/mdg: Stub memory_barrier{_image} +- panfrost: Clarify how fs_sidefx works with oq +- panfrost: Simplify Midgard blend disable +- panfrost: Don't force early-z with occlusion query +- panfrost: Respect early-Z force on Midgard +- pan/mdg: Fix units for SUBSAT +- pan/mdg: Handle {i,u}{add,sub}_sat +- pan/mdg: Update r1.w comment +- pan/mdg: Fix incorrect rewrite in Midgard scheduler +- panfrost: Mark 16/32_UNORM as non-renderable (v5) +- panfrost: Don't allocate WLS when not needed +- pan/mdg: Wire in PAN_SYSVAL_VERTEX_INSTANCE_OFFSETS +- pan/mdg: Lower away gl_VertexID offset +- pan/mdg: Use more accurate ld/st reg estimates +- pan/mdg: Don't skip unit-based checks in choose_instruction +- pan/mdg: Assert scheduled instructions are reasonable +- pan/mdg: Insert moves to load/store registers +- panfrost: Fix dirty state emission +- panfrost: Emulate indirect draws on Midgard +- panfrost: Add some missing BGRA formats +- panfrost: Remove scissor_culls_everything +- panfrost: Don't set a blend shader for no_colour +- panfrost: Allocate XFB buffers per-instance +- panfrost: Fix BUFFER image handling +- panfrost: Make image buffers robust +- panfrost: Lower max compute size +- panfrost: Set PIPE_COMPUTE_CAP_SUBGROUP_SIZE +- panfrost: Set PIPE_COMPUTE_CAP_MAX_THREADS_PER_BLOCK +- panfrost: Drop todo on PIPE_COMPUTE_CAP_IMAGES_SUPPORTED +- panfrost: Don't CRC mipmapped textures +- panfrost: Reduce pan_image_state indirection +- pan/indirect_dispatch: Indent NIR blocks +- pan/indirect_dispatch: Simplify empty command case +- pan/indirect_dispatch: Distinguish minus-1 defs +- pan/indirect_dispatch: Expand split expressions +- pan/indirect_dispatch: Use extracted values +- panfrost: Use direct dispatch with shared memory +- panfrost: Don't clobber indirect dispatch fields +- panfrost: Make data_valid a bitset +- panfrost: Remove pan_image_state +- panfrost: Set valid_buffer_range for GPU writes +- panfrost: Add XML for vertex/instance ID records +- panfrost: Clean up vertex/instance ID on Midgard +- panfrost: Flush everything for glMemoryBarrier +- panfrost: Flush before compute jobs +- panfrost: Set vertex job_barrier +- panfrost: Add "Cache Flush" job XML +- panfrost: Advertise GLES3.1 +- pan/decode: Fix image attribute counting +- pan/decode: Handle cache flush jobs +- panfrost/ci: Blank G52 flakes file +- panfrost/ci: Don't skip SSBO tests on G52 +- panfrost/ci: Do fractional dEQP-GLES31 run on Midgard +- docs/features: Mark GLES3.1 as done on Panfrost +- docs/panfrost: Update API versions +- pan/bi: Include modifier info in opcode table +- pan/bi: Move bi_word_node to common code +- pan/bi: Move typesize to common code +- pan/bi: Track instruction size in opcode table +- pan/bi: Handle fsat_signed and fclamp_pos +- pan/bi: Report tuples, not nops, in shader-db +- pan/bi: Propagate fabs/neg/sat +- pan/bi: Add back custom algebraic opts +- pan/bi: Fuse fclamp_pos and fsat_signed +- pan/bi: Schedule FCMP.v2f16 with abs modifier +- pan/bi: Fuse abs into FCMP/FMIN/FMAX.v2f16 +- nir: Fix constant folding for irhadd/urhadd +- agx: Mark components as ASSERTED +- agx: Add agx_immediate_f helper +- agx: Add perspective bit to ld_var +- agx: Update ld_vary encoding mask +- agx: Add ld_vary_flat opcode +- asahi: Identify varying descriptor fields +- agx: Rename remap_varyings -> remap_varyings_vs +- agx: Implement nir_intrinsic_load_frag_coord +- agx: Implement ld_vary_flat +- agx: Rename agx_pack to agx_pack_binary +- agx: Remap fragment shader varyings explicitly +- asahi: Unify varying linking code with vertex shaders +- agx: Pull out agx_write_components +- agx: Add agx_exit_block helper +- agx: Add liveness analysis pass +- agx: Mark sources that kill +- agx: Count write registers, not components +- agx: Lift agx_block_add_successor from Panfrost +- agx: Track logical control flow graph +- asahi: Wire in tgsi_to_nir +- asahi: Fix random \*2 +- asahi: Guard for overflow when packing +- asahi: Always flush when setting framebuffer state +- asahi: Handle Z16_UNORM textures +- asahi: Add zsbuf to batch +- asahi: Save zsbuf ptr +- asahi: Add internal (renderable) formats to the table +- asahi: Set fragment key for non-U8NORM render targets +- asahi: Implement colour buffer reloads +- asahi: Remove spurious assignment +- asahi: Remove spurious varying assignment +- asahi: Generalize varying linking +- asahi: Add ASAHI_MESA_DEBUG=no16 option +- agx: Fix 32-bit bitwise shifts +- agx: Fix LOD_MIN enum +- agx: Pack LOD descriptors +- agx: Fix lod_mode shift +- agx: Legalize LOD sources to be 16-bit only +- agx: Handle txl +- asahi: Fail on LOD clamps/bias +- asahi: Identify texture/sampler count fields +- asahi: Identify vertex texture/sampler counts +- asahi: Set vertex texture/sampler counts +- asahi: Track more Gallium state +- asahi: Wire in u_blitter +- asahi: Handle nonzero first_level +- asahi: Fix meson dependency on packing in compiler +- asahi: Prepack rasterizer faces +- asahi: Implement the stencil test +- asahi: Flush for accesses to Z/S buffer +- asahi: Comment on an embedded data structure +- asahi: Skip over holes in the vbufs +- asahi: Add XML for the attachment structure +- asahi: Sync attachment magic with asahi demo +- asahi: Parametrize software "command buffer" size +- asahi: Identify "command buffer" size field in map +- asahi: Move IOGPU header to XML +- asahi: Extend IOGPU header to contain encoder +- asahi: Use GenXML for main bind fragment +- asahi: Identify attachment length field +- asahi: Set data_valid for the depth buffer +- asahi: Enable primitive restart +- asahi: Use XML for interpolation packet +- panfrost: Express dependencies as resources, not BOs +- panfrost: Wrap occlusion query in pipe_resource +- panfrost: Split "flush writer" from "flush accessing" +- panfrost: Eliminate redundant flushes with AFBC +- panfrost: Add secondary shader XML fields +- pan/decode: Handle IDVS jobs on Bifrost +- agx: Don't choke on registers in the optimizer +- agx: Count read registers as well +- agx: Assign registers locally +- agx: Pipe in nir_register +- agx: Ensure we don't overallocate registers +- panfrost: Move draw_vbo to pan_cmdstream.c +- panfrost: Move most CSO creates to pan_cmdstream.c +- panfrost: Split out prepare_rsd into a vtbl +- panfrost: Move blend CSO to cmdstream/context +- panfrost: Don't ralloc panfrost_blend_state +- panfrost: Move launch_grid to pan_cmdstream +- panfrost: Move panfrost_emit_tile_map to pan_job +- panfrost: Use vtable for fragment descriptor functions +- panfrost: Clean up pan_cmdstream.h +- panfrost: Move sample accessor to pan_cmdstream +- panfrost: Remove pan_cmdstream.h +- panfrost: Remove unused midgard-pack.h includes +- docs: Update relnotes for panfrost/asahi +- pan/bi: Improve clause printing +- pan/bi: Fix skip/lod_mode aliasing with VAR_TEX +- pan/bi: Add bi_foreach_instr_global_rev_safe helper +- pan/bi: Pack staging_barrier for the -next- clause +- pan/bi: Try to hit full occupancy on v7 +- pan/bi: Only spill nodes that could progress in RA +- pan/bi: Report cycle counts +- pan/bi: Track LOD mode even for TEXC +- pan/bi: Analyze helper invocations +- pan/bi: Fuse LD_VAR+TEXS_2D -> VAR_TEX +- pan/bi: Add a constant subexpression elimination pass +- pan/bi: Workaround widen restrictions on +FADD.f32 +- pan/bi: Simplify cube map descriptor generation +- pan/bi: Comment the fexp2 implementation +- pan/bi: Factor out exp2/log2 code +- pan/bi: Don't lower fpow +- panfrost: Fix FPK enable condition +- panfrost: Add a performance counter dump utility +- panfrost: Don't set zs_update_operation in vertex shaders +- panfrost: Zero depth_source in vertex shaders +- panfrost: Query tiler features +- panfrost: Enable more tiler levels if we can +- panfrost: Generalize pan_blitter's reg count assert +- panfrost: Set register allocation in the v7 RSD +- asahi: Move fixed internal shaders to agx_blit.c +- asahi: Add missing copyright/guards for magic.c/h +- asahi: Remove unused bo_access property +- asahi/decode: Only dump mapped allocations +- asahi: Make track_free safer +- asahi/decode: Check fewer zeroes after a command buffer +- asahi: Reserve more space to stop a command buffer +- asahi: Identify more unknown fields in the memmap +- asahi/decode: Fix up high word +- asahi/decode: Handle CULL packets +- asahi/decode: Fix decoding of draw calls +- asahi: Allow specifying an encoder ID +- asahi: Allocate global IDs +- asahi: Consolidate some magic numbers +- asahi: Garbage collect senseless cmdbuf struct +- asahi/decode: Print clear/store pipelines +- asahi/decode: Print some IOGPU stuff +- asahi: Set bits in UNK11 needed for points +- asahi: Set point magic bit in rasterizer +- asahi: Set bit for psiz +- asahi: Lower PIPE_CAPF_MAX_POINT_WIDTH to hw limit +- asahi: Unpack varying descriptors (1x) +- asahi: Identify triangle/lines vs point varyings +- asahi: Handle point coordinates +- agx: Flip point coordinates because OpenGL +- panfrost: Inline flip_compare_func into pan_encoder.h +- panfrost: Move panfrost_vertex/instance_id to per-gen +- panfrost: Inline away pan_pool.c +- panfrost: Express pack_work_groups more concisely +- panfrost: Inline away pan_invocation.c +- panfrost: Assert that injected jobs are for blits +- panfrost: Inline panfrost_get_z_internal_format +- panfrost: Move arch-independent pan_format code +- panvk: Don't use panfrost_bifrost_swizzle +- panfrost: Remove panfrost_bifrost_swizzle +- panfrost: Add GenXML macros +- panfrost: Compile format table multiple times +- panfrost: Specialize blendable_formats for v6 +- panfrost: Use smaller sizes in blend table +- panfrost: Give WLS Instances a default +- panfrost: Pin an architecture for blending +- panfrost: Use generic delete for ZSA +- panfrost: Remove reference to mali_blend_equation_packed +- panfrost: Avoid GenXML enum dependences +- panfrost: Remove pan_blitter integration +- panfrost: Init/destroy blitter from per-gen file +- panfrost: Only access blitter from per-gen +- pan/bi: Refuse to CSE non-SSA sources +- pan/bi: Make bi_foreach_instr_in_tuple safer +- pan/bi: Update ins->link after scheduling +- pan/bi: Do helper termination analysis on clauses +- pan/bi: Handle multiple destinations in scheduler +- pan/bi: Add bi_before_tuple convenience method +- pan/bi: Handle 4-src instructions in scheduler +- pan/bi: Calculate dependency graph when bundling +- pan/bi: Add a bundling heuristic +- panfrost: Fix format swizzles on G72 +- targets/graw-xlib: Add missing dep_x11 +- pan/mdg: Garbage collect silly quirk +- asahi: Fix sampler filtering flag +- agx: Fix mismatched units in load_ubo +- agx: Plug memory leak in register allocator +- pan/bi: Restrict swizzles on same cycle temporaries +- pan/bi: Remove incorrect errata workaround + +Andres Gomez (25): + +- ci: Uprev piglit to 9d87cc3d79e ("framework/replay: send backend's subprocess stderr to sys.stderr") +- ci: Add test which occasionally times out to lavapipe-vk skips +- ci: add xorg to the x86_test-vk container +- ci: allow starting xorg for piglit run +- ci: remove results directory content only with piglit runners +- ci: make sure we only read the first line from install/VERSION +- ci: update some radv trace checksums +- ci: update some radv trace checksums +- ci: update radv's trace job tag for Raven +- ci: remove radv's trace job for Polaris10 +- ci: uprev apitrace to 10.0 +- ci: uprev DXVK to 1.8.1 +- ci: add radv's trace job for Navy Flounder +- ci: include VKD3D-Proton tests into the VK test container +- ci: add VKD3D-Proton testsuite runner +- ci: add VKD3D-Proton testsuite job for radv's Navy Flounder +- ci: disentangle tags for containers and artifacts produced by them +- ci: remove glslangValidator installation from the VK test container +- ci: replace glslangValidator with glslang-tools +- ci: fix the vkd3d-proton runner +- ci: build the hang-detection tool into x86_test-vk +- ci: update some radv trace checksums +- ci: bump x86_test-base tag +- ci: remove unzip from several containers that don't use it at all +- ci: use bash with download-git-cache.sh + +Andrii Simiklit (1): + +- Remove redundant assignment + +Antonio Caggiano (15): + +- panfrost: Fix invalid conversions +- panfrost: Meson dependency +- util: Perfetto SDK v15.0 +- pps: Gfx-pps v0.3.0 +- pps: Gfx-pps config tool +- pps: Documentation +- intel/perf: Extern C +- pps: Intel pps driver +- pps: Intel documentation +- ci: Add a manual job for tracking the performance of Freedreno +- panfrost: Counter definitions +- panfrost: Performance configuration +- panfrost: Fix pan_pool_ref construction +- pps: Panfrost pps driver +- pps: Panfrost documentation + +Anuj Phogat (39): + +- intel: Rename files with gen_debug prefix +- intel: Rename gen_debug prefix to intel_debug +- intel: Rename GEN_DEBUG prefix to INTEL_DEBUG +- intel: Rename intel_device_info.c to intel_dev_info.c +- intel: Rename gen_device prefix in filenames +- intel: Rename gen_device prefix to intel_device +- intel: Fix alignment and line wrapping due to gen_device renaming +- intel: Rename GEN_DEVICE prefix in macros to INTEL_DEVICE +- intel: Rename gen_get_device prefix to intel_get_device +- intel: Rename gen_get_aperture_size to intel_get_aperture_size +- intel: Drop gen prefix in gen_has_get_tiling() +- intel: Rename gen_context.h to intel_context.h +- intel: Rename gen_context prefix to intel_context +- intel: Rename gen_perf prefix in filenames to intel_perf +- intel: Rename gen_perf prefix to intel_perf in source files +- intel: Fix alignment and line wrapping due to gen_perf renaming +- intel: Rename GEN_PERF prefix to INTEL_PERF in build files +- intel: Rename GEN_PERF prefix to INTEL_PERF in source files +- intel: Rename gen_{pipeline, oa, counter, hw} to intel_{..} +- intel: Rename brw_gen_enum.h to brw_gfx_ver_enum.h +- intel: Rename gen enum to gfx_ver +- intel: Rename gen keyword in test_eu_validate.cpp +- intel: Rename gens keyword to gfx_vers +- intel: Rename index_gen keyword to index_ver +- intel: Rename eu compact instruction tests +- intel: Rename gen_{mapped, clflush, invalidate} prefix to intel_{..} +- intel: Remove devinfo_to_gen() helper function +- intel: Rename isl_to_gen keyword to isl_encode +- intel: Rename vk_to_gen keyword to vk_to_intel +- intel: Rename gen_10 to ver_10 +- intel: Rename calculate_gen_slm_size to intel_calculate_slm_size +- intel: Rename _gen_{program, part, batch, freq} to _intel_{..} +- intel: Rename GEN_PART to INTEL_PART +- intel: Rename {i965, iris, anv, isl}_gen prefix in build files +- intel: Rename since_gen to since ver +- intel: Rename _gen keyword to _gfx_ver in few build files +- intel: Fix GEN_GEN macro checks +- intel/gfx12+: Add Wa_14013840143 +- intel: Rename GFX 12.5 to XE_HP + +Axel Davy (1): + +- st/nine: Fix compilation error on non-x86 platforms + +Bas Nieuwenhuizen (41): + +- radv: Fix memory leak on descriptor pool reset with layout_size=0. +- amd/common: Use cap to test kernel modifier support. +- radv: Only require DRM 3.23. +- radeon/vcn: Use the correct pitch for chroma surface. +- nir: Add load_sbt_amd intrinsic. +- radv: Add sbt descriptors user SGPR input. +- aco: Add load_sbt_amd intrinsic implementation. +- radv: Use global BO list with raytracing. +- radv: Add support for RT bind point. +- radv: Add RT pipeline bind. +- radv: Implement vkCmdTraceRays. +- radv: Use correct border swizzle on GFX9+. +- nir: Add bvh64_intersect_ray_amd intrinsic. +- aco: Implement bvh64_intersect_ray_amd intrinsic. +- nir/lower_returns: Deal with single-arg phis after if. +- radv: Don't skip barriers that only change queues. +- radv: Actually return correct value for read-only DCC compressedness. +- radv: Allow DCC images to be compressed with foreign queues. +- gallium/vl: Use format plane count for sampler view creation. +- gallium/va: Add support for PRIME_2 import. +- radv: Use the global BO list for acceleration structures. +- radv: Add initial CPU BVH building. +- radv: Implement device-side BVH building. +- radv: Add acceleration structure descriptor set support. +- radv: Convert lower_intrinsics to a switch statement +- radv: Implement load_vulkan_descriptor for acceleration structures. +- radv: Expose formats for acceleration structure. +- radv: Add rt perftest flag. +- radv: Enable VK_KHR_acceleration_structure with RADV_PERFTEST=rt. +- radv: Add -Wpointer-arith. +- util/fossilize_db: Pull seek into lock. +- util/fossilize_db: Split out reading the index. +- util/fossilize_db: Do not lock the fossilize db permanently. +- util/fossilize_db: Only lock the db file, not the index. +- nir: Add lowered vendor independent raytracing intrinsics. +- nir: Add raytracing shader call lowering pass. +- meson: Bump libdrm for amdgpu to 2.4.107. +- radv/winsys: Return vulkan errors for buffer creation. +- radv/winsys: Add support for a fixed VA address for replay. +- radv: Support address capture and replay. +- ac/surface: Handle non-retiled displayable DCC correctly for modifiers. + +Bastian Beranek (1): + +- glx: Assign unique serial number to GLXBadFBConfig error + +BillKristiansen (2): + +- d3d12: Fixes stale context bindings after copy, resolve, and clear +- d3d12: Sets all SRV descriptors as data-static + +Billy Laws (1): + +- meson: Increase Android Platform SDK version limit + +Boris Brezillon (60): + +- panfrost: Don't advertise AFBC mods when the format is not supported +- panfrost: Reserve thread storage descriptor in panfrost_launch_grid() +- panfrost: Fix RSD emission on Bifrost v6 +- panfrost: Fix indirect draws +- pan/bi: Don't set the EOS flag if there's at least one successor +- panfrost: Keep panfrost_batch_reserve_framebuffer() private +- panfrost: Fix ZS reloading on Bifrost v6 +- pan/midg: Fix 2 memory leaks +- pan/bi: Expand pseudo instructions when nosched is set +- pan/midg: Fix midgard_pack_common_store_mask() +- pan/midg: Make sure the constant offset is in range in mir_match_iadd() +- panfrost: Make sure pack_work_groups_compute() is passed valid dimensions +- panfrost: Add helpers to emit indirect dispatch jobs +- panfrost: Hook-up indirect dispatch support +- panfrost: Only advertise INDIRECT_DRAW if the kernel supports HEAP BOs +- ci: Update to a kernel that has the panfrost MMU fixes +- panfrost/ci: Test GLES 3.1 on Bifrost +- panfrost/ci: Skip draw_indirect.compute_interop.large.* +- panfrost/ci: Run the full deqp-gles3 testsuite +- panfrost: Fix format definitions to match gallium expectations +- Revert "gallium/util: Fix depth/stencil blit shaders" +- panfrost: Pass an image view to panfrost_estimate_texture_payload_size() +- panfrost: Fix blit shader names +- panfrost: Pack pan_blit_surface fields +- panfrost: Get rid of the vertex_count arg in pan_preload_emit_varying() +- panfrost: Make pan_preload_emit_*_sampler() applicable to blits +- panfrost: Stop assigning ->position in pan_preload_emit_varying() +- panfrost: Make pan_preload_emit_*_textures() applicable to blits +- panfrost: Make pan_preload_emit_viewport() applicable to blits +- panfrost: Rename pan_preload_emit_varying() +- panfrost: Shrink the number of args passed to prepare_{bifrost,midgard}_rsd() +- panfrost: Don't select the blit shader fragout type twice +- panfrost: Stop assuming the viewport will always cover the framebuffer +- panfrost: Extend pan_blitter to support blit/resolve operations +- panfrost: Use pan_blit() when PAN_MESA_DEBUG=panblit +- panfrost: Split the indexed and !indexed indirect draw info structs +- pan/bi: Add support for gl_{BaseVertex,BaseInstance} +- pan/bi: Add support for gl_DrawID +- panfrost: Expose the DRAW_PARAMETERS cap on Bifrost +- panfrost: Flag indirect draw/dispatch shaders as internal +- panfrost: Relax the stride check when importing resources +- panfrost: Try to align scanout resource stride on 64 bytes +- panfrost: Don't freeze blit batches +- panfrost: Avoid duplicate entries in access->readers +- panfrost: Simplify the dependency tracking logic +- panfrost: Limit the number of active batch to 32 +- ci: Update to a new kernel fixing a bug in the panfrost driver +- panfrost: Constify the constants pointers passed to pan_blend functions +- panfrost: Make panfrost_scoreboard_initialize_tiler() return the job pointer +- pan/midg: Add a flag to dump internal shaders +- panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs +- panfrost: Don't add blit context BOs twice +- panfrost: Pass a memory pool to pan_blit_ctx_init() +- panfrost: Add alignment info to write-value and cache-flush jobs +- panfrost: Allocate WRITE_VALUE jobs with panfrost_pool_alloc_desc() +- panvk: Use the desc alloctor when we can +- panfrost: Start splitting the panfrost pool logic +- panvk: Support returning BOs allocated by panvk_pool to a 'free BO' pool +- panfrost: Replace the batch->bos hashmap by a sparse array +- panfrost: Do tracking of resources, not BOs + +Boyuan Zhang (8): + +- vl: add st_rps_bits for HEVC decode +- frontends/va: get st_rps_bits from VA pic param hevc +- frontends/vdpau: disable UseStRpsBits for vdpau hevc +- radeon/vcn: enable parsing support for st_rps_bits +- frontends/omx: use pipe buffer map instead of texture map +- radeon/vcn: move calc_dpb_size into create_decoder +- radeon/vcn: allocate non-tmz context buffer for VCN2+ +- radeon/vcn: use st_rps_bits only when it's set + +Caio Marcelo de Oliveira Filho (25): + +- spirv: Don't replicate patch bool in vtn_variable +- nir: Remove now unnecessary conditions from emit_load/store helpers +- intel/compiler: Add common function for CS dispatch info +- iris: Use brw_cs_get_dispatch_info() +- anv: Use brw_cs_get_dispatch_info() +- i965: Use brw_cs_get_dispatch_info() +- intel/compiler: Remove unused exported functions +- nir: Move shared_memory_explicit_layout bit into common shader_info +- intel/compiler: Clarify why VUE is recomputed by FS +- nir: Rename nir_is_per_vertex_io to nir_is_arrayed_io +- compiler: Rename local_size to workgroup_size +- compiler: Rename SYSTEM_VALUE_LOCAL_GROUP_SIZE to SYSTEM_VALUE_WORKGROUP_SIZE +- nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size +- nir: Rename WORK_GROUP (and similar) to WORKGROUP +- nir: Move zero_initialize_shared_memory into common shader_info +- nir: Move workgroup_size and workgroup_variable_size into common shader_info +- anv: Support workgroup memory in other shaders +- nir/lower_io: Rename vertex_index to array_index in helpers +- nir/gather_info: Rename per_vertex to is_arrayed +- spirv: Fix handling of OpBranchConditional with same THEN and ELSE +- nir/opt_if: Don't split ALU for single block infinite loops +- nir: Add test to check edge case in Split ALU optimization +- spirv: Update headers and metadata from latest Khronos commit +- spirv: Support SPV_KHR_subgroup_uniform_control_flow +- anv: Advertise VK_KHR_shader_subgroup_uniform_control_flow + +Carsten Haitzler (Rasterman) (1): + +- panfrost: Fix Bo imports to not take the process down if fd is invalid + +Charlie (10): + +- v3dv: enable KHR_image_format_list +- v3dv: enable KHR_sampler_mirror_clamp_to_edge +- v3dv: enable KHR_incremental_present +- v3dv: enable KHR_uniform_buffer_standard_layout +- v3dv: clamp srgb render targets +- v3dv: remove sRGB blending workaround +- v3dv: add the unswizzled RGBA4444 format +- v3dv: divide by block size in copy_image_blit +- v3dv: add ASTC formats to get_compatible_tlb_format +- v3dv: enable ASTC formats + +Charlie Birks (1): + +- v3dv: document two supported extensions + +Charlie Turner (5): + +- ci: Remove obsolete reference to DEQP_SKIPS +- radv: Merge dEQP default skips into all generation-specific skip lists +- radv: Add a STONEY baseline for dEQP. +- radv: Provide a toggle to avoid warnings about unsupported devices. +- spirv_to_nir: Add environment variable to change default log level + +Charmaine Lee (1): + +- svga: fix texture rectangle sampling when no sampler view declaration is defined + +Chia-I Wu (93): + +- venus: stop using vn_renderer_sync in vn_fence +- venus: stop using vn_renderer_sync in vn_semaphore +- venus: stop using vn_renderer_sync in vn_queue +- venus: remove vn_renderer_sync support from vn_queue_submission +- venus: remove VN_SYNC_TYPE_SYNC +- venus: remove vn_queue::sync_queue_index +- venus: remove vn_ring_wait_all +- venus: wait on vkQueuePresentKHR +- venus: remove vn_renderer_info::has_timeline_sync +- venus: add vn_image_init_memory_requirements +- venus: add vn_image_create +- venus: add vn_wsi_create_scanout_image +- venus: refactor vn_queue_submission_count_semaphores +- venus: clang-format clean +- venus: change SpaceBeforeParens style option +- venus: provide accessors for vn_instance_submit_command +- venus: update venus-protocol headers to use accessors +- venus: rename VN_CS_ENCODER_INITIALIZER +- venus: add vn_renderer_shmem +- venus: use vn_renderer_shmem +- venus: add dev->renderer pointer +- venus: pass vn_renderer in vn_renderer_bo functions +- venus: move vn_renderer_bo_ops to vn_renderer +- venus: merge bo create and init ops +- venus: move some common members to vn_renderer_bo +- venus: use sparse array to manage vn_renderer_bo +- venus: make sure gem_handle and vn_renderer_bo are 1:1 +- venus: update venus-protocol for external memory +- venus: rework external memory capability queries +- venus: enable external memory support +- venus: fix render pass without attachments +- venus: fix dmabuf import mmap_size check +- venus: fix dmabuf import fail path +- venus: add VN_MAX_API_VERSION +- venus: rename vn_instance::renderer_version +- venus: clarify/fix instance renderer versions +- venus: clarify/fix device renderer version +- venus: refactor vn_physical_device_init_extensions +- venus: avoid strcmp for spec version override +- venus: refactor vn_physical_device_init_supported_extensions +- venus: init supported extensions in one place +- venus: add extension check for ANDROID_native_buffer +- venus: clean up vn_device_fix_create_info +- venus: get rid of #ifdef's in vn_CreateImage +- pps: fix a missing include in Intel pps driver +- util/u_thread: fix u_thread_setname for long names +- venus: add struct vn_command_buffer_builder +- venus: remember cmd buffer level and queue family +- venus: ignore pInheritanceInfo when we should +- docs: add basic documentation for venus +- vulkan/wsi: provide more info in wsi_image_create_info +- venus: add vn_device_memory_alloc as a helper +- venus: fix asserts on mem bo +- venus: fix opaque fd re-import +- venus: move wsi_image_create_info parsing +- venus: remember image wsi states +- venus: handle VN_COMMAND_BUFFER_STATE_INVALID +- venus: remember render pass PRESENT_SRC attachments +- venus: remember render pass PRESENT_SRC barriers +- venus: remember image view image +- venus: remember framebuffer attachments +- venus: remember cmd buffer render pass and framebuffer +- venus: remember cmd buffer fb attachments +- venus: add vn_image_memory_barrier_has_present_src +- venus: add vn_cmd_get_image_memory_barriers +- venus: clean up vn_get_intercepted_barriers +- venus: undo wsi iamge ownership transfer for Android +- venus: reland wsi image ownership transfer for Android +- venus: enable wsi image ownership transfer for common wsi +- vulkan/util: add vk_default_allocator +- radv: use vk_default_allocator +- v3dv: use vk_default_allocator +- tu: use vk_default_allocator +- anv: use vk_default_allocator +- venus: use vk_default_allocator +- venus: silence compiler warnings +- venus: query experimental features in one call +- venus: document the darkest corner of venus +- venus: move vn_renderer_sync_ops to vn_renderer +- venus: simplify vn_renderer_sync creation +- venus: update venus-protocol headers +- venus: add support for external fence on Android +- venus: add support for external semaphores on Android +- venus: clean up vn_physical_device_get_native_extensions +- venus: fix compatibility with older host drivers +- venus: be verbose about which physical devices are skipped +- vulkan/wsi: fix select_memory_type when all MTs are local +- venus: fix empty submits with BOs +- egl/surfaceless: try kms_swrast before swrast +- meson: allow egl_native_platform to be specified +- venus: clean up vn_AllocateMemory +- venus: suballocate memory in more cases +- vulkan/wsi/x11: do not inherit last_present_mode + +Christian Gmeiner (3): + +- ci: disable initrd support +- drm-shim: fix compile with glibc >= 2.33 +- ci: bare-metal: drop webdav support + +Connor Abbott (133): + +- ir3: Fix list corruption in legalize_block() +- ir3: Reduce max const file indirect offset base to 9 bits +- ir3, tu: Add compiler flag for robust UBO behavior +- tu: Correctly preserve old push descriptor contents +- tu: Handle robust UBO behavior for pushed UBO ranges +- tu: Handle null descriptors +- tu: Expose VK_EXT_robustness2 +- ir3/parser: Fix oob write with immediates array +- ir3: Improve cat1 modifier disassembly +- ir3: Assemble and disassemble swz/gat/sct +- ir3: Prevent oob writes to inputs/outputs array +- nir/lower_clip_disable: Fix store writemask +- ir3, tu: Cleanup indirect i/o lowering +- freedreno: Don't lower indirects in GLSL IR +- freedreno/a6xx: Better document SP_GS_PRIM_SIZE +- freedreno/a6xx: Fix SP_GS_PRIM_SIZE for large sizes +- tu: Fix SP_GS_PRIM_SIZE for large sizes +- ir3/postsched: Fix dependencies for a0.x/p0.x +- ir3/cp: Clone registers for compare-folding optimization +- ir3/sched: Use correct src index +- ir3/postsched: Use correct src index +- ir3/delay: Remove special case for array deps +- ir3/postsched: Fix ir3_postsched_node::delay calculation +- ir3/cp_postsched: Fixup SSA use pointer for direct reads +- ir3: Refactor nir->ir3 block handling +- ir3: Make predecessors an array +- ir3: Rework outputs +- ir3: Don't assume regs[1] exists in ir3_fixup_src_type() +- nir/lower_phis_to_scalar: Add "lower_all" option +- ir3/cf: Rewrite pass +- ir3: Use round-to-nearest-even for fquantize2f16 +- ir3: Call nir_lower_wrmask() again after lowering scratch +- ir3: Only use per-wave pvtmem layout for compute +- ir3: Introduce phi and parallelcopy instructions +- ir3: Add ir3_start_block() +- ir3: Readd support for translating NIR phi nodes +- ir3: Prepare for instructions with multiple destinations +- ir3: Improve register printing for SSA +- ir3: Add ir3_register::array.base +- ir3/delay: Fix full->half and half->full delay +- ir3: Add reg_elems(), reg_elem_size(), and reg_size() +- ir3: Make branch conditions non-SSA +- ir3: Rewrite delay calculation +- ir3/delay: Delete pre-RA repeat handling +- ir3/postsched: Don't use SSA source information +- ir3: Remove unused check_src_cond() +- ir3: Add dominance infrastructure +- ir3: Add pass to lower arrays to SSA +- ir3: Expose occupancy calculation functions +- ir3: Rewrite register allocation +- ir3/ra: Add a validation pass +- ir3: Remove right and left copy prop restrictions +- ir3/sched: Don't schedule collect early +- ir3/sched: Make collects count against tex/sfu limits +- ir3/sched: Consider unused destinations when computing live effect +- ir3: Add simple CSE pass +- ir3: Insert output collects in the main shader +- ir3: Copy propagate immed/const to meta instructions +- ir3: Improve printing of array parallelcopies/phis +- ir3/ra: Fix array parallelcopy confusion +- ir3: Make tied sources/destinations part of the IR +- ir3: Split read-modify-write array dests in two +- ir3: Update ir3_register::instr when cloning instructions +- ir3: Validate that ir3_register::instr is correct +- ir3: Add is_reg_special() +- ir3: Make ir3_instruction::address a normal register +- ir3: Split ir3_reg_create() into ir3_{src,dst}_create() +- ir3: Add separate src/dst count in ir3_instr +- ir3/legalize: Construct branch properly +- ir3: Add srcs/dsts arrays to ir3_instruction +- freedreno/isa: Convert to srcs/dsts +- freedreno/tests: Convert to srcs/dsts +- ir3/sched: Convert to srcs/dsts arrays +- ir3/core: Switch to srcs/dsts arrays +- ir3/ra: Switch to srcs/dsts arrays +- ir3/parser: Switch to srcs/dsts arrays +- ir3/array_to_ssa: Switch to srcs/dsts arrays +- ir3/legalize: Switch to srcs/dsts arrays +- ir3/print: Switch to srcs/dsts arrays +- ir3/validate: Switch to srcs/dsts arrays +- ir3/opts: Switch to srcs/dsts arrays +- ir3/frontend: Switch to srcs/dsts arrays +- ir3: Remove regs array +- ir3: Remove IR3_REG_DEST +- ir3/ra: Fix corner case in collect handling +- freedreno/a6xx: Make SP_XS_PVT_MEM_HW_STACK_OFFSET non-inline +- freedreno, tu: Set SP_XS_PVT_MEM_HW_STACK_OFFSET +- freedreno/computerator: Fix local_size typo +- ir3/sched: Speed up live_effect +- ir3: Stop creating dummy dest registers +- ir3: Prepare dest helpers for multi-dst instructions +- ir3: Add foreach_dst/foreach_dst_n +- ir3: Support multi-mov instructions +- ir3/delay: Support multi-mov instructions +- ir3/postsched: Support multi-mov instructions +- ir3/legalize: Support multi-mov instructions +- ir3: Use correct flags for movmsk & multi-mov +- ir3/validate: Support multi-mov instructions +- ir3: Print multi-mov instructions +- ir3: Add min gen for multi-mov instructions +- ir3/lower_parallelcopy: Use SWZ +- nir/subgroups: Replace lower_vote_eq_to_ballot with lower_vote_eq +- nir/subgroups: Support > 1 ballot components +- nir: Add read_invocation_cond_ir3 intrinsic +- tu, ir3: Plumb through support for CS subgroup size/id +- ir3/nir: Call nir_lower_subgroups +- ir3: Handle shared register liveness correctly +- ir3: Handle unreachable blocks +- ir3: Prevent propagating shared regs out of loops +- ir3: Better valid flags for shared regs +- ir3: Actually allow shared reg moves to be folded +- ir3: Fix shared reg delay +- ir3: Make MOVMSK use repeat +- ir3: Fix infinite loop in scheduler when splitting +- ir3/sched: Handle branch condition in split_pred() +- ir3: Cleanup ir3_legalize jump optimization +- ir3: Support any/all/getone branches +- ir3: Add subgroup pseudoinstructions +- ir3: Handle shared registers in lower_parallelcopy +- ir3: Implement nir subgroup intrinsics +- ir3: Fix convergence behavior for loops with continues +- ir3/legalize: Fix loop convergence behavior +- tu: Update subgroup properties +- ir3/nir: Lower indirect references of compact variables +- ir3: Add missing include to ir3_parser.y +- ir3: Add ir3_collect() for fixed-size collects +- ir3/lower_parallelcopy: Don't manually set wrmask +- ir3: Update .editorconfig and .dir-locals.el +- ir3: Manually reformat some places +- freedreno: Add some options to .clang-format +- ir3: Reformat source with clang-format +- ir3/print: Manual formatting fixups +- ir3: Preserve gl_ViewportIndex in the binning shader + +Corentin Noël (4): + +- ci: Use the caching proxy for Mesa artifacts +- ci: Re-enable virgl tesselation shader +- ci: Bump virglrenderer +- ci: actually run piglit tests with virgl + +Daniel Schürmann (33): + +- aco: fix additional register requirements for spilling +- aco: relax validation rules for p_reduce dst RegType +- driconf: set vk_x11_strict_image_count for Metro: Exodus +- aco/ra: prevent underflow register for p_create_vector operands +- radv: call nir_copy_prop() after load-store vectorization +- aco/ra: also prevent overflow register for p_create_vector operands +- aco: remove condition operand from branch in invert block +- radv,aco: scalarize all phis via nir_lower_phis_to_scalar() +- aco: simplify Phi RegClass selection +- aco/ra: only create phi-affinities for killed operands +- aco/ra: refactor affinity coalescing +- aco/ra: refactor register assignment for vector operands +- amd/ci: add hawaii-specific skip and fail lists +- aco/ra: handle copies of definition registers +- aco/ra: handle copies of copies better +- aco/util: replace DIV_ROUND_UP(n+1,m) by n/m+1 +- aco: reorder and cleanup #includes +- aco: add missing Licenses and remove Authors from files +- aco: add 'common/' and 'llvm/' prefix to #includes +- aco/meson: remove unnecessary dependencies +- aco: refactor SDWA opcode validation +- aco: remove (wrong) GCC array-bounds warning +- util/meson: include inc_gallium +- aco: add .clang-format file +- aco: Format. +- aco/meson: remove inc_gallium from include_directories +- aco: fix self-intersecting register swaps +- aco: fix extract_vector optimization +- aco/isel: avoid unnecessary calls to nir_unsigned_upper_bound() +- aco/insert_waitcnt: Remove many unnecessary wait_imm.combine() +- aco/live_var_analysis: change worklist to a single integer +- aco/optimizer: ensure to not erase high bits when propagating packed constants +- aco: include in aco_util.h + +Daniel Stone (69): + +- CI: Disable Panfrost and radeonsi +- CI: Disable all Panfrost/AMD/Iris automatic jobs +- CI: Disable rk3399-gru-kevin jobs for now +- doc: Gratuituous promotion of Wayland +- docs: Even more gratutious nitpicks +- Revert "CI: Disable rk3399-gru-kevin jobs for now" +- CI: Fix path confusion in OpenCL Piglit execution +- ci/zink: Skip flaky GLX test +- ci/radeonsi: Skip flaky glx-swap-copy test +- ci/windows: Artifact Meson build and test logs +- ci/windows: Re-enable Windows build +- ci: Add Piglit gl-1.0-blend-func to everyone's skips +- ci/lava: Iterate all job results, not just the first +- ci/lava: Handle proxy download failures +- ci/lava: Add validate-only mode to job submitter +- ci/lava: Add --dump-yaml option to submitter +- ci/bare-metal: Factor out environment to a separate script +- ci/bare-metal: Don't leak JWT into logs +- ci/lava: Move LAVA files to lava/ +- ci/lava: Pass JWT separately from environment variables +- ci/lava: Cosmetic reordering of job init +- ci/lava: Wrap submission in a shell script +- ci/lava: Clean up variable naming, document them +- ci: Make PIPELINE_ARTIFACTS_BASE a common variable +- ci: Add JOB_ARTIFACTS_BASE variable +- ci: Use JOB_ARTIFACTS_BASE for Piglit fails +- ci/lava: Use per-job rootfs overlay for environment +- ci/panfrost: Remove useless variable +- ci/lava: Generate job name from lava-submit.sh +- ci/lava: Remove unused arguments +- ci/lava: Add explicit fatal-error handler +- ci/lava: Disable stdout/stderr buffering +- ci/lava: Dump and artifact YAML again +- ci/lava: Avoid tee as it ruins exit status +- ci/piglit: Fix path to uploaded images +- ci/lava: Always upload Piglit replay images to MinIO +- ci/lava: Set PIGLIT_NO_WINDOW +- ci/lava: Explicitly start Xorg for Iris EGL tests +- ci/bare-metal: Rename BM_KERNEL_MODULES to HWCI_KERNEL_MODULES +- ci/lava: Use HWCI_KERNEL_MODULES to load modules +- ci/lava: Rename environment variable script +- ci/bare-metal: Try harder to do NTP +- ci/bare-metal: Reorder init so network comes first +- ci: Move bare-metal init script to common directory +- ci: Be consistent about install path +- ci/bare-metal: Consistently set library paths +- ci/bare-metal: Split init script into two stages +- ci/bare-metal: Move devcoredump capture to CI common +- ci/lava: Start using devcoredump captures +- ci: Consistent pass/fail result output +- ci: Unify {BM,LAVA}_START_XORG environment +- ci: Unify {BARE_METAL,LAVA}_TEST_SCRIPT environment +- ci/bare-metal: Set CPU and GPU governors to max, disable GPU runtime PM +- ci/lava: Pass MinIO path on the command line +- ci/lava: Use common stage-2 init +- ci/lava: Drop bitrotten fastboot support +- ci/lava: Make kernel image type a normal argument +- ci/lava: Generate YAML from Python, not Jinja +- llvmpipe: Add handle export for resource_get_param +- Revert "ci: disable panfrost t760 jobs" +- CI: Disable LAVA devices for maintenance +- Revert "CI: Disable LAVA devices for maintenance" +- util/disk_cache: Don't leak when cache is empty +- panfrost/genxml: Decode Bifrost index-driven vertex jobs +- ci/panfrost: Temporarily disable sun50i/RK3288 +- Revert "ci/panfrost: Temporarily disable sun50i/RK3288" +- vulkan/wsi/wayland: Initialise wl_shm pointer in VkImage +- egl/wayland: Error on invalid native window +- egl/wayland: Allow EGLSurface to outlive wl_egl_window + +Danylo Piliaiev (36): + +- nir: add lowering pass for helperInvocationEXT() +- turnip: implement VK_EXT_shader_demote_to_helper_invocation +- turnip: implement VK_KHR_shader_terminate_invocation +- ir3: treat 16b imul as mul.s24 +- turnip: enable shaderInt16 +- ir3: do not double threadsize when exceeding branchstack limit +- ir3: make possible to specify branchstack up to 64 +- tu: do not corrupt unwritten render targets +- ir3: do not move varying inputs that depend on unmovable instrs +- ir3: do not fold cmps from different blocks with non-null address +- ir3: memory_barrier also controls shared memory access order +- ir3: update bar/fence bits in accordance to blob +- turnip: implement VK_KHR_vulkan_memory_model +- docs: mark off VK_KHR_vulkan_memory_model for turnip +- turnip,freedreno/a6xx: SP_BLEND_CNTL has per-mrt blend enable bit +- freedreno/a5xx: SP_BLEND_CNTL has per-mrt blend enable bit +- turnip: copy all layers specified in vkCmdCopyImage +- ci/turnip: drop fail annotation for float_control tests +- ci/turnip: drop fail annotation for image.extend_operands_spirv1p4.* +- turnip: do not ignore early_fragment_tests +- turnip: make possible to create read-only bo with tu_bo_init_new +- turnip: make cmdstream bo's read-only to GPU +- turnip: place a limit on the growth of BOs +- freedreno: reduce the upper bound of IB size by one +- turnip: reset push descriptor set on command buffer reset +- turnip: emit vb stride dynamic state when it is dirty +- turnip: fix register_index calculations of xfb outputs +- turnip: implement VK_EXT_provoking_vertex +- turnip: do not re-emit same vs params +- turnip: early exit in tu6_draw_common to save cpu cycles +- freedreno/computerator: pass iova of buffer to const register +- freedreno/isa: add uoffset type to print positive-only offsets +- ir3: add ldg.a,stg.a which allow complex in-place offset calculation +- glsl: Prohibit implicit conversion of mem parameter in atomicOP functions +- ir3: add newly found shlg.b16 instruction +- freedreno: fix wrong tile aligment for 3 CCU gpu + +Dave Airlie (146): + +- iris: move get_time into a static in bufmgr code. +- iris: move target to isl dim translate to inline. +- lavapipe: add support for non-dri loader on linux +- llvmpipe: split screen init up. +- llvmpipe: wrap late screen init with a mutex. +- llvmpipe: delay late screen creation until context init. +- lavapipe: fix mipmapped resolves. +- lavapipe: mark event_storage as volatile +- intel: move brw_ff_gs_prog_key/data to compiler. +- intel/compiler: add support for compiling fixed function gs +- i965: port fixed function geom shader to use compiler paths +- i965: drop old brw ff gs code. +- intel/genxml: align gen4/5 xml for store data immediate +- intel/genxml: rewrite the prefilterop xml to be more consistent. +- intel/gemxml: move blitter command to render on gen4/5 +- intel/genxml: fix raster op fields on gen4/5 +- intel/decoder: fixup batch decoder for binding tables on gen4/5 +- intel/decoder: add gen4/5 geometry state decode +- gallivm: handle texture arrays in non-fragment shaders with lod. +- llvmpipe: fix non-multisampled rendering to multisampled framebuffer +- llvmpipe: add the interesting bit of cpu detection to the cache. +- st/nir: always revectorise if scalarising happens. +- intel/gfx6: move xfb_setup outside the gs compiler into the driver. +- intel/isl: decrease isl_format_layouts size by 36k +- intel/isl: convert null surface fill to a struct. +- intel/isl: add levels and minimum array element to null fill +- intel/isl: add blend enable flag to gen4/5 +- u_blitter: fix fs used when no color emitted +- u_blitter: fix stencil blit fallback for crocus. +- iris: drop unused function declaration +- nir/edgeflags: update outputs written when lowering edge flags. +- st/mesa: also disable other int textures +- intel/decode: handle gen4/5 WM state fragment shaders +- intel: reorder base program key. +- intel/compiler: add flag to indicate edge flags vertex input is last +- crocus: initial gallium driver for Intel gfx 4-7 +- ci: add crocus to the build tests +- crocus: Don't call SET_TILING for dmabuf imports +- crocus: Make iris_bo_import_dmabuf take a modifier +- crocus: introduce main resource configuration helper. +- crocus: Drop buffer support in resource_from_handle +- crocus: hook up memory object creation from handle +- crocus: hook up resource creation from memory object +- crocus: plumb device/driver UUID generators +- crocus: enable GL_EXT_memory_object feature on gen7 +- crocus: fix scanout tiling so glamor/modesetting can work. +- crocus: fixed some missing WM dirtys. +- crocus: fixup render aux usage function. +- crocus: disable Z16 +- crocus/gen6: fix depth blit blorp regression. +- i965: fix regression in pipe control on g45 +- crocus: drop dead gen prototypes. +- crocus: fixup stray tab +- crocus: rename genX proto functions to avoid iris conflicts. +- crocus: fixup workaround_bo to match 965. +- crocus: convert a bunch of is_haswell into verx10 checks. +- crocus: refactor blend state code. +- crocus/gen8: limit some pipe controls to gen7/hsw +- crocus: limit texture gather workarounds to gen7/hsw +- crocus/stencil: limit stencil workaround to gen7 +- crocus/query: add gen8 support to queries by extending hsw checks +- crocus: extend l3 config setup to gen8 +- crocus/gen8: add push constant support (extend hsw) +- crocus/gen8: extend some compute + state functions to gen8 +- crocus/gen8: extend image support to gen8 +- crocus: extend hsw cut index to gen8 +- crocus/gen8: extend predicate handling to gen8. +- crocus/gen8: add sampler / border color support for gen8 +- crocus/gen8: add l3 config support +- crocus/gen8: extending gen7 binding table pointers +- crocus/gen8: limit vertex buffer workarounds to ivb +- crocus/gen8: add raw pipe control support for gen8 workarounds +- crocus/gen8: add support for vertex instancing and index buffers. +- crocus/gen8: state base address + misc setup state. +- crocus/gen8: add VF topology support +- crocus/gen8: add PMA fix from iris +- crocus/gen8: add streamout support +- crocus/gen8: add SBE swiz support +- crocus/gen8: add VF SGVS support. +- crocus/gen8: add PS blend command support. +- crocus/gen8: refactor blend state for gen8 +- crocus/gen8: add rasterizer state changes. +- crocus/gen8: add viewport support +- crocus/gen8: add depth stencil state support +- crocus/gen8: port over vs/gs/ds state changes. +- crocus/gen8: port over ps/wm state changes from iris. +- crocus/gen8: port over VFE/compute state changes +- croucs/gen8: handle gfx8 surface aux addr reloc. +- crocus/gen8: handle sampler differences +- crocus/gen8: hookup gen8 state generators +- crocus/gen8: add support for cherryview (env var for bdw) +- croucs: limit stencil swizzle change to older generations +- crocus/bufmgr: fix userptr left over fail +- crocus: Explicitly cast value to uint64_t +- crocus: free context state properly. +- crocus: fix vertex buffer leak on screen end. +- crocus: fix batch state bo leak +- meson/crocus: add prefer-crocus option. +- crocus/query: poll the syncobj in the no wait situation +- intel/genxml: fix gfx6 GS SVB_INDEX encoding +- crocus/gfx6: fix sampler view first level. +- crocus: dirty blend state more often. +- crocus: Avoid replacing backing storage for buffers with no contents +- crocus/gfx6: always be dirtying gs attachments for xfb +- crocus: fix another printf specifier. +- crocus/gen8: add back z16 support for gen8 +- crocus: disable Z16 unorm textures on pre-gen8 as well. +- gallium/sw: add sw_vk bit to avoid having to futz with env vars for lavapipe +- zink: drop getenv hacking now that gallium is fixed. +- iris: make iris_bind_reserve_3d and Wa_1604061319 only check for dirty render bindings +- crocus: cleanup some deadcode in the gen5 blend emit +- crocus: expose ARB_blend_func_extended on gen 45/50 +- crocus/gen5: enable support for GL_EXT_gpu_shader4 +- crocus: fix crash on index buffer rebinding. +- crocus: fixup index buffer dirtying. +- draw: fix tessellation output vertex size calculation +- draw/tess: write correct primitive id into vertices +- crocus: inline the d/s resource handling functions +- crocus: don't update draw parameters unless needed +- crocus: optimise bo_unref path a little. +- crocus: inline group_index<->bti +- crocus: reorder version checks on indirect xfb +- crocus: restrict prim_restart on index buffer check to pre-hsw +- crocus: support rebinding streamout target buffers +- crocus: use threaded context base classes +- crocus/tc: init/deinit threaded resource +- crocus: add unsync transfer pool +- crocus: enable threaded context support +- ac: fix win32 build +- crocus/gen8: fix wrap mode needs border color. +- crocus: add GL_CLAMP emulation in driver again. +- vulkan/wsi/wl: add wl_shm support for lavapipe. +- lavapipe: add the separate depth/stencil layout enable. +- crocus: use simple_mtx in the bufmgr +- lvp: fixup multi draw memcpys +- draw: handle resetting draw_id between instances. +- softpipe/aniso: move DDQ calculation to after scaling. +- crocus/gen4-5: fix ff gs emit on VS vue map change. +- llvmpipe: add support for time elapsed queries. +- draw/llvmpipe: multiply polygon offset units by 2 +- teximage: return correct desktop GL error for compressedteximage +- crocus/gen4: restrict memcpy mapping to gen5 +- intel/fs: restrict max push length on older GPUs to a smaller amount +- crocus/gen45: fix mapping compressed textures +- intel/genxml: fix raster operation field in blt genxml +- crocus: add support for set alpha to one with blt. + +Dmitry Baryshkov (2): + +- freedreno/regs: split DSI PHY registers to separate xml files. +- freedreno/regs: split old/not used phy registers to separate DB + +Drew Davenport (1): + +- radeonsi: Report multi-plane formats as unsupported + +Duncan Hopkins (3): + +- zink: Correct compiler issue with have_moltenvk member having been moved. +- gallium/dri: Guard DRI driver global variables on MacOS if Zink is enabled. +- zink: Fix MacOS compiling issues + +Dylan Baker (27): + +- meson: OpenMP is supposed to be optional +- docs: add release notes for 21.0.3 +- docs: update sha256 sum for mesa 21.0.3 +- docs: update calendar and link releases notes for 21.0.3 +- docs: update calendar for 21.1.0-rc1 +- docs: update calendar for 21.1.0-rc2 +- docs: update calendar for 21.1.0-rc3 +- meson/vulkan: fix linkage on windows +- docs: Add calendar entries for 21.2 release candidates. +- VERSION: bump for 21.2-rc1 +- .pick_status.json: Update to f40a08d25c91256cd3dff0211b8e10d5bbb3734e +- .pick_status.json: Update to a62973580b7846f2213cbd2589e9473c26596683 +- .pick_status.json: Update to 27534a49cf3872646cb8ef9371707d74a81b1986 +- VERSION: bump for 21.2-rc2 +- .pick_status.json: Update to b45cddda183230232937387f91d009500b2372c9 +- .pick_status.json: Update to 49908c602ffd2d84063effa7ddd0ee842be41a89 +- VERSION: bump for 21.2.0-rc3 +- .pick_status.json: Update to dff0d9911d176802b54890c796e19f56c50f24e1 +- .pick_status.json: Update to b8e29e89366a5264391dc7c10e778330b7add66a +- freedreno/ir3: Add build id to the disassembler test +- .pick_status.json: Mark 8cb795b4772f882024b20c4d4b051b2411dd1a8c as denominated +- .pick_status.json: Update to 87b0962fef4e447a2ea9c76a611aa20b109a259d +- .pick_status.json: Update to 842b8c8965327615f4692384a905dd63f1fba63d +- .pick_status.json: Update to 97be8e42e42f3b739c3de808553094f86ad8879f +- bin/gen_release_notes: Add basic tests for parsing issues +- bin/gen_release_notes: Don't consider issues for other projects +- bin/gen_release_notes: Fix commits with multiple Closes: + +Eleni Maria Stea (5): + +- egl: fix in expected type +- util: replaced ENODATA with ENOATTR for non-Linux systems +- util: Removed unused statement from FreeBSD build +- intel: struct bitset is renamed to brw_bitset +- intel: PAGE_SIZE used in allocators shouldn't be defined on FreeBSD + +Ella-0 (1): + +- anv: expose primary node to VK_EXT_physical_device_drm even when VK_KHR_display is not enabled + +Emil Velikov (1): + +- gbm: list to stderr all the missing extension + +Emma Anholt (251): + +- ci/freedreno: Merge a630 piglit to a single job. +- freedreno: Fix YUV sampler regression. +- ci/virgl: Mark a couple of new Crash tests as flakes. +- ci/freedreno: Skip some precision tests on a530. +- nir_to_tgsi: Use ARL instead of UARL in the !native_integers case. +- nir: Generate load_ubo_vec4 directly for !PIPE_CAP_NATIVE_INTEGERS +- ci/lavapipe: Don't include deqp's shader_cache in the artifacts. +- ci/lava: Return the run's results/ artifacts from the DUTs. +- ci/piglit: Always include the HTML summary in a run. +- ci/lava: Point the shader cache at tmpfs. +- mesa: Remove dead _mesa_unpack_rgba_block(). +- util: Switch the non-block formats to unpacking rgba rows instead of rects. +- util/format: Add some NEON intrinsics-based u_format_unpack. +- panfrost: Enable packed uniforms. +- zink: Enable PIPE_CAP_PACKED_UNIFORMS. +- ci: Build deqp-egl targeting x11_egl_glx +- ci/llvmpipe: Test dEQP-EGL against Xvfb. +- ci/freedreno: Test dEQP-EGL against Xorg. +- mapi: Respect MESA_DEBUG=silent for no-op debug output. +- freedreno: Mark glsl-fs-fogscale as a Fail. +- freedreno/a6xx: Don't try to do Z-as-RGBA blits for mismatched formats. +- util: Fix big-endian handling of z/s formats. +- mesa: Deduplicate _mesa_pack_ubyte_stencil_row() +- mesa: Deduplicate _mesa_pack_float_z_row(). +- mesa: Deduplicate _mesa_pack_uint_z_row(). +- mesa: Remove dead _mesa_get_pack_float_z_func(). +- msea: Move z24s8-to-z24s8 packing fastpath to swrast. +- mesa: Move per-pixel Z pack functions to swrast. +- mesa: Remove dead _mesa_pack_ubyte_rgba_rect(). +- mesa: Replace _mesa_pack_ubyte_rgba_row() with pack_ubyte_rgba_8unorm(). +- ci/radeonsi: Mark a glx_arb_sync_control/timing flake. +- turnip: Only write the tu_RegisterDeviceEXT() out fence on success. +- ci: Add missing vulkan dep for freedreno (turnip) and v3dv test jobs. +- u_format: Fix z32_s8x24 s8 unpacking on big-endian. +- u_format: Add missing BE swizzles for R8SG8SB8UX8U_NORM +- ci/freedreno: Mark dEQP-EGL flakes reported on IRC since its introduction. +- ci/freedreno: Mark new flakes from the go-fast branch. +- ci/freedreno: Mark another recent piglit flake. +- ci/freedreno: Fix the recent-a5xx-texture-flakes matches. +- ci/freedreno: Add another db820c flake that's appeared in the last few months. +- tgsi: Mark the tgsi_exec_channel and tgsi_double_channel ALIGN16. +- tunrip: Add support for VK_EXT_separate_stencil_usage. +- ci/freedreno: Mark a5xx texture gather as flaky. +- turnip: Demote API version to 1.1. +- ci/llvmpipe: Add testing of gles3/31/gl. +- ci/lavapipe: Add fractional NIR stress test job. +- freedreno/a5xx: Fix up border color pointers. +- gallium/tgsi_exec: Drop the unused dst_datatypes from dest stores. +- tgsi_exec: Drop unused destination dimension support. +- tgsi_exec: Mark the store file default case as unreachable. +- gallium/tgsi_exec: Simplify bounds checks on the const file. +- turnip: Switch to the shared vulkan ICD generator. +- turnip: Move the extension tables to tu_device.c +- ci/freedreno: Add another daily dose of a530 flakes. +- turnip: Drop wideLines properties since we don't support wide lines. +- turnip: Claim 2 discrete queue priorities. +- freedreno: Update editorconfig and emacs settings for freedreno reformat. +- ci/turnip: Clean up some stale fail annotations. +- ci/turnip: Add some links to issues and MRs for some test failures. +- turnip: Drop fail annotation for driver_properties. +- ci: Switch to apitraces for glmark2 +- ci/panfrost: Add some more traces to replay. +- ci/iris: Add some more traces to replay. +- ci/freedreno: Skip refract on a306 now that it hangchecks sometimes. +- midgard: Fix type for vertex_builtin_arg() and compute_builtin_arg(). +- ci/freedreno: Skip a test that's taking out the a530 boards. +- ci/freedreno: Mark two more recent intermittent a530 flakes. +- ci/deqp: Make DEQP_EXPECTED_RENDERER a required regex for VK like for GLES. +- ci/intel: Add test jobs for dEQP. +- vulkan: Avoid stomping array padding in the MemoryProperties wrapper. +- mesa/st: Only use 16-bit ints or floats in the NIR path. +- i915g: Disable 3D-pipeline clears. +- i915g: Switch batchbuffer dumping to mesa_logi(). +- i915g: Fix dumping of the FS in batchbuffers. +- ci/i915g: Introduce manual testing of i915g using anholt's runner. +- i915g: Make the FS for compile failures write red instead of DIFFUSE. +- i915g: Add support for the .Absolute flag on TGSI srcs. +- i915g: Stop advertising support for indirect addressing in the FS. +- i915g: Fix writing of undefined depth value if not writing any outputs. +- i915g: Fix undefined results for TGSI_OPCODE_KILL +- ci/iris: Switch GLK back to manual testing. +- ci/freedreno: Clear compswap flake annotation. +- ci/freedreno: Clear stale validation failure flake annotation. +- ci/freedreno: Drop a630 flake annotation from the go-fast changes. +- ci/freedreno: Add a link explaining get_display_plane_capabilities +- ci/freedreno: Drop VK flake annotations not seen in the last ~year. +- ci/freedreno: Consolidate ssbo.fragment_binding_array flake annotation. +- ci/freedreno: Mark a630 glx-visuals-depth/stencil as piglit flakes. +- ci/freedreno: Also mark waitformsc as flaky. +- ci/freedreno: Add glx-copy-sub-buffer to flakes on a530 and a630. +- mesa/st: Fix iris regression with clip distances. +- ci/freedreno: Add another a630 piglit flake. +- ci/freedreno: Turn off default a530 quick_gl testing, do full quick_shader. +- turnip: Reorganize copy_format()'s switch statement. +- turnip: Make sure that SNORM blits don't clamp ambiguous -1.0 values. +- Revert "ci: Configure DUTs for max performance" +- ci: Add known-flake handling for the IRC flake reports +- ci: Move the flakes channels to OFTC +- util: Add a helper macro for defining initial-exec variables. +- android: Fix ELF TLS support. +- ci/android: Update to building for SDK 29 by default. +- u_format: Fix some pep8 in u_format_parse.py. +- u_format: Drop redundant .name init. +- u_format: Move the BE swizzle computation into Format init. +- u_format: Use the nice helper for reversing an array. +- u_format: Assert that array formats don't include BE swizzles. +- u_format: Define tests for r3g3b2 formats and fix BE swizzles for them. +- u_format: Fix the BE channel ordering for R5G5B5A1_UINT. +- u_format: Sanity check the BE channels for all bitmask formats. +- u_format: Sanity check that BE swizzles are appropriately mapped from LE. +- u_format: Use the computed BE channels/swizzles for bitmask formats. +- ci/freedreno: Add some more known flakes from recent marge runs. +- docs/freedreno: Update for the fanin/fanout -> collect/split rename. +- docs/freedreno: Rewrite the section on array access. +- tgsi_exec: Garbage-collect the FAST_MATH path. +- u_math: Reduce fast-log2 table size from 65k entries back to 256. +- llvmpipe: Don't call util_init_math(). +- ra: Add a unit test. +- ra: Document that class index is allocated in order, use that in r300. +- ra: Use struct ra_class in the public API. +- ra: Add fast-path support for register classes of contiguous regs. +- vc4: Use the ra_alloc_contig_reg_class() function to speed up RA. +- v3d: Use the ra_alloc_contig_reg_class() function to speed up RA. +- intel/fs: Use ra_alloc_contig_reg_class() to speed up RA. +- intel/vec4: Use ra_alloc_contig_reg_class() to reduce RA overhead. +- lima: Use ra_alloc_contig_reg_class(). +- util/ra: Use the conflicting neighbor to skip unavailable registers. +- ci/i915g: Fix incorrect expectation. +- i915g: Make sure we don't try to texture from the const file. +- ci/lava: Finish garbage-collecting the TEST_SUITE variable +- ci: Update piglit and deqp/piglit-runner. +- ci/freedreno: Enable running all of piglit_gl for a530's manual test. +- ci/piglit: Skip WGL on all the Linux runs. +- ci/fastboot: Add a serial timeout to catch fastboot prompt failure. +- ci/fastboot: Consistently restart the run on intermittent conditions. +- ci/iris: Enable piglit testing on AML-Y. +- ci: Disable Xorg's screensaver entirely. +- ci/deqp: Drop stress/perf skips lists. +- ci/deqp: Skip flush_finish on all CI jobs. +- ci/softpipe: Move the flake to the flakes list. +- ci: Add a flakes IRC channel for llvmpipe/softpipe. +- ci/deqp: Skip dEQP-VK.wsi.display.get_display_plane_capabilities +- ci/piglit: Move the WGL skip to a common skips file. +- ci/piglit: Skip glx_arb_sync_control@timing.* on all systems. +- freedreno: Drop batch-cache orphan tracking. +- freedreno: Make a bunch of the batch cache take ctx as the arg. +- freedreno: Drop a bit of indirection around the batch cache flush path. +- freedreno: Be more strict about QUERY_AVAILABLE to simplify the code. +- freedreno: Fix batch reference handling in flush_resource(). +- freedreno: Move the !MAP_WRITE write batch refcounting to the branch. +- freedreno: Remove broken back_blit optimization. +- freedreno: Flush batches upon destroying the ctx. +- freedreno: Add perf_debug() for our software conditional rendering. +- freedreno: Move FD_MESA_DEBUG=msgs output to mesa_logi. +- freedreno/fdl: Give the tiling mode a nice name in debug dumps. +- freedreno: Add more detailed blit debug in FD_MESA_DEBUG=msgs. +- freedreno: Skip staging blits from uninitialized resources. +- freedreno: Add some cheza flakes from the last week. +- i915: Disable vertex texturing and delete the code. +- i915: Drop assertion failure about seeing each const decled once. +- i915g: Allow fragment coord conventions TGSI properties to be set. +- nir/lower_int_to_float: Make sure the cursor is in the right spot. +- nir: Do peephole select on other instructions if the limit is ~0. +- nir_to_tgsi: Fix internal handling of NIR uints for !CAP_INTEGERS +- nir_to_tgsi: Support integer sysvals on !CAP_INTEGERS hardware. +- i915g: Handle fragment depth being in OUT[1] not OUT[0]. +- i915g: Switch to using nir-to-tgsi. +- i915g: Add triangle provoking vertex support. +- freedreno: Move some driver debug printfs to mesa_logd. +- freedreno/ir3: Move the assert output to mesa_loge(). +- util/log: Add a streaming printf interface. +- freedreno/ir3: Use mesa_log_stream() for ir3 disassembly. +- freedreno/ir3: Move the native code output to mesa_log as well. +- nir: Add an interface for logging shaders with mesa_log*. +- freedreno/ir3: Move NIR printing to mesa_log. +- gallium/draw: Garbage collect draw_set_force_passthrough +- gallium/draw: Garbage collect draw_pt_fetch_emit. +- gallium/util: Introduce a helper for finding whole-resource blits. +- freedreno: Move the rsc-based batch flushing to helper functions. +- freedreno: Handle full blit discards by invalidating the resource. +- freedreno: Cooperate with tc to stop checking the BC for resource_busy(). +- ci/llvmpipe: Mark two more multithread program link flakes. +- i915g: Remove nr_cbufs loop. +- i915g: Create an i915_surface for our pipe_surfaces. +- i915g: Compute 3DSTATE_BUF_INFO flags at surface create time. +- i915g: Move cbuf color swizzle lookup to CSO creation time. +- i915g: Simplify color write mask setup. +- i915g: Use the color swizzle to reshuffle the blend const color. +- i915g: Clear xfails for vertex texturing. +- freedreno: Fix leak of the screen hash table. +- i915g: Fix GL_ARB_copy_buffer assertion fails. +- i915g: Fix bad naming of depth texture formats. +- i915g: Finish out blend factor overrides for both RGBx and A8. +- ci/i915g: Skip the piglit glx tests since we're not running X. +- freedreno/ir3: Report RA failure with mesa_loge(). +- turnip: Link more MRs and issues related to our xfails. +- turnip: Use vk_startup_errorf() in more startup paths. +- ci/turnip: Document create_instance_device_intentional_alloc_fail's fail. +- turnip: Disable buffer texturing on 422 formats. +- Revert "freedreno: Cooperate with tc to stop checking the BC for resource_busy()." +- nir: Add a helper for chasing movs with nir_ssa_scalar(). +- turnip: Short-circuit if ladder generation for constant index SSBO/UBOs. +- i915g: Apply clang-format. +- i915g: Bake the decls and program together. +- i915g: Allow use of I915_DEBUG= options on non-DEBUG builds. +- i915g: Enable dumping of fragment shaders under I915_DEBUG=fs. +- i915g: Use the normal compile error path for empty FSes. +- i915g: Log program compile errors to mesa_loge(). +- i915g: Stop translating the fragment program on the first error. +- i915g: Improve logging of unsupported opcodes. +- i915g: replace "uint" with normal uint32_t. +- i915g: Use stdbool.h instead of custom bools. +- i915g: Remove redundant p->error setting. +- i915g: Mark program errors on setting up temps, constants, and immediates. +- i915g: Fix off-by-one in constant count assertion. +- intel: Early exit from inst_is_in_block(). +- i915g: Finish the uint -> uint32_t conversion. +- i915g: Add the nice cube map layout comments from i915c. +- i915g: Fix FS debug dumping for declarations. +- i915g: Delete redundant i915_hw_sampler_views atom. +- i915g: Add curly braces for normal mesa style (and helps clang-format) +- i915g: Set up the cube map texture wrap modes. +- freedreno: Update comments about PIPE_BUFFER shadowing. +- freedreno: swap ->valid when shadowing resources. +- freedreno/a5xx: Make sure to mark blit read/write access in the BC. +- freedreno: Stop manually marking blit dst buffers as valid. +- freedreno: Swap needs_ubwc_clear when shadowing. +- freedreno: Flush the shadowed resource's write batch up front. +- i915g: Add support for per-vertex point size. +- i915g: whitespace fixup from the cube map fix. +- i915g: Force 1D textures to use wrap mode for the Y coordinate. +- i915g: Make sure the 1D texture Y channel is initialized. +- anv: Fix unused var warning on release builds from an assertion. +- nir: Add a nir_instr_remove that recursively removes dead code. +- nir: Use remove_and_dce for nir_shader_lower_instructions(). +- nir: Free the instructions in a DCE instr removal. +- i915g: Fix writemasking of SEQ/SNE/SSG. +- nir_to_tgsi: Run copy prop (and thus dce) after lower_bool_to_float. +- nir_to_tgsi: Declare immediates as float on non-native-ints hardware. +- turnip: Fix allocation size for vkCmdUpdateBuffer. +- i915g: Fix dumping of 3DSTATE_BACKFACE_STENCIL_OPS. +- i915g: Fix backface stencil when front_ccw is set. +- ci: Make sure that we build the piglit dmabuf tests. +- freedreno: Suballocate our long-lived ring objects. +- freedreno/a6xx: Reduce the size of the config stateobj allocation. +- freedrneo/a6xx: Reduce the size of the long-lived texture stateobj. +- freedreno/a6xx: Allocate just enough memory for SO state, only if we do SO. +- freedreno: Optimize duplicate obj-obj ring relocs. +- i915g: Fix release build compiler warnings. +- ci: Enable testing of i915g in the the debian -Werror release build. +- freedreno: Lock access to msm_pipe for RB object suballocation. + +Enrico Galli (10): + +- microsoft/compiler: zero out unused WebGPU system values +- microsoft/compiler: Remove de-duplication of arbitrary semantic names +- d3d12, microsoft/compiler: Switching semantic names to TEXCOORD +- d3d12, microsoft/compiler: Moving driver_location allocation to compiler +- util: Add simple test for util_qsort_r +- util: Add qsort_r/s args adapter for MSVC and BSD/macOS +- nir: Add modes filter to nir_sort_variables +- microsoft/compiler: Switch io sort to use nir_sort_variables_with_modes +- microsoft/spirv_to_dxil: Add drive_location assignment +- microsoft/compiler: Add support for get_ssbo_size to translator + +Eric Engestrom (15): + +- VERSION: bump to 21.2.0-devel +- docs: reset new_features.txt +- egl/x11: don't forget to exit the attrib list loop +- docs: add release notes for 21.1.0 +- docs: add release notes for 21.1.1 +- docs: update calendar and link releases notes for 21.1.0 +- docs: update calendar and link releases notes for 21.1.1 +- docs/release-calendar: add the schedule for the 21.1 branch +- docs: add release notes for 21.1.2 +- docs: update calendar and link releases notes for 21.1.2 +- docs: add release notes for 21.1.3 +- docs: update calendar and link releases notes for 21.1.3 +- docs: add release notes for 21.1.4 +- docs: update calendar and link releases notes for 21.1.4 +- docs/release-calendar: add a few more 21.1 releases + +Erico Nunes (7): + +- gallium/hud: create vs_text to match fs_text +- gallium/hud: extend check for has_srgb +- docs/lima: add an initial page for Lima +- lima: enable z16 format +- lima: add reload command to the command dump +- meson: kmsro: require dri3 for X11 +- lima: avoid crash with negative viewport values + +Erik Faye-Lund (193): + +- zink: fix stencil-export cap emission +- lavapipe: resolve border-color when creating sampler +- lavapipe: implement VK_EXT_custom_border_color +- nir/lower_tex: do not stumble on 16-bit inputs +- zink: document requirement of VK_EXT_custom_border_color +- gallivm: handle 16-bit input in i2b32 +- gallivm: run nir_opt_algebraic_late +- gallivm: add 16-bit integer support +- zink: do not require vulkan memory model for shader-images +- docs: write basic meta-documentation +- zink: do not read outside of array +- docs: remove out-of-date gles info +- docs: remove documentation of MESA_CI_VISUAL +- docs: remove documentation of MESA_PRIVATE_CMAP +- docs: remove documentation of MESA_HPCR_CLEAR +- docs: nest cherry-pick example under note +- docs: use tables instead of pre-formatted text +- docs: use math notation for example matrices +- docs: use code-block for console-content +- docs: use code-block for glsl +- docs: use code-block for c +- docs: use code-block for ini +- zink: only emit extended-formats cap if needed +- zink: remove memory-model leftovers +- docs: fixup link to extension +- docs: fix quoting around a few limits +- zink: correct image cap checks +- docs: add missing zink-requirement +- docs: someome -> someone +- zink: enable required instance ext +- zink: make zink_binding private +- zink: remove stray semicolons +- zink: fixup bad indentation +- docs: remove out-of-date versions doc +- zink: fix shader-image requirements +- zink: correct an extension-link +- docs: fixup indentation of radeonsi envvar values +- docs: document r600 envvars +- zink: use UINT32_MAX instead of UINT_MAX +- zink: respect bit-size of dref-result +- zink: run nir_opt_algebraic_late +- zink: always lower function-temp derefs +- zink: support emitting 16-bit int types +- zink: enable 16-bit int support +- zink: support emitting 16-bit float types +- zink: perform fp16 texture-lookups as fp32 and then convert +- zink: enable 16-bit float support +- zink/codegen: prefer first definition of prop/feature structs +- zink: also enable float16 from KHR extension +- lavapipe: consistently use nir macros +- docs: update gallium doxygen docs +- zink: handle matrix-types after vectors +- zink: cache SpvId for aggregate glsl_types +- zink: always enable fixed shader-caps +- zink: do not check for varying output for fragment shaders +- zink: emit cap early +- zink: remove needless shader-info from context +- zink: emit sample-shading cap early +- zink: emit cap early +- zink: only emit ImageBuffer cap if needed +- docs: do not generate redirects on error +- gallium/u_vbuf: avoid dereferencing NULL pointer +- freedreno/a5xx: Remove ppgtt hack +- docs: remove doxygen support +- zink: remove incorrect border-swizzle assumption +- lavapipe: emit correct textures_used for texture-arrays +- zink: do not ask glsl-compiler to unroll +- lavapipe: fix fsum with swizzle +- st/mesa: do not take util_logbase2 of a negative size +- zink: check for error when binding memory +- gallium: allow to report errors from p_screen::resource_bind_backing +- lavapipe: report out-of-memory when binding +- llvmpipe: allow calculating size of overly large texture +- lavapipe: report allocation-error +- lavapipe: correct reported number of UBOs +- translate: reserve more vertex-shader outputs +- translate: assert that nr_elements is in range +- ci: Uprev piglit to 3351e8952 ("max-texture-size: report merged results") +- docs/features: document GL_ARB_ES3_2_compatibility support for zink +- docs/features: mark a few more extensions as done for zink +- zink: fix provoking-vertex cap for quads +- docs: promote #dri-devel on oftc over freenode +- docs: update link to #zink +- docs: update location of #panfrost +- docs: update link to #lima +- zink: simplify emit_load_const +- v3d: use helper to simplify things +- ci: downgrade sphinx to v3.x +- docs: update another IRC reference +- docs: update another IRC reference +- docs: drop clayton from intel-ci notice +- zink: use actual const for const offset +- lavapipe: handle cube-array image-views +- lavapipe: do not interpret cube-compatible as cubemap +- zink: only mark resources as cube-compatible if supported +- zink: mark 2d-arrays as cube-compatible +- zink: implement half-float packing +- zink: untangle have_EXT_debug_utils and ZINK_DEBUG_VALIDATION +- zink: add support for string-markers +- util/prim_restart: revert part of bad fix +- docs: quote a few defines +- docs: fix header-levels in envvars.rst +- docs: use file-role for paths +- docs: use envvar role for envvars +- docs: add the doc-comment for fse-vars +- docs: do not list all gles major versions +- docs: update list of apis to match website +- docs: update llvm requirement +- docs: rename vmware-guest article +- docs: clean up list of deprecated systems +- docs: move swrast to deprecated drivers list +- docs: clean up software-drivers list +- docs: clean up openswr links +- docs: split out layered driver to its own list +- docs: clean up freedreno links +- docs: add links to documented drivers +- r600: explicitly advertise index buffer format support +- zink: limit images we mark as cube-compatible +- zink: rename spirv_15 bool to spirv_1_4_interfaces +- zink: allow to specify any spir-v version to nir_to_spirv +- zink: calculate spir-v version based on vk version +- zink: only enable vote if we can support it +- zink: use a macro for spir-v versions +- st/pbo: use correct type for images and textures +- docs: update master -> main in edit-links +- zink/ci: increase piglit and deqp-runner timeouts +- llvmpipe: fix edge-rule logic for lines +- llvmpipe: consistently deal with post-rast state +- llvmpipe: fix multisample lines again +- llvmpipe: do not always use pixel-rounded coordinates for points +- zink/ci: re-enable test +- zink: reject more illegal blits +- zink: limit non-extension version feature to spirv 1.5 +- zink: use correct type for u_bit_scan +- zink: do not unmap dt-buffers twice +- zink: drop paranoid code +- zink: add missing compiler-dependency +- zink: drop some more vla usage +- zink: fix more initializer styles +- zink: introduce a define for max descriptors per type +- zink: use max-descriptor define +- zink: use alloca instead of hard-to-size vlas +- zink: correct type of flags to flush +- zink: fixup signedness of subtraction +- zink: remove unused function +- zink: drop repeated usage-bit +- zink: do not check buffer-format for usage-bits +- docs: remove outdated meson-section +- docs: remove outdated clarification +- docs: drop historic meson details +- docs: use more file-roles +- docs: use rst captions +- wgl: remove hard limit on pixelformats +- zink: drop unused macros +- zink: remove unused function-pointers +- zink: unbreak moltenvk code +- zink: remove unused moltenvk functions +- zink: do not store moltenvk functions in screen +- zink: remove some needless moltenvk details +- libgl-gdi: add missing include +- iris/ci: disable amly jobs +- aux/trace: fix bool argument +- zink: cast pointers to uintptr_t +- ci/windows: work around meson encoding issues +- ci/windows: enable msvc builds of zink +- ci/windows: fix zink msvc build-rules +- gallium/u_threaded: do not apply start twice +- ci: fix source-deps for radv on windows +- zink: hook up line-rasterization ext +- zink: use bit-allocation for boolean rasterizer-state +- zink: support line stippling +- zink: fill in the right line-mode based on state +- docs: update zink requirements +- llvmpipe: reject unsupported shader-image formats +- lavapipe: query formats for shader-image support +- llvmpipe: only report supported shader-image formats +- lavapipe: expose more storage-image features +- lavapipe: do not disable multisampling for smooth lines +- lavapipe: fix disable_multisample condition +- gallium: explicitly specify line rasterization mode +- draw: respect line_rectangular state +- llvmpipe: respect rectangular_lines +- lavapipe: re-expose line-rasterization extension +- lavapipe: expose strict-lines feature +- zink: implement support for non-planar DRM modifiers +- zink: remove duplicate format-mapping on little-endian +- vulkan: do not map zero-sized region of memory +- vulkan: allocate host-visible memory for swapchain images +- zink: check for right feature +- zink: respect line_rectangular state +- lavapipe: do not assert on more than 32 samplers +- lavapipe: do not mark unsupported tests as crashing +- d3d12: split up root parameter update and set + +Erik Kurzinger (1): + +- vulkan/device_select: avoid segfault on Wayland if wl_drm is unavailable + +Ernst Sjöstrand (1): + +- nv50: Fix use of initializers on older compilers + +Ezequiel Garcia (2): + +- panfrost: Add GPU IDs for G52 1-Core-2EE (RK3568/RK3566) +- panfrost: Rename G52 product ID 0x7402 as G52r1 + +Felix DeGrood (16): + +- intel: add L3 Bypass Disable to gen xml +- iris: Cache VB/IB in L3$ for Gen12 +- iris: reduce redundant tile cache flushes +- intel/blorp: remove tile flush from emit surface state +- intel/compiler: Use switch for DERIVATIVE_GROUP logic +- intel/compile: refactor DERIVATIVE_GROUP logic +- intel/compiler: tileY friendly LID order for CS +- intel/compiler: balanced tileY/linear friendly LID order for CS +- anv: Cache VB/IB in L3$ for Gfx12 +- anv: Add debug messages for DEBUG_PIPE_CONTROL +- anv: Clear all pending stall after pipe flush +- anv: Remove Tile Cache flush from SBA, Pipe Select +- anv: remove unnecessary Tile Cache flushes +- anv: Only flush Tile Cache on VK_ACCESS_HOST_R/W +- anv: Add ANV_PIPE_HDC_PIPELINE_FLUSH_BIT +- anv: Replace DC Flush with HDC Pipeline Flush + +Francisco Jerez (20): + +- intel/fs: Implement representation of SWSB cross-pipeline synchronization annotations. +- intel/fs: Add helper functions inferring sync and exec pipeline of an instruction. +- intel/fs: Represent SWSB in-order dependency addresses as vectors. +- intel/fs: Calculate SWSB cross-pipeline synchronization information. +- intel/fs: Use CHV/BXT implementation of 64-bit MOV_INDIRECT on XeHP+. +- intel/fs: Fix repclear assembly for XeHP+ regioning restrictions. +- intel/fs: Handle regioning restrictions of split FP/DP pipelines. +- intel/eu: Teach EU validator about FP/DP pipeline regioning restrictions. +- intel/compiler: Lower integer division on XeHP. +- intel/fs: Introduce lowering pass to implement derivatives in terms of quad swizzles. +- intel/fs: Add more efficient fragment coordinate calculation. +- iris/gen12: Work around push constant corruption on context switch. +- iris/gfx12: Invalidate ISP at the end of every batch. +- intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps. +- intel/fs: Track single accumulator in scoreboard lowering pass. +- intel/fs: Implement Wa_22012725308 for cross-pipe accumulator data hazard. +- intel/fs: Add SWSB dependency annotations for cross-pipeline WaR data hazards on XeHP+. +- intel/fs: Teach IR about EOT instruction writing the accumulator implicitly on TGL+. +- intel/fs: Fix synchronization of accumulator-clearing W/A move on TGL+. +- intel/fs: Implement Wa_14013745556 on TGL+. + +Georg Lehmann (11): + +- radv: Fix compatible image handle type for dmabufs. +- v3dv: use VKAPI_ATTR and VKAPI_CALL. +- zink: Add a missing VKAPI_ATTR. +- vulkan: Update the XML and headers to 1.2.180 +- radv: Implement VK_EXT_global_priority_query. +- ac: Check me_fw_feature for 32bit predication on gfx10.3 +- ac: Enable 32bit predication on gfx10. +- ac: Enable 32bit predication on gfx9 with fw feature version 52. +- lavapipe: Use common default allocator. +- lavapipe: Add a missing VKAPI_ATTR. +- vulkan/wsi/wayland: Add support for more SRGB formats. + +Gert Wollny (40): + +- Revert "r600: don't set an index_bias for indirect draw calls" +- Revert "r600: Don't advertise support for scaled int16 vertex formats" +- r600: don't set an index_bias for indirect draw calls +- virgl: use pipe_draw_info::restart_index only when primitive_restart is enabled +- r600: update pipe_draw_info::restart_index only when primitive_restart is enabled +- nir/opt_algebraic: optimizations for add umax/umin with zero +- nir: Add filter callback for lower_to_scalar to the options +- gallium: pass lower_to_scalar_filter to lower_to_scalar pass +- r600/sfn: lower to scalar with filter applied +- mesa: add an extension MESA_bgra +- compiler/nir: check whether var is an input in lower_fragcoord_wtrans +- nir/linker: add option to ignore the IO precisions for better varying packing +- r600/sfn: Ignore precision when linking +- r600: don't put INTERP_X and INTERP_Z into one instruction group +- r600/sfn: Use valid pixel mode only in fragment shaders +- r600/sfn: Use valid pixel mode for SSBO and Image result fetches +- r600/sfn: force new CF if fetch through TC would be used in same clause +- r600/sfn: Lower FS pos input w-transform in NIR +- r600/sfn: Don't check the faction when searching for the input slot +- r600/sfn: count only distinct literals per instruction group +- r600/sfn: Fix Cayman trans ops +- r600/sfn: Use unified index register code for samplers +- r600/sfn: Use unified code path for index register load +- r600/sfn: Fix texture gather for Cayman +- r600/sfn: Fix ssbo/image atomic swap for Cayman +- r600/sfn: Fix Cayman SSBO write with more than one value +- r600/sfn: Fix Geometry shader for Cayman +- r600/sfn: read number of images from shader info +- r600/sfn: Fix cube query layer number for indirect access +- r600/sfn: Add lowering pass to legalize image access +- r600/sfn: legalize image access on Cayman +- r600: Enable NIR debug flags also for Cayman +- r600/sfn: don't designates initializers, since they are c++20 +- r600/sfn: don't read back unused image atomic result values +- r600/sfn: Drop method for emit_atomic_add, it is handled in generic code +- r600/sfn: Don't read return values of atomic ops that are not used +- r600/sfn: Clean up some ALU lowering and move code +- r600/sfn: Lower offset in TXF instructions +- virgl: Enable ASTC formats also for 3D textures +- r600/sfn: initialize all texture lower options + +Gustavo Padovan (10): + +- traces-iris: fix expectation for Intel GLK +- gitlab-ci: enable Intel AML-Y as experimental +- gitlab-ci: rule anchor for experimental devices as manual in MRs +- gitlab-ci: enable all 3 intel devices as manual in MR pipelines +- iris/ci: disable failing gimark test for now +- iris/ci: enable intel devices automatically in MR pipelines +- gitlab-ci: add python script to submit lava jobs +- gitlab-ci: enable testing on Intel Kaby Lake as experimental +- ci/lava: propely report test failure through sys.exit() +- ci/lava: do not save lava.yaml in the artifacts + +Hans-Kristian Arntzen (2): + +- radv: Take image alignment into account when allocating MUTABLE pool. +- radv: Allocate buffer list for MUTABLE descriptor types as well. + +Heinrich Fink (6): + +- softpipe: add missing sentinel to debug option array +- llvmpipe: unmap display target of shader image/sampler +- softpipe: unmap display target of shader sampler +- llvmpipe: do not leak map of display target in fs setup +- llvmpipe: do not leak display target mapped ptr in cs setup +- gbm/dri: Fix leaking bo memory on failure path + +Hoe Hao Cheng (15): + +- vulkan/util: generate vk_dispatch_table that combines all dispatch tables +- nir: define NIR_ALU_MAX_INPUTS +- zink: remove variable length arrays in ntv +- zink: introduce vk_dispatch_table +- zink/codegen: split commands into three groups +- zink/codegen: add zink_verify_*_extensions() +- zink: slight refactor of load_device_extensions() +- zink: use the dispatch tables +- zink/codegen: allow conditional enabling of instance extensions +- zink/codegen: clean the constructor of Extension up +- zink: do not fail when EXT_calibrated_timestamps is unavailable +- zink: move extension function verification to when it is used +- zink: zero-init structs with ISO C +- zink: standardize zero-init code style +- zink: make codegen compatible with python 3.5 + +Hubert Jasudowicz (1): + +- docs/egl: Add missing backticks + +Hyunjun Ko (6): + +- turnip: prep work for timeline semaphore support +- turnip: Implement VK_KHR_timeline_semaphore. +- turnip/kgsl: Fix to build on android. +- turnip: add missing VKAPI_ATTR/CALL +- turnip: Copy command buffers to deferred submit request +- turnip/kgsl: new flag TU_USE_KGSL + +Iago Toral Quiroga (118): + +- v3dv: avoid redundant BO job additions for textures and samplers +- v3dv: avoid redundant BO job additions for UBO/SSBO +- v3dv: avoid redundant BO job additions for spill / shared BOs +- v3dv: optimize a few cases of BO job additions +- v3dv: use a bitfield to implement a quick check for job BO tracking +- v3dv: fix descriptor set limits +- v3dv: fix array sizes when tracking BOs during uniform setup +- v3dv: don't use a dedicated BO for each occlusion query +- v3dv: fix sRGB blending workaround +- v3dv: improve dirty descriptor set state tracking +- v3dv: dirty viewport doesn't affect fragment shaders +- v3dv: better tracking of dirty push constant state +- vulkan/wsi: give drivers the option to decide if they need to blit +- v3dv: implement wsi hook to decide if we can present directly on device +- compiler/nir: add a divergence analysis option for non-uniform workgroup id +- v3dv: choose a larger CSD supergroup size if possible +- broadcom/compiler: track if a shader has control barriers in prog_data +- v3dv: limit supergroup size in presence of TSY barriers +- broadcom/common: move CSD supergroup sizing to a common helper +- v3d: choose a larger CSD supergroup size if possible +- broadcom/compiler: add a loop unrolling pass +- v3dv: setup loop unrolling +- v3d: move NIR compiler options to GL driver +- broadcom/compiler: add a compiler strategy to disable loop unrolling +- broadcom/compiler: refactor compile strategies +- broadcom/compiler: specify maximum thread count in compile strategies +- v3d: enable NIR loop unrolling +- v3d: re-enable GLSL loop unrolling +- broadcom/compiler: change register allocation policy for accumulators +- broadcom/compiler: move vertex shader output handling to its own function +- broadcom/compiler: implement non-uniform offset on vertex outputs +- broadcom/compiler: make vir_VPM_WRITE_indirect handle non-uniform offsets +- broacom/compiler: enable PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR +- broadcom/compiler: don't use nir_src_is_dynamically_uniform +- v3dv: don't lower indirect derefs on output variables +- broadcom/compiler: don't unroll due to indirect indexing of outputs +- v3d: disable GLSL loop unrolling again +- broadcom/compiler: clarify PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR setting +- broadcom/compiler: don't emit TLB loads for components that don't exist +- broadcom/compiler: consider RT component size when lowering logic ops in Vulkan +- broadcom/ci: update fail list for v3dv +- v3d: take TLB blit framebuffer dimensions from smallest surface dimensions +- v3dv: implement VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_POINT_CLIPPING_PROPERTIES +- v3dv: fix texture_size() +- v3dv: allow creating uncompressed views from compressed images and vice versa +- v3dv: expose VK_KHR_maintenance2 +- v3dv: define V3D_MAX_BUFFER_RANGE +- v3dv: implement VK_KHR_maintenance3 +- v3dv: implement VK_KHR_bind_memory2 +- v3dv: implement VK_KHR_get_memory_requirements2 +- v3dv: keep track of whether an image may be backed by external memory +- v3dv: implement VK_KHR_dedicated_allocation +- v3dv: trivially handle VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_KHR +- v3dv: add v3dv_GetImageSparseMemoryRequirements back +- v3dv: implement vkCmdDispatchBase +- v3dv: create a helper for image creation +- v3dv: implement interactions of VK_KHR_device_group with VK_KHR_swapchain +- v3dv: implement VK_KHR_device_group +- v3dv: don't keep an open file descriptor for imported fences/semaphores +- v3dv: implement external semaphore/fence extensions +- v3dv: increase number of supported SSBOs +- v3dv: expose KHR_relaxed_block_layout +- v3dv: document VK_KHR_relaxed_block_layout as implemented +- v3dv: expose VK_KHR_storage_buffer_storage_class +- v3dv: refactor descriptor updates +- v3dv: implement VK_KHR_descriptor_update_template +- v3dv: fix incorrect render area setup +- v3dv: expose KHR_variable_pointers +- v3dv: don't lower vulkan resource index result to scalar +- v3dv: implement VK_KHR_get_display_properties2 +- v3dv: handle Vulkan 1.1 feature and property queries +- v3dv: don't support VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT +- Revert "v3dv: allow creating uncompressed views from compressed images and vice versa" +- v3dv: expose VK_KHR_shader_non_semantic_info +- v3dv: implement VK_EXT_index_type_uint8 +- v3dv: implement vkCmdBlitImage2KHR +- v3dv: implement vkCmdCopyBuffer2KHR +- v3dv: implement vkCmdCopyBufferToImage2KHR and vkCmdCopyImageToBuffer2KHR +- v3dv: implement vkCmdCopyImage2KHR +- v3dv: implement vkCmdResolveImage2KHR +- v3dv: expose VK_KHR_copy_commands2 +- v3dv: remove const qualifier for resource pointer in view objects +- broadcom/compiler: implement nir_intrinsic_load_subgroup_id correctly +- broadcom/compiler: lower nir_intrinsic_load_num_subgroups +- broadcom/compiler: add FLAFIRST and FLNAFIRST opcodes +- broadcom/compiler: implement more subgroup intrinsics +- broadcom/compiler: add a ntq_emit_cond_to_bool helper +- broadcom/compiler: add a set_a_flags_for_subgroup helper +- broadcom/compiler: track if a compute shader uses subgroup functionality +- broadcom/util: don't use compute supergroup packing with subgroups +- v3dv: expose correct subgroup size +- v3dv: expose support for basic subgroup operations +- broadcom/compiler: use nir_sort_variables_with_modes +- v3dv: account for dst offset of copy query results operations +- v3dv: always free pipeline stages after compiling +- v3dv: extend broadcom stages to include geometry +- v3dv: define a generic helper to create binning pipeline stages +- v3dv: add a few more broadcom shader stage helpers +- broadcom/compiler: track if geometry shaders write gl_PointSize +- v3dv: add support for geometry shaders to pipelines +- broadcom/compiler: create a helper for computing VPM config +- v3dv: emit state packets for geometry shaders +- v3dv: handle QUNIFORM_FB_LAYERS +- v3dv: fix copy buffer to image TFU path for 3D images +- broadcom/compiler: handle compact input arrays for geometry shaders +- broadcom/compiler: don't ignore constant offset on per-vertex input loads +- v3dv: implement layered attachment clears +- v3dv: remove fallback path for vkCmdClearAttachments +- v3dv: remove deferred vkCmdClearAtachments path +- broadcom/ci: update expected fails for v3dv after enabling geometry shaders +- v3dv: expose geometry shaders +- v3dv: fix push constant range for texel buffer copy pipelines +- v3dv: implement layered texel buffer copies using a geometry shader +- v3dv: allow batching texel buffer copies for 3D images +- v3dv: use defines for push constant offsets used by texel buffer copy shaders +- v3d: better scissor tracking +- broadcom/compiler: implement gl_PrimitiveID in FS without a GS +- v3dv: remove more dead clearing code + +Ian Romanick (49): + +- tgsi_exec: Fix NaN behavior of saturate +- tgsi_exec: Fix NaN behavior of min and max +- ci: Uprev piglit to b3a9fa345 ("framework/replay: Quote resource names before signing") +- tgsi_exec: Use C99 functions for min and max instead of open coding +- gallivm: Fix NaN behavior of min and max +- gallivm: Use range analysis to generate better fmin and fmax code +- gallivm: Use GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN for norm clamping +- gallivm: Remove unused GALLIVM_NAN_RETURN_NAN +- nir/algebraic: Remove some optimizations of comparisons with fsat +- nir/algebraic: Tautology replacements require sources be numbers +- nir/algebraic: Invert comparisons less often +- nir/algebraic: Equality comparison inversions require sources be numbers +- nir/algebraic: Mark some more comparison reductions exact +- nir/algebraic: Mark some more logic-joined comparison reductions as exact +- nir/algebraic: Rearrange some logic-joined comparisons and reduce +- nir/algebraic: Add algebraic opt for float comparisons with identical operands. +- util/format: Delete trailing whitespace +- dri: Fix typo before __DRI_IMAGE_COMPONENTS defines +- egl/dri2: Rely on drm-uapi for DRM_FORMAT defines +- mesa/st: Always call st_nir_lower_tex_src_plane if samplers were lowered +- nir/lower_tex: Add support for lowering Y41x formats +- util/format: Add Y41x formats +- egl/dri2: Add Y41x formats +- gallium/dri: Add Y41x formats +- util/format: Add Y21x formats +- egl/dri2: Add Y21x formats +- gallium/dri: Add Y21x formats +- intel/isl: Add mappings for PIPE_FORMAT_R8G8_R8B8_UNORM and PIPE_FORMAT_G8R8_B8R8_UNORM +- mesa: Add R8G8_R8B8 and G8R8_B8R8 formats +- nir/lower_tex: Add support for lowering YUYV formats +- gallium/dri: Allow use of R8G8_R8B8 for YUYV and G8R8_B8R8 for UYVY +- iris: Return correct enum names from fmt_swizzle +- iris: Silence warnings about implicit enum type conversions +- mesa/st: Don't assert !unify_interfaces in the passthrough edge flags case +- radeonsi: Use util_cpu_caps to detect number of CPUs +- util: Zero out all of mask in util_set_thread_affinity +- util: Change order of PIPE_OS_UNIX code in util_cpu_detect_once +- util: Trivial cleanup in the BSD code of util_cpu_detect_once +- util: Fix setting nr_cpus on some BSD variants +- util: Set util_cpu_caps.num_cpu_mask_bits based on total CPUs in the system +- util: Use maximum number of CPUs for determining cache topology +- util: Consider CPU affinity when detecting number of CPUs +- v3d: ci: Add KHR-GLES31.core.shader_image_load_store.basic-glsl-earlyFragTests to flakes +- intel/compiler: Add the ability to defer IP updates in backend_instruction::remove +- intel/compiler: Add cfg_t::adjust_block_ips() method +- intel/compiler: Update block IPs once in dead_code_eliminate +- intel/compiler: Update block IPs once in register_coalesce +- intel/compiler: Update block IPs once in opt_cmod_propagation +- nir/gcm: Clear out pass_flags before starting + +Icecream95 (38): + +- panfrost: Assert staging resource allocation was successful +- panfrost: Unset shared/scanout binding flags for staging resources +- pan/bi: Skip nir_opt_move/sink for blend shaders +- panfrost: Fix shader texture count +- pan/decode: Allow frame shader DCDs to be in another BO than the FBD +- pan/decode: Print errors when closing dump file +- pan/mdg: Fix calculation of available work registers +- panfrost: Remove incorrect comment +- panfrost: Fix viewport scissor for preload draws +- panfrost: Split panfrost_batch_submit to prevent stack overflows +- pan/bi: Add "lane_dest" modifier +- pan/bi: Replace lane0 modifier with lane_dest for load instructions +- panfrost: Make pan_select_crc_rt a non-static function +- panfrost: Always write reloaded tiles when making CRC data valid +- pan/bi: Add two tuples to a clause when needed with NOSCHED +- panfrost: Skip blit shader labelling if the buffer has no space +- panfrost: Fix polygon list size computations +- pan/mdg: Try scheduling load/store ops in pairs +- pan/decode: Flush the dump stream after decoding +- panfrost: Call abort() when aborting on fault +- panfrost: Use first_tiler to check if tiling is needed +- pan/mdg: Add a bundle ID to instructions +- pan/mdg: Reorder some code in mir_spill_register +- pan/mdg: Fill from TLS before spilling non-SSA nodes +- pan/mdg: Fix reading a spilt register in the bundle it's written +- pan/mdg: Add 16 bytes of padding to the end of shaders +- panfrost: Don't set dirty_mask for constant buffers +- pan/bi: Create a mask of UBOs that need to be uploaded +- pan/mdg: Create a mask of UBOs that need to be uploaded +- panfrost: Only upload UBOs when needed +- panfrost: Set bound dimensions to framebuffer size +- Revert "panfrost: Fix crc_valid condition" +- panfrost: Always use a fragment shader when alpha test is enabled +- panfrost: Fix GPU ID for t76x in get_perf_config +- panfrost: Fix full_threads calculation on v6 +- pan/bi: Create a nop clause when the shader starts with ATEST +- panfrost: Initialise the blend equation in create_blend_state +- pan/mdg: Analyze helper termination after scheduling + +Ilia Mirkin (29): + +- nv50/ir: offset accesses to shared memory +- nv50/ir: refine limitation on load/store loading offsets, include atomics +- nv50/ir: "zero" register does not work with g[] memory +- nv50/ir: mark ATOM as having 3 arguments +- nv50/ir: wipe any info about memory when seeing a locking op +- nv50/ir: optimize shift of 0 bits +- nv50: pass surface/buffer parameters to shader via aux buffer +- nv50/ir: add surface op lowering +- nv50/ir: add lowering for shared atomics +- nv50: add compute invocations counter +- nv50: add remapping of buffers/images into unified space +- nv50: add support for doing membars +- nv50: add indirect compute support +- intel: fix MI builder for pre-gen7 +- nv50: fix streamout queries +- nvc0: fix 3d images +- vdpau: allow state tracker to report a lower number of macroblocks +- nouveau: improve video limit reporting +- st/mesa: avoid enabling image/buffer/compute extensions for weak hardware +- mesa: relax ES 3.1 compute shader requirements +- st/mesa: properly encode OES_geometry_shader requirement +- mesa/get: allow image/buffer/atomic variables to be fetched in es3.1 +- st/mesa: allow hardware to claim ES 3.1 without hw indirect draws +- nv50: expose images/buffers/compute +- nv50: expose GL ES 3.1 for nva3+ hardware +- mesa: always expose NV_image_formats and OES_shader_image_atomic +- mesa: also flush after compute dispatch when debug flag enabled +- nv50: use the no-mipmap texture type for 2d ms views +- st/mesa: always report the max samples as supported + +Ishi Tatsuyuki (1): + +- radv: ignore redundant variable descriptor counts (v2) + +Italo Nicola (28): + +- pan/mdg: fix midgard writemask encoding for stores +- util: add util_sign_extend +- pan/mdg: clean up redundant/unused variables in disassemble.c +- pan/mdg: rename dest_override to shrink_mode +- pan/mdg: improve outmod printing +- pan/mdg: refactor mir_pack_swizzle +- pan/mdg: add proper expand_mode enum +- pan/mdg: encode/decode expand_mode properly +- pan/mdg: add midgard_src_expand_mode validation +- pan/mdg: improve input modifier printing +- pan/mdg: improve swizzle decoding +- pan/mdg: fix/change ALU opcodes descriptions and add some missing ops +- pan/mdg: stop querying datatype by reading opcode name +- pan/mdg: print input data type for ALU opcodes +- pan/mdg: stop using size disambiguation suffixes +- pan/mdg: fix midgard.h indentation +- pan/mdg: improve mask decoding +- pan/mdg: remove register prefixes +- pan/mdg: print special alu arg outmods +- pan/mdg: misc cleanups +- pan/mdg: add helpers for load/store special read regs +- pan/mdg: improve ldst opcode names and add missing ops +- pan/mdg: print names of non-work registers +- pan/mdg: properly encode/decode ldst instructions +- pan/mdg: improve tex opcode decoding and add missing ops +- panfrost/ci: Improve coverage for T860 +- virgl: implement EXT_multisampled_render_to_texture +- panfrost: fix GL_EXT_multisampled_render_to_texture regression + +Iván Briano (2): + +- intel/nir: Fix txs for null surfaces +- anv: fix feature/property/sizes reported for fragment shading rate + +James Jones (18): + +- gbm: Remove stat and refcount fields from gbm_device +- gbm: Inline load_backend function content +- gbm: Create device directly in find_backend +- gbm: Consolidate env var and default backend loops +- gbm: Give getenv backend override its own function +- gbm: Give gbm_device a reference to its backend +- gbm: Add gbm_core struct to export code to backends +- gbm: Move majority of gbmint.h to gbm_backend_abi.h +- gbm: Version the GBM backend interface +- gbm: Add backend ABI-check test +- gbm: Rename backend description list to builtin_backends +- loader: Factor out driver library loading code +- meson: Add a GBM backends search path build option +- gbm: Rename the DRI backend from gbm_dri.so to dri +- gbm: Put common device creation in a helper function +- gbm: Support dynamically loading named backends +- gbm: Load backend based on DRM device driver name +- loader: Handle failure to load DRI driver library + +James Park (14): + +- meson: Fix winflexbison warnings +- ac/surface: Move drm_fourcc.h to common header +- radv: Use ac_drm_fourcc.h +- meson: Add wrap for libelf on Windows +- meson: Disable libdrm for RADV on Windows +- meson: Disable MSVC warning 5105 +- amd: Fix warnings around variable sizes +- radv: Add _WIN32 guard in radv_check_gpu_hangs +- radv: Fix unused label warning on Windows +- radv: Add on WIndows for missing close() +- draw/clip: Use NAN to make MSVC happy +- llvmpipe: Remove stray ## operator for MSVC +- ci: Update Windows image to build RADV +- vulkan: Support 32-bit "weak" symbols on MSVC + +Jan Beich (1): + +- anv: adjust headers for non-GNU after e9e1e0362b6c + +Jason Ekstrand (139): + +- intel/compiler: Don't insert barriers for NULL sources +- anv: Use the same re-order mode for streamout as for GS +- vulkan: Update the XML and headers to 1.2.177 +- anv: Implement VK_EXT_provoking_vertex +- gallium: Add a transcode_astc driconf option +- intel/isl: There are seven aux states +- intel/isl: Fix isl_color_value_unpack to match the prototype +- intel/eu: SVB writes only happen on Gen6 +- intel/fs: Stop using brw_dp_read/write_desc in Gen7+ only code +- intel/eu: Set message subtype properly for SIMD8 FB fetch +- intel/fs: Don't use pixel_z for Gen4-5 source_depth_to_render_target +- intel/nir: Set lower txs with non-zero LOD +- nir/builder: Move clamp helpers to nir_builder.h +- anv: Check offset instead of alloc_size for freeing surface states +- anv: Allow storage on all formats that support typed writes +- anv: Plumb the shader into push constant helpers +- anv: Support pushing shader constants +- anv: Push at most 32 regs for vec4 shaders +- intel/vec4: Don't spill fp64 registers more than once +- intel/vec4: Add some asserts to move_push_to_pull +- intel/vec4: Update nr_params in pack_uniform_registers +- intel/vec4: Set up push ranges before we emit any code +- intel/vec4: Add support for masking pushed data +- intel/vec4: Add support for UBO pushing +- nir: Add a nir_instr_move helper +- nir/gather_info: Expose a nir_intrinsic_writes_external_memory helper +- nir: Add a discard optimization pass +- intel/fs: Handle non-perspective-correct interpolation on gen4-5 +- intel/nir,i965: Move HW generation check for UBO pushing to i965 +- intel/vec4: Also use MOV_FOR_SCRATCH for swizzle resolves +- intel/isl: Fix isl_format_is_valid +- intel/fs/ra: Fix payload node setup for SIMD16 on Gen4-5 +- ttn: Stop manually managing system_values_read +- anv: Require softpin on Gen8+ +- anv: Make use_softpin compile-time in genX code +- anv: Handle OOM in the pinned path in anv_reloc_list_add +- anv: Add a helper to add a BO to the batch list without a reloc +- anv: Make anv_batch_emit_reloc inline and optimize SKL+ +- anv: Fast-path surface relocs when we have softpin +- anv: Optimize anv_address_physical when ANV_ALWAYS_SOFTPIN +- anv/blorp: Optimize addresses/relocations when ANV_ALWAYS_SOFTPIN +- iris: Use isl_surf_get_image_surf instead of hand-rolling it +- iris: Move target_to_isl_surf_dim to iris_resource.c +- intel/isl: Add a isl_surf_get_image_offset_B_tile_el helper +- intel/blorp: Use isl_surf_get_image_offset_B_tile_el in ccs_ambiguate +- intel/isl: Make the offset helpers four dimensional +- intel/isl: Make tile logical extents four dimensional +- intel/isl: Use a 4D physical total extent for size calculations +- i965: Use nir_lower_passthrough_edgeflags +- anv: Agressively no-op Flush/InvalidateMappedMemoryRanges +- docs: Begin documenting ISL +- isl: Document more members of isl_surf +- docs/isl: Document ISL's units +- docs/isl: Add detailed documentation about isl formats +- docs/isl: Add detailed documentation about tiling on Intel GPUs +- docs/isl: Add detailed documentation about CCS compression +- util: Move the 4x4 matrix inverse function to u_math +- crocus: Drop extra_aux support +- nir,amd: Suffix nir_op_cube_face_coord/index with _amd +- nir,panfrost: Suffix fsat_signed and fclamp_pos with _mali +- nir,vc4: Suffix a bunch of unorm 4x8 opcodes _vc4 +- vulkan: Update the XML and headers to 1.2.182 +- nir: Require vectorized ALU ops to be all-or-nothing +- nir,docs: Add docs for NIR ALU instructions +- nir: Document all the ALU opcodes +- docs,isl: Document Sandy Bridge HiZ/stencil +- editorconfig: Use 3-space tabs for .rst +- docs/nir: Use 3-space tabs +- docs/isl: Consistently use 3-space tabs +- spirv: Create acceleration structure and shader record variables +- anv: Add minimal boilerplate for VK_KHR_acceleration_structure +- anv: Add stub support for acceleration structures +- anv: Add support for binding acceleration structures +- anv: Add minimal boilerplate for VK_KHR_ray_tracing_pipeline +- anv: Get ready for more pipeline stages +- anv: Add a ray-tracing pipeline object +- anv: Add support for binding ray-tracing pipelines +- anv,iris: Move the SHADER_RELOC enums to brw_compiler.h +- intel/compiler: Generalize shader relocations a bit +- intel/compiler: Add a U32 reloc type +- intel/fs: Add support for compiling bindless shaders with resume shaders +- intel/rt: Use reloc constants for the resume SBT +- anv: Disallow UBO pushing for bindless shaders +- nir/apply_pipeline_layout: Handle bindless shaders +- anv: Support fetching descriptor addresses from push constants +- anv: Compile ray-tracing shaders +- anv: Compile trivial return and trampoline shaders +- intel/fs: Don't pull CS push constants if uses_inline_data +- anv: Create and return ray-tracing pipeline SBT handles +- anv: Compute scratch sizes for ray-tracing pipelines and shader groups +- anv: Add support for vkCmdSetRayTracingPipelineStackSizeKHR +- anv: Allow _anv_combine_address with a NULL batch +- anv: Make anv_address::offset 64-bit +- anv: Implement vkCmdTraceRays and vkCmdTraceRaysIndirect +- isl: Assert some iris invariants in isl_surf_get_ccs_surf +- isl: Take a hiz_or_mcs_surf in isl_surf_supports_ccs +- isl,iris: Move the extra_aux_surf logic into iris +- isl,docs: Add a chapter on AUX state tracking +- docs/isl: Improve the bit[6] swizzling section of the tiling chapter +- include/drm-uapi: bump headers +- anv: Claim to be a discrete GPU if has_lmem +- util: Add an implementation of qsort_r for non-GNU platforms +- nir: Add a function for sorting variables +- intel/genxml: Add SURFTYPE_SCRATCH on GFX version 12.5 +- intel/isl: Add support for scratch buffers +- intel/fs: Implement spilling on XeHP +- intel/fs: Implement load/store_scratch on XeHP +- intel/genxml: Add new ScratchSpaceBuffer fields on GFX version 12.5 +- iris: Add a MEMZONE_BINDLESS and uploader +- iris: Add support for scratch on XeHP +- anv: Add support for scratch on XeHP +- intel/genxml: Remove old scratch fields on GFX version 12.5 +- iris/bufmgr: Stop changing mapping modes on buffers +- intel/devinfo: Add a has_lsc bit +- intel/compiler: Add LSC to messages brw_ir_performance +- intel/fs: Lower uniform pull constant load message to LSC dataport +- docs/isl/tiling: Fix swizzle pattern for X-tiling +- intel/isl: Pull the uncompressed surface view code from anv +- intel/blorp: Adjust the compressed copy rectangle before convert_to_single_slice +- intel/blorp: Use isl_surf_get_uncompressed_surf +- intel/isl: Add more cases to isl_surf_get_uncompressed_surf +- iris: Don't leak the surface if uncompressed re-interp fails +- iris: Use isl_surf_get_uncompressed_surf +- nir: Drop nir_ssa_def::name and nir_register::name +- android: Drop the Android.mk build system +- android: Restore android/Android.mk +- nir/lower_subgroups: Pad ballot values before bitcasting +- docs: Add docs for running a local Mesa build +- mailmap: Update for Emma's new e-mail address +- Convert a few files to UTF-8 +- mailmap: Add two more lines for Alyssa Rosenzweig +- glsl: Delete lower_texture_projection +- anv/allocator: Use list->u64 in free_list_push +- iris: Re-emit MEDIA_VFE_STATE for variable group size shaders +- anv: Handle errors properly in anv_i915_query +- intel: Pull anv_i915_query into common code +- anv: Use intel_i915_query_alloc for memory regions +- iris: Use intel_i915_query for meminfo +- nir/lower_tex: Rework invalid implicit LOD lowering + +Jeremy Huddleston (2): + +- libgl-xlib: Set darwin-versions +- libgl-xlib: Add missing dep_x11 dependency + +Jeremy Newton (1): + +- Update libva requirement + +Jesse Natalie (44): + +- microsoft/spirv_to_dxil: Lower samplers from deref to index +- microsoft/spirv_to_dxil: Lower loads/stores to DXIL +- microsoft/compiler: Support raw SRVs/UAVs through dxil_module_get_res_type +- microsoft/compiler: Support arrays of UBOs +- microsoft/compiler: Emit CBVs via variables for Vulkan +- microsoft/compiler: Emit SSBO variables +- microsoft/compiler: Split Vulkan resource_index / descriptor processing +- microsoft/compiler: Better support UBO/SSBO references to descriptors +- microsoft/compiler: Store nir_shader in the ntd_context +- microsoft/compiler: Support raw SRVs in addition to typed SRVs +- microsoft/compiler: Propagate access when lowering SSBO loads +- microsoft/clc: If local size isn't specified either in the shader or at runtime, set it to (1,1,1) +- gallium: Define PIPE_ARCH_AARCH64 for MSVC arm64 builds +- nir: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- d3d12: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- microsoft/clc: Fix MSVC unreferenced variable warnings +- microsoft/clc: Fix undeclared function warning +- microsoft/compiler: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- shader_enums: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- gallium/aux: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- llvmpipe: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- xmlconfig: Fix MSVC warning C4334 (32bit shift cast to 64bit) +- CI: Windows: Bump warning level to W3 (except for zlib) +- microsoft/compiler: Remove hardcoded limits on numbers of resource arrays +- microsoft/compiler: Remove assert-only resource size or usage tracking +- microsoft/compiler: For Vulkan environment, don't create resource handles upfront +- vtn: Propagate access data that's present on all struct members to the struct itself +- vtn: Propagate access data from UBO/SSBO/push constant types to variables of that type, not just their pointers +- nir: Rename nir_lower_cl_images_to_tex, replace 'cl' with 'readonly' +- nir_lower_readonly_images_to_tex: Support non-CL semantics +- nir_lower_readonly_images_to_tex: Use nir_shader_lower_instructions +- microsoft/compiler: Treat read-only SSBOs as SRVs +- microsoft/spirv_to_dxil: Treat read-only storage images as SRVs +- d3d12, microsoft/compiler: Use SRV/sampler variable binding data +- microsoft/compiler: Rewrite sampler splitting pass to be smarter and handle derefs +- microsoft/compiler: Fix function signature for bufferStore to support overloads +- microsoft/compiler: Map descriptor set -> binding space +- microsoft/compiler: Handle unbounded arrays +- llvmpipe: Fix optimization loop to actually loop +- nir: Add relaxed 24bit opcodes +- vtn: Use relaxed 24bit opcodes for CL 24bit math +- microsoft/compiler: Change behavior for emitting inexpressible barriers +- nir_lower_readonly_images: Clear variable data when changing the type +- mesa/main: Check for fbo attachments when importing EGL images to textures + +John Bates (1): + +- add execmem build option + +Jonathan Marek (4): + +- freedreno/registers: define REG_DSI_CPHY_MODE_CTRL +- tu: remove workaround for conditional rendering + hw binning +- freedreno/a6xx: larger gmem_page_align from tile align instead of gpu id +- freedreno/common: unhardcode CCU color cache offset + +Jordan Justen (20): + +- Revert "intel/compiler: Silence unused parameter warning in update_inst_scoreboard" +- intel/eu: Allow 64-bit registers on XeHP. +- intel/fs: Disable 3-src immediates on XeHP. +- intel/fs: End computer shader with message gateway on XeHP. +- intel/compiler: Lower txd for 3D samplers on XeHP. +- intel/compiler: Fix INTEL_DEBUG=hex +- commit_in_branch_test.py: Rename branch master to main +- bin/pick: Rename master branch to main +- .gitlab-ci.yml: Use main branch for gitlab ci +- issue_templates/Bug Report: Rename master branch to main +- docs/releasing.rst: Rename master branch to main +- docs: Rename master branch to main +- mesa: NOTE! Default branch is now main +- intel/isl: Add Wa_22011186057 to disable CCS on ADL GT2 A0 +- intel/dev: Add device info for ADL GT2 +- intel: Add 2 ADL-S pci-ids +- intel/gen125.xml: Drop GPGPU_WALKER +- intel/devinfo: Add has_local_mem +- iris/bufmgr: Align vma addresses to 64K for local memory +- intel/dev: Set has_local_mem for DG1 + +Jose Maria Casanova Crespo (5): + +- v3d: YUV formats at is_dmabuf_modifier_supported are external_only +- v3d: YUV formats at query_dmabuf_modifiers are external_only +- v3d: DRM_FORMAT_MOD_BROADCOM_SAND128 only available for NV12 format. +- ci/v3d: Update piglit expectations. +- v3d/driconf: Expose non-MSAA texture limits for mutter and gnome-shell + +Joshua Ashton (5): + +- radv: Handle unnormalized samplers in YCbCr lowering +- venus: Fix zero-initialized fd causing apps to hang/crash +- driconf: Add more workarounds for Teardown +- llvmpipe: Handle NULL views in llvmpipe_cleanup_stage_sampling +- lavapipe: Use common Vulkan format helpers + +Joshua Watt (1): + +- v3d, vc4: Fix dmabuf import for non-scanout buffers + +José Fonseca (8): + +- lavapipe: Fix lvp_execute_cmds' pipe_stream_output_target leak. +- lavapipe: Fix lvp_pipeline_compile's nir_xfb_info leak. +- wgl: Remove opengl32.mingw.def. +- draw: Allocate extra padding for extra shader outputs. +- draw: Plug leak when combining tessellation with primitive assembly. +- d3d10umd,d3d10sw: Initial import. +- d3d10sw: Add a sanity test. +- d3d10umd: Avoid duplication in CreateDevice. + +Juan A. Suarez Romero (42): + +- v3dv: avoid dereferencing null value +- ci: support KHR-GLES testing +- ci/v3d: add KHR-GLES test jobs +- ci/llvmpipe: run KHR-GLES2.* tests +- ci/softpipe: run KHR-GLESxx tests +- iris: hook up memory object creation from handle +- iris: hook up resource creation from memory object +- iris: enable GL_EXT_memory_object feature +- ci/broadcom: update expected results +- ci/vc4: add KHR-GLES2.* job test +- ci/broadcom: add EGL testing jobs +- v3dv: check returned values +- ci/v3d: execute all piglit tests +- v3dv/pipeline_cache: bail out in case of error +- ci/v3d: fix typo in job name +- ci/v3dv: update flakes +- ci/baremetal: propagate ASAN_OPTIONS to devices +- ci/broadcom: update expected results +- v3d: rename header include guards +- v3d: rename VC5 enums and definitions +- broadcom/qpu: rename from VC5 to V3D +- broadcom/simulator: change references to VC5 +- v3dv: rename VC5 to V3D +- v3dv: check dest bitsize in color blit +- util/hash_table: do not leak u64 struct key +- ci/broadcom: update expected results +- v3d: fix resource leak in error path +- st/mesa: fix pipe resource leak +- broadcom/compiler: fix dynamic-stack-buffer-overflow error +- ci: Update VK-GL-CTS to 1.2.6.1 +- ci/broadcom: update expected results +- vc4: initialize array +- ci/v3dv: update expected results +- ci/broadcom: unset manual jobs +- ci/v3dv: test v3dv in arm64 environment +- broadcom/ci: Report flakes on IRC +- ci/vc4: update piglit failures +- ci: update VK-GL-CTS to 1.2.6.2 +- broadcom/compiler: emit TMU flush before a jump +- v3dv: assert job->cmd_buffer is valid +- broadcom: remove v3dv3 from neon library +- gallium/hud: initialize query + +Kai-Heng Feng (1): + +- iris: Avoid abort() if kernel can't allocate memory + +Karol Herbst (18): + +- clover/llvm: handle Fixed vs Scalable vectors explicitly starting with llvm-11 +- util/format: fix value declarations for big endian +- nv50/query: fix stringop-overflow gcc warning +- nvc0: fix implicit-fallthrough gcc warning +- clover/memory: fix data race in buffer subclasses +- nouveau: fix race in nouveau_screen_get_name +- nouveau/mm: pass mm_bucket to mm_slab_new +- nouveau/mm: remove unused nouveau_mm_allocation.next field +- nv50/ir: when constant folding shl(mul, a) we need to copy muls type +- nv50/ir: don't optimize shl(mul_hi, a) to mul_hi +- nv50/ir/ra: fixes upcoming barrier file +- nv50/ir: add barrier and thread_state files +- gv100/ir: add support for barrier thread state files for OP_CVT +- gm107/ir: emit barrier sources for quadon/pop +- gv100/ir: fix quadop/pop lowering +- nv50/ir: fix surface lowering when values get shared accross operations +- nv50/ir/nir: fix smem size for GL +- nv30: fix emulated vertex index buffers + +Keith Packard (1): + +- vulkan/x11: Mark present complete using serial instead of MSC + +Kenneth Graunke (29): + +- iris: only flush the render cache for aux changes, not format changes +- isl: Work around NVIDIA and AMD display pitch requirements +- i965: Don't advertise Y-tiled modifiers for scanout buffers on Gfx8- +- iris: Don't advertise Y-tiled modifiers for scanout buffers on Gfx8 +- iris: Replace no_gpu flag with PIPE_MAP_DIRECTLY +- iris: Promote to MAP_DIRECTLY when required before NULL return +- iris: Delete a comment suggesting we use tiled staging buffers +- iris: Make an iris_bo_is_external() helper and use it in a few places +- iris: Track imported vs. exported status separately +- iris: Use staging blits for reads from uncached buffers. +- iris: Use staging blits for transfers involving imported BOs +- iris: Assert on mapping a tiled buffer without MAP_RAW +- iris: Drop fallback GEM_MMAP_GTT if GEM_MMAP with I915_MMAP_WC fails +- iris: Delete GTT mapping support +- iris: Pick a single mmap mode (WB/WC) at BO allocation time +- iris: Use bo->mmap_mode in transfer map read check +- iris: Add a flags argument to iris_bo_alloc() +- iris: Add an alignment parameter to iris_bo_alloc() +- iris: Only use SET/GET_TILING when exporting/importing BOs +- iris: Add a BO_ALLOC_SMEM flag for allocating from system memory +- anv: Fix dynamic primitive topology for tess on Gfx7.x too +- iris: Stop calling I915_GEM_SET_CACHING on discrete GPUs +- iris: Fail BO allocation if we can't enable snooping properly. +- iris: Delete unused bo->cache_coherent flag +- vulkan/wsi: Fix prime blits to use system memory for the destination +- iris: Reduce SSBO alignment requirements from 64B to 4B +- crocus: Reduce SSBO alignment requirements from 64B to 4B. +- iris: Force device local memory for u_upload_mgr buffers +- iris: Use simple_mtx in the bufmgr. + +Leo Liu (9): + +- frontends/va: add VASurfaceAttribUsageHint attribute +- frontends/va: fix multi planes for external memeory type +- frontends/va: use pipe buffer map instead of texture map +- radeon/vcn/enc: use surface swizzle mode instead of linear +- radeonsi: add PIPE_FORMAT_P010 for HEVC Main10 profile to encode param +- radeonsi: separate video hw info based on HW engine individually +- frontends/va: use the correct entrypoint to get config attributes +- frontends/va: include the profile queries for encoder as well +- frontends/va: use the entrypoint from context instead of the hard-coded one + +Lepton Wu (3): + +- virgl: move new added field to the end. +- Revert "virgl: Cache depth and stencil buffers" +- gallium: Reset {d,r}Priv in dri_unbind_context + +Lionel Landwerlin (59): + +- anv: fix 3DSTATE_MULTISAMPLE emission on gen8+ +- anv: disable baked in pipeline bits from dynamic emission path +- vulkan/util: cast enums to int64_t in switch +- spirv: fix uToAccelerationStructure handling +- spirv: fixup pointer_to/from_ssa with acceleration structures +- vulkan: bump headers/registry to version 1.2.175 +- anv: drop extension check for dynamic state +- anv: prepare pipeline for delayed emission of color writes +- anv: implement VK_EXT_color_write_enable +- anv: reuse define for number of render target assert +- vulkan/wsi/display: don't report support if there is no drm fd +- i965/bufmgr: fix invalid assertion +- intel/dev: printout correct subslice/dualsubslice name +- intel/genxml: Add coarse pixel shading instructions +- intel/decoder: decode CPS_STATE +- intel/compiler: make sure we keep the lowest dispatch limit +- intel/compiler: rework message descriptors for render targets +- intel/compiler: use existing helpers to pull bits of descriptors +- intel/compiler: handle coarse pixel in render target writes descriptors +- intel/compiler: add support for fragment shading rate variable +- intel/compiler: add support for fragment coordinate with coarse pixels +- intel/compiler: add coarse pixel offset on Gfx12.5+ +- intel/compiler: add restrictions related to coarse pixel shading +- anv: implement VK_KHR_fragment_shading_rate +- isl: document format fields +- intel/fs: use the final destination type for regioning restrictions +- intel/mi_builder: fix resolve call +- anv: fix perf query pass with command buffer batching +- anv: handle spirv parsing failure +- iris: fix assert to reflect correct limit for encoded size +- intel/perf: allow opening perf stream with no context filtering +- intel/perf: allow metric sets to be loaded with on OA reports +- anv: fixup physical device properties of fragment shading rate +- intel/fs: make sure shuffle is lowered to supported types +- intel/perf: update gen9/11 TestOa configs +- intel/perf: update Gen11 RenderBasic programming +- intel/perf: update Gen11 RenderBasic programming +- intel/perf: add EHL availability condition to HDCAndSF counters +- intel/perf: update Gen9/11 programming for AsyncCompute +- intel/perf: rename metric descriptions +- anv: implement VK_EXT_physical_device_drm +- blorp: add blorp string in shader keys +- anv: cache raytracing trampoline shader +- anv: store more RT shader data in pipeline_stage object +- anv: move trivial return shader to device +- anv: implement caching for ray tracing pipelines +- intel/rt: switch to common pass for shader calls lowering +- nir: drop the btd_resume_intel intrinsic +- nir: use a more fitting index for btd_stack_push_intel +- anv: bound checks buffer memory binding in debug builds +- anv: allocate bigger batches as we grow command buffers +- intel/perf: use the right popcount for 64bits +- intel/compiler: Track latency/perf of LSC fences +- isl: fix mapping of format->stringname +- loader/dri3: create linear buffer with scanout support +- nir/lower_shader_calls: adding missing stack offset alignment +- anv: fix submission batching with perf queries +- drm-shim: implement stat/fstat when xstat variants are not there +- intel/disasm: fix missing oword index decoding + +Lucas Stach (9): + +- etnaviv: fix vertex sampler setup +- dri: add loader_dri_create_image helper +- loader/dri3: convert to loader_dri_create_image +- loader/dri: hook up createImageWithModifiers2 +- gallium/dri: copy image use in dup_image +- dri: don't call modifier interfaces when modifiers_count is 0 +- frontend/dri: add EXPLICIT_FLUSH hint in dri2_resource_get_param +- etnaviv: remove double assigment of surface->texture +- etnaviv: flush used render buffers on context flush when neccessary + +Luis Felipe Strano Moraes (2): + +- meson: print information about layers being built as part of summary +- overlay_layer: add missing undef + +Maksim Sisov (2): + +- iris: export GEM handle with RDWR access rights +- i965: export GEM handle with RDWR access rights + +Marcin Ślusarz (28): + +- intel/tools: remove unused macros +- intel/batch_decoder: set foreground color of decoded instructions +- i965: fully populate perf_config before using it to initialize perf_context +- iris: fully populate perf_config before using it to initialize perf_context +- intel/perf: move calculation of period_exponent to perf ctx init +- gallium/u_threaded: implement INTEL_performance_query hooks +- gallium/u_threaded: offload begin/end_intel_perf_query +- nir: handle float atomics in nir_gather_info +- nir: handle float atomics in nir_lower_memory_model +- intel: simplify is_haswell checks, part 1 +- intel: simplify is_haswell checks, part 2 +- i965: simplify gfx version checks +- intel/isl: replace format_gen by verx10 +- intel/disasm: decode/describe more send messages +- intel/disasm: remove useless space after "(" +- iris: fix error message on I915_GEM_[GS]ET_TILING failure +- intel/decoder: add assert for register size +- anv: fix potential integer overflows +- intel/tools: fix left shift overflow on 32-bit +- intel/tools: fix int-to-pointer/pointer-to-int cast warnings on 32-bit +- intel/tools: fix invalid type in argument to printf format specifier +- intel/tools: fix potential memory leaks +- intel/blorp: initialize BLEND_STATE using braced initializer list +- intel/fs: use stack for temporary array +- anv: keep descriptor set's address directly in anv_descriptor_set +- anv: handle push descriptor sets when they are sent with push constants +- anv: drop unused argument of anv_descriptor_set_address +- intel/compiler: document register types + +Marek Olšák (190): + +- ci: don't build clover with LLVM 9 on radeonsi because it's unsupported +- amd: drop support for LLVM 9 +- amd: drop support for LLVM 10 +- amd: remove some references to older LLVM versions in comments +- amd/registers: fix the kernel header parser with latest headers +- amd/registers: clean up gfx103.json +- amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning +- radeonsi: don't decompress DCC for float formats in si_compute_copy_image +- radeonsi: fix incorrect comments in culling code and NIR lowering +- radeonsi: fix automatic DCC retiling after DCC clear and DCC decompression +- radeonsi: fix automatic DCC retiling after compute image stores +- gallium/util: add easy profiling helpers using TIME_ELAPSED queries +- Revert "st/pbo: use cso_set_vertex_buffers_and_elements() for st_pbo_draw" +- Revert "ci/radeonsi: Add expected failures due to #4674 having slipped in" +- ac/surface: document more meta equation dependencies +- radeonsi: make the gfx9 DCC MSAA clear shader depend on the number of samples +- radeonsi: remove the separate DCC optimization for Stoney +- amd: addrlib update for April +- gallium: renumber PIPE_MAP_* enums to remove holes +- gallium: remove 4 bytes from pipe_transfer +- gallium+(u_threaded,r300,r600,radeonsi): move transfer offset into pipe_transfer +- util: print CPU caps in release builds too +- util: fix (re-enable) L3 cache pinning +- Revert "gallium/u_threaded: align batches and call slots to 16 bytes" +- gallium/u_threaded: move base_valid_buffer_range to transfer where it belongs +- gallium/u_threaded: handle sampler views == NULL better +- gallium/u_threaded: rewrite slot layout to reduce wasted space +- gallium/u_threaded: don't set resource pointers to NULL after driver calls +- gallium/u_threaded: fix 32-bit breakage due to incorrect pointer arithmetic +- gallium/u_threaded: pass last into and return call size from execute callbacks +- gallium/u_threaded: merge draws in tc_call_draw_single +- gallium/u_threaded: add callbacks and documentation for resource busy checking +- gallium/u_threaded: track whether TCS, TES, or GS have ever been used +- gallium/u_threaded: query shader resource limits +- gallium/u_threaded: add buffer lists - tracking of buffers referenced by tc +- gallium/u_threaded: add driver-internal flush tracking for buffer lists +- gallium/u_threaded: don't invalidate idle buffers +- gallium/u_threaded: always map idle buffers unsynchronized +- gallium/u_threaded: don't reference resource in pipe_transfer +- util: add thread-safe version of idalloc +- zink: don't set u_resource_vtbl +- gallium: remove unused u_default_transfer_unmap +- gallium: remove resource_get_handle from u_resource_vtbl +- gallium: remove transfer_flush_region from u_resource_vtbl +- radeonsi: stop using u_resource_vtbl::resource_destroy +- r300: stop using u_resource_vtbl::resource_destroy +- nouveau: stop using u_resource_vtbl::resource_destroy +- i915g: stop using u_resource_vtbl::resource_destroy +- virgl: stop using u_resource_vtbl::resource_destroy +- svga: stop using u_resource_vtbl::resource_destroy +- r600: stop using u_resource_vtbl::resource_destroy +- gallium: remove u_resource_vtbl::resource_destroy +- gallium: split transfer_(un)map into buffer_(un)map and texture_(un)map +- gallium: remove u_resource_vtbl::transfer_(un)map +- gallium: remove empty structure u_resource_vtbl +- gallium: remove structure u_resource +- radeonsi: simplify the NGG culling vertex count heuristic +- amd: add Beige Goby support +- amd/registers: don't generate 32-bit register fields +- amd/registers: regenerate json files without 32-bit register fields +- amd: fix incorrect addrlib comment for HTILE equations +- ac/gpu_info: set has_zero_index_buffer_bug for Navi12 too +- ac/llvm: set target features per function instead of per target machine +- ac/llvm: expose set_range_metadata to more users +- ac/llvm: allow ac_build_optimization_barrier with SGPRs, pointers, and metadata +- ac/llvm: set range metadata on mbcnt and deduplicate get_thread_id +- ac/llvm: don't draw the primitive for the dummy export workaround for Navi1x +- winsys/amdgpu: don't hold a mutex while accessing is_shared +- radeonsi: remove unused SI_IMAGE_ACCESS_AS_BUFFER +- radeonsi: handle PIPE_CAP_MAX_VERTEX_BUFFERS +- radeonsi: add a gfx10 bug workaround for NOT_EOP +- radeonsi: fix a coherency issue when VS memory stores are not visible in PS +- radeonsi: always use the L2 LRU cache policy for faster clears and copies +- radeonsi: don't disable L2 caching for staging textures +- radeonsi: don't use GS fast launch with small instances +- radeonsi: fix the fast launch vert/prim thread counts if they are trimmed +- radeonsi: remove a twice duplicated workaround for VERT_GRP_SIZE +- radeonsi: re-enable fast launch with indexed tri strips because it doesn't hang +- radeonsi: improve generated culling code by adding optimization barriers +- radeonsi: change si_resource::alignment to alignment_log2 for better packing +- radeonsi: remove 8 bytes from si_resource, turn other 4 bytes into padding +- radeonsi: add a gfx10 hw bug workaround with the barrier before gs_alloc_req +- radeonsi: add missing threaded_resource_deinit calls in fail paths +- radeonsi: rewrite the prefix sum computation for shader culling +- radeonsi: allow changing the NGG subgroup size to 256 but don't change it yet +- radeonsi: generate buffer_id_unique for u_threaded_context +- radeonsi: implement threaded context callbacks for resource busy checking +- radeonsi: disable DFSM on gfx9 by default because it decreases performance a lot +- radeonsi: remove DFSM after we discovered how bad it is +- gallium/u_vbuf: add a fast path to skip refcounting for uploaded user buffers +- mesa: move _mesa_copy_vertex_attrib/buffer functions to their only use +- mesa: don't call _mesa_set_draw_vao in glPushClientAttrib +- mesa: optimize glPush/PopClientAttrib for GL_CLIENT_VERTEX_ARRAY_BIT +- mesa: optimize unreferencing VBOs in glPopClientAttrib +- mesa: don't call FLUSH_VERTICES in glPopClientAttrib +- mesa: don't save/restore VAO NumUpdates and IsDynamic to fix update tracking +- st/mesa: execute glFlush asynchronously if no image has been imported/exported +- gallium/u_threaded: don't update valid_buffer_range for read-only shader buffers +- gallium/u_threaded: clear valid buffer range only if it's not bound for write +- gallium/u_threaded: use tc_drop_resource_reference in call_draw_single_drawid +- gallium/u_threaded: merge draws faster by merging indexbuf unreferencing +- radeonsi: check is_buffer once instead of 4 times in si_set_sampler_view_desc +- radeonsi: use the restrict keyword to set sampler view descriptors faster +- radeonsi: don't clear register fields in si_set_mutable_tex_desc_fields +- radeonsi: restructure si_set_sampler_views for faster unbinding trailing slots +- radeonsi: remove no-op unref in si_set_constant_buffer +- radeonsi: set desc[3] of all buffer descriptors at context creation +- radeonsi: move a few functions from si_state_draw.cpp into si_gfx_cs.c +- radeonsi: compile si_state_draw.cpp for each gfx generation separately +- radeonsi: remove the chip_class dimension from the draw_vbo array +- radeonsi: remove -Wstrict-overflow=0 since it doesn't seem to be needed +- gallium/pb: change alignment to 32 bits +- shader_enums: change VERT_BIT back to the 32-bit shift +- glthread: change when glFlush flushes asynchronously +- st/mesa: fix an incorrect comment in st_context_flush +- st/mesa: move the st_flush_bitmap_cache call into st_flush +- mesa: add gallium flush_flags param into ctx->Driver.Flush +- mesa: move _mesa_notifySwapBuffers into the x11 swrast driver +- mesa: execute glFlush asynchronously if no image has been imported/exported +- radeonsi: fix compile failures with SI_PRIM_DISCARD_DEBUG enabled +- radeonsi: use ac_build_bit_count instead of opencoding it +- radeonsi: fix incorrect counting of compute_num_verts_rejected +- radeonsi: fix multi draws for the prim discard CS +- ac/llvm: add a callback to ac_cull_triangle to generate code in inner-most block +- radeonsi: move the accepting code into the bbox cull branch in NGG cull code +- ac/surface/tests: fix RB counts +- ac/surface: don't set DCC_PIPE_ALIGN modifier bit for gfx10 with 1 RB +- radeonsi: restructure si_get_vs_vgpr_comp_cnt for readability +- radeonsi: merge 2 conditional blocks with same condition into 1 in culling code +- radeonsi: set more precise max_waves in NGG code +- radeonsi: remove incorrect comment about PA +- radeonsi: try to keep all VS input loads together for better perf +- radeonsi: don't compile TES and GS draw_vbo variants for the prim discard CS +- radeonsi: remove the Z culling option from the primitive discard CS +- radeonsi: drop gfx7 support from the prim discard CS to simplify code +- radeonsi: drop support for triangle fans from the prim discard CS +- radeonsi: skip buffer_atomic_add(ptr, n) when n=0 in the prim discard CS +- radeonsi: cleanup some primitive discard CS TODOs regarding instancing, etc. +- ac/llvm: don't set skip-uniform-regions to fix atomic.cmpswap +- mesa: unreference zombie buffers when creating buffers to lower memory usage +- radeonsi: document why VBO descriptors in user SGPRs are beneficial +- radeonsi: if shader culling culls all vertices, cull the primitive exports too +- radeonsi: remove incorrect comment about hangs in gfx10_ngg_gs_emit_epilogue +- radeonsi: don't use NGG culling on 1 RB chips +- ac/gpu_info: adjust the condition for use_late_alloc +- radeonsi: optimize set_inlinable_constants when they don't change +- st/mesa: don't track VS sampler views for st_draw_feedback in st_context +- st/mesa: don't track FS sampler views for bitmap/drawpix in st_context +- st/mesa: don't memset the sampler view array, don't init trailing slots to NULL +- st/mesa: sink _mesa_get_samplerobj into st_update_single_texture +- st/mesa: read Target only once in st_update_single_texture +- st/mesa: return sview from st_update_single_texture via return value, not param +- st/mesa: remove the const qualifier for a few st_sampler_view instances +- st/mesa: sink refcounting from st_get_sampler_views into st_sampler_view.c +- st/mesa: add a mechanism to bypass atomics when binding sampler views +- st/mesa: remove the sampler min_lod/max_lod value swap +- cso: disallow NULL sampler state templates in cso_single_sampler +- cso: update max_sampler_seen only once in cso_set_samplers +- cso: don't look up a sampler CSO if the last one is identical +- mesa: use atomics instead of mutexes for refcounting texture objects +- mesa: use atomics instead of mutexes for refcounting sampler objects +- mesa: use atomics instead of mutexes for refcounting renderbuffers +- mesa: remove mutex locking from a glBindTexture early out path +- mesa: translate into pipe_sampler_state in GL functions +- mesa: add LodBias quantization from st/mesa +- mesa: add IsBorderColorNonZero to skip border color update for st/mesa faster +- mesa: lower GL_CLAMP in texture and sampler functions instead of st/mesa +- radeonsi: remove the GDS variants of compute-based primitive discard +- radeonsi: change how the prim discard CS is enabled and splitting limits +- radeonsi: fix issues with draw-level splitting for the prim discard CS +- radeonsi: add optimal multi draws and draw-level splitting for prim discard CS +- radeonsi: move the accepting code into the bbox cull branch in prim discard CS +- radeonsi: drop smoothing quality to 4xAA for better performance +- ac/llvm: don't return a status from ac_cull_triangle because it's unused +- ac/llvm: rework how negative W affects culling to not call accept_func twice +- radeonsi: rewrite a confusing comment in si_upload_and_prefetch_VB_descriptors +- ac/surface/tests: fix the ARM build +- radeonsi,radv: fix a late alloc deadlock with <= 6 CUs per SA +- radeonsi: move an incorrectly placed comment about late alloc +- ac,radeonsi: move late alloc computation into common code and shader states +- radeonsi: enable uniform inlining by default +- util/idalloc: change num_elements to units of elements instead of bits +- util/idalloc: fold the size call into init +- util/idalloc: reserving an ID that already exists should be no-op +- util/idalloc: hide or remove unused public functions +- util/idalloc: add exists and foreach helpers +- util/idalloc: add util_idalloc_alloc_range +- radeonsi: don't expose no-attachment MSAA 16x on all 1 RB chips due to issues +- mesa: fix incorrect comment in draw_gallium_multimode +- st/mesa: always use PIPE_USAGE_STAGING for GL_MAP_READ_BIT usage + +Mark Janes (11): + +- iris: Increase the size of upload buffers +- iris: Upload constant resources for efficient GPU access +- iris: Use const_uploader for iris_create_stream_output_target +- iris: Use const uploader for blorp vertex data +- iris: Use const uploader for draw parameters +- iris: Use const uploader for user index data +- intel/compiler: Add getter helpers for LSC message descriptor fields +- intel/compiler: Add LSC messages to brw_schedule_instructions +- intel/fs: Lower DW untyped r/w messages to LSC when available +- intel/fs: Lower untyped atomic messages to LSC when available +- intel/fs: Lower A64 untyped r/w messages to LSC when available + +Martin Krastev (1): + +- compiler/glsl: Use mutex lock while freeing up mem_ctx + +Martin Peres (1): + +- ci: add the dEQP expectations for radv on Renoir + +Matt Turner (10): + +- intel/eu: Add instruction compaction support on XeHP. +- compiler/glsl: Return progress from propagate_invariance() +- compiler/glsl: Propagate invariant/precise when splitting arrays +- compiler/glsl: Always propagate_invariance() last +- freedreno/afuc: Print uintptr_t with PRIxPTR +- sparc: Avoid some redefinition warnings +- tu: Provide a toggle to avoid warnings about unsupported devices +- freedreno/ci: Use TU_IGNORE_CONFORMANCE_WARNING to reduce warnings +- ci: Unify on MESA_VK_IGNORE_CONFORMANCE_WARNING +- amd/ci: Use MESA_VK_IGNORE_CONFORMANCE_WARNING to reduce warnings + +Matti Hamalainen (11): + +- gallium/tools: clean up tracediff.sh a bit +- gallium/tools: improve option handling in dump_state.py +- gallium/tools: implement better suppression of variants +- gallium/tools: implement 'named' pointers option in dump.py +- gallium/tools: use left-column output mode of sdiff in tracediff.sh +- gallium/tools: improve tracediff.sh argument handling +- gallium/tools: implement "high-level" overview mode option in dump scripts +- gallium/tools: improve pointer type tracking in parse.py +- gallium/tools: add option to use Meld for diffing +- aux/trace: add missing return value to trace output +- gallium/tools: improve handling of pointer arrays + +Mauro Rossi (11): + +- egl/android: include "util/compiler.h" for FALLTHROUGH macro +- android: panfrost/lib: add pan_cs.c to Makefile.sources +- android: gallium/radeonsi: add nir include path +- android: amd/common: add nir include path +- android: pan/bi: add bi_opt_constant_fold.c to Makefile.sources +- android: nir: add nir_lower_fragcolor.c to Makefile.sources +- android: intel/compiler: add brw_compile_ff_gs.c to Makefile.sources +- android: i965: remove brw_ff_gs_emit.c from Makefile.sources +- android: ac: add ac_nir_lower_ngg.c to Makefile.sources +- android: ac: add include src/util path +- android: aco: add aco_optimizer_postRA.cpp to Makefile.sources + +Michael Tang (1): + +- microsoft/compiler: Maintain sorting of resource type in the context + +Michael Walle (1): + +- kmsro: Add mali-dp + +Michel Dänzer (18): + +- lima/ppir: Cast pointer to uintptr_t instead of uint64_t +- util: Remove unused Android options_tbl_lock +- Convert most remaining free-form fall-through comments to FALLTHROUGH +- Guard FALLTHROUGH annotations after assert() +- llvmpipe: Drop switch with only default case +- iris: Drop unneeded default switch case +- Use explicit break instead of fall-through to break-only case +- ci: Enable -Werror in clang jobs +- osmesa: Replace default case FALLTHROUGH annotation by following return +- ci: Enable -Werror for the remaining GCC build jobs +- ci: Move -Werror enabling from job definitions to meson build script +- ci: Add test which occasionally times out to lavapipe-vk skips +- Fix up leftover "state_trackers" references to "frontends" +- turnip: Mark local variable ASSERTED +- ci: Add debian/ prefix to job names for Debian based docker images +- ci: Rename Debian based build jobs from meson-* to debian-* +- ci: Add Fedora 34 based x86 build docker image +- ci: Add Fedora release build job + +Michel Zou (14): + +- lavapipe: fix unused variable warning +- vulkan: fix duplicate win32 def +- gallium: fix uninitialized variable warning +- meson: link vulkan_util with link_whole on mingw +- docs: list more vulkan extensions +- vulkan/wsi: avoid wsi_x11_check_for_dri3 for sw device +- zink: fix win32 build +- swr: fix uninitialized variable warnings +- llvmpipe: restrict optim bug workaround to gcc 10.x +- glapi: fix Warray-parameter +- zink: Drop useless zink_dispatch_table +- zink: Fix win32 build +- zink: Fix unused-variable warning +- meson: dont use missing dumpbin path + +Miguel Gomez (1): + +- i965: Prevent invalid framebuffer usage + +Mike Blumenkrantz (548): + +- gallium: add PIPE_BIND_SAMPLER_REDUCTION_MINMAX +- gallium: split PIPE_CAP_SAMPLER_REDUCTION_MINMAX into modes +- mesa/st: plumb GL_TEXTURE_REDUCTION_MODE_ARB through QueryInternalFormat +- zink: hook up VK_EXT_sampler_filter_minmax +- zink: support format queries for VK_EXT_sampler_filter_minmax +- zink: handle minmax sampler creation for VK_EXT_sampler_filter_minmax +- zink: export PIPE_CAP_SAMPLER_REDUCTION_MINMAX_ARB +- docs: update GL_ARB_texture_filter_minmax for zink +- zink: compare against screen batch id when determining which semaphore to use +- zink: always copy the nir shader before compiling +- zink: fix tcs slot map eval for user vars +- zink: fix tcs input reservation for user vars +- st/pbo: use cso_set_vertex_buffers_and_elements() for st_pbo_draw +- zink: merge copy-to-scanout path into non-deferred flush path +- zink: force scanout sync when mapping scanout resource +- util/format: add util_format_is_rgbx_or_bgrx +- zink: use undefined layout for first scanout obj transition +- Revert "zink: force scanout sync when mapping scanout resource" +- zink: move scanout sync to end of batch +- zink: add a flag indicating whether scanout object needs updating +- zink: move wsi flush info conditional to queue submission +- zink: directly set batch->state->flush_res from flush_resource hook +- zink: add clear-on-flush mechanic deeper into flush codepath +- gallium: when tracing is enabled for threaded drivers, trace the driver thread +- nir/lower_fragcolor: set outputs_written for fragdata members +- softpipe: fix render condition checking +- softpipe: fix streamout queries +- softpipe: ci updates +- zink: track persistent resource objects, not resources +- zink: restore previous semaphore (prev_sem) handling +- zink: use cached memory for staging resources +- zink: init timeline semaphore on screen creation, not first batch creation +- zink: only reset query on suspend if the query has previously been stopped +- zink: when performing an implicit reset, sync qbos +- lavapipe: implement VK_EXT_provoking_vertex +- zink: hook up VK_EXT_provoking_vertex +- zink: implement VK_EXT_provoking_vertex +- zink: ci updates +- zink: update docs +- nir/gl_lower_buffers: set access for ssbo load/store instrs +- zink: use non-atomic load/store ops if intrinsic is not actually coherent +- zink: remove leftover references to flatshading in shader keys +- zink: hook up VK_KHR_shader_clock +- zink: add conversion util for nir_scope -> SpvScope +- zink: add spirv builder for unops with a const operand +- zink: support nir_intrinsic_shader_clock +- zink: export PIPE_CAP_TGSI_CLOCK +- zink: generate spirv 1.5 from ntv when using vk >= 1.2 +- zink: create entrypoints for descriptor variables with spirv 1.5 +- zink: add fastpath for getting default shader variants +- zink: use first-created shader variant as the default +- zink: hook up VK_EXT_sample_locations +- zink: hook up VK_EXT_conservative_rasterization +- zink: hook up VK_EXT_shader_subgroup_ballot +- zink: hook up EXT_image_drm_format_modifier +- docs: mark off GL_ARB_shader_clock for zink +- gallium: rename pipe_draw_start_count -> pipe_draw_start_count_bias +- gallium: move pipe_draw_info::index_bias to pipe_draw_start_count_bias +- mesa/st: rename DrawGalliumComplex -> DrawGalliumMultiMode +- gallium: split drawid out of pipe_draw_info and as a separate draw_vbo param +- gallium: remove padding members from pipe_draw_info +- util/tc: split out drawid-using draws into a separate call +- iris: fix indirect drawid +- zink: grab GetPhysicalDeviceMemoryProperties2 from instance +- zink: hook up VK_EXT_memory_budget +- zink: support PIPE_CAP_QUERY_MEMORY_INFO +- zink: minor refactoring of buffer map for read case +- zink: add a screen util function for handling VkResults +- zink: use zink_screen_handle_vkresult() for fence and timeline waiting +- zink: add a ctx function for handling device lost resets +- zink: use new ctx device lost checker function +- zink: add a pipe_context::resource_commit hook +- zink: implement sparse buffer creation/mapping +- zink: export PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE +- aux/cso_cache: add handling for save/restore of compute states +- zink: clamp zs samplers to XXXX swizzle for all non-zero/one swizzles +- gallium/inlines: remove atomic set from pipe_reference_init() +- nir: add nir_isub_imm +- lavapipe: handle buffer sizes better in CmdBindTransformFeedbackBuffersEXT +- lavapipe: do not read sampler descriptor info during update if layout has immutables +- lavapipe: set events to the unsignalled state on creation +- lavapipe: flag renderpasses as having color/zs attachments +- lavapipe: update more states on null multisample pipeline info +- lavapipe: zero out the dsa state info and flag for updating on null dsa state +- lavapipe: zero out the blend state info and flag for updating on null blend state +- lavapipe: don't unnecessarily flag dsa states for updating +- lavapipe: ignore tess pipeline info if no tess shaders in pipeline +- lavapipe: don't access pipeline viewport state when it should be ignored +- lavapipe: don't access pipeline dsa state when it should be ignored +- lavapipe: don't access pipeline blend state when it should be ignored +- zink: split off descriptor layout from descriptor pools +- zink: unify pipeline layout creation functions +- zink: abstract descriptor init +- zink: abstract descriptor usage for programs +- zink: abstract descriptor pool usage for programs +- zink: use explicit types during descriptor updates +- zink: check descriptor layout support before creating it +- zink: move more vertex state stuff into the hw state +- zink: split vertex state pipeline hashing into its own value +- zink: flag pipeline for change more often when vbos change without dynamic state +- zink: return current pipeline object if state hasn't changed +- zink: hook up dynamic dsa states +- zink: start using dynamic front face state +- util/hash_table: _mesa_hash_table_create_u32_keys() +- zink: add a pipe_context::clear_buffer hook +- zink: never use LINEAR for VK_EXT_4444_formats +- zink: make ZINK_INLINE_UNIFORMS more standardized in function +- zink: clamp 3D surface viewtype to 2D only in the create_surface hook +- zink: add a target param to create_ivci() +- zink: simplify samplerview surface creation +- zink: only set layer info for samplerviews if there are multiple layers +- zink: handle in-renderpass clears in fb_clears_apply_internal() +- zink: break zs clear loop once both bits are set when beginning renderpass +- zink: add debug assert to verify that zink_clear_framebuffer() is accurate +- zink: remove compute cruft from resource mapping +- zink: break out draw dispatch into separate functions +- zink: fix texture barriers for real this time +- zink: rework memory_barrier hook again (third time's the charm) +- ci: skip glsl-uniform-interstage-limits tests for softpipe jobs +- zink: use DONTCARE renderpass when a new scanout fb attachment is set +- iris: refcount separate screen objects for resource tracking +- zink: stop invalidating descriptor sets on pool destroy +- zink: add context-based descriptor info tracking infrastructure +- zink: unify resource rebinding +- zink: track bind counts for descriptors +- zink: update samplerview descriptor layouts when image binds are set +- zink: don't track sampler states onto buffer sampler sets +- zink: track max slot idx for descriptor types +- zink: track number of tbos in shader data +- zink: add slot params to zink_context_invalidate_descriptor_state +- zink: use better iterating for buffer rebinds +- zink: call invalidate on invalid descriptor sets during recycle +- zink: make zink_context_update_descriptor_states() static +- zink: remove screen param from zink_descriptors_update() +- zink: pop descriptor refs when invalidating sets +- zink: flush every 100k draws/computes +- zink: check for a work_count-based stall in zink_maybe_flush_or_stall() +- zink: always do maybe_flush after draw/compute +- zink: stop overwriting buffer map pointers for stream uploader +- zink: fix DrawParameters shader cap usage +- lavapipe: fix fencing when submitting multiple cmdbufs +- zink: immediately return false when getting query result if it's not gonna happen +- util/queue: don't require a fence when adding a job +- zink: split out base renderpass begin into separate function +- zink: add a flag for tracking/validating renderpass clears +- zink: add flags for determining whether to update framebuffer and renderpass +- zink: emit some barriers out of renderpass where possible +- nir/builder: add nir_pad_vector and nir_pad_vec4 util functions +- zink: don't multiply cube array image layers +- zink: populate images with u_blitter if transfer_dst isn't available +- zink: add even more validation for linear images before creation +- util/primconvert: add C++ guards to header +- aux/trace: support pipe_screen::query_memory_info +- aux/trace: pipe_screen::query_dmabuf_modifiers +- aux/trace: pipe_context::is_dmabuf_modifier_supported +- aux/trace: propagate pipe_screen::transfer_helper pointer +- aux/trace: pipe_screen::get_dmabuf_modifier_planes +- aux/trace: trace pipe_screen::resource_create_with_modifiers +- util/prim_restart: fix util_translate_prim_restart_ib +- ci: more freedreno flakes +- aux/vbuf: prevent uint underflow and assert if no vbs are dirty +- aux/trace: add pipe_context::set_debug_callback hook +- aux/trace: more effectively unwrap pipe_context params from screen functions +- aux/trace: trace transfer ops +- aux/trace: stop dumping transfer data for threaded contexts +- aux/trace: hook tc methods +- aux/trace: fix set_inlinable_constants hook +- aux/trace: fix query handling with tc +- aux/trace: add a pipe_context::clear_buffer hook +- aux/trace: dump 'wait' param for get_query_result +- radeonsi: explicitly return support for all index buffer formats +- zink: rename ptr_add_usage -> batch_ptr_add_usage +- zink: make descriptor_layout_get a public util function +- zink: make a public util function for allocating descriptor sets +- zink: unify pipeline layout creation and move to descriptor_program_init +- zink: pass descriptor type to set layout create() +- zink: replace has_descriptors program member with a util function +- zink: abstract descriptor functionality and make descriptor structs private +- zink: improve samplerview update flagging +- zink: emit descriptor barriers and references during bind +- zink: add vertex buffer barriers during bind +- zink: make timeline_wait use only a screen param +- zink: move timeline_wait() to screen function +- zink: implement tc idalloc resource id stuff +- zink: force streamout rebind when mapping a streamout buffer for writing +- zink: implement a tc is_resource_busy hook +- zink: call tc_driver_internal_flush_notify() on flush +- zink: mark some buffer barrier functions inline/static +- zink: switch to memory barriers instead of actual buffer barriers +- zink: hook up push descriptor and descriptor template extensions +- zink: disable push descriptors on amd +- nir/builder: add nir_mask +- radv: make radv_pipeline::attrib_ends 32bit +- radv: set maxVertexInputAttributeOffset to UINT32_MAX +- zink: remove weird lod hack for texturing +- zink: ci updates +- llvmpipe: remove clamping to [0,1] for tri offset +- lavapipe: moar @optimize +- llvmpipe: split out scene surface info into separate struct +- llvmpipe: split out scene surface init into separate function +- llvmpipe: only dump tgsi shaders if they're actually tgsi shaders +- llvmpipe: store a screen pointer in resource struct +- llvmpipe: stop accessing pipe_resource::screen internally +- lavapipe: skip "pipeline barriers" if they're first or last in a cmdbuf +- lavapipe: also ignore multiple pipeline barriers in succession +- gallium/aux: add helper for pre-clamping clear_buffer value to dword +- zink: clamp clear_buffer values +- radeonsi: clamp clear_buffer values using new util helper +- zink: improve unsupported feature warning message +- aux/trace: avoid deadlock in screen::flush_frontbuffer hook +- gallivm: fix oob imageLoad with formats that have <4 components +- llvmpipe: ci updates +- aux/indices: break out primitive type conversion to separate function +- aux/indices: break out index size conversion to separate function +- aux/indices: break out index count conversion into separate function +- aux/indices: employ Delete The Code methodology +- lavapipe: add more format mappings for vertex buffer formats +- zink: reapply resource/surface refs after app flushes +- zink: reapply program refs automatically +- zink: remove barriers/refs from descriptor cache +- zink: mark some draw functions inline +- zink: only rebind pipelines when necessary +- zink: handle rebinds for vertex buffers +- zink: only rebind vertex buffers when necessary +- zink: only update viewport state when necessary +- zink: update scissor only when necessary +- zink: ref vertex buffers during set_vertex_buffers +- zink: stop using util_set_vertex_buffers_mask() +- Revert "zink: call tc_driver_internal_flush_notify() on flush" +- compiler/spirv: expand_to_vec4 -> nir_pad_vec4 +- anv: fix availability for copying timestamp query results +- zink: add a second descriptor manager +- zink: unify code for updating res->bind_count values +- zink: unify more resource bind count tracking code +- zink: optimize buffer rebinds +- zink: ci updates +- aux/trace: dump resource for samplerview and surface +- aux/draw: if pipe_draw_info::index_bias_varies is not set, ignore index_bias for N>1 +- aux/draw: fix aalines and aapoints for shaders with explicit FragData outputs +- radv: declare index_va in a single call for indexed draw packet emit +- radv: explicitly load a desc set layout struct member during set allocate +- zink: add a util function to create a null surface +- zink: replace context-based null framebuffer surfaces with internal ones +- zink: create dummy surface/bufferview for null descriptor use +- zink: handle null bufferview/imageview descriptors when robustness2 is missing +- zink: ci updates +- zink: no-op read access buffer barriers if existing access exists for earlier stage +- zink: emit fb attachment barriers inline during renderpass start +- zink: track number of fb attachment binds on resources +- zink: use VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL when possible +- aux/tc: fix ubo unbinding +- Revert "Revert "zink: call tc_driver_internal_flush_notify() on flush"" +- nouveau: explicitly advertise index buffer format support +- r300: explicitly advertise index buffer format support +- d3d12: explicitly advertise index buffer format support +- zink: explicitly advertise index buffer format support +- zink: more accurately handle shader layer/viewport caps +- util/prim_restart: assert the index size at the start of the function +- util/prim_restart: pre-trim degenerate primitives during draw rewrite +- util/prim_restart: store index bounds while rewriting draws +- util/prim_restart: store the total index count when rewriting draws +- util/prim_restart: update index bounds before draws in util_draw_vbo_without_prim_restart +- util/prim_restart: simplify util_draw_vbo_without_prim_restart a bit +- zink: populate maxSampleLocationGridSize for all available sample sizes on init +- zink: set VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT on zs rts +- zink; add a pipe_screen::get_sample_pixel_grid hook +- zink: add a pipe_context::set_sample_locations hook +- zink: also flag sample_locations_changed if framebuffer samples changes +- zink: add a util function for populating VkSampleLocationsInfoEXT +- zink: update vk sample location info during framebuffer setup +- zink: add a pipe_context::evaluate_depth_buffer hook +- zink: use dynamic state to apply sample locations during draw +- zink: export PIPE_CAP_PROGRAMMABLE_SAMPLE_LOCATIONS +- util/vbuf: fix buffer overrun in attribute conversions +- zink: fix caching of shader variants with inlined uniforms +- util/blitter: remove duplicated set_sample_mask calls +- util/disk_cache: add nocopy variant of disk cache store function +- zink: use scissor region for discarding clears during blit +- zink: clamp PIPE_CAP_MAX_VIEWPORTS to PIPE_MAX_VIEWPORTS +- aux/cso: add flag to disable vbuf +- aux/cso: split cso_destroy_context into unbind and a destroy functions +- lavapipe: use cso caching +- zink: fix typo that's definitely not at all embarrassing or anything like that +- aux/cso: store flatshade_first state from rasterizer +- util/primconvert: add function for setting flatshade_first +- util/vbuf: add flatshade_first to vbuf context and api +- aux/cso: set flatshade_first onto vbuf when binding rasterizer +- aux/tc: pass rebind count and rebind bitmask with replace_buffer_storage func +- util/prim_restart: use more direct conversion for restart index +- zink: add a function for creating descriptor layouts for push sets +- zink: split lazy sets based on descriptor type +- zink: match lazy descriptor set layout in cache mode +- zink: modernize cached push ubo descriptor updating +- zink: modernize cached ubo descriptor updating +- zink: modernize cached ssbo descriptor updating +- zink: modernize cached image descriptor updating +- zink: remove sorting for dynamic ubo offset updating +- zink: move ubo range assert to update_descriptor_state() +- zink: unify cached descriptor update code +- zink: run lazy batch descriptor functions in cache mode +- zink: add is_buffer flag to union zink_descriptor_surface +- zink: update null sampler/image descriptor surface with is_buffer during hashing +- zink: move shader image descriptor set refs to underlying type +- zink: add funcs for descriptor_surface refs +- zink: move samplerview descset refs to base objects +- zink: enable templated descriptor updates in cache mode +- zink: add oob asserts for descriptor set ref setting +- zink: skip hash updates for descriptor types which aren't used +- zink: unblock last_set cached descriptor reuse when safe to do so +- zink: add ZINK_DESCRIPTORS env var to explicitly set a mode +- zink: remove zink_batch_state::descs_used +- zink: split batch state work_count into separate vars +- zink: reorder has_barriers flag in batch state struct +- zink: optimize zink_tc_fence struct packing +- zink: move batch usage functions to static inlines +- zink: remove atomic from batch usage setting +- zink: make batch_usage_unset take a batch state param +- zink: unset program batch usage on state reset +- zink: remove unnecessary conditionals in resource batch tracking +- zink: make batch_usage_set take a batch state param +- zink: make batch_usage_matches take a batch state param +- zink: cache descriptor update templates along with layout +- zink: track active use counts for descriptor layouts +- zink: destroy lazy descriptor pools during batch reset when unused +- zink: slightly refactor program updating during draw +- zink: remove return types from program update functions during draw +- zink: simplify zink_program_has_descriptors() +- zink: mark bind_stage() as inline +- zink: unify gfx shader create callbacks +- zink: use u_live_shader_cache +- zink: remove unnecessary draw checks +- zink: move batch decl to top of draw_vbo +- zink: stop sanitizing primitive_restart flag in draw info +- zink: handle nir_op_pack_64_2x32 +- zink: add update flag for rasterizer state change +- zink: add update flag for dsa state change +- zink: split stencil ref changes to separate dirty flag +- anv: fix dynamic primitive topology for tess +- zink: update pipe_screen::num_contexts +- zink: set subdata hook as PIPE_MAP_ONCE +- zink: move queue init to screen creation +- util/queue: add a global data pointer for the queue object +- zink: add a more direct check for rgbx formats in create_sampler_view hook +- zink: smash dstAlphaBlendFactor to ZERO for RGBX attachments +- zink: also nope out of any dst alpha blends for rgbx formats +- zink: support more RGBX formats +- zink: ci updates +- zink: mark some functions inline +- zink: collapse host_visible and non-coherent alignment alloc cases +- zink: change a bunch of sparse buffer resource checks to host-visible checks +- zink: avoid caching visible vram allocations +- zink: key alloc cache on heap index, not heap flags +- zink: check actual mem props to determine if resource object is coherent +- zink: use fake buffer barriers for descriptors +- zink: always defer image descriptor barriers +- zink: remove duplicated bitflag filtering for inline uniforms +- zink: remove inlinable_uniforms_dirty_mask +- radv: move pipe_misaligned and l2_coherent image checks to flags set on init +- nine: only enable tgsi disk cache if the driver supports it +- nine: add zink to the build target +- zink: handle custom border color without matching wrap mode case +- zink: add a flag for disabling conditional render during blit +- zink: add more clear hooks +- zink: clear the fb clears array instead of freeing it on reset +- zink: support multidraw +- zink: use multidraw +- vk/util: add macros for multidraw +- zink: clear textures directly when possible +- zink: only update last_finished during batch reset if the batch was used +- zink: improve tc fence disambiguation +- zink: add and use fencing functions which take batch usage structs +- zink: use batch usage api for resource helper function +- zink: remove no-longer-used resource helper functions +- zink: queue v3.0 +- zink: apply zink_resource_object::offset for memory flush/invalidates +- zink: break out offset alignment calculation into helper +- zink: make init_mem_range() a public function +- zink: enforce multi-context waiting for unflushed resources on foreign batches +- zink: move queue submit thread to screen +- zink: move sparse buffer commit to screen queue +- zink: move fence reset to submit thread +- zink: flag scanout updates to batch state, not resource +- zink: move some end-of-batch stuff to submit thread +- zink: don't clear batch resources on fence finish +- ci: disable panfrost t760 jobs +- aux/draw: add a util function for reading back indirect draw params +- util/prim_restart: break out draw rewriting into separate function +- util/primconvert: handle indirect draws +- util/primconvert: map index buffer before getting index translator function +- util/primconvert: handle rewriting of prim-restart draws with unsupported primtype +- util/primconvert: handle multidraws in primconvert +- gallium: add a pipe cap to rewrite index buffers for draws using a non-fixed restart index +- gallium: handle automatic 8bit -> 16bit index buffer rewrites +- gallium: add a pipe cap for performing automatic prim type conversion +- gallium: add a pipe cap for determining driver support for prim type in restarts +- zink: export PIPE_CAP_EMULATE_NONFIXED_PRIMITIVE_RESTART +- zink: export 8bit index buffer support based on extension presence +- zink: export supported prim types +- zink: export supported primitive restart types +- zink: remove primconvert +- zink: ci updates +- zink: use depth/stencil-only layouts for depth/stencil-only formats +- lavapipe: implement multidraw ext +- zink: break out image descriptor layout into util function +- zink: split deferring of barriers to image and buffer functions +- zink: only do deferred image barriers if layout changes +- zink: use bind counts to more accurately determine image descriptor's exact layout +- zink: improve automatic layout transitions for sampler+image descriptors +- zink: only queue deferred descriptor layout change on first bind or change +- zink: flush pending clears if a resource is bound as a descriptor +- zink: repack zink_context struct a bit +- anv: unify some draw state vertex constant emission +- anv: VK_EXT_multi_draw implementation +- util/vbuf: always claim support for PATCHES in restart modes +- util/vbuf: flag fallback_always if any prim types are missing from restart modes +- zink: add direct conversion from pipe_shader_type->VkPipelineStageFlags +- zink: split dummy buffer creation and populate +- zink: try for better buffer allocation heaps +- zink: don't align device-local buffer memory +- zink: make mem cache limits dynamically scalable +- zink: uncap mem caching +- zink: cache visible vram +- zink: attempt to handle some resource unmap cases in 32bit envs +- radv: pre-calc vertex buffer descriptor size on pipeline object +- lavapipe: hook up some bits for handling dynamic line stipple state +- lavapipe: implement EXT_vertex_input_dynamic_state +- zink: avoid unnecessarily rewriting gl_DrawID +- zink: unify/consolidate some barrier queuing +- zink: break up ctx descriptor state updating to per-type functions +- zink: add a ref for flush resource +- zink: unify fb surface unbinding +- zink: move line width and depth bias updating into conditional during draw +- zink: merge some streamout state emission into the same draw conditional +- zink: rework pipeline cache implementation +- zink: make prim type a bitfield in pipeline info +- zink: rename 'template' struct member +- zink: remove unnecessary return from zink_desc_type_from_vktype() +- zink: add c++ header guards +- zink: add more explicit casts to draw code +- zink: don't add batch tracking during buffer rebinds if refs are dirty +- zink: remove stencil resource batch tracking +- zink: split out resource tracking into more incremental functions +- zink: split batch usage setting from refcounting +- zink: split samplerview/imageview usage/refcounting calls +- zink: add resource refs after last descriptor unbind +- zink: use vkresult helper for map return +- zink: only flag persistent resource maps for invalidation if they aren't coherent +- zink: don't add mem allocation offset when copying buf2image +- zink: use pipe_resource::width0 for clamping ssbo sizes +- zink: use 0 as the offset when mapping qbos +- zink: stop screwing up buffer offsets during for maps +- zink: add a screen function for waiting on a batch id +- zink: check last_finished before timeline waiting +- lavapipe: store whether the geometry shader outputs GL_LINES +- lavapipe: store the geometry shader prim type to render state +- lavapipe: implement VK_EXT_line_rasterization +- lavapipe: wideLines support +- zink: ci updates for wideline fails +- relnotes: add some line feature updates for lavapipe +- features: mark off line rasterization for lavapipe +- features: mark off some zink features +- features: fix ARB_shader_group_vote -> GL_ARB_shader_group_vote +- features: add VK_EXT_multi_draw +- features: mark off EXT_vertex_input_dynamic_state for lavapipe +- radv: use multidraw iteration for direct draws +- radv: emit NOT_EOP for multi indexed draws +- radv: emit drawid for multidraws +- radv: determine if hardware can emit NOT_EOP before emitting +- radv: split indexed draw cases based on whether drawid is used +- radv: add a gfx10 bug workaround for NOT_EOP +- radv: implement VK_EXT_multi_draw +- lavapipe: handle null vertex buffers more gracefully +- util/vbuf: check 3-component 16bit int formats for translation +- zink: make shader cache local to a single program +- zink: split up shader cache per-stage +- zink: set gfx program shaders and generate internal tcs during program creation +- zink: remove gfx program slot mapping +- zink: remove shader_id +- zinK: tweak shader module update -> pipeline combined_dirty conditional +- lavapipe: implement EXT_separate_stencil_usage +- lavapipe: implement KHR_separate_depth_stencil_layouts +- features: more lavapipe extensions +- relnotes: more lavapipe features +- zink: add util function for transferring resource refs to batch +- zink: move resource object ref to batch in invalidate hook +- zink: move resource object ref to batch in init_storage_object +- zink: remove refs from buffer rebinds +- zink: remove refs from vertex buffers +- zink: remove refs from ubos +- zink: remove refs from shader buffers +- zink: remove refs from shader images +- zink: remove resource refs from samplerviews +- zink: remove refs from desc ref updating +- zink: add surface ref during rebind if unflushed usage +- zink: set new batch usage during surface rebinds +- zink: remove imageview refs from shader images +- zink: remove samplerview refs +- zink: remove fb surface refs +- zink: remove fb surface resource refs +- zink: remove some descriptor_refs_dirty checks from resource binding +- zink: add a per-stage mask for ubo binds +- zink: add a per-stage bind mask for ssbos +- zink: make samplerview bind mask apply to buffer resources too +- zink: make image_bind_count work for buffers +- zink: remove barriers from buffer rebinds +- zink: optimize buffer rebinds +- zink: redo streamout and texture components of memory_barrier hook +- zink: remove unnecessary stall during device-local map case +- lavapipe: only apply pipeline state for depth bias if it's enabled +- lavapipe: implement EXT_extended_dynamic_state2 +- features: EXT_extended_dynamic_state2 for lavapipe +- relnotes: EXT_extended_dynamic_state2 for lavapipe +- zink: store the last vertex stage to the context during bind +- zink: use last_vertex_stage pointer to optimize streamout emission during draw +- zink: update streamout buffer strides inline +- zink: move descriptor update closer to start of draw +- zink: consolidate and optimize index buffer handling during draw +- features: mark off VK_EXT_multi_draw for radv +- zink: remove zink_shader_module refcounting +- zink: flag all shaders for create during gfx program init +- zink: keep a mask of stages present in a gfx program +- zink: flag shader modules as default +- zink: store the default variant hash for a program +- nir/format_convert: nir_shift -> nir_shift_imm +- nir/format_convert: add ssa version of uint packing +- lavapipe: disable line rasterization ext +- zink: ensure sparse allocations aren't marked host-visible +- zink: fix mem info query to be more permissive +- zink: zero out sampler/image descriptor surface info for null descriptor updates +- zink: ci updates +- zink: populate modifier props onto screen object during init +- zink: start storing modifiers to the base resource struct +- zink: store modifier aspect to resource +- zink: add a pipe_screen::resource_get_param hook +- zink: use VkImageDrmFormatModifierListCreateInfoEXT for creating from modifier array +- zink: explicitly disallow using the modifier image create for non-linear images +- zink: don't pass modifier count to first image create +- zink: add fallback for linear modifier use +- zink: add a pipe_screen::resource_create_with_modifiers hook +- features: mark off line rasterization for lavapipe +- relnotes: add some missing zink/lavapipe updates +- ci: add vulkan files to lavapipe rules +- ci: only trigger gallium_core_file_list jobs from dri and glx frontend changes +- zink: simplify modifier ifdefs +- zink: improve detection for broken drawids +- lavapipe: increment drawid for multidraws +- util/foz: stop crashing on destroy if prepare hasn't been called +- zink: use array size in spirv bo length calculations + +Nanley Chery (8): + +- anv: Add clear_supported to anv_layout_to_aux_state +- anv: Avoid sampling some MCS surfaces with clear +- iris: Avoid sampling some MCS surfaces with clear +- isl: Add isl_aux_usage_has_compression +- iris: Prefer more GPU-based uploads for compression +- intel: Limit the D16 workarounds to Gfx12.0 +- anv,iris: Port the D16 workaround stalls to BLORP +- intel/isl: Fix HiZ+CCS comment about ambiguates + +Neha Bhende (4): + +- svga: Add target and sampler_return_type info into shader key +- svga: Use shader_key info to declare resources if TGSI shader is missing it +- svga: Initialize pipe_shader_state for transform shaders +- aux/indices: include provoking vertex check in prim type conversion + +Neil Roberts (1): + +- kmsro: Fix confusing comma expression + +Niklas Haas (3): + +- vulkan/wsi/x11: return VK_SUBOPTIMAL_KHR on mismatched swapchain +- vulkan/wsi/x11: lower resize events to VK_SUBOPTIMAL_KHR +- vulkan/wsi/wayland: implement the full format table + +Olivier Fourdan (1): + +- radeonsi: Check aux_context on si_destroy_screen() + +Paul Gofman (1): + +- util: add force_gl_names_reuse for SWKOTOR. + +Paul Kocialkowski (1): + +- lima: Take offset in account when checking BO size + +Paulo Zanoni (2): + +- iris: finish converting from drmIoctl to intel_ioctl +- iris: don't munmap NULL pointers + +Petr Vaněk (1): + +- docs/install: remove one extra when + +Philipp Zabel (1): + +- etnaviv: fix gbm_bo_get_handle_for_plane for multiplanar images + +Philippe Normand (1): + +- i915: Prevent invalid framebuffer usage + +Pierre Moreau (2): + +- clover/spirv: Properly size 3-component vector args +- clover/nir: Set constant buffer pointer size to host + +Pierre-Eric Pelloux-Prayer (57): + +- driconf: add workaround for Golf With Friends +- glx: init __GLXvendorInfo to NULL +- radeonsi/nir: enable nir_opt_move_discards_to_top pass +- radeonsi: enable glsl_correct_derivatives_after_discard by default +- st/mesa: fix clearing of 1D array textures +- frontend/dri: set PIPE_BIND_PROTECTED later +- frontend/dri: fix bool/int comparison +- radeonsi: allow write-only mapping of encrypted textures +- radeonsi: fix encryption check for buffers +- radeonsi: dirty msaa_config on rs->multisample_enable change +- winsys/amdgpu: don't read bo->u.slab.entry after pb_slab_free +- amdgpu/winsys: remove amdgpu_cs_has_chaining +- winsys/amdgpu: reduce amdgpu_cs size +- winsys/amdgpu: use int16 for buffer_indices_hashlist +- radeonsi: add _once suffix to depth_cleared_level_mask +- radeonsi: add si_install_draw_wrapper +- radeonsi: use si_install_draw_wrapper for tmz handling +- radeonsi/nir: add si_nir_is_output_const_if_tex_is_const +- radeonsi: use si_nir_is_output_const_if_tex_is_const +- vbo: delay vbo_exec_vtx_map call +- radeonsi: delay sample_pos_buffer creation until first use +- util/u_queue: move function definition up +- util/u_queue: add UTIL_QUEUE_INIT_SCALE_THREADS flag +- disk_cache: use UTIL_QUEUE_INIT_SCALE_THREADS +- radeonsi: skip instance_count==0 draws on <= GFX9 +- radeonsi: disable ngg culling on llvm < 12 +- mesa/shaderapi: change construct_name signature +- mesa/shaderapi: add an optional shader override mechanism +- ac/llvm: call the callback in all return paths of ac_cull_triangle +- radeonsi: fix fb_too_small condition +- radeonsi/gfx7: always sync pfp/me +- ac/surface: don't print stencil info if tex has no stencil +- radeonsi/driconf: add workaround for SpaceEngine +- glthread: add a last parameter to unmarshal functions +- glthread: return consumed bytes +- glthread: use custom marshal/unmarshal for CallList +- glthread: merge sucessive glCallList +- dlist: add locked param to _mesa_lookup_list +- dlist: prelock ctx->Shared->DisplayList before execute_list +- dlist: remove OPCODE_EXT_0 +- dlist: remove InstSize +- dlist: unindent code +- dlist: use an union instead of allocating a 1-sized array +- dlist: always use merged primitive for drawing +- dlist: split hot/cold data from vertex_list +- dlist: use a separate opcode for vbo replay using loopback +- dlist: use a new OPCODE to avoid loading cold data +- dlist: increment/check list nesting when handling OPCODE_CALL_LIST(S) +- dlist: store all dlist in a continuous memory block +- dlist: remove _mesa_dlist_alloc_aligned +- dlist: remove unused _mesa_dlist_alloc +- dlist: skip NOP command at the head of a list +- mesa: clear shader_info::is_lowered in prog_to_nir +- mesa: fix bindless uniform samplers update +- dlist: don't handle unmerged draws as merged +- gallium/va: don't use key=NULL in hash tables +- amd/registers: fix fields conflict detection + +Qiang Yu (1): + +- st/mesa: fix size miss match for some check + +Rafael Antognolli (5): + +- intel/fs: Lower dword integer multiplies on XeHP. +- iris/bufmgr: Query memory region info. +- iris/bufmgr: Add new set of buckets for local memory. +- iris/bufmgr: Add flag to allocate from local memory. +- iris: Map with WC on non-LLC platforms. + +Rhys Perry (92): + +- aco/ra: use original names when renaming loop carried phi operands +- aco/ra: remove live-in temporary from live_out_per_block when moving it +- radv: fix barrier in radv_decompress_dcc_compute shader +- radv: fix clearing DCC-compressed e5b9g9r9 images +- aco: set TRUNC_COORD=0 for nir_texop_tg4 +- ac/nir: set TRUNC_COORD=0 for nir_texop_tg4 +- aco: remove image parameter from get_sampler_desc() +- Revert "radeonsi: set TRUNC_COORD=0 for Total War: WARHAMMER to fix it" +- aco: don't update register demand during RA validation +- aco: allow SDWA sels smaller than the operand size +- aco: add and use Program::progress +- nir/load_store_vectorize: assume CAN_REORDER ops don't alias with stores +- nir/opt_load_store_vectorize: improve handling of swizzles +- nir/opt_load_store_vectorize: ignore load_vulkan_descriptor +- nir/opt_load_store_vectorize: loop internally +- radv: improve vectorization callback for small bit sizes +- radv: only set robust_modes if robustBufferAccess2 is enabled +- radv: disable VK_FORMAT_R64_SFLOAT +- radv: cleanup LLVM implementation of vulkan_descriptor_index +- radv: implement vulkan_resource_reindex +- nir/lower_non_uniform: allow lowering with vec2 handles +- radv,aco: use nir_address_format_vec2_index_32bit_offset +- vulkan: fix use-after-free in vk_common_DestroyDebugReportCallbackEXT +- radv: fix use-after-free upon GS copy shader cache hits +- radv: fix possible use-after-free when inserting GS copy shader from cache +- radv,ac/llvm: use a dword alignment for descriptor loads +- aco: group loads from the same vertex binding into the same clause +- radv,aco: use per-attribute vertex descriptors for robustness +- Revert "radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2" +- radv,aco: compact vertex buffer descriptors +- ci: remove expected robustness2 fails for Renoir +- aco/ra: initialize temp_in_scc earlier +- aco/ra: fix get_reg_for_operand() with no free registers +- aco/ra: fix get_reg_for_operand() when the blocking var is a vector +- aco/ra: fix get_reg_for_operand() with vector operands +- aco/ra: use flags instead of booleans for update_renames() +- aco: disallow SGPRs on DPP instructions +- radv: don't allocate DCC predicate if the image doesn't use DCC +- radv: add radv_absolute_depth_bias +- radv: workaround incorrect depthBiasConstantFactor by Path of Exile +- radv: fix formatting of radv_dri_options +- radv: make attrib_end variable in radv_flush_vertex_descriptors 32-bit +- aco: do not clause NSA instructions +- aco/tests: add tests for form_hard_clauses() +- aco/tests: improve reporting of failed code checks +- aco: don't create 4 and 5 dword NSA instructions on GFX10 +- aco: don't use nir_block_is_unreachable() +- nir/unsigned_upper_bound: don't require dominance metadata +- nir/algebraic: optimize extract of extract +- nir, nir/algebraic: add byte/word insertion instructions +- aco: disallow SDWA for instructions with 64-bit definitions/operands +- aco: add p_extract/p_insert +- aco: implement nir_op_extract/nir_op_insert +- aco: use byte/word extract pseudo-instructions +- ac/llvm: implement byte/word extract/insert instructions +- radv: use byte/word extract/insert instructions +- aco: optimize 32-bit extracts and inserts using SDWA +- aco: make validate_ir() output usable in tests +- aco: disallow literals with some instruction formats +- aco/tests: add tests for p_extract/p_insert lowering +- aco/tests: add SDWA tests +- aco: use v1b/v2b for ds_read_u8/ds_read_u16 +- radv: improve LDS alignment check for load/store vectorization +- aco: don't ever widen 8/16-bit sgpr load_shared +- aco: use ds_read_{u8,u16}_d16 +- aco: fix emit_mbcnt() with a VGPR mask +- radv: increase maxComputeSharedMemorySize +- nir/load_store_vectorizer: fix check_for_robustness() with indirect loads +- nir/opt_load_store_vectorize: check for restrict at the variable +- nir/opt_load_store_vectorize: only require one variable to be restrict +- nir: document that ACCESS_RESTRICT is not set at intrinsics +- radv,aco: use all attributes in a binding to obtain an alignment for fetch +- aco: adjust the condition for expanding vertex fetch data format +- aco/ra: use adjust_max_used_regs() in compact_relocate_vars() +- aco: don't move descriptor loads below buffer loads +- aco: move VMEM instructions below descriptor loads +- aco/lower_phis: fix undef_operands initialization with >32 predecessors +- aco/lower_phis: don't allocate unused temporary ids +- nir: use a single set during CSE +- nir/cse: resize the instruction set +- nir/propagate_invariant: add invariant_prim option +- radv: allow VK_FORMAT_R8G8_SRGB sampling +- nir/opt_load_store_vectorize: fix check_for_robustness() with deref access +- aco/tests: fix 32-bit build +- docs/envvars: fix RADV_TEX_ANISO +- aco: remove resource flags +- aco: handle NIR loops without breaks +- radv: enable VK_KHR_shader_subgroup_uniform_control_flow +- radv: don't ever convert num_records to bytes if it's zero +- radv: adjust num_records when offset>stride +- radv: use null vertex descriptor if num_records=0 +- aco: don't create v_madmk_f32/v_madak_f32 from v_fma_legacy_f16 + +Rob Clark (157): + +- freedreno: Add .clang-format +- freedreno: Some manual reformatting +- freedreno: Re-indent +- freedreno: Manual fixups +- freedreno: Add missing foreach macros and update indentation +- freedreno/drm: Re-indent +- freedreno/afuc: Re-indent +- freedreno/common: Re-indent +- freedreno/computerator: Re-indent +- freedreno/decode: Re-indent +- freedreno/drm-shim: Re-indent +- freedreno/ir2: Re-indent +- freedreno/perfcntrs: Re-indent +- freedreno/fdl: Re-indent +- ir3: handle 16b op_i2b1 +- ci: Update kernel with a few freedreno related fixes +- ci: Add timeout for traces jobs +- freedreno: Small indent fix +- freedreno: Avoid staging blits with stencil on older gens +- freedreno: Make sure we actually flush if we need a fence +- freedreno: Add a couple debug traces +- freedreno: Allow resource shadowing for TC +- freedreno/drm: Move submit->primary to base class +- freedreno/drm: Cleanup bo allocation flags +- freedreno/drm: Cleanup bo cpu_prep flags +- freedreno/drm: Add FD_BO_PREP_FLUSH +- freedreno/drm: Move the growable array helper +- freedreno/drm: Add locked version fd_{bo,pipe}_del() +- freedreno/drm: Userspace fences +- freedreno/drm: Inline the fence-table +- freedreno/batch: Don't create fences for every batch +- freedreno: last_fence optimization for TC async flushes +- freedreno: Move fence struct to header +- freedreno: Drop unused create_fence() arg +- freedreno/drm: Reference count submits +- freedreno: Re-work fd_submit fence interface +- freedreno/drm: Add pipe tracking for deferred submits +- freedreno/drm/sp: Split submit prep and finish +- freedreno/drm/sp: Implement deferred submit merging +- freedreno: Avoid flushing deferred submits for u_trace +- freedreno/drm: fd_submit should hold ref to fd_pipe +- freedreno/drm: pipe should hold reference to device +- freedreno/drm: Async submit support +- freedreno/drm: Assume explicit fences if in_fence_fd +- freedreno/ci: Disable counterstrike trace on a306 for now +- freedreno/ci: Skip texsubmimage cube_map_array +- ci: Add DEQP_CASELIST_INV_FILTER +- freedreno/ci: Isolate dEQP-EGL reset_context tests +- freedreno: Remove samples-per-tex tracking +- freedreno/drm: Allow FD_BO_PREP_FLUSH without _NOSYNC +- freedreno: Flush resources harder +- freedreno/ci: Mark client_wait_sync_finish as flake +- freedreno/ci: Update piglit skips/fails +- freedreno/drm: Initialize control->fence +- freedreno: Fix TC last_fence optimization +- freedreno: Consolidate needs_flush and clearing last_fence +- freedreno/query/acc: Set needs_flush +- freedreno/tools: Fix async flush vs fdperf/computerator +- pps: Lower min sampling interval +- util/perfetto: Add one-time init +- freedreno: Add freedreno pps driver +- gallium/aux: Add perfetto support to u_trace +- freedreno/drm: Add support to query device suspend count +- freedreno/pps: Detect GPU suspend on newer kernels +- freedreno: Moar header C++-proofing +- freedreno: Add perfetto renderpass support +- pps: Add a more interesting cfg example +- docs/perfetto: Updates for freedreno and render-stages +- gallium/u_threaded: Add to_call() helper +- gallium/u_threaded: Add call logging +- freedreno/ir3: Don't force RTNE if rounding mode is undefined +- freedreno/a6xx: Add a few registers +- freedreno: Rename internal resource_busy +- freedreno: Implement TC resource_busy +- freedreno/tu+drm: Extract out pm4 pkt header helpers +- freedreno: Move pkt parsing helpers to common +- freedreno/afuc: Split out instruction decode helper +- freedreno/afuc: Split out utils +- freedreno/afuc: Clean up special regs +- freedreno/afuc: Add pipe reg name decoding +- freedreno/afuc: Add emulator mode to afuc-disasm +- freedreno/registers: Add a few a6xx regs and notes +- freedreno/afuc: Extract full gpu-id +- freedreno/afuc: Split out helpers to parse labels and packet-table +- freedreno/afuc: Add emulator support to run bootstrap +- freedreno/ci: Add real packet-table loading for afuc test +- freedreno/afuc: Use emulator to extract jmptbl +- freedreno/headergen2: Fix compile warnings with CP_DRAW_INDIRECT_MULTI +- freedreno/a6xx: Fix mh31 intermittent faults +- freedreno: Fix typo +- freedreno: Don't return a flushed batch +- egl: zero is a valid fd +- egl+libsync: Add check for valid fence-fd +- frontend/dri: Fix fence-fd logic +- freedreno/ir3: Fix use after free +- Revert "st/mesa: execute glFlush asynchronously if no image has been imported/exported" +- freedreno: Fix batch flush race condition +- freedreno: Fix fdperf flush +- gallium/u_threaded: Missing driver-thread marking +- freedreno: Add string-marker debug trace +- freedreno: Move assert +- freedreno: Add tid to DBG() msgs +- freedreno: Remove assert +- freedreno/registers: add A5XX_RBBM_STATUS3 bit +- freedreno: Add missing valid range tracking for SSBOs/images +- docs: Update freedreno features +- freedreno/ci: Sort a630 piglit xfails +- freedreno/a6xx: Fix r16_snorm blits +- freedreno/a6xx: Handle non-UBWC surface views +- freedreno/a6xx: Improve UBWC demotion logic +- freedreno: Drop obsolete comment +- freedreno: Don't try staging blit for non-renderable formats +- freedreno: Add debugging for blitter fallback recursion +- freedreno: Avoid recursive re-entry of u_blitter +- freedreno/a6xx: Handle R8G8 sharp edges in validate_format() +- freedreno/a6xx: Also validate format in blitter path +- freedreno: Flush batches on shadow/uncompress +- freedreno: Fallback to sw for copy_image with compressed +- freedreno: Fix flushes with NULL batch +- freedreno/blitter: Flush before self-blits +- freedreno/a6xx: Use UNORM for SNORM copy blits +- freedreno/a6xx: Handle u/snorm vs u/sint validation +- freedreno: Fix for multi-draw blits +- freedreno/a6xx: Flip on copy_image +- freedreno/a6xx: Skip nv_copy_image tests +- freedreno: Defer freeing batch->key +- freedreno/ci: Start longest traces first +- freedreno/ci: Increase # of jobs for CI runners +- freedreno/ci: Garbage collect some a630 flakes +- freedreno/a6xx: Handle fb_read in sysmem path +- freedreno: Flush if at risk of overflowing bos table +- turnip: Use drmIoctl() +- turnip: Fix AcquireImageANDROID() handle type +- turnip: Add CrOS Gralloc support +- nir: Add pass to lower phi precision +- freedreno+ir3: Enable INT16 +- freedreno/a6xx: Fix framebuffer_barrier crash +- turnip: avoid some UB +- turnip: Split tu6_emit_xs() +- freedreno/computerator: Add script to probe FLUT values +- freedreno/ir3: Add float immed "FLUT" support +- freedreno: Rename \*_dev_info +- freedreno: Generate device-info tables at build time +- freedreno: Convert fd_dev_info to const pointer +- turnip: Convert fd_dev_info to const pointer +- freedreno/ir3: Get tess_use_shared from fd_dev_info +- freedreno/ir3: Get reg_size_vec4 from fd_dev_info +- turnip: Drop unused vshs_workgroup param +- turnip: Get storage_16bit from fd_dev_info +- turnip: Get indirect_draw_wfm_quirk from fd_dev_info +- turnip: Get has_tex_filter_cubic from fd_dev_info +- turnip: Get has_sample_locations from fd_dev_info +- freedreno+turnip: Add has_cp_reg_write +- freedreno+turnip: Add has_8bpp_ubwc +- freedreno+turnip: Get device name from device-info table +- freedreno+turnip: Add a6xx gen4 support +- freedreno/a6xx: Add missing PC_CCU_INVALIDATE_x + +Robert Foss (1): + +- freedreno/regs: add 5nm DSI PHY/PLL regs + +Robert Tarasov (1): + +- iris: Check data alignment for copy_mem_mem + +Rohan Garg (8): + +- i965: plumb device/driver UUID generators +- i965: Initial implementation for EXT_memory_object_* +- i965: Implement semaphore support for EXT_external_objects +- i965: Implement BufferDataMem +- i965: fix in fences backend for ext_external_objects edge case +- i965: Enable EXT_memory_object_* for Gen 7 and above +- docs: mark external memory and semaphore extensions done for i965 +- ci: Don't artifact rendered images when job succeeds + +Roland Scheidegger (1): + +- llvmpipe: fix nir dot products (fsum op) + +Roman Stratiienko (7): + +- anv_android: Add missing type +- meson: egl: Do not build platform_drm for Android +- android: Add scripts to build using meson +- nouveau: Don't require RTTI and use it only when enabled +- egl: android: prepare code for adding more buffer_info getters +- egl: android: add IMapper@4 metadata API buffer_info getter +- AOSP: Do not add '-Wl,--gc-sections' to the linker arguments + +Ryan Houdek (3): + +- Default enable SSE2 on mesa builds. +- Switch u_format_test to passed on i386 +- Update release notes with mention that x87 is no longer used on x86 + +Sagar Ghuge (16): + +- anv: Set correct fast clear value for depth during blorp operation +- anv: Avoid corrupting indirect depth clear values +- anv: Query memory region info +- anv: Wrapper around I915_GEM_CREATE_EXT_MEMORY_REGIONS +- anv: Allocate BO in appropriate region +- anv: Allocate scratch and workaround BO in local memory +- intel/compiler: Define new LSC data port encodings +- intel/compiler: Add support for LSC fence operations +- intel/compiler: Add helpers for LSC message descriptors +- intel/disasm: Disassmeble LSC messages +- intel/disasm: Disassemble LSC message extended descriptors +- intel/fs: Lower untyped float atomic messages to LSC when available +- intel/fs: Lower Byte scattered r/w messages to LSC when available +- intel/fs: Lower A64 byte scattered r/w messages to LSC dataport +- intel/fs: Lower A64 atomic messages to LSC dataport +- intel/fs: Lower varying pull constant load message to LSC dataport + +Samuel Iglesias Gonsálvez (13): + +- turnip: move pipeline gras_su and rb{stencil,depth}_cntl_mask initialization +- turnip: initialize pipeline->rb_{stencil,depth}_cntl always +- turnip: refactor how LRZ state is calculated +- turnip/lrz: add support for VK_EXT_extended_dynamic_state +- turnip: document GRAS_LRZ_CNTL's UNK5 bitfield +- turnip/lrz: added support for depth bounds test enable +- turnip: fix typo in tu_CmdBeginRenderPass2() +- turnip: implement LRZ direction +- turnip: update LRZ state based on stencil test state +- turnip: group all geometry constant draw states in one +- turnip: fix setting dynamic state mask for VK_DYNAMIC_STATE_STENCIL_OP_EXT case +- turnip: add LRZ early-z support +- anv: do not dereference VkPipelineMultisampleStateCreateInfo always + +Samuel Pitoiset (130): + +- amd: drop support for LLVM 8 +- radv: keep DCC compressed for clears on compute with image stores +- aco: fix opquantize2f16 on GFX6-7 +- radv: fix fast clearing depth-only or stencil-only aspects with HTILE +- radv: fix emitting depth bias when beginning a command buffer +- radv: remove radv_image_iview::bo +- radv: remove radv_image_iview::multiplane_planes +- radv: allow concurrent MSAA images to be FMASK compressed +- radv: fix emitting default depth bounds state on GFX6 +- radv/winsys: remove set but never used use_llvm +- radv: remove old comment about LLVM <= 8 +- ac: move ac_lower_indirect_derefs() outside of the LLVM dir +- radv: cleanup LLVM related includes +- radv: remove RADV_DEBUG=nothreadllvm +- radv/winsys: fix allocating the number of CS in the sysmem path +- radv/winsys: fix resetting the number of padded IB words +- radv: make sure CP DMA is idle before executing secondary command buffers +- radv: remove warnings about RADV_PERFTEST=aco,llvm +- radv/llvm: implement the image load DCC bug +- radv: enable DCC stores with the LLVM backend +- radv: re-introduce missing skip list for Polaris10 +- radv: fix various CMASK regressions on GFX9 +- radv: add the provoking vertex mode to the pipeline/shader keys +- radv/llvm: adjust NGG if provoking vertex mode is last +- aco: adjust NGG if provoking vertex mode is last +- radv: implement VK_EXT_provoking_vertex +- radv: enable TC-compat CMASK on GFX8-9 +- radv: fix computation of the number of user SGPRS for NGG GS state +- radv: check if DCC is enabled when resolving different levels +- radv: only keep concurrent MSAA images compressed if TC-compat CMASK +- radv/winsys: add GFX6_MAX_CS_SIZE instead of using a magic value +- radv/winsys: fix executing huge secondary command buffers on GFX6 +- radv: implement RADV_FORCE_VRS for the LLVM backend +- util/math: change ROUND_DOWN_TO to return a uint64_t +- radv: adjust the computation of the total usage of memory used +- radv: expose 2/3rd of total memory as VRAM and 1/3rd as GTT on APUs +- radv: fix missing ITERATE_256 for D/S MSAA images that are TC-compat HTILE +- radv: declare VK_EXT_extended_dynamic_state2 but leave it disabled +- radv: declare new dynamic states for VK_EXT_extended_dynamic_state2 +- radv: implement dynamic depth bias enable +- radv: implement dynamic primitive restart enable +- radv: implement dynamic rasterizer discard enable +- radv: advertise VK_EXT_extended_dynamic_state2 +- radv: fix extending the dirty bits to 64-bit +- radv: dump the trap handler shader with RADV_DEBUG=metashaders +- nir/opt_access: fix getting variables in presence of similar bindings/desc +- radv: add missing entrypoints for VK_EXT_extended_dynamic_state2 +- radv: enable DCC stores on RDNA2 +- aco: fix derivatives/intrinsics with SGPR sources +- Revert "radv: Do not access set layout during vkCmdBindDescriptorSets." +- radv: fix heap indices when computing the budget +- ac: ac_gpu_info::has_vgt_flush_ngg_legacy_bug +- radv: fix fast clearing DCC if one level can't be compressed on GFX10+ +- radv: simplify radv_pipeline_has_gs_copy_shader() +- radv: remove small overhead of radv_pipeline_has_ngg() +- radv: ignore dynamic blend constants if blend isn't enabled +- radv: remove an useless TODO for dynamic line width +- radv: pass an image range to radv_layout_dcc_compressed() +- radv: remove redundant call to radv_dcc_enabled() +- radv: only mark DCC as compressed when drawing if layout allows it +- radv: only init DCC if compressed in the HW resolve path +- radv: do not decompress DCC for partial resolves if stores are supported +- radv: use radv_dcc_enabled() for the FB mip flush workaround +- aco: fix emitting discard when the program just ends +- radv: stop reporting ACO from the device name +- radv: remove DFSM +- util/drirc: make engine_versions an optional field +- radv: add few new drirc options +- util/drirc: use application_name_match for the SotTR RADV workaround +- radv: move all game workarounds to drirc +- radv: fix missing default state for DB_DFSM_CONTROL +- radv: fix generating hang reports if mutable descriptors are used +- radv: enable RADV_DEBUG=invariantgeom for Monster Hunter World +- ac/rgp: mark SQTT_FILE_CHUNK_TYPE_ISA_DATABASE as deprecated +- ac/rgp: bump the SQTT file minor version to 5 +- radv: enable RADV_DEBUG=invariantgeom for SotTR DX11/DX12 versions +- ac: import performance counters from RadeonSI +- ac: rename ac_dump_thread_trace() to ac_dump_rgp_capture() +- ac/rgp: fix ac_fill_sqtt_asic_info() name +- ac: add ac_thread_trace::data +- radv/winsys: allow to reserve a VMID +- radv: emit PA_SC_CONSERVATIVE_RASTERIZATION_CNTL only on GFX9+ +- ac/debug: fix color printing PKT3 when count in header is too low +- aco: fix range checking for SSBO loads/stores with SGPR offset on GFX6-7 +- radv: dump SPIR-V instead of using spirv-dis when generating a hang report +- aco: fix emitting literal offsets with SMEM on GFX7 +- ci: update list of expected failures for Pitcairn/Oland (RADV) +- radv: do not launch an IB2 for secondary cmdbuf with INDIRECT_MULTI on GFX7 +- radv/winsys: add a small comment explaining the CHAIN bit +- ci: add expected list of failures for Bonaire (RADV) +- radv: fix aligning the image offset by using align64() +- radv/winsys: adjust some error messages +- radv/winsys: remove useless errno.h includes +- radv: fix dynamic rasterizer discard enable state +- radv: reject binding buffer/image when the device memory is too small +- radv: always decompress both aspects of a depth/stencil image +- radv: create only one pipeline for decompressing depth/stencil images +- radv: fix dynamic culling and depth/stencil related dynamic states +- ac/perfcounters: remove ac_pc_block_base::num_prelude +- ac/perfcounters,radeonsi: rework performance counters layout +- ac/perfcounters: rename num_multi to num_spm_counters +- ac/perfcounters: add more SPM configuration fields +- ac/perfcounters: add a GPU block ID to every block definitions +- radv: implement dynamic logic op +- radv: advertise extendedDynamicState2LogicOp +- radv: fix RADV_FORCE_VRS for 2x1 and 1x2 +- radv: fix fd leak in vkAcquireImageANDROID() +- radv: disable DCC for DOOM 2016 and Wolfenstein II +- radv: implement VK_EXT_color_write_enable +- radv: advertise VK_EXT_color_write_enable +- radv: add support for more HTILE clear codes +- radv: prevent fast clearing HTILE depth for unrestricted ranges +- radv: allow more fast clears for depth surfaces without TC-compat HTILE +- ci: update list of expected failures against CTS 1.2.6.2 for RADV +- ci: remove few CTS that are now skipped with RADV +- aco: fix emitting d16 for MIMG instructions on GFX9+ +- aco: fix emitting a16 for MIMG instructions on GFX10+ +- aco: fix shared_atomic_comp_swap if the second source isn't a VGPR +- radv: fix applying radv_disable_dcc for DOOM and Wolfenstein II +- aco: use nir_ssa_def_is_unused() to determine if atomic dest is used +- ac,radv: implement the cs_regalloc_hang HW bug workaround +- radv: fix applying radv_disable_dcc for DOOM 2016 again +- radv: lower primitive shading rate in NIR +- radv: only init the TC-compat ZRANGE metadata for the depth aspect +- radv: fix bounds checking for zero vertex stride on GFX6-7 +- radv: report APUs as discrete GPUs for Red Dead Redemption 2 +- radv: fix specifying the stencil layout for separate depth/stencil layouts +- radv: allow unused VkSpecializationMapEntries +- radv: fix selecting the first active CU when profiling with SQTT +- radv: fix missing cache flushes when clearing HTILE levels on GFX10+ + +Sergii Melikhov (1): + +- util/format: Change the pointer offset. + +Simon Ser (27): + +- radeon/vcn: handle tiled buffers when decoding +- util/format: document block depth field +- ac/surface: use blocksizebits instead of blocksize +- radeonsi: stop special-casing YUV formats in si_query_dmabuf_modifiers +- ac/surface: allow non-DCC modifiers for YUV on GFX9+ +- frontends/va: improve surface attribs processing +- gallium, va: add support for VASurfaceAttribDRMFormatModifiers +- radeonsi: implement pipe_context.create_video_buffer_with_modifiers +- radv: stop special-casing multi-planar formats in radv_get_modifier_flags +- dri: add createImageWithModifiers2 interface +- gallium/dri: implement createImageWithModifiers2 +- i965: implement createImageWithModifiers2 +- vulkan/wsi/wayland: simplify wl_surface version check +- docs/envvars: document MESA_VK_WSI_PRESENT_MODE +- radv: implement VK_EXT_physical_device_drm +- amd/addrlib: remove Meson debug message() +- vulkan/wsi: unify format logic in dmabuf_handle_modifier +- vulkan/wsi: prefer the Wayland linux-dmabuf protocol +- vulkan/wsi/wayland: remove swapchain wl_drm wrapper +- vulkan/wsi/wayland: remove unnecessary wl_proxy_set_queue call +- vulkan/wsi/wayland: fix wsi_wl_image_init error code +- vulkan/wsi/wayland: handle dmabuf params allocation failure +- etnaviv: fix renderonly check in etna_resource_alloc +- etnaviv: fail in get_handle(TYPE_KMS) without a scanout resource +- freedreno: fail in get_handle(TYPE_KMS) without a scanout resource +- panfrost: fail in get_handle(TYPE_KMS) without a scanout resource +- lima: fail in get_handle(TYPE_KMS) without a scanout resource + +Simon Zeni (4): + +- vulkan/wsi: add drm_fd param to wsi_display_get_connector +- vulkan/wsi: Implement VK_EXT_acquire_drm_display +- radv: Implement VK_EXT_acquire_drm_display +- anv: Implement VK_EXT_acquire_drm_display + +Steve Pronovost (1): + +- d3d12: Add mechanism for D3D12 Adapter Selection + +Stéphane Marchesin (1): + +- virgl: resources without any binding can be cached + +SureshGuttula (3): + +- frontends/va/picture:Fix wrong reallocation even surface is protected +- frontends/va: Derive image from interlaced buffers for h26[4/5]encode +- radeon/vcn: calc_dpb_size should be based on dpb_type + +Tapani Pälli (21): + +- anv: do not support image export with stencil aspect set +- glx: fix compilation error when function name not found +- glsl: ignore interface precision qualifier on desktop GL +- glx: revert "Downgrade sRGB-ful fbconfigs" +- i965: support only color formats with memory objects +- nir: skip assert check with empty structs +- isl: require hiz for depth surface in isl_surf_get_ccs_surf +- anv: require rendering support for blit destination feature +- mesa: fix error set for glCompressedTexSubImage calls +- gitlab-ci: enable building of Vulkan tests in Piglit +- anv: introduce new dynamic states +- anv: support rasterizer discard dynamic state +- anv: support depth bias enable dynamic state +- anv: support primitive restart enable dynamic state +- anv: centralize vk_to_intel_logic_op array +- anv: support blending logic op dynamic state +- anv: toggle on VK_EXT_extended_dynamic_state2 +- docs: add VK_EXT_extended_dynamic_state2 features.txt entry +- anv: provide dummy vkCmdSetPatchControlPointsEXT +- iris: take a reference to memobj bo in iris_resource_from_memobj +- anv: fix emitting dynamic primitive topology + +Thomas H.P. Andersen (9): + +- nir: return progress from nir_lower_packing +- nir/lower_packing: use shader_instructions_pass +- anv: remove dead code +- nir/ifind_msb_rev: fix input check +- zink: remove initialization override +- lavapipe: remove initialization override +- broadcom/compiler: use correct flag enum +- broadcom/compiler: fix add vs. mul +- nine: Fix assert in tx_src_param + +Thong Thai (1): + +- radeon/vcn/enc: Add missing line to HEVC SPS header code + +Timothy Arceri (23): + +- mesa: fix incomplete GL_NV_half_float implementation +- mesa: make _mesa_find_temp_intervals() a static function +- mesa: fix _mesa_add{_typed}_unnamed_constant() declarations +- mesa: fix _mesa_add_state_reference() declaration mismatch +- mesa: fix glShaderSource() error handling +- util: disable glthread in CSGO +- glsl: create validate_component_layout_for_type() helper +- glsl: add missing support for explicit components in interface blocks +- nir/lower_io_to_vector: fix per vertex io handling for arrays +- Revert "util: disable glthread in CSGO" +- util: add work around for the game We Happy Few +- util/tests: initialise key in cache_test +- mesa: don't crash on incorrect texture use +- i965: don't crash on incorrect texture use +- glsl: force_glsl_version to shaders with no defined version +- util/driconf: add new ignore_write_to_readonly_var workaround +- util: add some workarounds for the game Luna Sky +- util/disk_cache: delete more cache items in one go when full +- util/radeonsi: add radeonsi workaround for Nuclear Throne +- glsl: replace some C++ code with C +- util: add workaround for Full Bore +- glsl: relax rule on varying matching for shaders older than 4.20 +- intel/compiler: make sure swizzle is applied to if condition + +Timur Kristóf (74): + +- aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7. +- radv: Ignore GS copy shader when determining NGG GS wave size. +- radv: Properly enable Wave32 mode for NGG GS. +- nir: Support upper bound of subgroup_id/num_subgroups for non-compute. +- nir: Support upper bound of unsigned bit size conversions. +- nir: Allow load_primitive_id in VS in nir_divergence_analysis. +- nir: Add AMD specific intrinsics for merged shaders and NGG. +- aco: Allow workgroup barrier and shared scope for NGG shaders. +- aco: Fixup the NIR metadata after sanitize_cf_list. +- aco: Split ngg_emit_sendmsg_gs_alloc_req from the wave0 check. +- radv: Fill shader info earlier. +- radv: Gather NGG info sooner. +- aco: Implement new NGG specific NIR intrinsics. +- ac: Add new NIR pass to lower NGG VS/TES. +- radv: Use new NGG NIR lowering for VS/TES when ACO is used. +- ac: Add NIR lowering for NGG GS. +- radv: Use new NIR lowering of NGG GS when ACO is used. +- aco: Determine whether a few more instructions need exec. +- aco: Use Operand instead of Temp for the exec mask stack. +- aco: Remember when exec mask is const, and restore the const then. +- aco: Don't use s_and_saveexec with branches when exec is constant. +- aco: Refactor SSA elimination phi info to use vector instead of map. +- aco: Eliminate useless exec writes in jump threading. +- aco/insert_exec_mask: Fixed unused variable warning in release build. +- aco/util: Initialize IDSet::bits_set to zero. +- gallium/tessellator: Fix uninitialized variable warnings. +- anv: Fix unused function warnings for memory range checks. +- gallivm: Fix a few uninitialized variable warnings. +- nine: Fix uninitialized warning in texture9.c +- radv/cmd_buffer: Fix warning by initializing instance count. +- aco: Don't eliminate exec write when it's used by a copy later. +- aco: Don't DCE instructions that write non-temps, eg. exec. +- aco: Add Operand(Temp, PhysReg) constructor. +- aco: New writeout overloads for the test framework. +- aco: Introduce a new, post-RA optimizer. +- aco: Use s_cbranch_vccz/nz in post-RA optimization. +- aco: Eliminate SALU comparison when SCC can be used instead. +- radv: Remove duplicate code for getting GS info. +- radv: Don't generate GS copy shader when the pipeline has NGG. +- radv: Assert that there is no GS copy shader when the pipeline has NGG. +- aco: Add note about v_alignbyte in the ISA README. +- nir: Add nir_op_sad_u8x4 which corresponds to AMD's v_sad_u8. +- aco: Implement nir_op_sad_u8x4. +- aco: Add validation for v_permlane instructions. +- nir: Add AMD-specific byte and lane permute intrinsics. +- aco: Implement byte and lane permute intrinsics. +- aco: Keep VGPR destinations for uniform shared loads when beneficial. +- ac/nir: Refactor and optimize the repacking sequence. +- amd: Add extra source to the mbcnt_amd NIR intrinsic. +- aco: Use as_vgpr for the second source of mbcnt_amd. +- ac/nir: Update TCS output barriers with nir_var_mem_shared. +- aco: Fix checking if load_shared is used by cross lane instructions. +- radv/llvm: Emit s_barrier at the beginning of NGG non-GS shaders. +- aco/gfx10: NGG zero output workaround for conservative rasterization. +- aco/gfx10: Emit barrier at the start of NGG VS and TES. +- radv: Add last_vgt_api_stage and use it to simplify some code. +- radv: Move radv_optimize_nir_algebraic to a separate function. +- radv: Allow enabling vertex grouping, fix NGG info with it disabled. +- radv: Set parameter cache oversubscription according to the PC lines. +- nir: Add AMD specific intrinsics for NGG shader based culling. +- ac/nir: Add a NIR port of ac_llvm_cull. +- ac/nir: Use a ballot that matches the wave size during NGG lowering. +- ac/nir: Implement NGG deferred attribute culling in NIR. +- radv: Expose radv_get_viewport_xform in radv_private.h +- radv: New shader args for NGG culling settings and viewport. +- aco: Implement NGG culling related intrinsics. +- radv: Support NGG culling with new perftest environment variable. +- radv: Run algebraic optimizations before NGG lowering. +- ac/nir: Reuse the repacked output positions of culling shaders. +- ac/nir: Analyze culling shaders to remember which inputs are used when. +- ac/nir: Reuse uniforms from top part of culling shaders. +- radv, aco, ac/nir: Tweak position export scheduling for NGG culling. +- radv: Don't compile NGG culling into shaders that write viewport index. +- radv: Remove num_viewports from radv_skip_ngg_culling. + +Tomeu Vizoso (51): + +- ci: Reenable radeonsi jobs, and extend coverage +- ci/lava: Build all piglit profiles in LAVA images +- ci/lava: Update kernel for LAVA to 5.11 +- ci/lava: Start Xorg on request, for Piglit +- ci: Test RadeonSI with piglit's quick_gl +- ci: Use a single kernel+rootfs for both baremetal and LAVA jobs +- ci: Drop hack to disable all modules from defconfig +- ci/radeonsi: Add expected failures due to #4674 having slipped in +- panfrost/ci: Enable some dEQP 3.1 tests on Mali T860 +- Revert "CI: Disable Panfrost and radeonsi" +- panfrost: Don't access members of NULL pointers +- pan/midgard: Don't emit zero padding +- ci: Remove the need for an empty Piglit results file +- Revert "CI: Disable all Panfrost/AMD/Iris automatic jobs" +- ci: Update kernel to v5.13-rc2 +- panfrost/ci: Test Panfrost on the Mali G72 GPU +- panfrost/ci: Add one more flake test for G72 +- radv/ci: Test on Stoney on CI +- ci/lava: Add caching proxies for trace downloads +- ci/piglit: Use wget instead of ci-fairy to check a file exists +- ci: Configure DUTs for max performance +- ci: Uprev piglit to eee7d89611cf "tests: Replay profile frame times" +- ci: Uprev apitrace to 170424754bb4 "retrace: Get --loop to work without rewinding" +- radeonsi/ci: Add new Piglit failures +- ci/freedreno: Add depth32f_stencil8 flakes +- ci/zink: Add nearest_linear_mirror_l8_pot flake +- ci/freedreno: Fix name of flake +- ci/freedreno: Add new flake after "ci: Configure DUTs for max performance" +- ci/freedreno: Add spec@arb_copy_buffer@dlist flake on a530 +- Partial revert of "ci: Add a manual job for tracking the performance of Freedreno" +- ci/freedreno: Skip Portal 2 trace on a630, due to flakiness +- Revert "ci/freedreno: Skip Portal 2 trace on a630, due to flakiness" +- ci/lava: Disable CPU frequency scaling +- ci/lava: Switch LAVA jobs to x86 runners +- ci: Disable windows builds due to runner not being available +- ci: Build Crosvm in our container +- ci: Move Kernel build tasks into its own file +- ci: Store the credentials in /tmp +- ci: Run tests inside Crosvm +- iris/ci: Update the checksums for the pixmark-piano trace +- panfrost/ci: Add some failures that crept in +- ci/lava: Improve error reporting in lava_job_submitter.py +- ci/lava: Don't overwrite PIGLIT_REPLAY_EXTRA_ARGS +- Revert "ci: Disable the iris APL jobs" +- ci/bare-metal: Add parens around shell command +- panfrost: Fork pan_pool for Gallium and Vulkan +- panvk: Add VkCommandPool support +- panvk: Support calls to CreateDescriptorSetLayout without bindings +- panvk: Make panvk_queue_transfer_sync more generic +- panfrost: Specify alignment for the Job Header descriptor +- panvk: Add vkEvents support + +Tony Wasserka (18): + +- radv: Remove assert about pDepthStencilState +- aco/spill: Fix improper handling of exec phis +- aco/scheduler: Fix register demand computation for downwards moves +- aco/scheduler: Fix register demand computation for upwards moves +- aco/scheduler: Verify register demand invariants in debug mode +- util: add support for defining bitwise operators on strongly typed enums +- util: tune signatures of generated enum operators +- aco/scheduler: Clean up register demand tracking +- aco/scheduler: Move cursor handling state to dedicated interfaces +- aco/ra: Fix off-by-one-error in print_regs +- aco/ra: Clean up print_regs output and support byte-allocated variables +- aco/ra: Split print_regs by lines of 64 registers +- aco: Replace Operand literal constructors with factory member functions +- aco: Remove use of deprecated Operand constructors in test_to_hw_instr.cpp +- aco: Remove use of deprecated Operand constructors in aco_builder.h +- aco: Remove use of deprecated Operand constructors +- aco: Clean up unneeded literal casts +- aco: Remove deprecated Operand constructors + +Vasily Khoruzhick (3): + +- lima: switch resource to linear layout if there's to many full updates +- lima: implement alpha test +- lima: handle fp16 vertex formats + +Ville Syrjälä (2): + +- i915: Implement __DRI_IMAGE_ATTRIB_OFFSET query +- i915: Implement __DRI2_FLUSH version 4 + +Vinson Lee (17): + +- clover: Add constructor for constant_argument. +- glx: Fix macOS build. +- nv50/ir: Initialize Graph::Node member tag. +- nvc0: Remove unnecessary bsp_bo NULL check. +- nv50/ir: Initialize BuildUtil member tail. +- nv50/ir: Initialize CodeEmitterNV50 member progType. +- nv50/ir: Initialize GCRA::RIG_Node members. +- nvc0/ir: Initialize CodeEmitterGK110 member progType in constructor. +- nv50/ir: Add ConstantFolding constructor. +- travis: Download XQuartz from GitHub. +- v3dv: Fix assert. +- nvc0/ir: Initialize CodeEmitterNVC0 member progType in constructor. +- intel/vec4: Add missing break statement. +- nvc0/ir: Initialize Limits members in constructor. +- asahi: Fix macOS macro. +- st/xa: Mark default xa_get_pipe_format case unreachable. +- asahi: Move assignment after null check. + +Yevhenii Kolesnikov (3): + +- intel: fix leaking memory on shader creation +- glsl: Add operator for .length() method on implicitly-sized arrays +- glsl: Properly handle .length() of an unsized array + +Yiwei Zhang (79): + +- venus: update venus-protocol headers +- venus: implement dma_buf fd import and properties query +- venus: cap api version to 1.1 for Android +- venus: fix virtgpu_bo_init_dmabuf for classic resource +- venus: close the import memory fd on success +- venus: force a roundtrip after vn_renderer_bo_create_dmabuf +- venus: set bo->size to 0 for classic resource +- venus: update venus-protocol headers +- venus: implement VK_ANDROID_native_buffer v7 +- venus: use VK_EXT_image_drm_format_modifier +- venus: update venus-protocol headers +- venus: enable VK_EXT_queue_family_foreign +- venus: handle VK_IMAGE_LAYOUT_PRESENT_SRC_KHR transfer +- venus: handle wsi image queue ownership transfer for Android +- venus: query extended resource info from gralloc +- venus: populate VK_ERROR_OUT_OF_HOST_MEMORY if applied +- virgl: do not use winsys info for guest storage of classic resource +- venus: fix vkEnumeratePhysicalDeviceGroups +- venus: stop advertising KHR_driver_properties for Android +- venus: clean up vn_android api names +- venus: add AHB format and VkFormat conversion helper functions +- venus: add vn_android_get_ahb_usage helper function +- venus: add ahb image and buffer properties query support +- venus: vn_GetAndroidHardwareBufferPropertiesANDROID (part 1/2) +- venus: vn_GetAndroidHardwareBufferPropertiesANDROID (part 2/2) +- anv: fix AHB leak upon exportable allocation +- radv: fix AHB leak upon exportable allocation +- gallium/st: add a back buffer fallback for front rendering +- gallium/dri: implement EGL_KHR_mutable_render_buffer +- egl/android: check front rendering support for cros gralloc +- venus: tiny refactor of vn_android_get_gralloc_buffer_info +- venus: complete the format conversion between AHB and Vulkan +- venus: fix vn_GetAndroidHardwareBufferPropertiesANDROID +- venus: fix AHB image format properties query +- venus: prepare image creation helpers for AHB +- venus: implement image creation for ahb handle type +- venus: refactor device memory fd import +- venus: implement AHB allocation and import (part 1/2) +- venus: implement AHB allocation and import (part 2/2) +- venus: implement vn_GetMemoryAndroidHardwareBufferANDROID +- venus: support AHB external format for sampler YCbCr conversion +- venus: advertise VK_ANDROID_external_memory_android_hardware_buffer +- venus: rename dmabuf to dma_buf when it represents a type +- venus: fix misaligned bo_flags between import and query +- venus: refactor for property query of dma_buf fd +- venus: fix mismatched bo mmap_size for export and multiple imports +- venus: initial AHB support for multi-planar format +- venus: update to the latest venus protocol +- venus: support AHB prop query with host dma_buf size +- venus: refactor gralloc buffer and drm modifier properties query +- venus: unify VkNativeBufferANDROID and AHardwareBuffer image create info +- venus: forward the host renderer hardware info +- egl/android: fix cached buffer slots for EGL Android winsys +- egl/android: refactor to use the legit vndk/window.h header +- vulkan: fix back compat with Android Oreo and below +- egl/android: add aosp_nougat system/window.h back for back compat +- virgl: forward the host renderer hardware info +- anv: fix Android WSI VkFence +- venus: silence a build warning +- venus: refactor vn_AcquireImageANDROID with globalFencing +- venus: moves GPU rendering off CPU timeline for Android WSI +- venus: add debug info for experimental features during init +- radv: fix build errors after commit 8b7ff784 +- anv: fix build errors after commit 8b7ff78 +- venus: remove workarounds for multi-planar format interop +- anv: fix some log formats +- anv: support multi-planar format in add_all_surfaces_explicit_layout +- anv: enable multi-planar support for drm format modifier +- venus: properly support GPU_DATA_BUFFER for AHB +- venus: use the mesa "drm-uapi/drm_fourcc.h" header +- venus: remove unsupported AHB formats +- venus: resolve AHB external format with DRM format +- venus: add more logs for Android WSI debugging +- venus: prepare vn_CreateBuffer for AHB +- venus: handle ahb backed VkBuffer creation properly +- venus: fix AHB VkBuffer memory requirement +- egl/android: only apply front rendering usage in shared buffer mode +- egl/android: restore image creation fallback path used by virgl +- venus: cache ahb backed buffer memory type bits requirement + +Yogesh Mohanmarimuthu (4): + +- radv: set RADEON_FLAG_GTT_WC flag for prime memory +- glx: Keep display fd open for prime +- glx: create DRI screen for display GPU incase of prime +- loader: allocate VRAM in display GPU in case of prime + +Yurii Kolesnykov (1): + +- c_std=c11 in meson default_options + +Zhaofeng Li (1): + +- Add default driver selections for RISC-V + +Zhu Yuliang (1): + +- gallium/vl: don't leak fd in vl_dri3_screen_create + +Zoltán Böszörményi (2): + +- crocus: Add pipe loader driver +- crocus: Make the driver loader use PCI IDs for crocus + +cheyang (1): + +- virgl:Fix the leak of hw_res used as fence + +luc (1): + +- panfrost: Only clear existing color buffers