KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	278d12f991	intel/fs,vec4: Drop prog_data binding tables Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Jason Ekstrand	4fa58d27a5	intel/fs,vec4: Drop support for shader time Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Sagar Ghuge	f78e33aa1a	intel/compiler: Set correct return format for brw_SAMPLE on GFX8 onwards, we have only single bit to determine correct return format. v2: - Define macro and use it instead of hardcoded value. (Lionel) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>	2021-11-22 21:27:30 -08:00
Lionel Landwerlin	361b3fee3c	intel: move away from booleans to identify platforms v2: Drop changes around GFX_VERx10 == 75 (Luis) v3: Replace (GFX_VERx10 < 75 && devinfo->platform != INTEL_PLATFORM_BYT) by (devinfo->platform == INTEL_PLATFORM_IVB) Replace (devinfo->ver >= 5 \|\| devinfo->platform == INTEL_PLATFORM_G4X) by (devinfo->verx10 >= 45) Replace (devinfo->platform != INTEL_PLATFORM_G4X) by (devinfo->verx10 != 45) v4: Fix crocus typo v5: Rebase v6: Add GFX3, ILK & I965 platforms (Jordan) Move ifdef to code expressions (Jordan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12981>	2021-11-08 16:48:06 +00:00
Lionel Landwerlin	4e4560ab6f	intel/compiler: add missing line returns to logs In the upcoming intel_clc tool, we're allowing to print these messages out and some of them just don't look right. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13165>	2021-10-05 07:31:52 +00:00
Ian Romanick	0f809dbf40	intel/compiler: Basic support for DP4A instruction v2: Very significant rebase on changes to previous commits. Specifically, brw_fs_nir.cpp changes were pretty much rewritten from scratch after changing the NIR opcode names and types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	043c5bf966	intel/compiler: Add id parameter to shader_debug_log callback There are two problems with the current architecture. In OpenGL, the id is supposed to be a unique identifier for a particular log source. This is done so that applications can (theoretically) filter particular log messages. The debug callback infrastructure in Mesa assigns a uniqe value when a value of 0 is passed in. This causes the id to get set once to a unique value for each message. By passing a stack variable that is initialized to 0 on every call, every time the same message is logged, it will have a different id. This isn't great, but it's also not catastrophic. When threaded shader compiles are used, the id pointer is saved and dereferenced at a possibly much later time on a possibly different thread. This causes one thread to access the stack from a different thread... and that stack frame might not be valid any more. :( This fixes shader-db crashes of various kinds on Iris with threaded shader compiles enabled. Fixes: `42c34e1ac8` ("iris: Enable threaded shader compilation") Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>	2021-08-01 23:58:08 +00:00
Sagar Ghuge	705285b9f4	intel/compiler: Add support for ternary add instruction on XeHP v2: - Re-arragne opcode in correct order (Matt Turner) - Move ADD3 case closer to LRP (Jason) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Francisco Jerez	4dc4284342	intel/fs: Implement Wa_14013745556 on TGL+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	c19cfa9dc2	intel/fs: Fix synchronization of accumulator-clearing W/A move on TGL+. Right now the accumulator-clearing move emitted by the generator for Wa_14010017096 inherits the SWSB field from the previous instruction. This can lead to redundant synchronization, or possibly more serious issues if the previous instruction had a TGL_SBID_SET SWSB synchronization mode. Take the SWSB synchronization information from the IR. Fixes: `a27542c5dd` ("intel/compiler: Clear accumulator register before EOT") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Jason Ekstrand	705395344d	intel/fs: Add support for compiling bindless shaders with resume shaders Instead of depending on the driver to compile each resume shader separately, we compile them all in one go in the back-end and build an SBT as part of the shader program. Shader relocs are used to make the entries in the SBT point point to the correct resume shader. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Marcin Ślusarz	3340d5ee02	intel: simplify is_haswell checks, part 1 Generated with: files=`git grep is_haswell \| cut -d: -f1 \| sort \| uniq` for file in $files; do cat $file \| \ sed "s/devinfo->ver <= 7 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" \| \ sed "s/devinfo->ver >= 8 \|\| devinfo->is_haswell/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo->is_haswell \|\| devinfo->ver >= 8/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo.is_haswell \|\| devinfo.ver >= 8/devinfo.verx10 >= 75/g" \| \ sed "s/devinfo->ver > 7 \|\| devinfo->is_haswell/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo->ver == 7 && !devinfo->is_haswell/devinfo->verx10 == 70/g" \| \ sed "s/devinfo.ver == 7 && !devinfo.is_haswell/devinfo.verx10 == 70/g" \| \ sed "s/devinfo->ver < 8 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" \| \ sed "s/device->info.ver == 7 && !device->info.is_haswell/device->info.verx10 == 70/g" \ > tmpXXX mv tmpXXX $file done Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10810>	2021-05-17 09:46:45 +00:00
Jason Ekstrand	34c560ae95	intel/fs: Stop using brw_dp_read/write_desc in Gen7+ only code Those helpers exist primarily to sort out some of the weirdness around Gen4-6 dataport access. On Gen5 and earlier, everything was called "dataport" and, instead of the SFID we have today there was a "target cache" parameter in the descriptor. There are also some bits that moved around on various gens depending on read vs. write. Starting with Gen6, most things which target one of the data cache SFIDs should use brw_dp_desc() instead. v2: Drop backward comment (Ken) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>	2021-05-02 20:20:06 +00:00
Lionel Landwerlin	6d4070f3dd	intel/compiler: add support for fragment coordinate with coarse pixels v2: Drop new internal opcodes (Jason) Simplify code (Jason) v3: Add Z computation for coarse pixels v4: Document things a little Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>	2021-05-02 20:20:06 +00:00
Lionel Landwerlin	b6332fc4a8	intel/compiler: handle coarse pixel in render target writes descriptors v2: Use the new inst->ex_desc field (Jason) v3: Drop CPS LoD compensation from sampler messages (Lionel) v4: Drop useless uses_rate_shading (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>	2021-05-02 20:20:06 +00:00
Anuj Phogat	61e8636557	intel: Rename gen_device prefix to intel_device export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen_device" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_device/intel_device/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Jordan Justen	515ee73b4e	intel/fs: End computer shader with message gateway on XeHP. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>	2021-04-16 08:27:35 +00:00
Francisco Jerez	262b647b25	intel/compiler: Lower integer division on XeHP. It has been removed from the hardware. [jordan.l.justen@intel.com: Move to brw_postprocess_nir] v2: Switch to nir_lower_idiv_precise (Rhys). v3: Fix for interface changes of nir_lower_idiv. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>	2021-04-16 08:27:35 +00:00
Francisco Jerez	05cce1f97d	intel/fs: Use CHV/BXT implementation of 64-bit MOV_INDIRECT on XeHP+. According to the hardware spec "Vx1 and VxH indirect addressing for Float, Half-Float, Double-Float and Quad-Word data must not be used." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>	2021-04-16 08:27:35 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Anuj Phogat	f96c3b8b63	intel: Rename GEN:BUG:### to Wa_### Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN:BUG:" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN$:BUG:$/Wa_/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	e7e55af4d6	intel: Rename GENx keyword to GFXx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN$[[:digit:]]\+$/GFX\1/g" Exclude the changes to modifiers: grep -E "I915_.GFX" -rIl $SEARCH_PATH \| xargs sed -ie "s/$I915_.$GFX/\1GEN/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	1d296484b4	intel: Rename Genx keyword to Gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/Gen$[[:digit:]]\+$/Gfx\1/g" Exclude changes in src/intel/perf/oa-.xml: find src/intel/perf -type f $ -name ".xml" $ \| xargs sed -ie "s/Gfx/Gen/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	b75f095bc7	intel: Rename genx keyword to gfxx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen$[[:digit:]]\+$/gfx\1/g" Exclude pack.h and xml changes in this patch: grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+_pack\.h$/gen\1/g" grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+\.xml$/gen\1/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	c1f3a778de	intel: Rename GENx prefix in macros to GFXx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN" -rIl src/intel/genxml \| grep -E ".py" \| xargs sed -ie "s/GEN$[%{]$/GFX\1/g" grep -E "[^_]GEN[[:digit:]]+" -rIl $SEARCH_PATH \| grep -E ".(\.c\|\.h\|\.y\|\.l)" \| xargs sed -ie "s/$[^_]$GEN$[[:digit:]]\+$/\1GFX\2/g" Leave out renaming GFX12_CCS_E macros. They fall under renaming pattern like "_GEN[[:digit:]]+": grep -E "GFX12_CCS_E" -rIl $SEARCH_PATH \| xargs sed -ie "s/GFX12_CCS_E/GEN12_CCS_E/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Ian Romanick	6c8e2e9317	intel/compiler: Enable the ability to emit CMPN instructions v2: Move checks to the EU validator. Suggested by Jason. Fixes: `2f2c00c727` ("i965: Lower min/max after optimization on Gen4/5.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>	2021-02-17 19:52:24 +00:00
Ian Romanick	b0d7434c71	intel/eu/validate: Add some checks for CMP and CMPN These checks were originally assertions elsewhere either in the existing code or later in this MR. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>	2021-02-17 19:52:24 +00:00
Jason Ekstrand	3ce6ca7214	intel/fs: Shuffle can't handle source modifiers On Gen7, we have to split shuffles into two MOVs for 64-bit types so we can't handle source modifiers. On Gen12.5, we have to use integer types all the time so we can't use them there either. Fixing that will be a different commit but it interacts with this one. Fixes: `90c9f29518` "i965/fs: Add support for nir_intrinsic_shuffle" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>	2021-02-17 03:59:25 +00:00
Jason Ekstrand	f3a43e36e0	intel/fs: Add an ex_desc field to fs_inst for SHADER_OPCODE_SEND I meant to do this years ago when I first added SHADER_OPCODE_SEND. At the time, the only use for the extended descriptor was bindless handles which were always one thing and never non-constant. However, it doesn't actually require any extra instructions because we have to OR in ex_mlen anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8748>	2021-01-28 17:57:48 +00:00
Jason Ekstrand	c80db6611a	intel/fs: Support 64-bit CLUSTER_BROADCAST on Gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>	2021-01-22 18:38:38 +00:00
Jason Ekstrand	b90921ec0c	intel/fs: Support 64-bit SHUFFLE on Gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>	2021-01-22 18:38:38 +00:00
Jason Ekstrand	cdedc82329	intel/fs: Support 64-bit SEL_EXEC on Gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>	2021-01-22 18:38:37 +00:00
Jason Ekstrand	f9d549b2bf	intel/fs: Use BRW_OPCODE_HALT for discards We're about to start using it to implement nir_jump_halt which has nothing inherently to do with fragment shaders or discards. May as well name it for the HW instruction it generates. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071>	2020-12-01 16:19:08 -06:00
Jason Ekstrand	e76e359007	intel/fs: Rename PLACEHOLDER_HALT to HALT_TARGET It's a bit more explicit and will play more nicely with what we're about to do. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071>	2020-12-01 16:18:50 -06:00
Jason Ekstrand	7280b0911d	intel/compiler: Add support for bindless shaders The Intel bindless thread dispatch model is very simple. When a compute shader is to be used for bindless dispatch, it can request a set of stack IDs. These are allocated per-dual-subslice by the hardware and recycled automatically when the stack ID is returned. Passed to the bindless dispatch are a global argument address, a stack ID, and an address of the BINDLESS_SHADER_RECORD to invoke. When the bindless shader is dispatched, it is passed its stack ID as well as the global and local argument pointers. The local argument pointer is the address of the BINDLESS_SHADER_RECORD plus some offset which is specified as part of the BINDLESS_SHADER_RECORD. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:09 +00:00
Ian Romanick	262ca98b3a	intel/compiler: Remove Gen10-specific code Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>	2020-10-15 09:29:53 -07:00
Jason Ekstrand	06ebf23283	intel/fs: Add a SCRATCH_HEADER opcode This opcode is responsible for setting up the buffer base address and per-thread scratch space fields of a scratch message header. For the most part, it's a copy of g0 but some messages need us to zero out g0.2 and the bottom bits of g0.5. This may actually fix a bug when nir_load/store_scratch is used. The docs say that the DWORD scattered messages respect the per-thread scratch size specified in gN.3[3:0] in the message header but we've been leaving it zero. This may mean that we've been ignoring any scratch reads/writes from a load/store_scratch intrinsic above the 1KB mark. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	8427e56067	intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs When I copied and pasted the code from MOV_INDIRECT for handling the dependency controls, I missed a subtle difference between MOV_INDIRECT and SHUFFLE. Specifically, MOV_INDIRECT gets lowered to a narrow instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to split it in the generator. Therefore, the check safety check for whether or not we can use dependency control has to be based on the lowered width rather than the width of the original instruction. Fixes: `a8ac61b0ee` "intel/fs: NoMask initialize the address..." Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989>	2020-10-02 19:53:56 +00:00
Jason Ekstrand	a8ac61b0ee	intel/fs: NoMask initialize the address register for shuffles Cc: mesa-stable@lists.freedesktop.org Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2979 Tested-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6825>	2020-10-02 00:42:56 +00:00
Marcin Ślusarz	5ea0b6a9c6	intel/compiler: initialize remaining fields of various classes These variables seem to be initialized before being used, so this patch is not fixing any bug, but leaving them unitialized may become a bug after some refactoring. These classes were affected: fs_reg_alloc, fs_visitor, fs_generator, instruction_scheduler. Found by Coverity. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>	2020-09-10 12:16:58 +00:00
Marcin Ślusarz	663c4d5377	intel/fs: add hint how to get more info when shader validation fails Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6559>	2020-09-04 12:09:22 +00:00
Jason Ekstrand	91becd84ae	intel/fs: Add support for a new load_reloc_const intrinsic Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	90b6745bc8	intel/fs,vec4: Stuff the constant data from NIR in the end of the program Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	91348d125d	intel/eu: Add some new helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Danylo Piliaiev	bc4a127d6e	intel/disasm: Label support in shader disassembly for UIP/JIP Shader instructions which use UIP/JIP now get formatted with a label in addition with immediate value, labels have "LABEL%d" format. v2: - Consider brw_jump_scale when calculating label's offset From: "Lonnberg, Toni" <toni.lonnberg@intel.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>	2020-09-02 10:33:29 +00:00
Jason Ekstrand	cccb497d3c	intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+ The immediate case is pretty uncommon to see but it can happen, in theory. BROADCAST is typically used to uniformize values and those are usually 32-bit. However, it does come up in some subgroup ops. Fixes: `49c21802cb` "intel/compiler: Split has_64bit_types into float/int" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6211>	2020-09-01 13:25:20 -05:00
Jason Ekstrand	c48f42e178	intel/fs: Emit HALT for discard on Gen4-5 Using HALT to immediately jump to the end of the shader is required to implement GL_EXT_gpu_shader4 and OpenGL 3.0. However, vanilla OpenGL 1.2 doesn't forbid it and it likely makes something somewhere faster. We should be consistent and implement the same discard behavior on all hardware if we can. The rules for HALT on Gen4-5 are a bit different from Gen6+. On the older hardware, there is no stack for HALT; instead it's up to software to save and restore mask registers. However, there's no real saving needed since we only use HALT to jump to the end of the program where we're about about to do our FB writes. All we need to do is reset AMask to DMask, the value it was initialized to at the start of the thread. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5244>	2020-05-30 06:21:15 +00:00
Caio Marcelo de Oliveira Filho	f858fa26b4	intel/fs,vec4: Pull stall logic for memory fences up into the IR Instead of emitting the stall MOV "inside" the SHADER_OPCODE_MEMORY_FENCE generation, use the scheduling fences when creating the IR. For IvyBridge, every (data cache) fence is accompained by a render cache fence, that now is explicit in the IR, two SHADER_OPCODE_MEMORY_FENCEs are emitted (with different SFIDs). Because Begin and End interlock intrinsics are effectively memory barriers, move its handling alongside the other memory barrier intrinsics. The SHADER_OPCODE_INTERLOCK is still used to distinguish if we are going to use a SENDC (for Begin) or regular SEND (for End). This change is a preparation to allow emitting both SENDs in Gen11+ before we can stall on them. Shader-db results for IVB (i965): total instructions in shared programs: 11971190 -> 11971200 (<.01%) instructions in affected programs: 11482 -> 11492 (0.09%) helped: 0 HURT: 8 HURT stats (abs) min: 1 max: 3 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.03% max: 0.50% x̄: 0.14% x̃: 0.10% 95% mean confidence interval for instructions value: 0.66 1.84 95% mean confidence interval for instructions %-change: 0.01% 0.27% Instructions are HURT. Unlike the previous code, that used the `mov g1 g2` trick to force both `g1` and `g2` to stall, the scheduling fence will generate `mov null g1` and `mov null g2`. During review it was decided it was not worth keeping the special codepath for the small effect will have. Shader-db results for HSW (i965), BDW and SKL don't have a change on instruction count, but do report changes in cycles count, showing SKL results below total cycles in shared programs: 341738444 -> 341710570 (<.01%) cycles in affected programs: 7240002 -> 7212128 (-0.38%) helped: 46 HURT: 5 helped stats (abs) min: 14 max: 1940 x̄: 676.22 x̃: 154 helped stats (rel) min: <.01% max: 2.62% x̄: 1.28% x̃: 0.95% HURT stats (abs) min: 2 max: 1768 x̄: 646.40 x̃: 362 HURT stats (rel) min: <.01% max: 0.83% x̄: 0.28% x̃: 0.08% 95% mean confidence interval for cycles value: -777.71 -315.38 95% mean confidence interval for cycles %-change: -1.42% -0.83% Cycles are helped. This seems to be the effect of allocating two registers separatedly instead of a single one with size 2, which causes different register allocation, affecting the cycle estimates. while ICL also has not change on instruction count but report changes negative changes in cycles total cycles in shared programs: 352665369 -> 352707484 (0.01%) cycles in affected programs: 9608288 -> 9650403 (0.44%) helped: 4 HURT: 104 helped stats (abs) min: 24 max: 128 x̄: 88.50 x̃: 101 helped stats (rel) min: <.01% max: 0.85% x̄: 0.46% x̃: 0.49% HURT stats (abs) min: 2 max: 2016 x̄: 408.36 x̃: 48 HURT stats (rel) min: <.01% max: 3.31% x̄: 0.88% x̃: 0.45% 95% mean confidence interval for cycles value: 256.67 523.24 95% mean confidence interval for cycles %-change: 0.63% 1.03% Cycles are HURT. AFAICT this is the result of the case above. Shader-db results for TGL have similar cycles result as ICL, but also affect instructions total instructions in shared programs: 17690586 -> 17690597 (<.01%) instructions in affected programs: 64617 -> 64628 (0.02%) helped: 55 HURT: 32 helped stats (abs) min: 1 max: 16 x̄: 4.13 x̃: 3 helped stats (rel) min: 0.05% max: 2.78% x̄: 0.86% x̃: 0.74% HURT stats (abs) min: 1 max: 65 x̄: 7.44 x̃: 2 HURT stats (rel) min: 0.05% max: 4.58% x̄: 1.13% x̃: 0.69% 95% mean confidence interval for instructions value: -2.03 2.28 95% mean confidence interval for instructions %-change: -0.41% 0.15% Inconclusive result (value mean confidence interval includes 0). Now that more is done in the IR, more dependencies are visible and more SWSB annotations are emitted. Mixed with different register allocation decisions like above, some shaders will see more `sync nops` while others able to avoid them. Most of the new `sync nops` are also redundant and could be dropped, which will be fixed in a separate change. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3278>	2020-04-29 07:17:27 +00:00
Caio Marcelo de Oliveira Filho	0e96b0d6dd	intel/fs: Allow FS_OPCODE_SCHEDULING_FENCE stall on registers It will generate the MOVs (or SYNC_NOP in Gen12+) needed for stall. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3278>	2020-04-29 07:17:27 +00:00

1 2 3 4

179 Commits