KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Roland Scheidegger	740a1618c3	gallium: add new LOD opcode The operation performed is all the same as LODQ, but with the usual differences between dx10 and GL texture opcodes, that is separate resource and sampler indices (plus result swizzling, and setting z/w channels to zero). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-30 02:58:09 +02:00
Nicolai Hähnle	cad959d901	gallium: add LDEXP TGSI instruction and corresponding cap Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:08:01 +02:00
Nicolai Hähnle	3c78215a1c	tgsi: clarify the semantics of DFRACEXP The status quo is quite the mess: 1. tgsi_exec will do a per-channel computation, and store the dst[0] result (significand) correctly for each channel. The dst[1] result (exponent) will be written to the first bit set in the writemask. So per-component calculation only works partially. 2. r600 will only do a single computation. It will replicate the exponent but not the significand. 3. The docs pretend that there's per-component calculation, but even get dst[0] and dst[1] confused. 4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions, and kind-of assumes that everything is replicated, generating this for the dvec4 case: DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw Settle on the simplest behavior, which is single-component calculation with replication, document it, and adjust tgsi_exec and r600. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:50 +02:00
Nicolai Hähnle	dbe7fc00d5	tgsi: fix the documentation of DLDEXP Sourcing the exponent for the zw destination pair from Z is consistent with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always generates per-channel instructions anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 12:07:46 +02:00
Gwan-gyeong Mun	c261bc11e6	gallium/docs: Fix an inequality sign of TGSI_SEMANTIC_SUBGROUP_LT_MASK A previous expression presents same as TGSI_SEMANTIC_SUBGROUP_GT_MASK. It fixes a direction of an inequality for TGSI_SEMANTIC_SUBGROUP_LT_MASK. before: bit index > TGSI_SEMANTIC_SUBGROUP_INVOCATION after: bit index < TGSI_SEMANTIC_SUBGROUP_INVOCATION Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-28 12:05:44 +02:00
Gwan-gyeong Mun	9649c6acce	gallium/docs: Fix the math formula of U2I64 before: dst.xy = (uint64_t) src0.x dst.zw = (uint64_t) src0.y after: dst.xy = (int64_t) src0.x dst.zw = (int64_t) src0.y Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-23 14:09:49 +02:00
Gwan-gyeong Mun	9aabf80ef3	gallium/docs: Add missing word "Not" Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-23 14:09:22 +02:00
Marek Olšák	497506ad93	gallium: remove TGSI opcode SCS use COS+SIN instead. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2017-08-22 16:42:17 +02:00
Marek Olšák	cdaaf66566	gallium: remove TGSI opcode BREAKC Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:33:48 +02:00
Marek Olšák	985e6b5ef9	gallium: remove TGSI opcode XPD use MUL+MAD+MOV instead. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	3e2ff8fade	gallium: remove TGSI opcode DPH use DP4 or DP3 + ADD. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	86e6f7a73b	gallium: remove TGSI opcode DP2A use DP3 instead. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	0bb367830a	gallium: remove TGSI_OPCODE_CALLNZ Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	068c3ad2cb	gallium: remove TGSI FENCE opcodes use MEMBAR instead Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	44716655e6	gallium: remove TGSI opcodes PUSHA, POPA, SAD, TXQ_LZ Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-22 13:29:47 +02:00
Brian Paul	426673e271	gallium/docs: add more info about TXF and MSAA textures If the texture is multisampled, the coord.w component indicates which sample to fetch. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-03 14:13:57 -06:00
Brian Paul	722ba1ad19	gallium/docs: document automatic per-sample FS execution Both the GLSL 4.00 specs and DX10.1 specs specify that if a fragment shader uses the sample ID or sample position inputs, the shader is automatically run at per sample frequency. Document that expectation for gallium fragment shaders. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-08-03 14:13:57 -06:00
Karol Herbst	c5cbb9a543	gallium/docs: add precise instruction modifier v4: add comment about intermediate rounding step to MAD Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-07-21 23:45:18 -04:00
Brian Paul	e54fe78e0e	gallium/docs: document that TXF is used with PIPE_BUFFER resources Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-06-30 13:37:10 -06:00
Brian Paul	1c33dc77f7	gallium/docs: improve docs for SAMPLE_POS, SAMPLE_INFO, TXQS, MSAA semantics For the SAMPLE_POS and SAMPLE_INFO opcodes, clarify resource vs. render target queries, range of postion values, swizzling, etc. We basically follow the DX10.1 conventions. For the TXQS opcode and TGSI_SEMANTIC_SAMPLEID, clarify return value and type. For the TGSI_SEMANTIC_SAMPLEPOS system value, clarify the range of positions returned. v2: use 'undef' for unused vector components. Use (0.5, 0.5, undef, undef) for sample pos when MSAA not applicable. v3: Add note that OPCODE_SAMPLE_INFO, OPCODE_SAMPLE_POS are not used yet and the information is subject to change. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-06-16 14:07:31 -06:00
Brian Paul	def8d1d23f	gallium/docs: clarify TGSI_SEMANTIC_SAMPLEMASK, again I've since discovered the fragment shader sample mask system value (which corresponds to gl_SampleMaskIn). v2: It's a system value, not a shader input. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-13 08:02:43 -06:00
Brian Paul	81e15a5dea	tgsi: clarify TGSI_SEMANTIC_SAMPLEMASK documentation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-09 08:51:56 -06:00
Lyude	af788a82d5	gallium: Add TGSI shader token for ARB_post_depth_coverage Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-06-02 23:19:22 -04:00
Nicolai Hähnle	f3d2cf6c1f	tgsi: clarify TGSI_SEMANTIC_{LAYER,VIEWPORT_INDEX} Depending on pipe caps they can be writable in all vertex processing stages, but only the output of the last stage counts. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-14 22:50:06 +02:00
Rob Clark	16d493f1e7	gallium/docs: small correction about register files for atomics These can operate on MEMORY[], in addition to BUFFER[] and IMAGE[] Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-14 12:46:12 -04:00
Ilia Mirkin	5dd490f134	gallium: fix some math formulas to display better Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-07 20:20:17 -04:00
Ilia Mirkin	08bd0aa507	tgsi: add SUBGROUP_* semantics v2: add documentation (Nicolai) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:41 +02:00
Ilia Mirkin	3650d7455f	tgsi: add BALLOT/READ_* opcodes v2 (Nicolai): - BALLOT isn't per-channel - expand the documentation (also for VOTE_) v3: - only BALLOT returns a 64-bit lanemask (Boyan) - relax the requirement on READ_INVOC: the invocation number to read from must be uniform within a sub-group. This matches the GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD GCN) v4: - hopefully really fix the doc of VOTE_ returns (Ilia) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2017-04-05 15:29:34 +02:00
Ilia Mirkin	94ec847cb0	tgsi: add CLOCK opcode Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:26 +02:00
Francisco Jerez	e6469ec43b	gallium/tgsi: Treat UCMP sources as floats to match the GLSL-to-TGSI pass expectations. Currently the GLSL-to-TGSI translation pass assumes it can use floating point source modifiers on the UCMP instruction. See the bug report linked below for an example where an unrelated change in the GLSL built-in lowering code for atan2 (`e9ffd12827`) caused the generation of floating-point ir_unop_neg instructions followed by ir_triop_csel, which is translated into UCMP with a negate modifier on back-ends with native integer support. Allowing floating-point source modifiers on an integer instruction seems like rather dubious design for a transport IR, since the same semantics could be represented as a sequence of MOV+UCMP instructions instead, but supposedly this matches the expectations of TGSI back-ends other than tgsi_exec, and the expectations of the DX10 API. I take no responsibility for future headaches caused by this inconsistency. Fixes a regression of piglit glsl-fs-tan-1 on softpipe introduced by the above-mentioned glsl front-end commit. Even though the commit that triggered the regression doesn't seem to have made it to any stable branches yet, this might be worth back-porting since I don't see any reason why the bug couldn't have been reproduced before that point. Suggested-by: Roland Scheidegger <sroland@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99817 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-03-15 15:47:14 -07:00
Marek Olšák	cca0389c72	gallium: add TGSI opcodes TEX_LZ and TXF_LZ for better code generation in radeonsi	2017-03-15 18:17:41 +01:00
Eric Engestrom	d88a0dffe3	gallium/docs: fix section title formatting src/gallium/docs/source/tgsi.rst:3488: WARNING: Title underline too short. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:01:01 +00:00
Eric Engestrom	5aa7fa2bbf	gallium/docs: add missing newlines Without these, mathjax considers these as the continuation of the previous line. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:00:57 +00:00
Eric Engestrom	3ae77c912e	gallium/docs: add missing math formatting Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:00:51 +00:00
Marek Olšák	ad019bf5c6	gallium: remove TGSI_OPCODE_CLAMP Not used and not widely supported. Use MIN+MAX instead. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	b5b0936677	gallium/docs: remove documentation of non-existent instructions trivial	2017-02-18 01:22:08 +01:00
Ilia Mirkin	a2b2cd81d1	gallium: add TGSI_PROPERTY_MUL_ZERO_WINS This will be useful for proper D3D9 emulation, where this behavior is expected by some shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-01-23 20:35:55 -05:00
Ilia Mirkin	1393999541	gallium: add FBFETCH opcode to retrieve the current sample value Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:08 -05:00
Nicolai Hähnle	6be4a40430	tgsi: add DDIV instruction Double-precision division, to allow more precision than a DRCP + DMUL sequence. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-16 20:17:22 +01:00
Nicolai Hähnle	6526977306	tgsi: align the definition of BFI & [UI]BFE with GLSL As previously written, these opcodes use the SM5 semantics which is incompatible with GLSL when bits == 0, offset == 32. At some point we may want to add BFI_SM5 etc. opcodes, but all users currently either want (and expect!) the GLSL semantics or don't care. Bitfield inserts are generated by the GLSL lower_instructions and lower_packing_builtins passes with constant bits and offset arguments, so any workaround code that drivers may have to emit to follow GLSL semantics should be optimized away easily for those uses. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-02 12:30:07 +01:00
Dave Airlie	6e1a34d545	gallium: add opcode and types for 64-bit integers. (v3) This just adds the basic support for 64-bit opcodes, and the new types. v2: add conversion opcodes. add documentation. v3: - make docs more consistent - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:23:05 +02:00
Samuel Pitoiset	3f3640c86c	tgsi: document semantics for compute shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-12 22:15:10 +02:00
Hans de Goede	d386cef246	tgsi: Add WORK_DIM System Value Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Ilia Mirkin	30684b50d7	gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:28 -04:00
Dave Airlie	e6d9389366	tgsi: remove culldist semantic. This isn't used anymore in the tree, culldist's are part of the clipdist semantic, we could in theory rename it, but I'm not sure there is much point, and I'd have to be careful with virgl. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:44 +10:00
Hans de Goede	b5e7907f30	nouveau: codegen: LOAD: Take src swizzle into account The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually getting the desired behavior Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Oded Gabbay	d97f5d60f5	tgsi/doc: fix spelling error Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 11:43:43 +03:00
Bas Nieuwenhuizen	01f993a21f	gallium: add threads per block TGSI property The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:50:59 +02:00
Brian Paul	6775268b61	gallium/docs: s/gven/given/	2016-03-29 18:13:46 -06:00
Marek Olšák	fbe6e92899	gallium: add TGSI property NEXT_SHADER Radeonsi needs to know which shader stage will execute after a shader in order to make the best decision about which shader variant to compile first. This is only set for VS and TES, because we don't need it elsewhere. VS has 3 variants: - next shader is FS - next shader is GS - next shader is TCS TES has 2 variants: - next shader is FS - next shader is GS Currently, radeonsi always assumes the next shader is FS, which is suboptimal, since st/mesa always knows which shader is next if the GLSL program is not a "separate shader". By default, ureg always sets "next shader is FS". Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00

1 2 3 4

192 Commits