cull distance is analogous to clip distance. If a register is
given this semantic, then the values in it are assumed to be a
float32 distance to a plane. Primitives will be completely
discarded if the plane distance for all of the vertices in
the primitive are < 0.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Since pipe_surface already has all the necessary fields no interface
changes are necessary except adding a new shader semantic value
(TGSI_SEMANTIC_LAYER).
(Note that what GL knows as "gl_Layer" variable d3d10 is naming
"RENDER_TARGET_ARRAY_INDEX".)
v2: drop cap bit (just tied to geometry shader), add docs.
Adds the remaining integer opcodes, and some opcodes are moved to more
appropriate places, along with getting rid of the (already nearly empty)
ps_2_x section. Though the CAP bits for some of these are still a bit in
the air so the documentation isn't quite as watertight as is desirable.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
A lot of them were missing. Others were moved from the Compute ISA
to a new Integer ISA section as that seemed more appropriate.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
It's valid because we reuse certain arithmetic operations
for both signed and unsigned types (e.g. uadd, umad, which
have a bit unfortunate naming)
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Squashed commit of the following:
commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Apr 23 17:37:18 2013 +0100
gallium: s/lower_left_origin/bottom_edge_rule/
commit 4dff4f64fa83b9737def136fffd161d55e4f1722
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Apr 23 17:35:04 2013 +0100
gallium: Move diagram to docs.
commit 442a63012c8c3c3797f45e03f2ca20ad5f399832
Author: James Benton <jbenton@vmware.com>
Date: Fri May 11 17:50:55 2012 +0100
gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
This change is necessary to achieve correct results when using OpenGL
FBOs.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
TGSI_OPCODE_IF condition had two possible interpretations:
- src.x != 0.0f
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
vertex and fragment shaders
- gallivm/llvmpipe
- postprocess
- vl state tracker
- vega state tracker
- most old drivers
- old internal state trackers
- many graw examples
- src.x != 0U
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
vertex and fragment shaders
- tgsi_exec/softpipe
- r600
- radeonsi
- nv50
And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)
This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range. It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.
But with this change there are now two opcodes, IF and UIF, with clear
meaning.
Drivers that do not support native integers do not need to worry about
UIF. However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.
I tried to implement this for r600 and radeonsi based on the surrounding
code. I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
properly in nv50/ir.
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum
Need to take the type into account. Also, if we want to allow
mov's with modifiers we need to pick a type (assume float).
v2: don't allow all modifiers on all type, in particular don't allow
absolute on non-float types and don't allow negate on unsigned.
Also treat UADD as signed (despite the name) since it is used
for handling both signed and unsigned integer arguments and otherwise
modifiers don't work.
Also add tgsi docs clarifying this.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
It looks like using coord.w as explicit lod value is a mistake, most likely
because some dx10 docs had it specified that way. Seems this was changed though:
http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx
- let's just hope it doesn't depend on runtime build version or something.
Not only would this need translation (so go against the stated goal these
opcodes should be close to dx10 semantics) but it would prevent usage of this
opcode with cube arrays, which is apparently possible:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509699%28v=vs.85%29.aspx
(Note not only does this show cube arrays using explicit lod, but also the
confusion with this opcode: it lists an explicit lod parameter value, but then
states last component of location is used as lod).
(For "true" hw drivers, only nv50 had code to handle it, and it appears the
code was already right for the new semantics, though fix up the seemingly
wrong c/d arguments while there.)
v2: fix comment, separate out other changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Need to calculate the number of mip levels (if it would be worthwile could
store it in dynamic state).
While here, the query code also used chan 2 for the lod value.
This worked with mesa state tracker but it seems safer to use chan 0.
Still passes piglit textureSize (with some handwaving), though the non-GL
parts are (largely) untested.
v2: clarify and expect the sviewinfo opcode to return ints, not floats,
just like the OpenGL textureSize (dx10 supports dst modifiers with resinfo).
Also simplify some code.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt()
and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED
query says it's supported by the driver.
Otherwise, sqrt(x) is implemented with x*rsq(x). The problem with
this is sqrt(0) must be handled specially because rsq(0) might be
Inf/NaN/undefined (and then 0*rsq(0) is Inf/Nan/undefined). In the
glsl-to-tgsi code we use an extra CMP to check if x is zero and then
replace the result of x*rsq(x) with zero.
In the end, this makes sqrt() generate much more reasonable code for
drivers that can do square roots.
Note that many of piglit's generated shader tests use the GLSL
distance() function.
This change will be useful to implement function parameter passing on
top of TGSI. As we don't have a proper stack, a register-based
calling convention will be used instead, which isn't necessarily a bad
thing given that GPUs often have plenty of registers to spare.
Using the same register space for local temporaries and
inter-procedural communication caused some inefficiencies, because in
some cases the register allocator would lose the freedom to merge
temporary values together into the same physical register, leading to
suboptimal register (and sometimes, as a side effect, instruction)
usage.
The LOCAL declaration modifier specifies that the value isn't intended
for parameter passing and as a result the compiler doesn't have to
give any guarantees of it being preserved across function boundaries.
Ignoring the LOCAL flag doesn't change the semantics of a valid
program in any way, because local variables are just supposed to get a
more relaxed treatment. IOW, this should be a backwards-compatible
change.
Normal resource access (e.g. the LOAD TGSI opcode) is supposed to
perform a series of conversions to turn the texture data as it's found
in memory into the target data type.
In compute programs it's often the case that we only want to access
the raw bits as they're stored in some buffer object, and any kind of
channel conversion and scaling is harmful or inefficient, especially
in implementations that lack proper hardware support to take care of
it -- in those cases the conversion has to be implemented in software
and it's likely to result in a performance hit even if the pipe_buffer
and declaration data types are set up in a way that would just pass
the data through.
Add a declaration flag that marks a resource as typeless. No channel
conversion will be performed in that case, and the X coordinate of the
address vector will be interpreted in byte units instead of elements
for obvious reasons.
This is similar to D3D11's ByteAddressBuffer, and will be used to
implement OpenCL's constant arguments. The remaining four compute
memory spaces can also be understood as raw resources.
Move Interpolate, Centroid and CylindricalWrap from tgsi_declaration
to a separate token -- they only make sense for FS inputs and we need
room for other flags in the top-level declaration token.
This commit splits the current concept of resource into "sampler
views" and "shader resources":
"Sampler views" are textures or buffers that are bound to a given
shader stage and can be read from in conjunction with a sampler
object. They are analogous to OpenGL texture objects or Direct3D
SRVs.
"Shader resources" are textures or buffers that can be read and
written from a shader. There's no support for floating point
coordinates, address wrap modes or filtering, and, unlike sampler
views, shader resources are global for the whole graphics pipeline.
They are analogous to OpenGL image objects (as in
ARB_shader_image_load_store) or Direct3D UAVs.
Most hardware is likely to implement shader resources and sampler
views as separate objects, so, having the distinction at the API level
simplifies things slightly for the driver.
This patch introduces the SVIEW register file with a declaration token
and syntax analogous to the already existing RES register file. After
this change, the SAMPLE_* opcodes no longer accept a resource as
input, but rather a SVIEW object. To preserve the functionality of
reading from a sampler view with integer coordinates, the
SAMPLE_I(_MS) opcodes are introduced which are similar to LOAD(_MS)
but take a SVIEW register instead of a RES register as argument.
Conflicts:
src/gallium/auxiliary/tgsi/tgsi_strings.c
src/mesa/state_tracker/st_atom_clip.c
commit d919791f2742e913173d6b335128e7d4c63c0840
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 17:59:22 2012 +0100
d3d1x: adapt to new clip state
commit cfec82bca3fefcdefafca3f4555285ec1d1ae421
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 14:16:51 2012 +0100
gallium/docs: update for clip state changes
commit c02bfeb81ad9f62041a2285ea6373bbbd602912a
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 14:21:43 2012 +0100
tgsi: add TGSI_PROPERTY_PROHIBIT_UCPS
commit d4e0a785a6a23ad2f6819fd72e236acb9750028d
Author: Brian Paul <brianp@vmware.com>
Date: Thu Jan 5 08:30:00 2012 -0700
tgsi: consolidate TGSI string arrays in new tgsi_strings.h
There was some duplication between the tgsi_dump.c and tgsi_text.c
files. Also use some static assertions to help catch errors when
adding new TGSI values.
v2: put strings in tgsi_strings.c file instead of the .h file.
Reviewed-by: Dave Airlie <airlied@redhat.com>
commit c28584ce0d8c62bd92c8f140729d344f88a0b3cd
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 12:48:09 2012 +0100
gallium: extend user_clip_plane_enable to apply to clip distances
commit f1d5016c07f786229ed057effbe55fbfd160b019
Author: Marek Olšák <maraeo@gmail.com>
Date: Fri Jan 6 02:39:09 2012 +0100
nvfx: adapt to new clip state
commit 6f6fa1c26bd19f797c1996731708e3569c9bfe24
Author: Marek Olšák <maraeo@gmail.com>
Date: Fri Jan 6 01:41:39 2012 +0100
st/mesa: fix DrawPixels with GL_DEPTH_CLAMP
commit c86ad730aa1c017788ae88a55f54071bf222be12
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Tue Jan 3 23:51:30 2012 +0100
nv50: adapt to new clip state
commit 3a8ae6ac243bae5970729dc4057fe02d992543dc
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Tue Jan 3 23:32:36 2012 +0100
nvc0: adapt to new clip state
commit 6243a8246997f8d2fcc69ab741a2c2dea080ff11
Author: Marek Olšák <maraeo@gmail.com>
Date: Thu Dec 29 01:32:51 2011 +0100
draw: initalize pt.user.planes in draw_init
This fixes a crash in glean/fpexceptions.
commit e3056524b19b56d473f4faff84ffa0eb41497408
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Dec 26 06:26:55 2011 +0100
svga: adapt to new clip state
commit c5bfa8b37d6d489271df457229081d6bbb51b4b7
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 14:11:51 2011 +0100
r600g: adapt to new clip state
commit f11890905362f62627c4a28a8255b76eb7de7df2
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 14:10:26 2011 +0100
r300g: adapt to new clip state
commit e37465327c79a01112f15f6278d9accc5bf3103f
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 12:39:16 2011 +0100
draw: adapt to new clip state
This adds a regression in the LLVM clipping path. Can anybody see anything
wrong with the code? It works for every other case, just glean/fpexceptions
crashes when doing the "Infinite clip plane test".
commit b474d2b18c72d965eefae4e427c269cba5ce6ba2
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 13:14:59 2011 +0100
u_blitter: don't save/set/restore clip state
commit 9dd240ea91f523a677af45e8d0adb9e661e28602
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 13:11:56 2011 +0100
gallium: don't cso_save/set/restore clip state
The enable bits are in the rasterizer state.
commit a4f7031179f5f4ad524b34b394214b984ac950f6
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 12:58:55 2011 +0100
gallium: default depth_clip to 1
depth_clip = !depth_clamp
commit fe21147a00ab90e549d63fe12ee4625c9c2ffcc3
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Dec 26 06:14:19 2011 +0100
trace,util: update state logging to new clip state
Also dump the other missing flags.
commit 2a3b96e84ac872dcc5bc1de049fe76bb58d64b23
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 10:43:43 2011 +0100
st/mesa: adapt to new clip state
commit b7b656a42fca19d7c85267f42649a206a85a2c72
Author: Marek Olšák <maraeo@gmail.com>
Date: Sat Dec 17 15:45:19 2011 +0100
gallium: move state enable bits from clip_state to rasterizer_state
this mentions which channels are used for slice and depth comparison values.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This adds tokens for texture offsets, to store 4 * swizzled vec 3
for use in TXF and other opcodes.
It also contains TGSI exec changes for softpipe to use this code,
along with GLSL->TGSI support for TXF.
v2: add some more comments, add back padding I removed.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
largely a merge of the previously discussed origin/gallium-resource-sampling
but updated.
the idea is to allow arbitrary binding of resources, the way opencl, new gl
versions and dx10+ require, i.e.
DCL RES[0], 2D, FLOAT
LOAD DST[0], SRC[0], RES[0]
SAMPLE DST[0], SRC[0], RES[0], SAMP[0]
For GL fragColor semantics we need to tell the pipe drivers that the fragment
shader color result is to be replicated to all bound color buffers, this
adds the basic TGSI + documentation.
v2: fix missing comma pointed out by Tilman on mesa-dev.
Signed-off-by: Dave Airlie <airlied@redhat.com>