For 0^0 case result of "LOG_CLAMPED ...,0" is -MAX_FLOAT, and then result of
"MUL_LIT ...,0,-MAX_FLOAT,..." is -MAX_FLOAT instead of 0 because of special
src1 checks for -MAX_FLOAT. So swap src0/1:
"MUL_LIT ...,-MAX_FLOAT,0,..." to get expected 0, then result of
"EXP_IEEE ...,0" is 1 as expected for LIT.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
It's not supposed to do conversion, but st sometimes asks us to.
Sometimes conversion is even wrong (e.g. between UNORM and SRGB).
This should now include all formats the 2D engine supports.
If a user-buffer was referenced twice by a draw command, the affected ranges
were uploaded separately, with only the last one being referenced by the
hardware. Make sure we upload only a single range.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
We currently always treat contents of user-buffers as volatile so
we don't need to take any particular action when the state tracker
announces that the contents has changed.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Viewperf uses some unusual vertex arrays where the stride is less
than the element size. In this case, the stride was 4 while the
element size was 12. The difference of 8 bytes causes us to miss
uploading the tail bit of the array data.
Typically the stride is >= the element size so there was no problem
with other apps.
Stream user buffer contents rather than trying to maintain persistent
host / hardware copies.
Resulting negative array offsets are not allowed by the hardware,
(well, at least not according to header files), so adjust index bias
to make all array offsets positive.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
- Copy i915c's support for phases, that should allow us to run a coupe more shaders.
- Fix the error messages.
- Still try to proceed when we get a shader that's too long.
Move defintion of M_PI (for the benefit of <math.h> which do not define it), to
before the first use of it
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>
Blending and maybe even alpha-test don't work with those formats.
Only supporting RGBA, BGRA, RGBX, BGRX.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Evergreen+ don't support multi-writes so we need to emulate
it in the shader. Fixes the following piglit tests:
fbo-drawbuffers-fragcolor
ati_draw_buffers-arbfp-no-option
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Should unify this too, but will delay that until the planned
libdrm_nouveau/winsys changes which are likely to cause major
changes to this bo validation code too.
Fixes broken glTexImage2D with format=GL_RGBA since
1a339b6c71
The origin for this behaviour is that r600_is_format_supported
checks only against r600_state_inline.h tables not evergreens.
evergreen+ stores depth and stencil separately so when we
allocate a depth/stencil fbo, make sure we allocate enough
memory for both depth and stencil buffers.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
this drop a bunch of unnecessary checks (i.e. should be trapped
at gallium level), and also removes the switch statement in favour
of some calculated values for the vgt values.
Signed-off-by: Dave Airlie <airlied@redhat.com>
the attached patch should be an improvement over Vadim Girlin's patch
fixing LIT instruction for r600g (commit
2fe39b46e7).
Instructions used in tgsi_lit have been reordered to always write to a
dst channel after the same channel in src has been read (so if src ==
dst, input values are not overwritten before being used).
Signed-off-by: Dave Airlie <airlied@redhat.com>
Current LIT implementation uses dst components for storing temp
results, possibly overwriting still needed values (depends on the
swizzles).
This patch uses temp reg for one of such cases (found in etqw) and
fixes "LIT R.z, R.xyzz".
Tested on evergreen. Fixes some etqw-demo rendering glitches when
"Lighting" is set to "High" in the settings.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Source box needs to be adjusted for blitting from compressed formats.
fixes https://bugs.freedesktop.org/show_bug.cgi?id=35434
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: final chunk of Mike's patch from bug 37476
this uses a loop to emit the GRADIENTS and does a check to
see if we need to fetch to a temporary register. It also
increases the context src gpr to 4 which is needed here.]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Mike had actually done a lot of the TXD support in a patch in bug
37476 which I see now, I'll add the bits of his work that I didn't think
to add to my work.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This at least passes the piglit arb_shader_texture_lod-texgrad test,
the AMD shader analyzer seems to multiply the V component by an unspecified
constant value no idea why.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This sets the base level as the zero level, which fixes
piglit/texturing/tex-miplevel-selection*.
The r600 hardware ignores the BASE_LEVEL field in some cases, so we can't
use it.
Evergreen might need this too.
Copy-and-paste from the bgra cases. The C paths attempt to avoid
copying the 'x' channel, but it's harmless, you might as well. Good for
about 5% in glxgears (740 to 780 fps).
Signed-off-by: Adam Jackson <ajax@redhat.com>
If the wrap R (3rd) mode is set to CLAMP or CLAMP_TO_BORDER and the texture
isn't 3D, r300 always samples the border color regardless of texture
coordinates.
I HATE THIS HARDWARE.
NOTE: This is a candidate for the 7.10 branch.
Ideally we'd have a compiler and register spilling and all that
but this is good enough for now to avoid the gpu hang in piglit,
glsl-vs-vec4-indexing-temp-dst-in-nested-loop-combined
on r600/r700 cards.
based on r600c patch
Andre Maasikas <amaasikas@gmail.com>
r600c: bump sq gpr resources if a shader needs more than default
Signed-off-by: Dave Airlie <airlied@redhat.com>
So only with kernel version 2.7 can this work, thanks to Alex
for pointing that out. Also add a workaround for a hw bug.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Evergreen can do this as well as cayman, so we should enable it.
This fixes a gpu lockup with
glsl-vs-vec4-indexing-temp-dst-in-nested-loop-combined.shader_test
I need to add a better workaround for r600/r700.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since resources don't generally vary in size, this splits
the emit path, it also takes into a/c that texture and vertex resources
have different number of relocs, and avoids emitting the extra
reloc for vertex resources.
Signed-off-by: Dave Airlie <airlied@redhat.com>
With complex shaders there are often "holes" in the fs inputs, and we only
have 8 tex coorsd to map those to. To fix this, we remap fs inputs to [0..8].
This lets us to run many more GLSL programs.
At the end of flushing we were scanning over 450 blocks
with generally about 50 enabled. This reduces the scanning
to just the list of enabled blocks.
Signed-off-by: Dave Airlie <airlied@redhat.com>
There isn't much point taking the overhead of range/block lookups on resources
we aren't going to be getting resource registers at wierd offsets.
Signed-off-by: Dave Airlie <airlied@redhat.com>
resource setting could be a fair bit more lightweight,
this patch just separates the resource structs from the standard
reg tracking structs in the driver, later patches will improve
the winsys.
Signed-off-by: Dave Airlie <airlied@redhat.com>