Commit Graph

9 Commits

Author SHA1 Message Date
Vinson Lee ff8daa0136 gallivm: Add missing header for powf.
Fix build error after llvm-11 commit 3a29393b4709 ("Remove
math.h/cmath include from DataTypes.h").

src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c: In function ‘lp_build_linear_to_srgb’:
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: error: implicit declaration of function ‘powf’ [-Werror=implicit-function-declaration]
  194 |                                  exp2f_c * powf(coeff_f, 1.0f / exp_f));
      |                                            ^~~~
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: warning: incompatible implicit declaration of built-in function ‘powf’
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:78:1: note: include ‘<math.h>’ or provide a declaration of ‘powf’
   77 | #include "lp_bld_format.h"
  +++ |+#include <math.h>
   78 |

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4473>
2020-04-07 21:48:31 +00:00
Jose Fonseca 320d1191c6 gallivm: Use llvm.fmuladd.*.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Marek Olšák fb523cb6ad gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Roland Scheidegger 9477d8c862 llvmpipe: add support for b5g6r5_srgb
The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA
(and vice versa), fix this to handle also 16bit 565-style srgb formats.
Still not really all that generic, things like r10g10b10a2_srgb or
r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not
require more work to not crash but near certainly need some higher precision
calculation) but not needed right now.
The code is not fully optimized for this (could use more direct calculation
instead of expanding to 8-bit range first) but should be good enough.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-03-21 17:23:38 +01:00
Roland Scheidegger 754319490f gallivm,llvmpipe: fix float->srgb conversion to handle NaNs
d3d10 requires us to convert NaNs to zero for any float->int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float->unorm8 path only really passes because it relies on sse2
pack intrinsics which just happen to work by luck for NaNs (float->int
conversion in hw gives integer indeterminate value, which just happens to be
-0x80000000 hence gets converted to zero in the end after pack intrinsics).
However, float->srgb didn't get so lucky, because we need to clamp before
blending and clamping resulted in NaN behavior being undefined (and actually
got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp
with defined nan behavior as we can handle the NaN for free this way.
I suspect there's more bugs lurking in this area (e.g. converting floats to
snorm) as we don't really use defined NaN behavior everywhere but this seems
to be good enough.
While here respecify nan behavior modes a bit, in particular the return_second
mode didn't really do what we wanted. From the caller's perspective, we really
wanted to say we need the non-nan result, but we already know the second arg
isn't a NaN. So we use this now instead, which means that cpu architectures
which actually implement min/max by always returning non-nan (that is adhering
to ieee754-2008 rules) don't need to bend over backwards for nothing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-11-14 12:24:55 +00:00
Roland Scheidegger 2d9fea95e8 gallivm: fix comment wrt srgb accuracy.
I think it's actually not good enough now...
2013-08-08 18:55:57 +02:00
Roland Scheidegger dc1cc928ed llvmpipe: support sRGB framebuffers
Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never
worked anyway in the blend code and are thus disabled, and I don't think anyone
is interested in L8/L8A8. Would need even more hacks otherwise.
Unless I'm missing something, this is the last feature except MSAA needed for
OpenGL 3.0, and for OpenGL 3.1 as well I believe.

v2: prettify a bit, use separate function for packing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-16 01:54:51 +02:00
Roland Scheidegger 796b73d1fe gallivm: (trivial) use constant instead of exp2f() function
Some lame compilers can't do exp2f() and as far as I can tell they can't do
exp2() (with doubles) neither so instead of providing some workaround for
that (wouldn't actually be too bad just replace with pow) and since it is
used with a constant only just use the precalculated constant.
2013-07-14 02:39:33 +02:00
Roland Scheidegger 6bcbb0dc82 gallivm: handle srgb-to-linear and linear-to-srgb conversions
srgb-to-linear is using 3rd degree polynomial for now which should be _just_
good enough. Reverse is using some rational polynomials and is quite accurate,
though not hooked into llvmpipe's blend code yet and hence unused (untested).
Using a table might also be an option (for srgb-to-linear especially).
This does not enable any new features yet because EXT_texture_srgb was already
supported via util_format fallbacks, but performance was lacking probably due
to the external function call (the table used by the util_format_srgb code may
not be all that much slower on its own).
Some performance figures (taken from modified gloss, replaced both base and
sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge,
the numbers aren't terribly accurate):

normal gloss, aos, 8-wide: 47 fps
normal gloss, aos, 4-wide: 48 fps

normal gloss, forced to soa, 8-wide: 48 fps
normal gloss, forced to soa, 4-wide: 47 fps

patched gloss, old code, soa, 8-wide: 21 fps
patched gloss, old code, soa, 4-wide: 24 fps

patched gloss, new code, soa, 8-wide: 41 fps
patched gloss, new code, soa, 4-wide: 38 fps

So there's a performance hit but it seems acceptable, certainly better
than using the fallback.
Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will
continue to use the old util_format fallback, because I can't be bothered
to write code for formats noone uses anyway (as decoding is done as part of
lp_build_unpack_rgba_soa which can only handle block type width of 32).
Compressed srgb formats should get their own path though eventually (it is
going to be expensive in any case, first decompress, then convert).
No piglit regressions.

v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also
since keeping both linear to srgb functions for now make sure both are
compiled (since they share quite some code just integrate into the same
function).

v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb
path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00