This is the first user of optimized monolithic shader variants.
Cull distances can't be disabled by states.
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
key->part.*: prolog and epilog flags only
key->as_{ls,es}: special flags
key->mono.*: flags for monolithic compilation only
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Company Of Heroes 2 needs only 24.
This saves 512 bytes of CE RAM per shader stage.
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
FORCE_TILING should disable it. It has no effect now, but that may change
soon.
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
When the uploading of shader fails on si_shader_binary_upload(),
it returns -ENOMEM. We should handle si_shader_binary_upload() failure path
on si_create_compute_state().
CID 1394027
v2: Fixes from Edward O'Callaghan's review
a) Update explicitly return value check with "si_shader_binary_upload() < 0"
b) Update commit message.
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
v4: Add windows-specific gen_knobs.{cpp|h} changes
v5: remove aggresive squashing of gen_knobs.py to this commit; added
SConscript to EXTRA_DIST in Makefile.am
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Modify gen_knobs.py so that each invocation creates a single generated
file. This is more similar to how the other generators behave.
v5: remove Scoscript edits from this commit; moved to commit that first
adds SConscript
Acked-by: Emil Velikov <emil.velikov@collabora.com>
- Handle dynamic library loading for windows
- Implement swap for gdi
- fix prototypes
- update include paths on configure-based build for swr_loader.cpp
v2: split to multiple patches
v3: split and reshuffle some more; renamed title
v4: move Makefile.am changes to other commit. Modify header files
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
There are 2 swr_create_screen() functions. One in swr_loader.cpp, which
is used during driver init, and the other is hiding in swr_screen.cpp,
which ends up in the arch-specific .dll/.so.
Rename the second one to swr_create_screen_internal(), to avoid confusion
in header files.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reorder header files so that we have a chance to defined NOMINMAX before
mesa include files include windows.h
v3: split from bigger patch
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
For compute shaders, we free the selector after the shader has been
compiled, so we need to save this bit somewhere else. Also, make sure that
this type of bug cannot re-appear, by NULL-ing the selector pointer after
we're done with it.
This bug has been there since the feature was added, but was only exposed
in piglit arb_compute_variable_group_size-local-size by commit
9bfee7047b (which is totally unrelated).
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Since the state tracker now enables MSAA in the hardware for the case
nr_samples == 1 as well, we need to set sample locations correctly for
this case.
The Polaris override is still needed for the non-MSAA case (when
nr_samples == 0).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
This fixes GL45-CTS.sample_variables.mask.*.samples_1.*.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Multithreaded fragment shaders let us hide texturing latency by a
hyperthreading-style switch to another fragment shader. This gets us up
to 20% framerate improvements on glmark2 tests.
I'm also sending out a piglit test, gl-2.0/vertexattribpointer-size-3,
which exposes this corner case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
There is no functionality in swr to clamp either vertex or frag colors.
This could be added in swr_shader, at which point these could be
re-enabled.
Fixes arb_color_buffer_float-render
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Rendering could still be ongoing (or have yet to start) when the shader
is deleted. There's no refcounting on the shader text, so insert a
pipeline stall unconditionally when this happens.
[Note, we should instead introduce a way to attach work to
fences, so that the freeing can be done in the current fence.]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
The EXT_texture_integer test says that blending and alphatest should
all be disabled. st/mesa takes care of alphatest already.
Fixes the ext_texture_integer-fbo-blending piglit test.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
The support in swr requires shaders to output the components as UINTs.
This is not how GL or Gallium work, and since this is not a
required-renderable format, just leave it out.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Fixes the texsubimage piglit and lets the copyteximage one get further.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
In a BGR10X2 or BGR5X1 situation, there's no need to try to quantize the
X channel - the default will have the proper quantization required.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
This is the format used for the primary surface of a
PIPE_FORMAT_Z32_FLOAT_S8X24_UINT resource.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>