The texunit variable we create and assign in nir_emit_texture gets passed
through two more layers of function calls before it gets to its sole use in
rescale_texcoord. The best part is that we already pass the sampler into
rescale_texcoord so we can just look it up there.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
As of now, uniform setup is more-or-less unified between vec4 and fs and no
longer requires the fs_visitor. This makes uniform setup more of a
language/API thing than a backend compiler thing. This commit moves
setting up the stage_prog_data.params arrays to the same place as we set up
the rest of stage_prog_data.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Setting up binding tables really has little to do with the actual process
of turning shaders into instructions; it's more part of setting up
prog_data. This commit moves it out of the visitors and with the rest of
the prog_data setup stuff.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This really has nothing to do with the backend compiler and we'd like to
eventually be able to set this up earlier in the compile process.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The way we deal with GLSL uniforms and builtins is basically the same in
both the vec4 and the fs backend. This commit takes the best parts of both
implementations and pulls the common code into a shared helper function.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The way we deal with ARB program uniforms is basically the same in both the
vec4 and the fs backend. This commit takes the best parts of both
implementations and pulls the common code into a shared helper function.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Previously, we were counting up uniforms as we set them up. However, this
count should be exactly identical to shader->num_uniforms provided by
nir_assign_var_locations. (If it's not, we're in trouble anyway because
that means that locations don't match up.) This matches what the fs
backend is already doing.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
I tried to do this once before but Curro pointed out that having it in
backend_shader meant it could use the setup_vec4_uniform_values helper
which did different things in vec4 and fs. Now the setup_uniform_values
function differs only by an assert in the two backends so there's no real
good reason to be using it anymore.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The uniform_vector_size array was only ever used by pack_uniform_registers
which no longer needs it.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Previously, pack_uniform_registers worked based on the size of the uniform
as given to us when we initially set up the uniforms. However, we have to
walk through the uniforms and figure out liveness anyway, so we migh as
well record the number of channels used as we go. This may also allow us
to pack things tighter in a few cases.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
That way, if we do the usual thing of multiplying vector_elements by
matrix_columns we get the actual number of components in the type as per
component_slots().
While we're at it, we also switch to using the actual C++ field
initializers for vector_elements and matrix_columns.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Previously, we had a bunch of code in each stage to figure out how many
slots we needed in stage_prog_data.param. This code was mostly identical
across the stages and had been copied and pasted around. Unfortunately,
this meant that any time you did something special, you had to add code for
it to each of these places. In particular, none of the stages took
subroutines into account; they were working entirely by accident. By
taking this data from the NIR shader, we know the exact number of entries
we need and everything goes a bit smoother.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The next commit will add code to codegen_vs_prog that requires the NIR
shader to be there in all cases. It doesn't hurt anything to just move it
from brw_vs_emit to its only caller.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
GLSL IR vs. NIR shader-db results for vec4 programs on i965:
total instructions in shared programs: 1499328 -> 1388354 (-7.40%)
instructions in affected programs: 1245199 -> 1134225 (-8.91%)
helped: 7469
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on G4x:
total instructions in shared programs: 1436799 -> 1325825 (-7.72%)
instructions in affected programs: 1205599 -> 1094625 (-9.20%)
helped: 7469
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake:
total instructions in shared programs: 1436654 -> 1325682 (-7.72%)
instructions in affected programs: 1205503 -> 1094531 (-9.21%)
helped: 7468
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge:
total instructions in shared programs: 2016249 -> 1787033 (-11.37%)
instructions in affected programs: 1850547 -> 1621331 (-12.39%)
helped: 14856
HURT: 1481
GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
GLSL IR vs. NIR shader-db results for vec4 programs on Haswell:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
I also ran our full suite of benchmarks on a Haswell and had the following
statistically significant (according to ministat) changes:
Test master-glsl master-nir diff
bench_OglGeomPoint 461.556 463.006 1.450
bench_OglTerrainFlyInst 184.484 187.574 3.090
bench_OglTerrainPanInst 132.412 136.307 3.895
bench_OglTexFilterAniso 19.653 19.645 -0.008
bench_OglTexFilterTri 58.333 58.009 -0.324
bench_OglVSInstancing 65.049 65.327 0.278
bench_trexoff 69.474 69.694 0.220
bench_valley 40.708 41.125 0.417
v2 (Jason Ekstrand):
- Remove more uses of NirOptions as a switch
- New shader-db numbers
- Added benchmark numbers
Reviewed-by: Matt Turner <mattst88@gmail.com>
Commit 0a1adaf11d (nir: Report progress
from nir_lower_system_values().) introduced a bug caught by Valgrind:
==823== Conditional jump or move depends on uninitialised value(s)
==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68)
==823== by 0xB079FB8: foreach_cf_node (nir.c:1310)
==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336)
==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79)
...
==823== Uninitialised value was created by a stack allocation
==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76)
which is trivially fixed by initializing progress.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Add a macro GL_LIB_NAME to hold the filename that configure comes up with
based on the --with-gl-lib-name and --enable-mangling options.
In driOpenDriver, use the GL_LIB_NAME macro instead of hard-coding
"libGL.so.1".
v2: Add an #ifndef/#define for GL_LIB_NAME so that non-autoconf builds will
work.
v3: Fix the library filename in the Makefile.
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
When USE_MGL_NAMESPACE is defined, _glapi_get_stub will check for the "m"
prefix before trying to skip it, so that "glFoo" and "mglFoo" are
equivalent.
This should let it work with all the places where something calls
_glapi_get_proc_offset with a hard-coded name that starts with the normal
"gl" prefix.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
the renames from glx_mangle.h.
Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
defined.
v2: Add a comment clarifying why GLX_ALIAS needs two macros.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Fixes following Piglit test:
member-invalid-binding-qualifier.frag
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
When writing to a column of a row-major matrix, each component of the
vector is stored to non-consecutive memory addresses, so we generate
one instruction per component.
This patch skips the disabled components in the writemask, saving some
store instructions plus avoid storing wrong data on each disabled
component.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Fixes a failing subtest in:
ES31-CTS.shader_storage_buffer_object.negative-glsl-compileTime
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reading this output was really confusing. reg represents attribute
slots; reg_offset is the x/y/z/w component (0..3) within a vec4 slot.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The code for input lowering is going to get significantly more
complicated shortly, so I wanted to pull it out. Vertex shader inputs
are handled nearly identically regardless of vec4/scalar mode, so I
opted to not split that.
I thought about having each function actually do the lowering, but one
pass through nir_lower_io that handles all types (which weren't handled
earlier) is probably more efficient.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
We may want to use different type_size functions for (e.g.) inputs
vs. uniforms. Passing in -1 for mode ignores this, handling all
modes as before.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
If the texture object exists, but the Name field is zero, it means
the object was created but never bound to a target. Trying to bind it
in _mesa_BindTextureUnit() should generate GL_INVALID_OPERATION.
Fixes piglit's arb_direct_state_access-bind-texture-unit test.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This helper was only called from _mesa_BindTextureUnit(). It's simpler
to just inline it.
The error check / code / message in the helper was incorrect. It was
written for glBindTextures(), not glBindTextureUnit(). The correct
error for a bad texture unit number is GL_INVALID_VALUE. The error
message now reports the unit number rather than a GL_TEXTUREi enum.
Fixes a failure in piglit's arb_direct_state_access-bind-texture-unit test.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Before, we were doing the actual _mesa_reference_texobj() call and
ctx->Driver.BindTexture() and misc housekeeping in three different
places. This consolidates the common code in a new bind_texture()
function.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Since we store both in UniformBlocks, we can't just compute the index by
subtracting the array address start, we need to count the number of
buffers of the approriate type.
v2:
- Just fall back to calc_resource_index (Tapani)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>