Fixes many failures in gles3 Khronos CTS test: packed_pixels
V2: Add the check for alpha bits to avoid confusion.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Function is utilized by next patch in the series.
V2: Add missing formats.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
V2: Declare the function in teximage.h
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Previously we had to keep unreachable global symbols in the symbol table
because the symbol table is used during linking. Having the symbol
table retain pointers to freed memory... what could possibly go wrong?
At the same time, this meant that we kept live references to tons of
memory that was no longer needed.
New strategy: destroy the old symbol table, and make a new one from the
reachable symbols.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 59 40,642,425,451 76,337,968 69,720,886 6,617,082 0
After (32-bit): 46 40,661,487,174 75,116,800 68,854,065 6,262,735 0
Before (64-bit): 79 37,179,441,771 106,986,512 98,112,095 8,874,417 0
After (64-bit): 64 37,200,329,700 104,872,672 96,514,546 8,358,126 0
A real savings of 846KiB on 32-bit and 1.5MiB on 64-bit.
v2: (by Kenneth Graunke) Just add the ir_function from the IR stream,
rather than looking it up in the symbol table; they're now
identical.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Piglit's spec/glsl-1.10/linker/override-builtin-{const,uniform}-05 tests
do the following:
1. Call abs(float) - a built-in function.
2. Create a user-defined replacement for abs(float).
3. Call abs(float) again - now the user function.
At step 1, we created an ir_function which included the built-in
signature, added it to the symbol table, and emitted it into the IR
stream.
Then, when processing the function definition at step 2, we'd see that
there was already an ir_function. But, since there were no user-defined
functions, we skipped over a bunch of code, and ended up creating a
second one. This new ir_function shadowed the original in the symbol
table, but both ended up in the IR stream.
This results in an awkward situation where searching for an ir_function
via the symbol table, a forward linked list walk, and a reverse linked
list walk may return different ir_functions. This seems undesirable.
This patch instead re-uses the existing ir_function, putting both
built-in and user-defined signatures in the same one. The previous
patch's additional filtering ensures everything continues working.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Historically, we've implemented the rules for overriding built-in
functions by creating multiple ir_functions and relying on the symbol
table to hide the one containing built-in functions. That works, but
has a few drawbacks, so the next patch will change it.
Instead, we'll have a single ir_function for a particular name, which
will contain both built-in and user-defined signatures. Passing an
extra parameter to matching_signature makes it easy to ignore built-ins
when they're supposed to be hidden.
I didn't add the parameter to exact_matching_signature since it wasn't
necessary.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
On Haswell, this cuts 1-3 instructions from 183 vertex shaders in
"Shadowrun Returns", "Shatter", and "Trine 2." It adds 2 instructions
to a single fragment shader in "Closure."
total instructions in shared programs: 278803 -> 278546 (-0.09%)
instructions in affected programs: 41930 -> 41673 (-0.61%)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This code was attemping to align the base of the structure to the required
alignment of the structure. However, it had two problems:
1. It was aligning the target structure member, not the base of the
structure.
2. It was calculating the alignment based on the members previous to the
target member instead of all the members of the structure.
Fixes gles3conform failures in:
ES3-CTS.shaders.uniform_block.random.nested_structs.6
ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.2
ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.6
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.5
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.19
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.0
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.2
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.6
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.12
v2: Fix rebase failure noticed by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Use the data that is stored in the ir_variable and the glsl_type to
determine whether or not a UBO member is row-major.
Fixes gles3conform failures in:
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat2x3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat2x4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat3x2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat3x4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat4x2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat4x3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat2x3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat2x4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat3x2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat3x4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat4x2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat4x3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat2x3
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat2x4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat3x2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat3x4
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat4x2
ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat4x3
ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.2
ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.5
ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.9
Causes gles3conform failures in:
ES3-CTS.shaders.uniform_block.random.basic_types.8
ES3-CTS.shaders.uniform_block.random.basic_arrays.3
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.0
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.2
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.13
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.18
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.4
These failures will be fixed shortly.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes gles3conform failures in:
ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.3
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.13
Causes gles3conform failures in:
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.9
This failure will be fixed shortly.
v2: Use without_array() instead of older predicates.
v3: s/GLSL_MATRIX_LAYOUT_DEFAULT/GLSL_MATRIX_LAYOUT_INHERITED/g
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
v2: Rename GLSL_MATRIX_LAYOUT_DEFAULT to GLSL_MATRIX_LAYOUT_INHERITED.
Add comments in glsl_types.h explaining the layouts. Suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This causes the thing following the structure to be vec4-aligned.
Fixes gles3conform failures in:
ES3-CTS.shaders.uniform_block.random.nested_structs.2
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.5
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
I also considered renaming visit_field(const glsl_struct_field *) to
entry_record and adding an exit_record method. This would be more
similar to the hierarchical visitor.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Commit 32f32292 (glsl: Allow elimination of uniform block members)
enabled elimination of unused uniform block members to fix a gles3
conformance test failure. This went too far the other way.
Section 2.11.6 (Uniform Variables) of the OpenGL ES 3.0.3 spec says:
"All members of a named uniform block declared with a shared or
std140 layout qualifier are considered active, even if they are not
referenced in any shader in the program. The uniform block itself is
also considered active, even if no member of the block is
referenced."
Fixes gles3conform failures in:
ES3-CTS.shaders.uniform_block.single_nested_struct.per_block_buffer_shared
ES3-CTS.shaders.uniform_block.single_nested_struct.per_block_buffer_std140
ES3-CTS.shaders.uniform_block.single_nested_struct_array.per_block_buffer_shared
ES3-CTS.shaders.uniform_block.single_nested_struct_array.per_block_buffer_std140
ES3-CTS.shaders.uniform_block.random.scalar_types.2
ES3-CTS.shaders.uniform_block.random.scalar_types.9
ES3-CTS.shaders.uniform_block.random.vector_types.1
ES3-CTS.shaders.uniform_block.random.vector_types.3
ES3-CTS.shaders.uniform_block.random.vector_types.7
ES3-CTS.shaders.uniform_block.random.vector_types.9
ES3-CTS.shaders.uniform_block.random.basic_types.5
ES3-CTS.shaders.uniform_block.random.basic_types.6
ES3-CTS.shaders.uniform_block.random.basic_arrays.0
ES3-CTS.shaders.uniform_block.random.basic_arrays.2
ES3-CTS.shaders.uniform_block.random.basic_arrays.5
ES3-CTS.shaders.uniform_block.random.basic_arrays.8
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.0
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.4
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.5
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.6
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.9
ES3-CTS.shaders.uniform_block.random.nested_structs.0
ES3-CTS.shaders.uniform_block.random.nested_structs.1
ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.4
ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.8
ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.7
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.3
ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.6
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.18
v2: Whitespace and other minor fixes suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Returns the type without any arrays.
This will be used in later patches in this series.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Just a few lines earlier we may have wrapped the index expression with
ir_unop_i2u expression. Whenever that happens, as_constant will return
NULL, and that almost always happens.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Since the ralloc test in util/tests needs gtest, we need to make sure that
the gtest subdir is loaded first. This fixes bug #82148.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
With this patch, the SVGA_3D_CMD_BIND_GB_SHADER functionality will reserve
two relocations, one for the shader ID and the second for the MOB ID.
Verified with the WDDM winsys path that the number of relocations and patch
locations required is two.
Fixes Bug 1277406
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
This gathers macros that have been included across components into util so
that the include chain can be more vertical. In particular, this makes
util stand on its own without any dependence whatsoever on the rest of
mesa.
Signed-off-by: "Jason Ekstrand" <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This hash table is used in core Mesa, the GLSL compiler, and the i965
driver, which makes it a good candidate for the new src/util module.
It's much faster than program/hash_table.[ch] (see commit 6991c2922f
for data), and José's u_hash_table.c has a comment saying Gallium should
probably consider switching to a linear probing hash table at some point.
So this seems like the best candidate for a shared data structure.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
v2 (Jason Ekstrand): Pick up another hash_table use and patch up scons
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
For a long time, we've wanted a place to put utility code which isn't
directly tied to Mesa or Gallium internals. This patch creates a new
src/util directory for exactly that purpose, and builds the contents as
libmesautil.la.
ralloc seemed like a good first candidate. These days, it's directly
used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl
didn't make much sense.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
v2 (Jason Ekstrand): More realloc uses and some scons fixes
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
With earlier commit we've conditionally enabled/added the kms_dri target
for automake builds. Unfortunately the we forgot to add the appropriate
define in the scons build, resulting in a broken library due to the
undefined symbol 'kms_swrast_create_screen'.
Reported-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Roland Scheidegger <sroland@vmware.com>
warning: type qualifiers ignored on function return type
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>