With the following structures :
struct StructA
{
uint64_t value0;
uint8_t value1;
};
struct TopStruct
{
struct StructA a;
uint8_t value3;
};
Currently offsetof(struct TopStruct, value3) = 9. While the same code
on the CPU gives offsetof(struct TopStruct, value3) = 16.
This is impacting OpenCL kernels we're trying to use to build
acceleration structures.
v2: Add comment/link to some description of the alignment/size
computation
Cc: mesa-stable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16940>
util_cpu_detect is an anti-pattern: it relies on callers high up in the call
chain initializing a local implementation detail. As a real example, I added:
...a Mali compiler unit test
...that called bi_imm_f16() to construct an FP16 immediate
...that calls _mesa_float_to_half internally
...that calls util_get_cpu_caps internally, but only on x86_64!
...that relies on util_cpu_detect having been called before.
As a consequence, this unit test:
...crashes on x86_64 with USE_X86_64_ASM set
...passes on every other architecture
...works on my local arm64 workstation and on my test board
...failed CI which runs on x86_64
...needed to have a random util_cpu_detect() call sprinkled in.
This is a bad design decision. It pollutes the tree with magic, it causes
mysterious CI failures especially for non-x86_64 developers, and it is not
justified by a micro-optimization.
Instead, let's call util_cpu_detect directly from util_get_cpu_caps, avoiding
the footgun where it fails to be called. This cleans up Mesa's design,
simplifies the tree, and avoids a class of a (possibly platform-specific)
failures. To mitigate the added overhead, wrap it all in a (fast) atomic
load check and declare the whole thing as ATTRIBUTE_CONST so the
compiler will CSE calls to util_cpu_detect.
Co-authored-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15580>
Unused variable vector_elements is now used in return from
function decode_type_from_blob instead of encoded.basic.vector_elements.
In the code we can see how those variables were equated
and then the operations were made exactly to vector_elements.
But variable didn't pass into any other variables or functions.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5536
Signed-off-by: Viktoriia Palianytsia <v.palianytsia@globallogic.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13725>
This is separate from images and samplers. It's a texture (not a
storage image) without a sampler. We also add C-visible helpers to
convert between sampler and image types.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>
From the ARB_enhanced_layouts spec:
"As with input layout qualifiers, all shaders except compute shaders
allow *location* layout qualifiers on output variable declarations,
output block declarations, and output block member declarations. Of
these, variables and block members (but not blocks) additionally
allow the *component* layout qualifier."
We previously had compile tests in piglit to make sure this was not a
compile error but no execution tests.
Fixes: d99a040bbf ("i965: enable ARB_enhanced_layouts for gen8+")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10763>
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
When packing varyings when there is only 32bit of space
left in slot 64bit type is attempted to be divided between
current and next slot. However there is neither code for
splitting the 64bit type nor for assembling it back.
Instead we add 32bit padding.
The above happens only in structs because their
members aren't sorted by packing order.
Example of the issue:
struct S {
vec3 a;
double d;
};
out flat S s;
Before, the packing went as:
slot 32b 32b 32b 32b
0 a.x a.y a.z d
1 d 0 0 0
After:
slot 32b 32b 32b 32b
0 a.x a.y a.z 0
1 d d 0 0
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member name is not initialized in
this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7766>
LLVM loves take advantage of the fact that vec3s in OpenCL are 16B
aligned and so it can just read/write them as vec4s. This results in a
LOT of vec4->vec3 casts on loads and stores. One solution to this
problem is to get rid of all vec3 variables.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
When creating explicit type, the alignment information is lost, thus
forcing explicit type users to recalculate the alignment using the same
size_align() function. Let's add a new field to cache this information.
Only structs, matrices, and vectors have and explicit alignment. Arrays
alignment is implicitly set to its element alignment and matrices are
required to have an alignment that matches that of its vector columns.
the concept of alignment simply doesn't apply to other types.
We make the strategic choice to not allow explicit alignments on
scalars. This is for a couple of reasons:
1. There are no cases today where we use explicit types where we want
any other alignment for scalars than natural alignment.
2. Vectors don't have a component alignment that's separate from the
explicit_alignment so it's impossible to get an explicitly aligned
scalar type which is the component of the explicitly aligned vector
type.
This choice may cause problems if we ever want to use explicitly laid
out types for things like varyings where we sometimes want vec4
alignment of scalars. We can deal with that when the time comes.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
OpenCL doesn't mandate a size and this is consistent with the rest of
the glsl_type system. While we're here, we also clean ::cl_size() up a
bit and use a new explicit_type_scalar_byte_size() helper.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Right now, when calling get_explicit_type_for_size_align() on a packed
struct, the packed attribute is lost and field offsets are wrong.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
There's a few quirks: kernel images are untyped, whether they're
sampled is unknown, and they're passed as inputs to the kernel even though
SPIR-V declares their address space as UniformConstant.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5242>
This should fix such valgrind warnings:
==37417== Uninitialised byte(s) found during client check request
==37417== at 0x6183471: blob_write_bytes (blob.c:163)
==37417== by 0x629785B: encode_type_to_blob (glsl_types.cpp:2760)
==37417== by 0x61E68D8: write_variable (nir_serialize.c:293)
==37417== by 0x61E6F6A: write_var_list (nir_serialize.c:421)
==37417== by 0x61EBA7A: nir_serialize (nir_serialize.c:2018)
==37417== by 0x5B5E007: serialize_nir_part (brw_program_binary.c:135)
==37417== by 0x5B5E7F3: brw_serialize_program_binary (brw_program_binary.c:299)
==37417== by 0x5FEF5FF: write_program_payload (program_binary.c:177)
==37417== by 0x5FEF7BB: _mesa_get_program_binary_length (program_binary.c:225)
==37417== by 0x5E3D31D: get_programiv (shaderapi.c:912)
==37417== by 0x5E3F730: _mesa_GetProgramiv (shaderapi.c:1827)
==37417== by 0x111DA0: program_binary_save_restore (shader_runner.c:686)
==37417== Address 0x8f59481 is 81 bytes inside a block of size 480 alloc'd
==37417== at 0x483B7F3: malloc (vg_replace_malloc.c:309)
==37417== by 0x618CE67: ralloc_size (ralloc.c:123)
==37417== by 0x618CF35: rzalloc_size (ralloc.c:155)
==37417== by 0x618D245: rzalloc_array_size (ralloc.c:234)
==37417== by 0x629041D: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, glsl_interface_packing, bool, char const*) (glsl_types.cpp:148)
==37417== by 0x6293EC3: glsl_type::get_interface_instance(glsl_struct_field const*, unsigned int, glsl_interface_packing, bool, char const*) (glsl_types.cpp:1271)
==37417== by 0x604C878: (anonymous namespace)::per_vertex_accumulator::construct_interface_instance() const (builtin_variables.cpp:365)
==37417== by 0x6050722: (anonymous namespace)::builtin_variable_generator::generate_varyings() (builtin_variables.cpp:1568)
==37417== by 0x60509CA: _mesa_glsl_initialize_variables(exec_list*, _mesa_glsl_parse_state*) (builtin_variables.cpp:1600)
==37417== by 0x6149AE9: _mesa_ast_to_hir(exec_list*, _mesa_glsl_parse_state*) (ast_to_hir.cpp:131)
==37417== by 0x60706D6: _mesa_glsl_compile_shader (glsl_parser_extras.cpp:2222)
==37417== by 0x5E3DC16: _mesa_compile_shader (shaderapi.c:1211)
==37417== Use of uninitialised value of size 8
==37417== at 0x529AE13: ??? (in /usr/lib/x86_64-linux-gnu/libz.so.1.2.11)
==37417== by 0x6184075: util_hash_crc32 (crc32.c:127)
==37417== by 0x5FEF401: write_program_binary (program_binary.c:95)
==37417== by 0x5FEF8BC: _mesa_get_program_binary (program_binary.c:252)
==37417== by 0x5E40E22: _mesa_GetProgramBinary (shaderapi.c:2411)
==37417== by 0x4914057: stub_glGetProgramBinary (piglit-dispatch-gen.c:24737)
==37417== by 0x111E4A: program_binary_save_restore (shader_runner.c:704)
==37417== by 0x11F765: piglit_display (shader_runner.c:5112)
==37417== by 0x499082F: run_test (piglit_fbo_framework.c:52)
==37417== by 0x4980E89: piglit_gl_test_run (piglit-framework-gl.c:229)
==37417== by 0x110DA9: main (shader_runner.c:72)
v2: - decode_glsl_struct_field_from_blob and
encode_glsl_struct_field should be `static`
( Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> )
v3: - we can get rid of `struct packed_struct_field_flags`
( Tapani Pälli <tapani.palli@intel.com> )
- we can get rid of `unsigned __pad: 15` bitfield
( Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> )
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andrii Simiklit <asimiklit.work@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5054>
There is no uint to int implicit conversion in glsl, this is just
a typo in the name of this function. The correct one would be:
has_implicit_int_to_uint_conversion.
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4884>
Insertions can modify entry->data. Seems to fix random Fossilize crashes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
CC: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
To implement NIR-to-TGSI, we need to be able to get the size of the
uniform variable for the TGSI declaration, not just the
.driver_location. With its location in mesa/st, drivers couldn't link
to it from nir-to-tgsi.
This feels like a common enough function to want, so let's share it in
the core compiler.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
The only bit that gallium varied on was handling of bindless. We can
retain previous behavior for count_attribute_slots() by passing in
"true" (though I suspect this is just giving a silly answer to a silly
question), and delete our recursive function from mesa/st.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
This introduces new vec8 and vec16 instructions (which are the only
instructions taking more than 4 sources), in order to construct 8 and 16
component vectors.
In order to avoid fixing up the non-autogenerated nir_build_alu() sites
and making them pass 16 src args for the benefit of the two instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).
v2 (Karol Herbst):
use nir_build_alu2 for vec8 and vec16
use python's array multiplication syntax
add nir_op_vec helper
simplify nir_vec
nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert
use nir_build_alu for opcodes with <= 4 sources
v3 (Karol Herbst):
fix nir_serialize
v4 (Dave Airlie):
fix serialization of glsl_type
handle vec8/16 in lowering of bools
v5 (Karol Herbst):
fix load store vectorizer
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
glsl 4.4 spec section '5.9 expressions':
"The operator is multiply (*), where both operands are matrices or one operand is a vector and the
other a matrix. A right vector operand is treated as a column vector and a left vector operand as a
row vector. In all these cases, it is required that the number of columns of the left operand is equal
to the number of rows of the right operand. Then, the multiply (*) operation does a linear
algebraic multiply, yielding an object that has the same number of rows as the left operand and the
same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations”
explains in more detail how vectors and matrices are operated on."
This fix disallows a multiplication of incompatible matrices like:
mat4x3(..) * mat4x3(..)
mat4x2(..) * mat4x2(..)
mat3x2(..) * mat3x2(..)
....
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
We want to detect invalid refcounting so assert we have at least one
use before creating types.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
v2: use glsl_type_size_align_func
v2: move get_explicit_type() to glsl_types.cpp/nir_types.cpp
v2: use align() instead of util_align_npot()
v2: pack arrays a bit tighter
v2: rename mem_* to field_*
v2: don't attempt to handle when struct offsets are already set
v2: use column_type() instead of recreating it
v2: use a branch instead of |= in nir_lower_to_explicit_impl()
v2: assign locations to variables and update shared_size and num_shared
v2: allow the pass to be used with nir_var_{shader_temp,function_temp}
v4: rebase
v5: add TODO
v5: small formatting changes
v5: remove incorrect assert in get_explicit_type()
v5: rename to nir_lower_vars_to_explicit_types
v5: correctly update progress when only variables are updated
v5: rename get_explicit_type() to get_explicit_shared_type()
v5: add comment explaining how get_explicit_shared_type() is different
v5: update cast strides
v6: update progress when lowering nir_var_function_temp variables
v6: formatting changes
v6: add more detailed documentation comment for get_explicit_shared_type
v6: rename get_explicit_shared_type to get_explicit_type_for_size_align
v7: fix comment in nir_lower_vars_to_explicit_types_impl()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
While using SPIR-V shaders (ARB_gl_spirv), layout data is not implicit
to a specific value (std140, std430, etc) but explicitly included on
the type (explicit values for offset, stride and row_major).
So this method is equivalent to the existing std140_size and
std430_size, but using such explicit values.
Note that the value returned by this method is only valid if such data
is set, so when dealing with SPIR-V shaders.
v2: (all changes suggested by Jason Ekstrand)
* Iterate through all struct members, instead of assume that fields
are ordered by offset
* Use else if
* Take into account the case that explicit_stride > elem_size, to
fine graine the final size on arrays and matrices
* Handle different bit-sizes in general, not just 32 and 64.
v3: (change suggested by Caio Marcelo de Oliveira Filho)
* fix up explicit_size() to consider interface types
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
It only accepts 32-bit integers so it should have a more descriptive
name. This patch should not be a functional change.
Reviewed-by: Karol Herbst <kherbst@redhat.com>