Register strides higher than 4 are uncommon but they can happen. For
instance, if you have a 64-bit extract_u8 operation, we turn that into
UB -> UQ MOV with a source stride of 8. Our previous calculation would
try to generate a stride of <32;8,8>:ub which is invalid because the
maximum horizontal stride is 4. To solve this problem, we instead use a
stride of <8;1,0>. As noted in the comment, this does not work as a
destination but that's ok as very few things actually generate that
stride.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
The lod clamping is what limits you between base and last level, and the
base level field is just there to help decide where the min/mag change
happens.
Fixes tex-miplevel-selection GL2:texture()
The ordering of the values was even less obvious than I thought, with both
the mip filter and the min filter being in different bits depending on
whether the mip filter is none.
Fixes piglit fs-textureLod-miplevels.shader_test
The HW doesn't pad the slice's height to make a full 4x4 group of UIF
blocks. We just need to pad to columns, and the start of the next column
appears in the bottom of the previous column's last block.
Fixes piglit fs-textureOffset-2D.
This is so much more pleasant to write than the manual
V3D33_whatever_pack() calls, and will be useful for when we start doing
actual per-V3D compiles.
As with blending, we'll have the bit flagged again when it gets reenabled
in CONFIGURATION_BITS, so there's no need to emit test state if we're not
testing.
Groups containing fields smaller than a byte probably not being decoded
correctly. For example:
<group count="32" start="32" size="4">
<field name="Vertex Element Enables" start="0" end="3" type="uint"/>
</group>
gen_field_iterator_next would properly walk over each element of the
array, incrementing group_iter. However, the code to print the actual
values only considered iter->field->start/end, which are 0 and 3 in the
above example. So it would always fetch bits 3:0 of the current byte,
printing the same value over and over.
Cc: Eric Anholt <eric@anholt.net>
The HW puts the pad bits at the top for DEPTH_COMPONENT24, but we need it
at the bottom for texturing. Using the format with stencil probably means
we won't be able to do Z24 and separate S8, but I wasn't planning on
supporting that anyway.
Fixes hiz-depth-read-fbo-d24-s0
I saw VK_IMAGE_ASPECT_ANY_COLOR_BIT while hacking anv_formats.c and got
confused. "Huh? What extension added that?". No extension defines it;
anv_private.h defines it.
To remove confusion, rename the anv-private VK tokens as if they were
extension tokens with the ANV vendor suffix.
I found only two such tokens:
VK_IMAGE_ASPECT_ANY_COLOR_BIT
VK_IMAGE_ASPECT_PLANES_BITS
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
In anv_physical_device_get_format_properties().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The DummyShader is used by GenFragmentShadersATI() as a placeholder to
mark IDs as allocated. Context cleanup wants to delete everything in
ctx->Shared->ATIShaders, and crashes on these placeholders with this
backtrace:
==15060== Invalid free() / delete / delete[] / realloc()
==15060== at 0x482F478: free (vg_replace_malloc.c:530)
==15060== by 0x57694F4: _mesa_delete_ati_fragment_shader (atifragshader.c:68)
==15060== by 0x58B33AB: delete_fragshader_cb (shared.c:208)
==15060== by 0x5838836: _mesa_HashDeleteAll (hash.c:295)
==15060== by 0x58B365F: free_shared_state (shared.c:377)
==15060== by 0x58B3BC2: _mesa_reference_shared_state (shared.c:469)
==15060== by 0x578687F: _mesa_free_context_data (context.c:1366)
==15060== by 0x595E9EC: st_destroy_context (st_context.c:642)
==15060== by 0x5987057: st_context_destroy (st_manager.c:772)
==15060== by 0x5B018B6: dri_destroy_context (dri_context.c:217)
==15060== by 0x5B006D3: driDestroyContext (dri_util.c:511)
==15060== by 0x4A1CBE6: dri3_destroy_context (dri3_glx.c:170)
==15060== Address 0x7b5dae0 is 0 bytes inside data symbol "DummyShader"
Also, DeleteFragmentShadersATI() should not assert on DummyShader, just
remove the hash entry.
Normally one would define a shader after GenFragmentShadersATI(), and
BindFragmentShaderATI() replaces the placeholder with a real object.
However, the specification doesn't say that one has to define a shader
for each allocated ID.
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
As discussed in this thread:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/175104.html
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Chad Versace <chadversary@chromium.org>
As discussed in this thread:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/175104.html
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Christian Schmidbauer <ch.schmidbauer@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
autotools generates libGLESv1_CM.so.1.0.0, so let's make sure meson
does the same.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
This `version` field defines the filename for the .so.
The plan .so as well as .so.$major are always symlinks to this.
Unless I'm mistaken, only the major is ever used, so this shouldn't
matter, but for consistency with autotools (and in case it does matter),
let's always have all 3 major.minor.patch components.
(The soname isn't affected, and is always .so.$major)
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Commit 259fc50545 added linker error for
mismatching uniform precision, as required by GLES 3.0 specification and
conformance test-suite.
Several Android applications, including Forge of Empires, have shaders
which violate this rule, on a dead varying that will be eliminated.
The problem affects a big number of applications using Cocos2D engine
and other GLES implementations accept this, this poses a serious
application compatibility issue.
Starting from GLSL ES 3.0, declarations with conflicting precision
qualifiers are explicitly prohibited. However GLSL ES 1.00 does not
clearly specify the behavior, except that
"Uniforms are defined to behave as if they are using the same storage in
the vertex and fragment processors and may be implemented this way.
If uniforms are used in both the vertex and fragment shaders, developers
should be warned if the precisions are different. Conversion of
precision should never be implicit."
The word "used" is not clear in this context and might refer to
1) declared (same as GLES 3.x)
2) referred after post-processing, or
3) linked after all optimizations are done.
Looking at existing applications, 2) or 3) seems to be widely adopted.
To avoid compatibility issues, turn the error into a warning if GLSL ES
version is lower than 3.0 and the data is dead in at least one of the
shaders.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We can avoid adding the buffer in the non-local case, this will
avoid all the overhead of the indirect call.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The next patch will try and avoid calling the indirect function.
v2: add a missing conversion.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The function that calls us has just added the buffer to the
list already, no need to try and add it again.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
Hilariously this is a fairly big win. Neil's multi-context-test
improves from ~24 to ~36 fps with llvmpipe on a Core i5-3317U. softpipe
also improves, from about 2.25 to 3.09 fps (when it's that slow, you're
allowed to be that precise).
I'd have added it to swrast classic, but the testcase wants GL 3.0 and
shaders, and that's not a thing classic has, so I figured making it work
on softpipe was crime enough.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>