The buffer_offset is used in aligned_vertex_buffer_offset.
But now that most of these decisions are done in compile_vertex_list
we can work on local variables instead of struct members in the
display list code. Clean that up and remove buffer_offset.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from
that it is not used anymore.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Since we now store a set of VAOs in the display list, use these object
to get the reference to the VBO in several places.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Use the information already present in the VAO to update the current values
after display list replay. Set GL_OUT_OF_MEMORY on allocation failure
for the current value update storage.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Use the information already present in the VAO to replay a display list
node using immediate mode draw commands. Use a hand full of helper methods
that will be useful for the next patches also.
v2: Insert asserts, constify local variables.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
The master value is now stored inside the VAO already present in
struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
We don't zalloc the physical device so we need to unconditionally set
everything. Crucible helpfully initializes all allocations to 139 so it
was getting true regardless of whether or not the kernel actually
supports context priorities.
Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit a2c1e48f15.
On BDWGT3e and KBLGT3e systems, this commit regressed the following
tests:
piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small depthstencil
piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small depthstencil
piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small depthstencil
piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small depthstencil
piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve small depthstencil
This stops a crash when running (still fails):
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Nothing to do except using a busy wait loop. At least for old kernels.
A better implementation for newer kernels to come later.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
The nir->llvm conversion was using the wrong srcs.
Fixes:
tests/spec/arb_compute_shader/execution/shared-atomics.shader_test
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0,
which means that c->reg.vertex[] isn't initialized. It then emits MOVs
to stomp components of those uninitialized registers to 0.
This started causing assertions after Matt's recent series, when those
uninitialized registers started getting BRW_REGISTER_TYPE_NF, which
definitely doesn't exist on Gen4-5.
Reviewed-by: Matt Turner <mattst88@gmail.com>
We need to align the size of the slice, not the offset of the next slice.
Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge.
Fixes: b4b4ada761 ("broadcom/vc5: Fix layout of 3D textures.")
Before, we were trusting in the hardware to take the intersection
of the viewport clip with the drawing rectangle. Unfortunately,
3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
does a full pipeline stall. If we're a bit more careful with our
viewport clipping, we can just re-emit it once at context creation
time.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The delayed loading code was fail if we had control flow.
This fixes:
tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test
v2: don't use temp_reg before setting temp_reg up.
Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:
add(16) g25<1>F -g23<4>.xyxyF g23<4>.zwzwF { align16 1H };
Without align16, we now implement it as:
add(4) g25<1>F -g23<0,2,1>F g23.2<0,2,1>F { align1 1N };
add(4) g25.4<1>F -g23.4<0,2,1>F g23.6<0,2,1>F { align1 1N };
add(4) g26<1>F -g24<0,2,1>F g24.2<0,2,1>F { align1 1N };
add(4) g26.4<1>F -g24.4<0,2,1>F g24.6<0,2,1>F { align1 1N };
where only the first two instructions are needed in SIMD8 mode.
Note: an earlier version of the patch implemented this in two
instructions in SIMD16:
add(8) g25<2>F -g23<4,2,0>F g23.2<4,2,0>F { align1 1N };
add(8) g25.1<2>F -g23.1<4,2,0>F g23.3<4,2,0>F { align1 1N };
but I realized that the channel enable bits will not be correct. If we
knew we were under uniform control flow, we could emit only those two
instructions however.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of
pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F
we now have
mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F
... and in the case of SIMD8 only the first pair of MAD instructions is
used.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
If multiple instructions are emitted, special handling of things like
conditional mod and NoDDClr/NoDDChk need to be performed.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This isn't technically broken, but the next patch will make this
function report whether it generated multiple instructions, and that
information will be used to disable the application of conditional mod
by the generic code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.
One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
anv_gem_set_context_param is to be used directly instead!
Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fix clipper validMask setting. We don't need to run frustum rejected
primitives through the clipper. Perform frustum culling with only
frustum clip codes. Guardband clip codes cannot be used because they
overlap frustum codes.
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>