Before this patch, the following code would not be optimized even though
the first two instructions were common to the then and else blocks:
(+f0) IF
MOV dst0 ...
MOV dst1 ...
MOV dst2 ...
ELSE
MOV dst0 ...
MOV dst1 ...
MOV dst3 ...
ENDIF
This commit extends the peephole to handle this case.
No shader-db changes.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
fs_visitor::try_replace_with_sel optimizes only if statements whose
"then" and "else" bodies contain a single MOV instruction. It also
could not handle constant arguments, since they cause an extra MOV
immediate to be generated (since we haven't run constant propagation,
there are more than the single MOV).
This peephole fixes both of these and operates as a normal optimization
pass.
fs_visitor::try_replace_with_sel is still arguably necessary, since it
runs before pull constant loads are lowered.
total instructions in shared programs: 1559129 -> 1545833 (-0.85%)
instructions in affected programs: 167120 -> 153824 (-7.96%)
GAINED: 13
LOST: 6
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This is the last thing that register_coalesce() still handled.
total instructions in shared programs: 1561060 -> 1560908 (-0.01%)
instructions in affected programs: 15758 -> 15606 (-0.96%)
Reviewed-by: Eric Anholt <eric@anholt.net>
parent_mem_ctx was unused since db47074a, so remove the two wrappers
around create() and make create() the constructor.
Reviewed-by: Eric Anholt <eric@anholt.net>
make_list is just a one-line wrapper and was confusingly called by
NULL objects. E.g., cur_if == NULL; cur_if->make_list(mem_ctx).
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously we made the basic block following an ENDIF instruction a
successor of the basic blocks ending with IF and ELSE. The PRM says that
IF and ELSE instructions jump *to* the ENDIF, rather than over it.
This should be immaterial to dataflow analysis, except for if, break,
endif sequences:
START B1 <-B0 <-B9
0x00000100: cmp.g.f0(8) null g15<8,8,1>F g4<0,1,0>F
0x00000110: (+f0) if(8) 0 0 null 0x00000000UD
END B1 ->B2 ->B4
START B2 <-B1
break
0x00000120: break(8) 0 0 null 0D
END B2 ->B10
START B3
0x00000130: endif(8) 2 null 0x00000002UD
END B3 ->B4
The ENDIF block would have no parents, so dataflow analysis would
generate incorrect results, preventing copy propagation from eliminating
some instructions.
This patch changes the CFG to make ENDIF start rather than end basic
blocks, so that it can be the jump target of the IF and ELSE
instructions.
It helps three programs (including two fs8/fs16 pairs).
total instructions in shared programs: 1561126 -> 1561060 (-0.00%)
instructions in affected programs: 837 -> 771 (-7.89%)
More importantly, it allows copy propagation to handle more cases.
Disabling the register_coalesce() pass before this patch hurts 58
programs, while afterward it only hurts 11 programs.
Reviewed-by: Eric Anholt <eric@anholt.net>
glext.h has had all the necessary bits for years.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This extension enabled the use of texture array with fixed-function and
assembly fragment shaders. No applications are known to use this
extension.
NOTE: This patch regresses GL_TEXTURE_1D_ARRAY and GL_TEXTURE_2D_ARRAY
cases of the copyteximage piglit test. The test is incorrectly using
texture arrays with fixed function while only requiring the
GL_EXT_texture_array extension. A fix for the test has been posted to
the piglit mailing list.
http://lists.freedesktop.org/archives/piglit/2013-November/008639.html
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Every driver that enables one also enables the other. The difference
between the two is MESA adds support for fixed-function and assembly
fragment shaders, but EXT only adds support for GLSL. The MESA
extension was created back when Mesa did not support GLSL.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Constify the gl_context parameter, and remove suffixes from enums that
have non-suffix versions.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
main/texobj.c: In function 'count_tex_size':
main/texobj.c:886:23: warning: unused parameter 'key' [-Wunused-parameter]
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
main/texobj.c: In function '_mesa_test_texobj_completeness':
main/texobj.c:553:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
main/texobj.c:553:193: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
main/texobj.c:553:254: warning: signed and unsigned type in conditional expression [-Wsign-compare]
main/texobj.c:553:148: warning: signed and unsigned type in conditional expression [-Wsign-compare]
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
There are no 3D textures in OpenGL ES 1.x.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
That enum requires GL_ARB_texture_cube_map_array, and it is only
available on desktop GL. It looks like this has been an un-noticed
issue since GL_ARB_texture_cube_map_array support was added in commit
e0e7e295.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This adds an extension called EGL_WL_create_wayland_buffer_from_image
which adds the following single function:
struct wl_buffer *
eglCreateWaylandBufferFromImageWL(EGLDisplay dpy, EGLImageKHR image);
The function creates a wl_buffer which shares its contents with the given
EGLImage. The expected use case for this is in a nested Wayland compositor
which is using subsurfaces to present buffers from its clients. Using this
extension it can attach the client buffers directly to the subsurface without
having to blit the contents into an intermediate buffer. The compositing can
then be done in the parent compositor.
The extension is only implemented in the Wayland EGL platform because of
course it wouldn't make sense anywhere else.
If we're not using EGL_EXT_swap_buffers_with_damage, we have to
damage the full extent. EGL operates on buffer coordinates, but
wl_surface.damage takes surface coordinates. EGL doesn't know the
buffer transformation (rotated or scaled) and can't post accurate
damage in surface coordinates. The damage event however is clipped to
the surface extents so we can just damage the maximum rectangle.
In case of EGL_EXT_swap_buffers_with_damage, the application knows
the buffer transform and is expected to pass in rectangles in
surface space.
https://bugs.freedesktop.org/show_bug.cgi?id=70250
Cc: "10.0" mesa-stable@lists.freedesktop.org
flush_with_flags, when available, allows the driver to throttle.
Using this suppress input lag issues that can be observed in heavy
rendering situations on non-intel cards.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.0" mesa-stable@lists.freedesktop.org
This typically won't make a difference, since we only send the requests at
wl_display_flush() time. There might be a small race
with another thread calling wl_display_flush() after our commit request,
but before we flush the DRI driver. Moving the commit below the DRI
driver flush call looks more natural and eliminates the small race.
Cc: "10.0" mesa-stable@lists.freedesktop.org
We would like the compositor to receive the commited buffer
as soon as possible, so it has the time to treat it, and
release old ones. We shouldn't rely on the client
to flush the queue for us.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.0" mesa-stable@lists.freedesktop.org
Display lists allocate memory in chunks of 256 tokens (1KB) at a time.
If an app creates many short display lists or uses glXUseXFont() this
can waste quite a bit of memory.
This patch uses realloc() to trim short lists and reduce the memory
used.
Also, null/zero-out some list construction fields in _mesa_EndList().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Now, sizeof(gl_dlist_node)==4 even on 64-bit systems. This can
halve the memory used by some display lists on 64-bit systems.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is a first step in reducing memory used by display lists on
64-bit systems. On 64-bit systems, the gl_dlist_node union type
is 8 bytes because of the 'data' and 'next' fields. This causes
every display list node/token to occupy 8 bytes instead of 4 as
originally designed. This basically doubles the memory used by
some display lists on 64-bit systems.
The fix is to remove the 64-bit 'data' and 'next' pointer fields
from the union and instead store them as a pair of 32-bit values.
Easily done with a few helper functions.
The next patch will take care of the 'next' field.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>