Given the usecase we have of trying to measure timestamps across individual
draw calls, flushing will totally mess up what people are trying to measure.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The theory I had when I wrote the code was that you wanted to minimize latency
on your queries because the app was going to ask soon. Only, it turns out
that everybody batches up their queries and asks for the results later (often
after the next SwapBuffers!), so this was a pessimization.
Until now, I had no workload where it mattered enough to benchmark. Recently
I started playing some Minecraft, which uses tons of queries to decide whether
to render chunks of the terrain. For that app, avoiding the flush in the
query-generation loop improves performance 22.7% +/- 4.7% (n=3) on an apitrace
capture of it (confirmed in game by watching the fps meter found by pressing
F3, 15/16 -> 20/21 fps).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
otherwise some compilers will throw error
"error: format not a string literal and no format arguments"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Fixes WebGL texture mips conformance test, no piglit regressions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44912
NOTE: This is a candidate for the stable branches.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Simplify is_zero() somewhat, and as a side effect work around a gcc compiler
bug that causes build failure.
https://bugs.freedesktop.org/show_bug.cgi?id=56140
Reported-by: Dmitry Cherkassov <dcherkassov@gmail.com>
If GL_BASE_LEVEL==0 and GL_MAX_LEVEL==0 that's a pretty good hint that
there'll be a single mipmap level in the texture.
Google Earth sets the texture's state this way before the first glTexImage
call. This saves a bit of texture memory.
Fixes piglit tests "unpack-teximage2d --pbo=* --format=GL_BGRA" on
Sandybridge+.
The fastpath was checking an incomplete set of pixel unpack state. This
patch adds checks for all the fields of gl_pixelstore_attrib that affect
2D texture uploads. Also, it begins permitting the case where
GL_UNPACK_ROW_LENGTH is 0.
Ideally, we would just ask a unicorn to JIT this fastpath for us in
a way that safely handles the unpacking state. Until then, it's safer if
only a small set of situations activate the fastpath.
v2: Use _mesa_is_bufferobj(), per Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
It doesn't provide the cross-process buffer sharing that a window system
pixmap could otherwise support and we don't have anything left that uses
this type of surface.
The 0.99.0 Wayland release changes the event API to provide a thread-safe
mechanism for receiving events specific to a subsystem (such as EGL) and
we need to use it in the EGL platform.
The Wayland protocol now also requires a commit request to make changes
take effect, issue that from eglSwapBuffers.
Now that we've replaced all the variable settings other than reg_width, it's
easy to hang on to this (the expensive part of setting up the allocator).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Improves performance of the Lightsmark penumbra shadows scene by 15.7% +/-
1.0% (n=15), by eliminating register spilling. (tested by smashing the list of
scenes to have all other scenes have 0 duration -- includes additional
rendering of scene description text that normally doesn't appear in that
scene)
v2: Allow allocation of all but g0/g1 of the payload.
v3: Pull count_to_loop_end() out to a helper function.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2, recommended v3)
Based on split_virtual_grfs(), we choose the same set every time, so set it in
stone. This will help us avoid regenerating the somewhat expensive
class/register set setup every compile.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is derived from the FS visitor code for the same, but tracks each channel
separately (otherwise, some typical fill-a-channel-at-a-time patterns would
produce excessive live intervals across loops and cause spilling).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48375
(crash -> failure, can turn into pass by forcing unrolling still)
These messages always have m0 = g0 and m1 = offset, and write has m2 = data.
Avoids regression in opt_compute_to_mrf() with a change to scratch writes to
set up the data as an MRF write in the IR.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Note that BRW_PREDICATE_NONE is 0 and BRW_PREDICATE_NORMAL is 1, so that's a
lot like the true/false we had in the FS before.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
fs_bblock_link -> bblock_link
fs_bblock -> bblock_t (to avoid conflicting with all the fs_bblock *bblock)
fs_cfg -> cfg_t (to avoid conflicting with all the fs_cfg *cfg)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This fixes confusion by the upcoming live variable analysis which saw e.g. use
of temp.w when only temp.xyz were initialized in the basic block, and
concluded that temp.w must have come from outside of the block (even though it
was never initialized anywhere).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Due to a string mismatch, INTEL_swap_event wasn't listed among GLX
extensions for the connection, even when present on both client and
server. That is, glXQueryServerString and glXGetClientString reported the
extension, but glXQueryExtensionsString did not.
Note: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56057
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Per commentary and direction in the LLVM community, support for ppc64 is
going into MCJIT rather than the old JIT. There is no existing support
in prior llvm versions, so no need to specify LLVM version numbers.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>