I was going to send this as review for dce1e1a8, but I missed that
window. This saves 64 bytes of unshared data and prelaces it with 96
bytes shared text. My guess is that some of the calls to memcpy get
optimized to something else.
text data bss dec hex filename
7847613 220208 27432 8095253 7b8615 i965_dri.so before
7847709 220144 27432 8095285 7b8635 i965_dri.so after
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Brian Paul <brianp@vmware.com>
The ISO C99 standard (7.18.4) specifies that C++
implementations should define UINT64_C only when
__STDC_CONSTANT_MACROS is defined.
Because we now use UINT64_C in our cpp files (since commit
208bfc493d), we need to add this define.
This also solves compilation errors with GCC 4.8.x on ppc64le machines.
v2: add this define to SCons build system
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Gen9+ requires us to emit 3DSTATE_BINDING_TABLE_POINTERS_HS for the
hull shader push constants to take effect. The passthrough TCS uses
push constants for the default tessellation levels. So, when those
change, we need to re-upload the binding table as well.
Fixes five Piglit tests on Skylake:
- spec/arb_tessellation_shader/vs-tes-vertex
- spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-quads
- spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-tris
- spec/arb_tessellation_shader/tes-read-texture
- spec/arb_tessellation_shader/tess_with_geometry
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
In testing KBL, I found:
- urb size was not set for slices gt1.5, gt2, and gt3. The value I
used for these slices (384) was taken from an earlier patch authored
by Ben Widawsky.
- slice count was missing. This field was added by
a403ad4f5a
With this commit, KBL passes piglit at parity with SKL.
Note: As requested by Kristian, Sarah modified this patch to drop
setting urb size for gt1.5, gt2, and gt3, since the correct default is
set in the GEN9 macro by commit c1e38ad370
"i965/skl: Use larger URB size where available."
Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
For the case where we convert a double to an int, we should
round the same as we do for floats.
This fixes GL41-CTS.gpu_shader_fp64.state_query
v2: add IROUNDD (Ilia)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
OpenGL 2.0 function StencilOp() is in part internally implemented via
StencilOpSeparate(). This change happened some time ago, however the
accompanying doxygen todo comment was not accordingly updated.
Replace the outdated portion of this doxygen todo comment, leaving the
remainder unchanged.
Also better respect the 80 character suggested line length in this file.
v2: Fully remove comment, following code review by t_arceri@yahoo.com.au
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Currently, opt_vectorize() tries to combine:
result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);
into a single ir_quadop_bitfield_insert opcode, which operates on
ivec4s. However, GLSL IR's opcodes currently require the bits and
offset parameters to be scalar integers. So, this breaks.
We want to be able to vectorize this eventually, but for now, just
chicken out and make opt_vectorize() bail by marking all the bitfield
insert/extract related opcodes as horizontal. This is a relatively
uncommon case today, so we'll do the simple fix for stable branches,
and fix it properly on master.
Fixes assertion failures when compiling Shadow of Mordor vertex shaders
on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
SCons doesn't understand nir yet and doesn't want to compile the glsl to
nir pass. Move the files to their own variable so we can add it only for
automake.
Tested-by: Brian Paul <brianp@vmware.com>
These are used by code that doesn't necessarily link to libglsl.la. Move
them to shader_enums.[ch] where we keep similar helpers.
Reviewed-by: Matt Turner <mattst88@gmail.com>
The scope of libi965_compiler.la is to be able to take nir shaders and
generate i965 EU code. As such, we don't want the GLSL IR lowering
passes in the library. With this change, libi965_compiler.la no longer
needs to link to libglsl.la.
Reviewed-by: Matt Turner <mattst88@gmail.com>
libglsl_la_SOURCES includes both NIR_FILES and LIBGLSL_FILES, so for
libglsl.la consumers, this is a no-op. libnir.la however no longer uses
any GLSL IR infrastructure and can be used without also linking to
libglsl.la.
Acked-by: Matt Turner <mattst88@gmail.com>
Previously we were treating the binding index for Uniform Buffer
Objects and Shader Storage Buffer Objects as being part of the
combined BufferInterfaceBlocks array.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93322
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or
any of the graphics project wikis (including the DRI wiki) on
freedeskop.org. Fix that by linking to it from the sidebar.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Specify that the operation only applies to the x component, not
per-component as previously specified. This is unnecessary for GL and
creates additional complications for images which need to support these
operations as well.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Each load/store on most hardware can specify what caching to do. Since
SSBO allows individual variables to also have separate caching modes,
allow loads/stores to have the qualifiers instead of attempting to
encode them in declarations.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
I believe that `1u << x`, where x >= 32 yields undefined results
according to the C standard.
Particularly MSVC says `warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)`.
Reviewed-by: Brian Paul <brianp@vmware.com>
For profiling mesa's code, especially llvmpipe, PROFILE should be
defined. Currently, this define can only be generated if mesa is
built using scons.
This patch makes it possible to generate this define also when building
mesa through automake tools.
v2:
- Change --enable-llvmpipe-profile to --enable-profile
- Add -fno-omit-frame-pointer to CFLAGS and CXXFLAGS when enabling profile
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
It can be trivially derived from the number of already declared system
values. This allows ureg users not to worry about which index to choose.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>