Commit Graph

70890 Commits

Author SHA1 Message Date
Erik Faye-Lund c61bc6ed84 util: port _mesa_strto[df] to C
_mesa_strtod and _mesa_strtof are only used from the GLSL compiler and
the ARB_[vertex|fragment]_program code, meaning that the locale doesn't
need to be initialized before the first OpenGL context gets initialized.

So let's use explicit initialization from the one-time init code instead
of depending on a C++ compiler to initialize at image-load time.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-29 09:06:40 -07:00
Erik Faye-Lund de3e323be1 glsl: No need to lock in _mesa_glsl_release_types
This function only gets called while mesa is unloading, so there's
no potential of racing or multiple calls at the same time. So let's
just get rid of the locking.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-29 09:06:40 -07:00
Erik Faye-Lund 195ab79dde mesa/main: only call _mesa_destroy_shader_compiler once on exit
There's no point in calling _mesa_destroy_shader_compiler multiple
times on exit; the resources will only be released once anyway.

So let's move the atexit-call into the part that is only called
once.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-29 09:06:40 -07:00
Erik Faye-Lund ba5e1612c8 dri: don't touch the shader compiler
This function is for deleting per-screen resources, and the shader
compiler resources are not of such nature. Besides, dri shouldn't
need to even know about the presence of a shader compiler.

These resources will already be released when mesa gets unloaded,
and that should be sufficient.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-29 09:06:40 -07:00
Erik Faye-Lund 73d2b5af52 mesa/main: Get rid of outdated GDB-hack
All of these enums are now in use around in the code, so there's no need
to explicitly use them here any more.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-29 09:06:40 -07:00
Grigori Goronzy d15b32ebde clover: implement CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.

It might make sense to refine the value according to a kernel's
resource usage, but that's a possible optimization for the future.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-06-29 13:24:37 +02:00
Grigori Goronzy 249a9df7fc gallium: add PIPE_COMPUTE_CAP_SUBGROUP_SIZE
We need this to implement OpenCL's
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-06-29 13:24:22 +02:00
Neil Roberts c0ca6c30ea i965: Don't try to print the GLSL IR if it has been freed
Since commit 104c8fc2c2 the GLSL IR will be freed if NIR is
being used. This was causing it to segfault if INTEL_DEBUG=wm is set.
This patch just makes it avoid dumping the GLSL IR in that case.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-29 11:33:34 +01:00
Emil Velikov dd9ceb0219 docs: add news item and link release notes for mesa 10.6.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-29 09:03:19 +01:00
Emil Velikov 24df6cd0f7 docs: Add sha256 checksums for the 10.6.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 6ff3ae8deb1d99037f2f8e5890b09bd984059cf0)
2015-06-29 09:01:04 +01:00
Emil Velikov 07158c508a Add release notes for the 10.6.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a871e80fc6237fa029d6970f7e9b414fd097bd98)
2015-06-29 09:01:00 +01:00
Kenneth Graunke 6218c68bec Revert "glsl: clone inputs and outputs during linking"
This reverts commit c2ff3485b3.

Ilia and I noticed a memory leak caused by this patch: at least with
fixed-function programs, we clone things using ProgramResourceList as
the context before reralloc makes it non-NULL.

I believe Tapani found other bugs with these patches, so I'm just going
to revert them for now and let him pursue them further.
2015-06-28 22:20:27 -07:00
Kenneth Graunke cae701fc8e Revert "i965: Delete linked GLSL IR when using NIR."
This reverts commit 104c8fc2c2.
2015-06-28 22:17:09 -07:00
Ilia Mirkin 61912036d1 nv30: avoid leaking blit fp/vp
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-06-29 00:46:53 -04:00
Ilia Mirkin b5622313ea nv40: enable base vertex
Still appears to have issues with negative indices less than -1M, but
that's a corner case of a corner case.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-06-29 00:46:45 -04:00
Kenneth Graunke 19a0ba130f i965/vs: Move compute_clip_distance() out of emit_urb_writes().
Legacy user clipping (using gl_Position or gl_ClipVertex) is handled by
turning those into the modern gl_ClipDistance equivalents.

This is unnecessary in Core Profile: if user clipping is enabled, but
the shader doesn't write the corresponding gl_ClipDistance entry,
results are undefined.  Hence, it is also unnecessary for geometry
shaders.

This patch moves the call up to run_vs().  This is equivalent for VS,
but removes the need to pass clip distances into emit_urb_writes().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-28 19:44:34 -07:00
Kenneth Graunke 17e8fca626 i965: Write at least some data in SIMD8 URB write messages.
According to the "URB SIMD8 Write > Write Data Payload" documentation,
"The write data payload can be between 1 and 8 message phases long."

Apparently, the simulator considers it an error if you issue an URB
SIMD8 message with only a header and no actual data to write.

v2: Try to put in a better PRM citation, now that the Broadwell docs
    actually exist (requested by Jordan).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-28 19:44:33 -07:00
Samuel Pitoiset b4b4406e1e gallium/hud: prevent NULL pointer dereference with pipe_query functions
The HUD doesn't check if query_create() fails and it calls other
pipe_query functions with NULL pointer instead of a valid query object.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-06-28 09:49:03 +02:00
Mario Kleiner a98600b0eb nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.
The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

The original fd's lifetime differs from that of the nouveau_screen stored
in the hash. The hash key is the fd, and in order to compare hash entries
we fstat them, so the fd must be around for as long as the screen is.

This is an extension of the fix in commit a59f2bb1 (nouveau: dup fd
before passing it to device).

Cc: "10.3 10.4 10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-06-28 01:11:38 -04:00
Mike Stroyan 2a210b797e meta: Only change and restore viewport 0 in mesa meta mode
The meta code was setting a default depth range for all viewports
and 'restoring' all viewports to depth range values saved from viewport 0.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-27 11:29:56 -07:00
Dave Airlie 556dd4af76 radeonsi: add support for geometry shader invocations.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-27 00:24:30 +01:00
Dave Airlie 7e5064360c radeonsi: add support for viewport array (v3)
This isn't pretty and I'd suggest it the pm4 interface builder
could be tweaked to do this more efficently, but I'd need
guidance on how that would look.

This seems to pass the few piglit tests I threw at it.

v2: handle passing layer/viewport index to fragment shader.
fix crash in blit changes,
add support to io_get_unique_index for layer/viewport index
update docs.
v3: avoid looking up viewport index and layer in es (Marek).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-27 00:24:07 +01:00
Kenneth Graunke 35d8379304 i965/fs: Fix ir_txs in emit_texture_gen4_simd16().
We were not emitting the LOD, which led to message lengths of 1 instead
of 3.  Setting has_lod makes us emit the LOD, but I had to make changes
to avoid emitting the non-existent coordinate as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91022
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-26 15:57:03 -07:00
Ilia Mirkin ad62ec8316 nv50/ir: propagate modifier to right arg when const-folding mad
An immediate has to be the second arg of an ADD operation. However we
were mistakenly propagating the modifier of the non-folded value to the
folded immediate argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91117
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-26 18:42:29 -04:00
Boyan Ding 052b3d4e2f egl_dri2: Remove trailing whitespaces
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-06-26 17:05:21 +00:00
Neil Roberts 3cf90bb183 i965/skl: Fix aligning mt->total_width to the block size
brw_miptree_layout_2d tries to ensure that mt->total_width is a
multiple of the compressed block size, presumably because it wouldn't
be possible to make an image that has a fraction of a block. However
it was doing this by aligning mt->total_width to align_w. Previously
align_w has been used as a shortcut for getting the block width
because before Gen9 the block width was always equal to the alignment.
Commit 4ab8d59a2 tried to fix these cases to use the block width
instead of the alignment but it missed this case.

I think in practice this probably won't make any difference because
the buffer for the texture will be allocated to be large enough to
contain the entire pitch and libdrm aligns the pitch to the tile width
anyway. However I think the patch is worth having to make the
intention clearer.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-26 17:02:22 +01:00
Matt Turner 404a90b827 mesa: Enable subdir-objects globally.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-26 12:55:25 +01:00
Emil Velikov 229450520a mesa: fold duplicated GL/GL_CORE/GLES3 entry in get_hash_params.py
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-26 12:55:25 +01:00
Chia-I Wu 7de85694fa ilo: define ILO_IMAGE_MAX_LEVEL_COUNT
Define ILO_IMAGE_MAX_LEVEL_COUNT for ilo_image and remove unnecessary header
includes.
2015-06-26 13:45:28 +08:00
Chia-I Wu cbdc26aa3f ilo: replace pipe_format by gen_surface_format
Replace pipe_format by gen_surface_format in ilo_image.  Change how depth
format is specified in ilo_state_zs.
2015-06-26 13:45:28 +08:00
Chia-I Wu 2ee95f6d64 ilo: always use the specified image format
Move silent promotion of PIPE_FORMAT_ETC1_RGB8 or combined depth/stencil out
of core.
2015-06-26 13:45:28 +08:00
Chia-I Wu dc2e92b2d3 ilo: replace pipe_texture_target by gen_surface_type
Replace pipe_texture_target by gen_surface_type in ilo_image.  Change how
GEN6_SURFTYPE_CUBE is specified in ilo_state_surface and ilo_state_zs.
2015-06-26 13:45:28 +08:00
Chia-I Wu 934e4a469f ilo: initialize ilo_image from ilo_image_info
Convert pipe_resource to ilo_image_info for image initialization.
2015-06-26 13:45:28 +08:00
Chia-I Wu f825fe8e13 ilo: remove ilo_image_disable_aux()
Fail resource creation when aux bo allocation fails.
2015-06-26 13:45:28 +08:00
Chia-I Wu 07acf9cb16 ilo: improve SURFTYPE_BUFFER validations
Reorganize the validations to make them more systematic.
2015-06-26 13:45:27 +08:00
Chia-I Wu 9871646c13 ilo: remove ilo_buffer
Since the addition of ilo_vma, it was used only to pad a bo for sampling
engine surfaces.  Replace it entirely with these functions

  ilo_state_surface_buffer_size()
  ilo_state_vertex_buffer_size()
  ilo_state_index_buffer_size()
  ilo_state_sol_buffer_size()
2015-06-26 13:45:27 +08:00
Chia-I Wu 36d107e92c ilo: introduce ilo_vma
This cleans up the code a bit and makes ilo_state_vector_resource_renamed()
simpler and more robust.  It also allows a single bo to back mulitple VMAs.
2015-06-26 13:45:27 +08:00
Iago Toral Quiroga fbba25bba0 mesa: remove unnecessary checks in _mesa_readpixels_needs_slow_path
readpixels_can_use_memcpy will later call _mesa_format_matches_format_and_type
which does much tighter checks than these to decide if we can use
memcpy for readpixels.

Also, the checks do not seem to be extensive enough anyway, since we are
checking for signed/unsigned conversion only when the framebuffer has integers,
but the same checks could be done for other types anyway, since as long as
there is a signed/unsigned conversion we can't memcpy.

No regressions observed on i965/llvmpipe.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-06-26 07:42:47 +02:00
Jason Ekstrand 316206ee9e i965/vec4_live_variables: Do liveness analysis bottom-to-top
From Muchnick's Advanced Compiler Design and Implementation:

"To determine which variables are live at each point in a flowgraph, we
perform a backward data-flow analysis"

Previously, we were walking the blocks forwards and updating the livein and
then the liveout.  However, the livein calculation depends on the liveout
and the liveout depends on the successor blocks.  The net result is that it
takes one full iteration to go from liveout to livein and then another
full iteration to propagate to the predecessors.  This works out to an
O(n^2) computation where n is the number of blocks.  If we run things in
the other order, it's O(nl) where l is the maximum loop depth which is
practically bounded by 3.

In b2c6ba0c4b, we made this same change in
the FS backend to great effect.  Might as well keep it consistent and make
the same change for vec4.  Also, this took the time to run the test:

ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1

from 6:49.62 to 3:31.40 on Timothy Arceri's machine.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-25 16:42:20 -07:00
Ben Widawsky c1151b18f2 i965/skl: Use more compact hiz dimensions
gen8 had some special restrictions which don't seem to carry over to gen9.
Quoting the spec for SKL:
"The Z_Height and Z_Width values must equal those present in
3DSTATE_DEPTH_BUFFER incremented by one."

This fixes nothing in piglit (and regresses nothing).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-25 14:17:02 -07:00
Marek Olšák 101a73846b radeonsi: don't fail in si_shader_io_get_unique_index
Trivial. Picked from my tessellation branch.
2015-06-25 15:05:56 +02:00
Kenneth Graunke c97105ee12 i965: Drop brw->depthstencil.stencil_offset from gen8_depth_state.c.
This is always 0 - only brw_workaround_depthstencil_alignment ever sets
it, and that doesn't run on Gen6+.  My initial Broadwell depth state
commit had this mistake.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-25 02:18:51 -07:00
Kenneth Graunke 6026f7e8fb nir: Recognize max(min(a, 1.0), 0.0) as fsat(a).
We already recognize min(max(a, 0.0), 1.0) as a saturate, but neglected
this variant (which is also handled by the GLSL IR pass).

shader-db results on Broadwell:
total instructions in shared programs: 7363046 -> 7362788 (-0.00%)
instructions in affected programs:     11928 -> 11670 (-2.16%)
helped:                                64
HURT:                                  0

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-06-25 02:12:32 -07:00
Marek Olšák 77a78c65f8 softpipe,llvmpipe: fix PIPE_SHADER_CAP_MAX_INPUTS value
PIPE_MAX_SHADER_INPUTS was recently bumped to 80 because of tessellation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91099
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91101

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-06-25 09:00:23 +02:00
Ben Widawsky d1663ccb4c i965/bxt: Add basic Broxton infrastructure
The thread counts and URB information are all speculative numbers that were
based on some CHV numbers at the time.

v2:
Originally this patch had PCI IDs. I've moved that to a new patch at the end of
the series.
Remove is_cherryview hack.
Add PCI ids. These match the ones defined in the kernel. The only one tested by
us is 0x0a84.
Capitalize the hex string (Mark)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: "Lecluse, Philippe" <Philippe.Lecluse@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-06-24 16:37:12 -07:00
Ian Romanick 9f261dc18d radeon: Advertise correct GL_QUERY_COUNTER_BITS/GL_SAMPLES_PASSED value
Commit b765119c changed the default value of all the counter bits to
64.  However, older hardware only has 32 counter bits.

This has only been build-tested.  We don't have any tests that verify
the advertised value against implementation behavior, so I don't know
what additional testing could be done.

NOTE: It appears that many Gallium drivers (at least r300 and i915g)
have the same problem, but I don't see a way for the state-tracker to
determine the counter size.  Marek says, "For Gallium, a new PIPE_CAP or
new get_xxx_param function will be needed."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
2015-06-24 16:33:32 -07:00
Jason Ekstrand b2c6ba0c4b i965/fs_live_variables: Do liveness analysis bottom-to-top
From Muchnick's Advanced Compiler Design and Implementation:

"To determine which variables are live at each point in a flowgraph, we
perform a backward data-flow analysis"

Previously, we were walking the blocks forwards and updating the livein and
then the liveout.  However, the livein calculation depends on the liveout
and the liveout depends on the successor blocks.  The net result is that it
takes one full iteration to go from liveout to livein and then another
full iteration to propagate to the predecessors.  This works out to an
O(n^2) computation where n is the number of blocks.  If we run things in
the other order, it's O(nl) where l is the maximum loop depth which is
practically bounded by 3.

On my HSW desktop, one particular shadertoy test gets a 20% improvement in
compile times:

N           Min           Max        Median           Avg        Stddev
x  10        15.965        16.884        16.026       16.1822    0.34736846
+  10        12.813        13.052        12.876       12.8891    0.06913666
Difference at 95.0% confidence
        -3.2931 +/- 0.235316
        -20.3501% +/- 1.45417%
        (Student's t, pooled s = 0.250444)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-24 13:11:30 -07:00
Tapani Pälli 104c8fc2c2 i965: Delete linked GLSL IR when using NIR.
This is based on Kenneth's patch to delete 'most of the IR'. Due to
linker changes to clone variables, we can now free all of IR.

Saves 58MB of memory when replaying a Dota 2 trace on Broadwell.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-06-24 12:03:41 -07:00
Tapani Pälli c2ff3485b3 glsl: clone inputs and outputs during linking
This increases memory pressure during linking but makes it easier
for backend to free IR after it is not needed anymore.

v2: use resource list as ralloc context in case of relink (Kenneth)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-06-24 12:01:21 -07:00
Chris Wilson 4b35ab9bdb i965: Rename intel_emit* to reflect their new location in brw_pipe_control
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-24 10:35:04 -07:00