Implement assembly language API acceleration for PPC64LE,
analogous to long-standing implementations for X86 and X86-64.
See also similar implementation in libglvnd.
Tested with Piglit.
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
This enables support for importing RGBX8888 EGLImage textures on
Skylake.
Chrome OS needs support for RGBX8888 EGLImage textures because because
the Android framework produces HAL_PIXEL_FORMAT_RGBX8888 winsys
surfaces, which the Chrome OS compositor consumes as dma_bufs. On
hardware for which RGBX is unsupported or disabled, normally core Mesa
provides the RGBX->RGBA fallback during glTexStorage. But the DRIimage
code bypasses core Mesa, so we must do the fallback in i965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The new function takes a mesa_format and, if the format is an alpha
format with a non-alpha variant, returns the non-alpha format.
Otherwise, it returns the original format.
Example:
input -> output
// Fallback exists
MESA_FORMAT_R8G8B8X8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
MESA_FORMAT_RGBX_UNORM16 -> MESA_FORMAT_RGBA_UNORM16
// No fallback
MESA_FORMAT_R8G8B8A8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
MESA_FORMAT_Z_FLOAT32 -> MESA_FORMAT_Z_FLOAT32
i965 will use this for EGLImages and DRIimages.
v2 (Jason Ekstrand):
- Use mako
- Rework to be easier to read
- Write directly to the output file
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
sizeof(struct si_shader_key):
Before reverting the 2 commits: 120 bytes
After reverting the 2 commits: 128 bytes
With #pragma pack: 107 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Broken by:
commit 00173d91b7
Author: Marek Olšák <marek.olsak@amd.com>
Date: Sat Jun 10 12:09:43 2017 +0200
mesa: don't flag _NEW_TRANSFORM for st/mesa if possible
It also optimizes the case slightly for GL core.
It doesn't try to fix that glEnable might be a bad place to do the
clip plane transformation.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Per Jose's suggestion, this patch cleans up format_cap_table to remove
the unnecessary default cap value for vgpu10 formats since those devcap values
can be retrieved from the device.
Tested with MTT conform, glretrace, piglit in HWv13 and HWv8.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The default devcap for format SVGA3D_Z_D24S8_INT in HWv8 when its devcap is
not explicitly advertised should be set to zero to match the default value
in the device.
Tested with MTT piglit in HW version 8.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
In cases where certain bind flags cannot be enabled together,
such as CONSTANT_BUFFER cannot be combined with any other flags,
a separate host surface will be created.
For example, if a stream output buffer is reused as a constant buffer,
two host surfaces will be created, one for stream output,
and another one for constant buffer. Data will be copied from the
stream output surface to the constant buffer surface.
Fixes piglit test ext_transform_feedback-immediate-reuse-index-buffer,
ext_transform_feedback-immediate-reuse-uniform-buffer
Tested with MTT piglit, MTT glretrace, Nature, NobelClinician Viewer, Tropics.
v2: Fix bind flags compatibility check as suggested by Brian.
v3: Use the list utility to maintain the buffer surface list.
v4: Use the SAFE rev of LIST_FOR_EACH_ENTRY
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Currently we unconditionally enable streamout bind flag at
buffer resource creation time. This is not necessary if the buffer
is never used as a streamout buffer. With this patch, we enable
streamout bind flag as indicated by the state tracker. If the buffer
is later bound to streamout and does not already has streamout bind
flag enabled, we will recreate the buffer with
the new set of bind flags. Buffer content will be copied
from the old buffer to the new one.
Tested with MTT piglit, Nature, Tropics, Lightsmark.
v2: Fix bind flags check as suggested by Brian.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This is to prepare for more bind_flags optimization
in subsequent patches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This is to prepare for other bind_flags optimization
in subsequent patches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
src/mesa/drivers/x11/xm_dd.c:688:7: warning: implicit declaration of function ‘_mesa_update_draw_buffer_bounds’; did you mean ‘_mesa_has_ARB_draw_buffers_blend’? [-Wimplicit-function-declaration]
_mesa_update_draw_buffer_bounds(ctx, ctx->DrawBuffer);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Marek Olšák <marek.olsak@amd.com>
Fixes: 585c5cf8a5 ("mesa: don't update draw buffer bounds in
_mesa_update_state")
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
From experimentation in IGT, we found that the OA unit might label
some report as "idle" (using an invalid context ID), right after a
report for a given context. Deltas generated by those reports actually
belong to the previous context, even though they're not labelled as
such.
This change makes ensure that while reading OA reports, we only
consider the GPU actually idle after 2 reports with an invalid context
ID.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Due to an underlying hardware race condition, we have no guarantee
that all the reports coming from the OA buffer related to the workload
we're trying to measure have landed to memory by the time all the work
submitted has completed. That means we need to keep on reading the OA
stream until we read a report with a timestamp more recent than the
timestamp recored by the MI_REPORT_PERF_COUNT at the end of the
performance query.
v2: fix uninitialized offset variable to 0 (Lionel)
v3: rework the reading to avoid blocking the user of the API unless
requested (Rob)
v4: fix a bug that makes the i965 driver reading the perf stream when
not necessary, leading to very long counter accumulation times
(Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Enables access to OA unit metrics on Gen8+ via INTEL_performance_query.
v2: make use of new parameters coming from gen_device_info (Lionel)
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>