Commit Graph

74130 Commits

Author SHA1 Message Date
Ilia Mirkin 3ea3727998 docs: note that ARB_copy_image was added to nv50, nvc0 in this release
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-09 07:14:07 -05:00
Brian Paul 28f6faca51 st/wgl: add null pointer check for HUD texture
Fixes crash when using HUD with Nobel Clinician Viewer.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-11-09 11:25:59 +00:00
Brian Paul 75d1e363ff st/wgl: fix double-present on swapbuffers bug
The stw_st_framebuffer_present_locked() function was getting called
twice per SwapBuffers.  First, when st_context_iface::flush() was
called from DrvSwapBuffers() because the ST_FLUSH_FRONT flag was
given.  Second, by stw_st_swap_framebuffer_locked() which does the
actual SwapBuffers.

Two code changes:
1. Pass ST_FLUSH_END_OF_FRAME, instead of ST_FLUSH_FRONT.
2. Move the implementation of stw_flush_current_locked() into
DrvSwapBuffers() since it's not called anywhere else.

Not much change in perf for benchmarks like Lightsmark, but some simple
Mesa demos are measurably faster.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-09 11:25:59 +00:00
Brian Paul 8083943e2e st/wgl: reorder pixel formats to put MSAA formats last
And put 8-bit/channel formats before 5/6/5 formats.

The ChoosePixelFormat() function seems to be finicky about format
selection.  Putting the MSAA formats after the non-MSAA formats
means most apps get a low-numbered format.  Now we generally get
the same pixel format regardless of whether using vgpu9 or 10.

VMware bug 1455030

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-09 11:25:59 +00:00
José Fonseca e524df5ef3 st/wgl: Don't rely on GDI to bookkeep pixelformat for us.
This allows to use apitrace's retracediff script on Windows to retrace and
compare two builds of a Mesa based opengl32.dll/ICD side-by-side.

See also e4a4f15f5b
2015-11-09 11:08:27 +00:00
Michel Dänzer 24abbaff9a winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3
Fixes GPUVM conflicts with non-4K page size.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738

v2: Replace sanitization of VM base address alignment with comment why
    that's not necessary.
v3: Use unsigned instead of long as the type for the size_align member.
    (Marek)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-09 17:24:32 +09:00
Christian König df4f9b0236 radeon/uvd: add H.265/HEVC to legal notes
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-11-08 18:16:01 -05:00
Leo Liu 519502d08f st/omx: add headless support
This will allow dec/enc/transcode without X

v2:  use env override even with X,
     use loader_open_device instead of open
v3:  clean up

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-08 18:15:57 -05:00
Leo Liu 25526d77b1 st/va: use vl screen drm support from vl_wys_drm
v2: move the dup to vl_wys_drm for pipe loader

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-08 18:15:57 -05:00
Leo Liu 7da86e0ec0 vl: add drm support for vl_screen
This will allow the state trackers to use render nodes
with screen creation

v2: dup fd for pipe loader

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-08 18:15:57 -05:00
Leo Liu d115e47099 st/va: fix build fails with pipe loader
There is no dev in drv, and dev should be from vl_screen here

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-08 18:15:57 -05:00
Samuel Pitoiset ffb60e7788 nvc0: enable compute support on Fermi
Altough the compute support is still not complete because textures and
surfaces need to be implemented, it allows to launch very simple compute
kernel like one which reads reading MP performance counters.

This turns on PIPE_CAP_COMPUTE and PIPE_SHADER_COMPUTE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-08 16:47:59 +01:00
Ilia Mirkin e06238cb9e nv50/ir: fix emission of s[] args in certain situations
There might only be a single arg (e.g. cvt), so use mode rather than
looking at the source directly. Also we don't want to rely on the type
of the value, which can be unreliable, but instead use the
instruction's. This works out well since mkSplit doesn't adjust the
type.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-07 18:58:58 -05:00
Ilia Mirkin af218217d7 nv50/ir: only take abs value when computing high result
Not reachable from TGSI since it only has UMUL, no IMUL. However it's
surprising that setting argument types to s32 will cause sign to get
lost.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-07 18:58:58 -05:00
Ilia Mirkin 53cbb11707 nouveau: avoid queueing too much work onto a single fence
Force the fence to get kicked off, which won't actually wait for its
completion, but any additional work will be put onto a fresh list.

This fixes crashes in teximage-colors --benchmark with too many active
maps.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-07 18:58:58 -05:00
Dave Airlie 0f5b1409fd llvmpipe: disable front updates for now
As pointed out by Emil, this sometimes hangs, appears to be due to threading

need to rethink how this stuff works for llvmpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-08 07:55:17 +10:00
Dave Airlie 87711183ac virgl: wrap ret assignment with braces to do correct thing
Coverity reported that ret could only be 0 or 1, since it
was setting ret = fn() > 0, instead of doing (ret = fn()) > 0.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-08 06:27:02 +10:00
Jason Ekstrand 6c731d8566 nir: Add a nir_deref_tail helper
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-11-07 12:09:44 -08:00
Jason Ekstrand 7d90e570f3 nir/types: Add an is_vector_or_scalar helper
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-11-07 12:09:38 -08:00
Jason Ekstrand d43e16b163 i965/fs: Use regs_read/written for post-RA scheduling in calculate_deps
Previously, we were assuming that everything read/wrote exactly 1 logical
GRF (1 in SIMD8 and 2 in SIMD16).  This isn't actually true.  In
particular, the PLN instruction reads 2 logical registers in one of the
components.  This commit changes post-RA scheduling to use regs_read and
regs_written instead so that we add enough dependencies.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92770
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-11-07 08:41:48 -08:00
Jason Ekstrand c839174d55 nir/validate: Add better validation of load/store types
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-11-07 08:41:35 -08:00
Marek Olšák d57ede92b7 radeonsi: add register definitions for Stoney
There are a few non-stoney changes too.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-11-07 10:22:13 +01:00
Marek Olšák 2658777f46 radeonsi: add workarounds for CP DMA to stay on the fast path
v2: set emit_scratch_reloc, add a NULL check

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-11-07 10:22:13 +01:00
Marek Olšák fc0416ef5d radeonsi: unify CP DMA preparation logic
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-11-07 10:22:13 +01:00
Marek Olšák 89da3b4458 radeonsi: unify CP DMA code determining various flags
v2: don't call get_flush_flags twice per function

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-11-07 10:22:12 +01:00
Marek Olšák c3e527f93d radeonsi: only enable write confirmation on the last CP DMA packet
This should improve performance for big copies that need to be split.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-11-07 10:22:12 +01:00
Ilia Mirkin 8e9ade7eb3 nv50/ir: allow emission of immediates in imul/imad ops
Nothing actually uses this yet (due to complications), but the emission
logic is right.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-07 00:42:15 -05:00
Ilia Mirkin 393d0c336b nv50/ir: properly set the type of the constant folding result
This removes the hack used for merge, which only covers a fraction of
the cases.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 19:39:32 -05:00
Ilia Mirkin 2f9aaed749 nv50/ir: add support for const-folding OP_CVT with F64 source/dest
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 19:39:32 -05:00
Ilia Mirkin 76957389fc nv50/ir: add fp64 opcode emission support for G200 (NVA0)
Need to emulate rcp/rsq before providing full fp64 support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 18:36:25 -05:00
Hans de Goede f979d3cfec nv50/ir: Add support for 64bit immediates to checkSwapSrc01
Now that we support 64 bit immediates in insnCanLoad, we need to swap
64 bit immediate sources too for optimal effect.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 18:13:31 -05:00
Hans de Goede 9f2f8bda6e nvc0/ir: Teach insnCanLoad about double immediates
Teach insnCanLoad about double immediates, together with the
"Add support for merge-s to the ConstantFolding pass"

This turns the following (nvc0) code:
  1: mov u32 $r2 0x00000000 (8)
  2: mov u32 $r3 0x3fe00000 (8)
  3: add f64 $r0d $r0d $r2d (8)

Into:
  1: add f64 $r0d $r0d 0.500000 (8)

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 18:13:31 -05:00
Hans de Goede 428506ece2 nv50/ir: Add support for merge-s to the ConstantFolding pass
This allows later passes like LoadPropagation to properly deal with 64
bit immediates.

If the new 64 bit load this introduces does not get optimized away then
split64BitOpPostRA() will split this into 2 instructions again.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 18:13:31 -05:00
Ilia Mirkin 2437f00853 nv50/ir: disallow 64-bit immediates on nv50 targets
No instructions are able to load short immediates like nvc0 can.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 18:13:31 -05:00
Ilia Mirkin 11e3dac36e nv50/ir: allow movs with TYPE_F64 destinations to be split
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 18:13:31 -05:00
Hans de Goede b487b55f7d gm107/ir: Add support for double immediates
Add support for encoding double immediates (up to 20 bits of precision)
into the generated gm107 machine-code.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 17:22:40 -05:00
Hans de Goede 12c850d01c nvc0/ir: Add support for double immediates
Add support for encoding double immediates (up to 20 bits of precision)
into the generated nvc0 machine-code.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-06 17:22:40 -05:00
Francisco Jerez 5169407221 i965/nir/fs: Add comment for no-op memory barrier functions
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-11-06 13:19:56 -08:00
Jordan Justen faa1193070 i965/nir/fs: Implement new barrier functions for compute shaders
For these nir intrinsics, we emit the same code as
nir_intrinsic_memory_barrier:

 * nir_intrinsic_memory_barrier_atomic_counter
 * nir_intrinsic_memory_barrier_buffer
 * nir_intrinsic_memory_barrier_image

We treat these nir intrinsics as no-ops:
 * nir_intrinsic_group_memory_barrier
 * nir_intrinsic_memory_barrier_shared

v3:
 * Add comment for no-op cases (curro)

v4:
 * Moving comment to a separate patch authored by curro

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-06 13:16:11 -08:00
Jordan Justen 9d65f3208b nir: Add new barrier functions for compute shaders
When these functions are called in glsl-ir, we create a corresponding
nir intrinsic function call.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-06 13:15:16 -08:00
Jordan Justen 91f188710a glsl: Add new barrier functions for compute shaders
When these functions are called in GLSL code, we create an intrinsic
function call:

 * groupMemoryBarrier => __intrinsic_group_memory_barrier
 * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter
 * memoryBarrierBuffer => __intrinsic_memory_barrier_buffer
 * memoryBarrierImage => __intrinsic_memory_barrier_image
 * memoryBarrierShared => __intrinsic_memory_barrier_shared

v2:
 * Consolidate with memoryBarrier function/intrinsic creation (curro)

v3:
 * Instead of add_memory_barrier_function, add an intrinsic_name
   parameter to _memory_barrier (curro)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-06 13:14:44 -08:00
Boyuan Zhang 6bad554d98 radeon/uvd: fix VC-1 simple/main profile decode v2
We just needed to set the extra width/height fields to get this working.

v2 (chk): rebased, CC stable added, commit message added, fixed coding style

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-11-06 20:07:23 +01:00
Boyuan Zhang ed55def44f st/vaapi: fix vaapi VC-1 simple/main corruption v2
Apply the start code fix only to advanced profile.

v2 (chk): add commit message

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-11-06 20:07:23 +01:00
Julien Isorce cc1e5c972e st/va: add support for RGBX and BGRX in VPP
Before it was only possible to convert a NV12 surface to
RGBA or BGRA. This patch uses the same post processing
function, "handleVAProcPipelineParameterBufferType", but
add definitions for RGBX and BGRX.

This patch also makes vlVaQuerySurfaceAttributes more generic
to avoid copy and pasting the same lines.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06 17:33:45 +00:00
Julien Isorce 42a5e143a8 vl/buffers: add RGBX and BGRX to the supported formats
Useful is one wants to create RGBX or BGRX surfaces.
The infrastructure is such that it required just a
few definitions to support these formats.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06 17:33:38 +00:00
Julien Isorce bf6acbb2db st/va: properly use brackets in vlVaAcquireBufferHandle's switch
In "switch (mem_type)" the brackets were surrounding "case+default"
instead of "case" only.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06 17:33:16 +00:00
Julien Isorce bfc245e9ac st/va: properly indent buffer.c, config.c, image.c and picture.c
Some lines were using 4 indentation spaces instead of 3.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06 17:33:01 +00:00
Rob Clark 6459e780ae freedreno/a4xx: fix blend color
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06 11:19:04 -05:00
Rob Clark 7465e16124 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06 11:18:47 -05:00
Guillaume Charifi 6f5e0c08a4 freedreno: add a305 support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06 11:17:58 -05:00