Commit Graph

6560 Commits

Author SHA1 Message Date
Marek Olšák d7b6f90684 gallivm: set LLVMNoUnwindAttribute on all intrinsics
RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement:

 Application            Files    SGPRs     VGPRs   SpillSGPR SpillVGPR Code Size    LDS    Max Waves   Waits
 unigine_valley           278    0.00 %   -0.29 %    0.00 %    0.00 %    0.01 %    0.00 %    0.17 %    0.00 %

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-07-11 19:06:05 +02:00
Nicolai Hähnle 374aa2bb27 gallium/u_queue: assert that users must wait on fences before destroying them
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-11 11:04:44 +02:00
Nicolai Hähnle a0a616720a gallium/u_queue: guard fence->signalled checks with fence->mutex
I have seen a hang during application shutdown that could be explained by the
following race condition which this patch fixes:

1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true.
2. Main thread calls util_queue_job_wait, which returns immediately.
3. Main thread deletes the job and fence structures, leaving garbage behind.
4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because
   it is accessing garbage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-11 11:03:59 +02:00
Nayan Deshmukh af18a04755 vl: add half pixel to v_tex before adding offsets
Since pixel center lies at 0.5, add half_pixel to vtex
before adding offsets to it.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-08 20:51:12 +02:00
Rob Clark def044376a gallium/util: make util_copy_framebuffer_state(src=NULL) work
Be more consistent with the other u_inlines util_copy_xyz_state()
helpers and support NULL src.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-06 10:17:30 -04:00
Hans de Goede d386cef246 tgsi: Add WORK_DIM System Value
Add a new WORK_DIM SV type, this is will return the grid dimensions
(1-4) for compute (opencl) kernels.

This is necessary to implement the opencl get_work_dim() function.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Nayan Deshmukh 872dd9ad15 vl: add a bicubic interpolation filter(v5)
This is a shader based bicubic interpolater which uses cubic
Hermite spline algorithm.

v2: set dst_area and dst_clip during scaling (Christian)
v3: clear the render target before rendering
v4: intialize offsets while initializing shaders
    use a constant buffer to send dst_size to frag shader
    small changes to reduce calculation in shader
v5: send half pixel offset instead of sending dst_size

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-01 12:54:33 +02:00
Brian Paul c823ff8dfb gallium/util: check for window cliprects in util_can_blit_via_copy_region()
We can't blit with resource_copy_region() if there are window clip rects.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-30 18:19:09 -06:00
Brian Paul 5f1335878e gallium/util: add tight_format_check param to util_can_blit_via_copy_region()
The VMware driver will use this for implementing GL_ARB_copy_image.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-30 14:32:06 -06:00
Brian Paul a029d9f074 gallium/util: simplify a few things in util_can_blit_via_copy_region()
Since only the src box can have negative dims for flipping, just
comparing the src/dst box sizes is enough to detect flips.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-30 14:32:06 -06:00
Brian Paul 5d31ea4b8f gallium/util: new util_try_blit_via_copy_region() function
Pulled out of the util_try_blit_via_copy_region() function.  Subsequent
changes build on this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-30 14:32:06 -06:00
Hans de Goede 459cc94507 pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms
Make pipe_loader_sw_probe_kms take ownership of the passed in fd,
like pipe_loader_drm_probe_fd does.

The only caller is dri_kms_init_screen which passes in a dupped fd,
just like dri2_init_screen passes in a dupped fd to
pipe_loader_drm_probe_fd.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-28 12:29:54 +02:00
Marek Olšák cbb5adb908 gallium/u_queue: allow the execute function to differ per job
so that independent types of jobs can use the same queue.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 4a06786efd gallium/u_queue: reduce the number of mutexes by 2
by converting semaphores to condvars and using the main mutex

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 2fba0aaa70 gallium/u_queue: add an option to name threads
for debugging

v2: correct the snprintf use

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 404d0d50d8 gallium/u_queue: add an option to have multiple worker threads
independent jobs don't have to be stuck on only one thread

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 4358f6dd13 gallium/u_queue: rewrite util_queue_fence to allow multiple waiters
Checking "signalled" is first done without a mutex, then with a mutex.
Also, checking without waiting doesn't lock the mutex. This is racy, but
should be safe.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák d8367e91f2 gallium/u_queue: use a ring instead of a stack
and allow specifying its size in util_queue_init.

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Brian Paul e0dc3c5f19 gallium/util: fix some 4-space indentation in blitter code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Ilia Mirkin 5b0d64886d translate: fix start_instance parameter in sse version
The generic version gets this right already, but this was using an
incorrect formula in SSE.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-21 21:50:16 -04:00
Marek Olšák 5fed1122e8 gallium/u_blitter: implement mipmap generation
for pipe_context::generate_mipmap

first move some of the blit code from util_blitter_blit_generic
to a separate function, then use it from util_blitter_generate_mipmap

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-21 13:52:05 +02:00
Roland Scheidegger b0cf99165a gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).

This addresses https://llvm.org/bugs/show_bug.cgi?id=28176

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Aaron Watry <awatry@gmail.com>
2016-06-20 17:19:03 +02:00
Christian König bf89e672cf vl: support luma keying for interlaced surfaces as well
We had the CSC code twice in there, factor it out into a separate function.

Signed-off-by: Christian König <christian.koenig@amd.com>
2016-06-16 09:41:12 +02:00
Brian Paul bb1292e226 auxilary/os: allow appending to GALLIUM_LOG_FILE
If the log file specified by the GALLIUM_LOG_FILE begins with '+', open
the file in append mode.  This is useful to log all gallium output for
an entire piglit run, for example.

v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-15 17:16:42 -06:00
Marek Olšák 562cb03d76 gallium/util: import the multithreaded job queue from amdgpu winsys (v2)
v2: rename the event to util_queue_fence

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-15 21:07:34 +02:00
Roland Scheidegger afbf5888f5 gallium/util: don't use blocksize for minify for assertions
The previous assertions required for texture sizes smaller than block_size
that src_box.x + src_box.width still be block size.
(e.g. for a texture with width 3, and src_box.x = 0, src_box.width would
have to be 4 to not assert.)
This caused some assertions with some other state tracker.
It looks though like callers aren't expected to round up widths to block sizes
(for sizes larger than block size the assertion would still have verified it
wouldn't have been rounded up) so we simply shouldn't use a minify which
rounds up to block size.
(No piglit change with llvmpipe.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-14 17:03:34 +02:00
Julien Isorce 1cdb4da1d6 st/va: ensure linear memory for dmabuf
In order to do zero-copy between two different devices
the memory should not be tiled.

Tested with GStreamer on a laptop that has 2 GPUs:
1- gstvaapidecode:
   HW decoding and dmabuf export with nouveau driver on Nvidia GPU.
2- glimagesink:
   EGLImage imports dmabuf on Intel GPU.

TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-14 08:40:33 +01:00
Mathias Fröhlich c3b6656676 mesa/gallium: Move u_bit_scan{,64} from gallium to util.
The functions are also useful for mesa.
Introduce src/util/bitscan.{h,c}. Move ffs function
implementations from src/mesa/main/imports.{h,c}.
Move bit scan related functions from
src/gallium/auxiliary/util/u_math.h. Merge platform
handling with what is available from within mesa.

v2: Try to fix MSVC compile.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2016-06-14 05:19:10 +02:00
Brian Paul cf9bb9acac util: update some assertions in util_resource_copy_region()
To cope with copies of compressed images which are not multiples of
the block size.  Suggested by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>
2016-06-13 13:30:19 -06:00
Jan Vesely 1fb4179f92 vl: Fix trivial sign compare warnings
v2: add whitepace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
[Emil Velikov: squash a few more whitespace issues]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Rob Herring 112e988329 Android: move libdrm settings to top-level Android.common.mk
Fix warnings like these due to HAVE_LIBDRM being inconsistently defined:

external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition]
typedef struct drm_clip_rect drm_clip_rect_t;

HAVE_LIBDRM needs to be set project wide to fix this. This change also
harmlessly links libdrm with everything, but simplifies the makefiles a
bit.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Jan Vesely ace70aedcf gallivm: Fix trivial sign warnings
v2: include whitespace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-13 09:23:09 -04:00
Brian Paul dd4be2e19a util: update util_resource_copy_region() for GL_ARB_copy_image
This primarily means added support for copying between compressed
and uncompressed formats.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Anuj Phogat 466b320163 gallium: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
 -----------
|           |
 ------- ---
|       |   |
|       |   |
 ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-10 14:35:21 -07:00
Dave Airlie 1584918996 gallivm: more 64-bit integer prep work.
This converts one other place to using the new helper.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:30 +10:00
Dave Airlie e5c57824ec gallivm: make non-float return code bitcast consistent.
This just uses the same form across the fetches.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:17 +10:00
Dave Airlie 3b97e50b9a gallium/gallivm: use 64-bit test instead of doubles.
This just makes some generic code that currently emits double
suitable for emitting 64-bit values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:13 +10:00
Dave Airlie 213ab8db87 gallium/tgsi: add 64-bitness type check function.
Currently this just doubles, but we'll convert users to this
so making adding 64-bit integers easier.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:43:45 +10:00
Leo Liu 2ad443e4cc vl/dri3: support receiving new pixmap for front buffer
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.

This also fixes Totem player playback corruption.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 11:24:24 -04:00
Leo Liu 0ef8500aab vl/dri3: get Makefile properly
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.

This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 11:24:19 -04:00
Jose Fonseca 2b4cee0571 gallivm: Never emit llvm.fmuladd on LLVM 3.3.
Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't
handle llvm.fmuladd and instead it fall backs to a C function.  Which in
turn causes a segfault on Windows.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 16:17:04 +01:00
Jose Fonseca 320d1191c6 gallivm: Use llvm.fmuladd.*.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Jose Fonseca 9e8edfa190 util,gallivm: Explicitly enable/disable fma attribute.
As suggested by Roland Scheidegger.

Use the same logic as f16c, since fma requires VEX encoding.

But disable FMA on LLVM 3.3 without MCJIT.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Nayan Deshmukh f24eb5a178 vl: Apply luma key filter before CSC conversion
Apply the luma key filter to the YCbCr values during the CSC conversion
    in video buffer shader. The initial values of max and min luma are set
    to opposite values to disable the filter initially and will be set when
    enabling it.

    Add extra parmeters min and max luma for the luma key filter in
    vl_compositor_set_csc_matrix in va, xvmc. Setting them
    to opposite value 1.f and 0.f respectively won't effect the CSC
    conversion

    v2: -Squash 1,2 and 3 into one patch to avoid breaking build of
        other components. (Christian)
        -use ureg_swizzle. (Christian)
        -change name of the variables. (Christian)

    v3: -Squash all patches in one to avoid breaking of build. (Emil)
        -wrap functions properly. (Emil)
        -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil)

    v4: -Divide it in two patches one which introduces the functionality
	 and assigs dummy values to the changed functions and second which
	 implements the lumakey filter. (Christian)
	-use ureg_scalar instead ureg_swizzle. (Christian)

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-09 14:23:07 +02:00
Nicolai Hähnle d3a584defe tgsi/scan: add uses_derivatives (v2)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-07 23:45:17 +02:00
Ilia Mirkin 30684b50d7 gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:28 -04:00
Charmaine Lee 627e975896 tgsi: fix mixed data type comparison in tgsi_point_sprite.c
Cast the unsigned semantic index to integer datatype before comparing
to max_generic, otherwise, max_generic which is initialized to -1
will be converted to unsigned int before the comparison, causing a wrong
semantic index to be assigned to a shader output.

Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265)

Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord.

v2: use the original max_generic variable but add the (int) cast
    to the semantic index, as suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-06 10:20:45 -06:00
Lars Hamre 4163c71010 tgsi: use truncf in micro_trunc
Switches to using truncf in micro_trunc.

Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-trunc-float
fs-trunc-vec2
fs-trunc-vec3
fs-trunc-vec4
vs-trunc-float
vs-trunc-vec2
vs-trunc-vec3
vs-trunc-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-trunc-float
gs-trunc-vec2
gs-trunc-vec3
gs-trunc-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-06 15:56:28 +02:00
Marek Olšák ada3d8f31e gallium/u_suballoc: allow different alignment for each allocation
Just move the alignment parameter from u_suballocator_create
to u_suballocator_alloc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Rob Clark 228b2b36f4 gallium/util: remove u_staging
Unused, and fixes a couple of coverity warnings: CID 1362171, 1362170

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2016-06-02 15:44:07 -04:00
Nicolai Hähnle d9893feb2c gallium/cso: allow saving the first fragment shader image slot
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:15 +02:00
Nicolai Hähnle fc0352ff9c gallium/u_inlines: allow NULL src in util_copy_image_view
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:12 +02:00
Marek Olšák 9d881cc0ac gallium/util: add util_texrange_covers_whole_level from radeon
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák 921ab0028e gallium/u_blitter: do GL-compliant integer resolves
The GL spec has been clarified and the new rule says we should just
copy 1 sample.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-31 16:48:53 +02:00
Frederic Devernay cee459d84d gallivm: initialize init_native_targets_once_flag correctly
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-05-30 16:13:52 +02:00
Brian Paul 747754f027 gallium/util: another s/unsigned/enum pipe_prim_type/ for clang
Trivial.
2016-05-27 18:42:21 -06:00
Brian Paul 8beb6f3c9c gallium/util: another unsigned -> enum pipe_prim_type change
gcc didn't warn about the unsigned / enum pipe_prim_type mismatch
between the .c and .h file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-27 17:55:05 -06:00
Roland Scheidegger 9247570d42 gallivm: eliminate a unnecessary AND with unorm lerps
Instead of doing a add and then mask out the upper bits, we can
simply do a add with a half wide type (this, of course, assumes
the hw can actually do it...), so we'll get the required zero
in the upper bits automatically.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-27 19:11:28 +02:00
Roland Scheidegger 17d685c426 gallium/util: use enum pipe_prim_type instead of unsigned some more
There were complaints from a mingw build:
u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’
to ‘pipe_prim_type’ [-fpermissive]

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-27 19:11:28 +02:00
Rob Clark 4f98c94be7 gallium/util: fix build break
Missing #include caused build breaks after 21a3fb9cd.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-26 20:59:08 -04:00
Brian Paul 1ec45a1948 gallium/util: use enum pipe_prim_type in u_prim.h functions
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul 7a49b41436 util/indices: move duplicated assignments out of switch cases
Spotted by Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul a25ae485a6 util/indices,svga: s/unsigned/enum pipe_prim_type/
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul 21a3fb9cd8 util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul 0f983e1793 util/indices: implement unfilled (tri->line) conversion for adjacency prims
Tested with new piglit gl-3.2-adj-prims test.

v2: re-order trisadj and tristripadj code, per Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul d6c2c7d710 util/indices: implement provoking vertex conversion for adjacency primitives
Tested with new piglit gl-3.2-adj-prims test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul 479d364c39 util/indices: assert that the incoming primitive is a triangle type
The unfilled index translator/generator functions should only be
called when the primitive mode is one of the triangle types.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul 26de558072 util/indices: formatting, whitespace fixes in u_unfilled_indices.c
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul 24eadb4810 util/indices: improve comments in u_indices.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Rob Clark 6e51fe75a4 tgsi: fix coverity out-of-bounds warning
CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local:
Overrunning array of 2 16-byte elements at element index 2 (byte offset
32) by dereferencing pointer &inst.Dst[i].

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 15:17:49 -04:00
Rob Clark 3d66ba971e tgsi: fix out of bounds access
Not sure why coverity calls this an out-of-bounds read vs out-of-bounds
write.

CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local:
Overrunning array r of 3 16-byte elements at element index 3 (byte
offset 48) using index chan (which evaluates to 3).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 15:17:49 -04:00
Lars Hamre c626a86586 gallium/tgsi: use _mesa_roundevenf in micro_rnd
Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-roundeven-float
fs-roundeven-vec2
fs-roundeven-vec3
fs-roundeven-vec4
vs-roundeven-float
vs-roundeven-vec2
vs-roundeven-vec3
vs-roundeven-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-roundeven-float
gs-roundeven-vec2
gs-roundeven-vec3
gs-roundeven-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 07:59:15 -06:00
Giuseppe Bilotta 8c00fe3970 scons: whitespace cleanup
This text transformation was done automatically via the following shell
command:

$ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \;

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-25 12:23:12 -06:00
Brian Paul 9690ab0cdf tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer
Print "GEOM" instead of "2", for example.

v2: also update the text parsing code, per Ilia.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-25 07:21:23 -06:00
Brian Paul 2b773fcf00 tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-25 07:21:23 -06:00
Emil Velikov a155cdaace vl/drm: don't call close(-1) in vl_drm_screen_create error path
Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-05-23 12:07:47 +01:00
Dave Airlie e6d9389366 tgsi: remove culldist semantic.
This isn't used anymore in the tree, culldist's
are part of the clipdist semantic, we could in theory
rename it, but I'm not sure there is much point, and
I'd have to be careful with virgl.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:03:44 +10:00
Dave Airlie d17062a40e draw: stop using CULLDIST semantic.
The way the HW works doesn't really fit with having
two semantics for this.

The GLSL compiler emits 2 vec4s and two properties,
this makes draw use those instead of CULLDIST semantics.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:03:40 +10:00
Axel Davy 52cb8e33c3 gallium/util: Implement util_format_translate_3d
This is the equivalent of util_format_translate, but for volumes.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Brian Paul 5888c47cc9 cso: remove / add some comments
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-17 19:20:36 -06:00
Jan Vesely 47b390fe45 Treewide: Remove Elements() macro
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-17 15:28:04 -04:00
Jose Fonseca cf010de6ee vl/dri: Move the DRI3 check out of sources include into C.
Fixes SCons build.

Trivial.  Built locally with SCons and autotools.
2016-05-16 21:50:43 +01:00
Leo Liu c122c74dca vl/dri3: implement functions for get and set timestamp
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 9f50a79b8f vl/dri3: handle PresentCompleteNotify event
and get timestamp calculated based on the event's reply

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 8d7ac0a4e4 vl/dri3: implement DRI3 BufferFromPixmap
We also need render to the front buffer of temporary X pixmap,
this is the case of when we using opengl as video out for vaapi.
the basic implementation is to pass pixmap ID to X server, and
then X will return dma-buf fd, we will get the buffer object
through this dma-buf fd.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 858b329c2c vl/dri3: add support for resizing
When drawable size changed, PresentConfigureNotify event will be
emitted, by handling the event to re-allocate resized buffer.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 96580ad593 vl/dri3: implement funciton for get dirty area
This will clear presentation area not covered by video content

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu b0bd908284 vl/dri3: implement function for flush frontbuffer
Request drawable content in pixmap by calling DRI3 PresentPixmap,
and handle PresentIdleNotify event.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu e1223282db vl/dri3: add back buffers support
This implements DRI3 PixmapFromBuffer. Create buffer objects, and
associate it to a dma-buf fd, and then pass this fd with a pixmap
ID to X server for creating pixmap object; also add a function
for wait events.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 69ba9be4d2 vl/dri3: implement flushing for queued events
also place holder for present events handling

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 758b1bbaa7 vl/dri3: register present events
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 672e8d5e7e vl/dri3: set drawable geometry
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 12e5220e34 vl/dri3: add DRI3 support and implement create and destroy
Required functions into place for implementation, create screen
with device fd returned from X server, also bail out to DRI2
with certain conditions.

v2: -organize the error out path (Axel)
    -squash previous patch 1 and 2 into one (Emil)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu bd9ae72459 vl/dri: fix close fd error out
fd should be set to -1 only if it got closed by pipe_loader_release.

Signed-off-by: Leo Liu <leo.liu@amd.com>
2016-05-12 18:26:48 -04:00
Tim Rowley 2785f2f2d7 swr: properly expose compressed format support
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-12 14:12:18 -05:00
Rob Clark 425dc4c4b3 gallium: refactor pipe_shader_state to support multiple IR's
The goal is to allow the pipe driver to request something other than
TGSI, but detect whether what is getting is TGSI vs what it requested.
The pipe drivers will always have to support TGSI (and convert that into
whatever it is that they prefer), but in some cases we should be able to
skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir).

I think pipe_compute_state should get similar treatment.  Currently,
afaict, it has one user and one consumer, which has allowed it to be
sloppy wrt. supporting alternative IR's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-11 12:20:11 -04:00
Roland Scheidegger 430797843a gallivm: improve dumping of bitcode
Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode.
Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple
modules can be dumped (albeit it might still overwrite previous modules,
particularly the modules from draw tend to always have the same name).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-11 04:43:35 +02:00
Roland Scheidegger e4cf8717de gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir
Those aren't really interesting, however outputting them is helpful when
trying to feed the IR to llvm llc (or opt) for debugging.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-10 17:08:16 +02:00
Roland Scheidegger 5c200894c8 gallivm: use InternalLinkage instead of PrivateLinkage for texture functions
At least with MCJIT the disassembler will crash otherwise when trying to
disassemble such functions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-10 17:08:16 +02:00
Roland Scheidegger 8b66e2647d gallivm: disable avx512 features
We don't target this yet, and some llvm versions incorrectly enable it based
on cpu string, causing crashes.
(Albeit this is a losing battle, it is pretty much guaranteed when the next
new feature comes along llvm will mistakenly enable it on some future cpu,
thus we would have to proactively disable all new features as llvm adds them.)

This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested)

Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com

CC: <mesa-stable@lists.freedesktop.org>
2016-05-10 17:08:16 +02:00
Tim Rowley b65f7ec450 gallium: enable intel jitevents profiling
LLVM when configured with "intel jitevents" enabled can inform
VTune about dynamic code, so individual shaders are attributed
profiling data and the resulting assembly can be examined.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-05-09 11:25:02 -05:00
Nicolai Hähnle 62b7958cd0 gallium: fix various undefined left shifts into sign bit
Funnily enough, some of these were turned into a compile-time error by gcc
with -fsanitize=undefined ("initializer is not a constant").

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:59 -05:00
Brian Paul ef5a31fc06 gallium/util: change assertion to conditional in util_bitmask_destroy()
If we fail to create a context in the VMware driver we call this function
unconditionally to free a bunch of bit vectors.  Instead of asserting on
a null pointer, just no-op.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 15:40:49 -06:00
Brian Paul 68116dcd5a cso: null-out previously bound sampler states
If, for example, we previously had 2 sampler states bound and now we
are binding one, we'd leave the second sampler state unchanged.
This change nulls-out the second sampler state in this situation.
We're already doing the same thing for sampler views.

This silences an occasional warning issued by the VMware driver when
the number of sampler views and sampler states disagreed.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-03 15:40:49 -06:00
Jan Vesely ebbe31d57c gallium,utils: Fix trivial sign compare warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 12:00:09 -04:00
WuZhen 798f7a8596 tgsi: initialize stack allocated struct
Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
Emil Velikov 0700cdd5aa gallium/target-helpers: remove inline_wrapper_sw_helper.h
Unused as of commit dddedbec0e "{st,targets}/nine: use static/dynamic
pipe-loader"

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:37:25 +01:00
Nicolai Hähnle 59af21c3e9 tgsi/text: fix parsing of memory instructions
Properly handle Target and Format parameters when present.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:56 -05:00
Nicolai Hähnle 4055babc75 tgsi/text: add str_match_name_from_array
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:53 -05:00
Nicolai Hähnle a56edbdd8f tgsi/text: add str_match_format helper function
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:51 -05:00
Nicolai Hähnle acb65a23a3 tgsi/build: pass Memory.Texture and .Format through tgsi_build_full_instruction
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:32 -05:00
Nicolai Hähnle 318d305f6d tgsi/dump: signal nospace when the last print exceeded the size
Previously, there was a bug where nospace wasn't signalled if it just so
happened that the very last print exceeded the available space.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:28 -05:00
Nicolai Hähnle e08eaa5b72 tgsi/dump: shared dump_ctx initialization
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:21 -05:00
Brian Paul a609da60c0 gallium/util: s/Elements/ARRAY_SIZE/
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 09:04:24 -06:00
Brian Paul 23c55e5c23 tgsi: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 419e386571 os: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul d902504a67 hud: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul e522a76226 gallivm: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 489df4a71a draw: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Nicolai Hähnle 91fb4bb2e9 gallium/util: add u_bit_consecutive for generating a consecutive range of bits
There are some undefined behavior subtleties, so having a function to match
the u_bit_scan_consecutive_range makes sense.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Dave Airlie f78bcb7638 tgsi/exec: initialise SysSemanticToIndex array to -1
We want to use the SysSemanticToIndex to tell if we've seen
the semantics at all.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:46 +10:00
Dave Airlie fbea4e177f tgsi/exec: implement restartable machine.
This lets us restart the machine at a PC value, and exits
the machine when we hit a barrier.

Compute shaders will then execute all the threads up to the
barrier, then restart the machines after the barrier once
all are done.

v2: comment the code a bit, change return types.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:44 +10:00
Dave Airlie 8ffa3c58d4 tgsi/exec: make inputs/outputs optional for compute shaders.
compute shaders don't need input/outputs so don't bother
allocating memory for these.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:41 +10:00
Dave Airlie 16a9dc1e49 tgsi/exec: implement load/store/atomic on MEMORY.
This implements basic load/store/atomic ops on MEMORY types
for compute shaders.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:35 +10:00
Dave Airlie 354c5f2d0f tgsi/exec: split out setting up masks to separate function
This is just a cleanup that will make later changes easier
to make.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:56:22 +10:00
Dave Airlie 6cf36a7231 tgsi: accept a starting PC value for exec machine.
This will be used later to restart barriered execution
threads in compute, for now we just want to change the API.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:56:17 +10:00
Dave Airlie 912ed84f83 tgsi: move to using vector for system values.
For compute support some of the system values are .xyz types,
so move to using a vector instead of a single channel.

[airlied: squash swizzle fix from compute series].

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:26:53 +10:00
Dave Airlie 9013d9267c tgsi/exec: fix system value handling.
a) SysSemanticToIndex needs to be indexed with the semantic name
not the decl->Declaration.Semantic.

b) doing this in run is too late, as the mappings are all setup
prior to run in the execs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:25:38 +10:00
Jose Fonseca dcc3baf733 gallium: Include intrin.h instead of defining ourselves.
More portable, particularly when building with Clang, which implements
all MSVC intrisincs in its own intrin.h, but doesn't actually support
`#pragma instrinsic`.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 17:17:00 +01:00
Dave Airlie e3e6859381 tgsi: pass a shader type to the machine create and clean up.
There was definitely bugs here mixing up the PIPE_ and TGSI_ defines,
hopefully they didn't cause any problems, since mostly it was special
cases for GEOMETRY.

This clarifies at shader machine create what type of shader this
machine will execute. This is needed also for compute shaders where
we don't want to allocate inputs/outputs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 13:05:32 +10:00
Dave Airlie a6aae0c24d gallium/tgsi: move tgsi_exec.h header out of draw_context.h
It gets annoying that changing the tgsi exec rebuilds the state
tracker unnecessarily. Putting this include into draw_gs.h which
uses it causes a lot less rebuilds.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 13:00:57 +10:00
Roland Scheidegger bd07e20d20 gallivm: make sampling more robust against bogus coordinates
Some cases (especially these using fract for coord wrapping) did not handle
NaNs (or Infs) correctly - the following code assumed the fract result
could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract
result was NaN - which then could produce out-of-bound offsets.

(Note that the explicit NaN behavior changes for min/max on x86 sse don't
result in actual changes in the generated jit code, but may on other
architectures. Found by looking through all the wrap functions.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955

No piglit changes.

(v2: fix min/max typo in coord_mirror, add comment)

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-26 04:55:37 +02:00
Brian Paul 63df017fda util/blitter: use ARRAY_SIZE macro
And remove local definition of Elements() macro.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-25 12:59:29 -06:00
Brian Paul 1db8313168 gallium/util: initialize pipe_framebuffer_state to zeros
To silence a valgrind uninitialized memory warning.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-25 12:59:29 -06:00
Brian Paul 1e990978ee util/cache: add comments, fix formatting 2016-04-25 12:59:29 -06:00
Grazvydas Ignotas dc732a8ef2 gallium: use unreachable instead of asserts
Avoids warnings in release builds.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:34 +02:00
Grazvydas Ignotas cbb0d4ad75 gallium: fix warnings in release build
Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings
in release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:21 +02:00
Marek Olšák c9e5a7df61 gallium: remove helpers converting to/from TGSI_PROCESSOR_*
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák af249a7da9 gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák fb523cb6ad gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Roland Scheidegger 4ff8cbb0d8 gallivm: fix bogus argument order to lp_build_sample_mipmap function
Screwed up since 0753b135f6.

(Only an issue with different min/mag filters, and then only in some cases,
which is probably why it went unnoticed for quite a while.
The effect should have simply been nearest mip filter instead of linear, iff
min was nearest, mag was linear, and all pixels hit the mignifying path.)

Fixes a bunch of dEQP failures.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-21 23:57:24 +02:00
Russell King fadfaa82c6 tgsi/lowering: improved lowering for LRP
Provide an improved lowering for LRP, which can be implemented in two
MAD instructions with a bit of rearranging of the equation, rather
than the literal implementation of two multiplies, an add and a
subtract.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King 67da7dd98a tgsi/lowering: improved lowering for XPD
Improve XPD lowering to consume less instructions by using the
MAD instruction to perform the multiply and subtraction together.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King 65460cf4c8 tgsi/lowering: add support for lowering TRUNC
Add support for lowering TRUNC using the following sequence:

	FRC tmpA, |src|
	SUB tmpA, |src|, tmpA
	CMP dst, -tmpA, tmpA

Note that this is incompatible with FRC lowering.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King 23e870a888 tgsi/lowering: add support for lowering FLR and CEIL
Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD
instructions for GPUs that support FRC but not FLR or CEIL.  Since
these uses FRC, it is invalid to ask for FLR or CEIL to be lowered
along with FRC, so add an assert to catch this invalid configuration.

We also need to deal with FLR instructions emitted by the lowering
code.  Fix these up with the FRC+SUB equivalent when FLR lowering is
enabled.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Bas Nieuwenhuizen 0b6c463dac gallium/util: Add u_bit_scan_consecutive_range64.
For use by radeonsi.

v2: Make sure that it works for all 64 bits set.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Jose Fonseca 932b71f17d gallivm: Avoid llvm::sys::getProcessTriple().
Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3
onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway,

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:37 +01:00
Jose Fonseca b5ca689cee gallivm: Remove lp_get_module_id.
Just keep a copy of the module_name in gallivm.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:26 +01:00
Jose Fonseca 969ba8bfa7 gallivm: Fix MCJIT with LLVM 3.3.
One needs to call setJITMemoryManager for LLVM 3.3, instead of
setMCJITMemoryManager.

This regressed in commits 065256df/75ad4fe7 when trying to make the
code to build with LLVM 3.6.

Tested MCJIT with LLVM 3.3 to 3.6.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:17 +01:00
Jose Fonseca cf4105740f gallivm: Make MCJIT a runtime option.
On the LLVM versions that support it, so we can easily switch between
MCJIT/old-jit for testing.

The new option is GALLIVM_MCJIT.

Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes
segfault, both on Linux and Windows.  I'm almost certain this used to
work, so there probably is a regression somewhere.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:14 +01:00
Jose Fonseca 2211f8d559 gallivm: Use LLVMSetTarget.
Instead of LLVM C++ interfaces.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:00 +01:00
Jose Fonseca 9aa23b11e4 gallivm: Use LLVMPrintValueToString where available.
And llvm::raw_string_ostream where not (LLVM 3.3).

Thereby eliminating yet another dependency on unstable LLVM interfaces.

As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which
was disabled, probably due to C++ issues.)

Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:28:37 +01:00
Jose Fonseca f6621cd3be gallium/tests: Update UTIL_FORMAT_MAX_* defines.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:28:16 +01:00
Dave Airlie 3a26ef23e7 gallivm: convert size query to using a set of parameters.
This isn't currently that easy to expand, so fix it up
before expanding it later to include dynamic samplers.

[airlied: use some local variables (Roland)]

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-19 07:33:39 +10:00
Marek Olšák 6dc21b1962 gallium/util: fix undefined shift to the last bit in u_bit_scan
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák 9434aa8103 gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffff
The second ffs returns 0, yielding count == -1.

v2: change 1 to 1u

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-18 19:51:24 +02:00
Roland Scheidegger d11111a551 gallivm: don't use vector selects with llvm 3.7
llvm 3.7 sometimes simply miscompiles vector selects.
See https://bugs.freedesktop.org/show_bug.cgi?id=94972

This was fixed in llvm r249669
(https://llvm.org/bugs/show_bug.cgi?id=24532).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-18 00:23:34 +02:00
Tim Rowley ee72fec9cf gallium/swr: allow swr use as a swrast dri driver
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-15 14:21:50 -05:00
Jose Fonseca 9586468c03 gallivm: Workaround LLVM PR 27332.
The credit for finding and isolating this bug goes to Vinson and Roland.

The buggy LLVM versions were found by doing

  opt -instcombine llvm-pr27332.ll > /dev/null

where llvm-pr27332.ll is the IR from
https://llvm.org/bugs/show_bug.cgi?id=27332#c3

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-13 16:42:55 +01:00
Roland Scheidegger cb438d8b3e gallivm: use llvm.nearbyint instead of llvm.round.
We used to use sse roundps intrinsic directly, but switched to use the llvm
intrinsics for rounding with e4f01da15d.
However, llvm semantics follows standard math lib round function which is
specced to do roundNearestAwayFromZero but we really want roundNearestEven
(moreoever, using round generates atrocious code since the cpu can't do it
directly and it results in scalar calls to libm __roundf).
So, use llvm.nearbyint instead, which does exactly the right thing, and even
has the advantage of being available with llvm 3.3 too. (I've verified it
actually generates a roundps instruction with llvm 3.3.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-13 11:13:03 +01:00
Thomas Hindoe Paaboel Andersen b89708f95f tgsi: fix buffer overflow
Increase r to four channels as rgba is written to it
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-13 11:51:34 +10:00
Jose Fonseca b5105e67a8 gallium: Use STATIC_ASSERT whenever possible.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 16:56:15 +01:00
Marek Olšák 0ba0933f48 winsys/amdgpu: add support for 64-bit buffer sizes
v2: fail in radeon_winsys_bo_create if size > 32 bits

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák 7e78b5ed38 pb_buffer: switch pb_buffer::size to 64 bits
being able to allocate more than 4 GB may be useful

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák e599b8f384 gallium: pause queries for all meta ops
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:46 +02:00
Dave Airlie afa8707ba9 softpipe: add SSBO/shader atomics support.
This adds support for the features requires for ARB_shader_storage_buffer_object
and ARB_shader_atomic_counters, ARB_shader_atomic_counter_ops.

[airlied: some cleanups applied]
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:16:13 +10:00
Dave Airlie c2aeeca455 draw: add support for passing buffers to vs/gs shaders.
Like the image code, but for shader buffers this time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:36 +10:00
Dave Airlie 081a958bcd tgsi: add support for buffer/atomic operations to tgsi_exec.
This adds support for doing load/store/atomic operations on
buffer objects.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:33 +10:00
Dave Airlie 9c7a0d188a tgsi: set nonhelpermask for vertex shaders
For atomic operations we really need to avoid executing unnecessary shaders, so for some
tests that just draw a single point we only want one vertex to get processed not 4,

this fixes a number of the atomic counters tests.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:16 +10:00
Emil Velikov 594e868555 Part revert "gallium/auxiliary: don't build NIR sources with MSVC2008 flags"
This reverts commit 41c7912d04 but leaves
out the pragma [that inspired the original commit].

Building mesa requires MSVC2013 or later, thus we no longer need this.

v2: Use correct include path (src/glsl/nir -> src/compiler/nir)

Conflicts:
	src/gallium/auxiliary/Makefile.am

Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-04-11 19:08:23 +01:00
Samuel Iglesias Gonsálvez 3663a2397e nir: add bit_size info to nir_load_const_instr_create()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Nicolai Hähnle 4bfcc86bf9 tgsi/scan: add an assert for the size of the samplers_declared bitfield
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-07 13:15:05 -05:00
Nicolai Hähnle cc39879989 draw/aaline: stronger guard against no free samplers (v2)
Line anti-aliasing will fail when there is no free sampler available. Make
the corresponding guard more robust in preparation of raising
PIPE_MAX_SAMPLERS to 32.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:05 -05:00
Nicolai Hähnle 040f5cb09e util/pstipple: stronger guard against no free samplers (v2)
When hasFixedUnit is false, polygon stippling will fail when there is no free
sampler available. Make the corresponding guard more robust in preparation
of raising PIPE_MAX_SAMPLERS to 32.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:02 -05:00
Edward O'Callaghan 0b7075fed7 gallium: Put no.of {samples,layers} into pipe_framebuffer_state
Here we store the number of samples and layers directly in the
pipe_framebuffer_state so that in the case of
ARB_framebuffer_no_attachment we may make use of them directly.

Further, we adjust various gallium/auxiliary helper functions
accordingly.

V2:
  Convert branches in util_framebuffer_get_num_layers() and
  util_framebuffer_get_num_samples() to their canonical form.

V3:
  'git stash pop' the typo fix of 'cbufs' which should be
  'nr_cbufs' that was missing in V2, woops! Thanks Marek for
  pointing this out yet again.

V4:
  Squash in the following patch:

  'gallium/util: Ensure util_framebuffer_get_num_samples() is valid'

   Upon context creation, internal driver structures are malloc()'ed
   and memset() to zero them. This results in a invalid number of
   samples 'by default'. Handle this in the simplest way to avoid
   elaborate and probably equally sub-optimial solutions.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Jose Fonseca 7ad49daca6 gallivm: Introduce lp_format_intrinsic.
For adding .v4f32 like suffixes to intrinsics, taking special care for
scalar case, which was being often neglected.

This fixes invalid IR when doing mipmap filtering on SSE2 (the only
case where we'd use intrinsics with scalars.)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-04 00:06:09 +01:00
Jose Fonseca a293f57e13 gallivm: Use llvm.fabs.
Exactly the same code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:09:09 +01:00
Jose Fonseca e4f01da15d gallivm: Prefer backend agnostic intrinsic for rounding.
We could unconditionally use these instrinsics, but performance with SSE2
would suck, as LLVM falls back to calling libm.

lp_test_arit.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:09:07 +01:00
Jose Fonseca 324451e73f gallivm: Add debug option to force SSE2.
For simulating less capable machines.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:08:57 +01:00
Jose Fonseca b284f1f7f9 gallivm: Fix performance regressions due to vector selects.
LLVM often can't determine the mask elements are all ones/zeros, and
there doesn't seem to be a good way to hint that.

Thanks to Roland Scheidegger for spotting and analyzing the issue.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Jose Fonseca 11c4e5b45c gallivm: Remove lp_build_load_volatile.
No longer needed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Jose Fonseca bcfb86b09d gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.
Only provide a fallback for LLVM 3.3.

One less dependency on LLVM C++ interface.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Brian Paul ef10b5427a tgsi: add simple tgsi_is_msaa_target() helper
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-02 08:05:20 -06:00
Bas Nieuwenhuizen 01f993a21f gallium: add threads per block TGSI property
The value 0 for unknown has been chosen to so that
drivers using tgsi_scan_shader do not need to detect
missing properties if they zero-initialize the struct.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-02 01:50:59 +02:00
Jose Fonseca f72de6f386 gallivm: Prevent disassembly debug output from being truncated.
By using os_log_message directly, as _debug_vprintf truncates messages
to 4K.

Also cleanup the disassemble interface.

Spotted by Roland.

Trivial.
2016-04-01 21:22:42 +01:00
Jose Fonseca cdf7c6b83d gallivm: Use vector selects on LLVM 3.3+.
This is an old patch I had around.

Vector selects seem to work well from LLVM 3.3.  Using them should
improve code quality, as it might make constant propagation pass more
effective.

Tested lp_test_*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-01 09:05:19 +01:00
Samuel Pitoiset d22eca5f90 tgsi: silence compiler warning in fetch_sampler_unit()
The unit variable can be used uninitialized.

Fixes: 24e77cb09 ("tgsi: handle indirect sampler arrays. (v2)")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-01 07:16:24 +10:00
Samuel Pitoiset 05902a6686 tgsi: fix out of bounds access in exec_atomop()
The number of channels must be 4 for all RGBA components.

Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-01 07:15:16 +10:00
Brian Paul 9076e04934 tgsi: split tgsi_util_get_texture_coord_dim() function into two
It was kind of overloaded, returning two different things.  Now get
the index of the shadow reference src register with a new
tgsi_util_get_shadow_ref_src_index() function.

To verify the new code, I added some temp/debug code which looped
over all TGSI_TEXTURE_x values, calling the old function and new and
checking that the returned indexes matched.

Also tested piglit "shadow" tests with softpipe/llvmpipe.
No testing of ilo and radeonsi changes.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:48:00 -06:00
Brian Paul 9d7cd43988 tgsi: skip texture query opcodes when examining texture targets
Should fix the assertion in piglit
spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the
TXQ instruction specifies a 2D target but the sampler view was
declared as SHADOW2D.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-31 09:47:40 -06:00
Dave Airlie eb9ad9faa3 softpipe: add image support to softpipe (v3)
This adds support for ARB_shader_image_load_store to softpipe.

v2: add RESQ support (Ilia)
v3: constify, cleanup internals, add some comments (Brian).

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:16 +10:00
Dave Airlie 0d1f679ded draw: add support for passing images to vs/gs shaders.
This just adds support for passing through images to the
tgsi execution stage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:11 +10:00
Dave Airlie 22d1296013 tgsi: add support for image operations to tgsi_exec. (v2.1)
This adds support for load/store/atomic operations on images
along with image tracking support.

v2: add RESQ support. (Ilia)
v2.1: constify interface (Brian)
split get_image_coord_dim (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:05 +10:00
Dave Airlie 827393b76f tgsi: introduce NonHelperMask
This is a mask of which of the current 2x2 grid are non-helper
invocations. This allows us to mask off the helper invocations
later for the image operations.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:50 +10:00
Dave Airlie ca180c09bb tgsi_exec: handle execmask when doing indirect lookups
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:46 +10:00
Dave Airlie 1ff4cc0535 tgsi_exec: add support for up to 3 address registers (v2)
v2: be consistent with other definitions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:08 +10:00
Roland Scheidegger 2d3b8aefda tgsi: (trivial) only verify target for is_tex instructions
d3d10 state tracker does not encode (valid) target (only offsets are
really used from the texture bits), since that information always comes
from the sview dcl, and not the instruction (note the meaning of target
is actually slightly different between gl and d3d10 in any case, because
d3d10 target does never include shadow bit).
Also move the msaa sampler identification as well - would need to set that
on the sview not sampler, so while this does not fix it make it at least
obvious it won't work with sample instructions.
2016-03-30 04:26:54 +02:00
Brian Paul 5c85c3be26 tgsi: simplify tgsi_shader_info::is_msaa_sampler checking
We assert that fullinst->Instruction.Texture != 0 above so no need to
check it in the conditional.  We also have the fullinst->Texture.Texture
value in a local variable, so use it.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:46 -06:00
Brian Paul 86e1768c13 tgsi: collect texture sampler target info in tgsi_scan_shader()
Texture sample instructions specify a sampler unit and texture target
such as "1D", "2D", "CUBE", etc.  Sampler view declarations also specify
the sampler unit and texture target.

This patch checks that the texture instructions agree with the declarations
and collects the texture target type for each sampler unit.

v2: only compare instruction's texture target to the sampler view declaration
target if the instruction is a TEX instruction, not a SAMPLE instruction.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:46 -06:00
Rovanion Luckey 7087e0ab27 gallium: Format code in pb_buffer_fenced.c according to style guide.
This is a tiny housekeeping patch which does the following:

  * Replaced tabs with three spaces.
  * Formatted oneline and multiline code comments. Some doxygen
    comments weren't marked as such and some code comments were marked
    as doxygen comments.
  * Spaces between if- and while-statements and their parenthesis.

According to the mesa coding style guidelines.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-29 13:44:11 -06:00
Marek Olšák 6262d6125a gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)
v2: move the nr_cbufs check above the loop

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
2016-03-28 00:46:23 +02:00
Rob Clark 61c7d20e4f ttn: remove stray global from header
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-24 16:04:54 -04:00
Dave Airlie 53afbc980a tgsi: drop unused set_exec/kill_mask interfaces.
These don't get used and haven't been in git history from what I can
see, so drop them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-22 13:07:05 +10:00
Nicolai Hähnle 79762e877c tgsi/scan: add writes_memory to flag presence of stores or atomics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle 457f9c6b25 tgsi/scan: track which shader images are really buffers
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Nicolai Hähnle fa096a14af tgsi/scan: add images_writemask
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Brian Paul 38e831ca3d gallium/util: declare sampler view in util_make_fs_blit_msaa_depthstencil()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul e7b5a844e3 postprocess: declare sampler views in shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul 5a9f2a2d89 hud: add sampler view declaration in text fragment shader
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul 63e020d734 gallium/tgsi: pass TGSI tex target to tgsi_transform_tex_inst()
Instead of hard-coded 2D tex target in tgsi_transform_tex_2d_inst()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Hans de Goede 3788e1bf74 tgsi: Add support for global / private / input MEMORY
Extend the MEMORY file support to differentiate between global, private
and shared memory, as well as "input" memory.

"MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
special memory type is added for this, since the actual storage of these
(e.g. UBO-s) may differ per implementation. The uploading of kernel
parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers
to use an access mechanism for parameter reads which matches with the
upload method.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:24 +01:00
Hans de Goede 43ddec2f43 tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi text
When support for decl.Atomic and .Shared was added, tgsi_build_declaration
was not updated to propagate these properly.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:19 +01:00
Hans de Goede b72156c8e0 tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memory
tgsi_default_instruction_memory / tgsi_build_instruction_memory were
returning uninitialized memory for tgsi_instruction_memory.Texture and
tgsi_instruction_memory.Format. Note 0 means not set, and thus is a
correct default initializer for these.

Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory")
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-20 18:01:53 -04:00
Marek Olšák fbe6e92899 gallium: add TGSI property NEXT_SHADER
Radeonsi needs to know which shader stage will execute after a shader
in order to make the best decision about which shader variant to compile
first.

This is only set for VS and TES, because we don't need it elsewhere.

VS has 3 variants:
- next shader is FS
- next shader is GS
- next shader is TCS

TES has 2 variants:
- next shader is FS
- next shader is GS

Currently, radeonsi always assumes the next shader is FS, which is suboptimal,
since st/mesa always knows which shader is next if the GLSL program is not
a "separate shader".

By default, ureg always sets "next shader is FS".

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-19 23:20:01 +01:00
Brian Paul e9d5e68d1b tgsi: add tgsi_transform_op3_inst() function
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-18 12:06:30 -06:00
Connor Abbott 3124ce699b nir: add a bit_size parameter to nir_ssa_dest_init
v2: Squash multiple commits addressing the new parameter in different
    files so we don't break the build (Iago)

v3: Fix tgsi (Samuel)

v4: Fix nir_clone.c (Samuel)

v5: Fix vc4 and freedreno (Iago)

v6 (Sam)
- Fix build errors in nir_lower_indirect_derefs
- Use helper to get type size from nir_alu_type.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:54:45 +01:00
Iago Toral Quiroga 084b24f558 nir: rename nir_const_value fields to include bitsize information
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-17 11:16:33 +01:00
Roland Scheidegger 12a4f0bed6 draw: fix line stippling
The logic was comparing actual ints, not true/false values.
This meant that it was emitting always multiple line segments instead of just
one even if the stipple test had the same result, which looks inefficient, and
the segments also overlapped thus breaking line aa as well.
(In practice, with the no-op default line stipple pattern, for a 10-pixel
long line from 0-9 it was emitting 10 segments, with the individual segments
ranging from 0-1, 0-2, 0-3 and so on.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94193

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

CC: <mesa-stable@lists.freedesktop.org>
2016-03-15 19:41:34 +01:00
Nicolai Hähnle 0ffcc318e6 tgsi: add tgsi_full_src_register_from_dst helper function
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:49 -05:00
Nicolai Hähnle c02d73af0b gallium/u_inlines: add util_copy_image_view
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:46 -05:00
Nicolai Hähnle e526f930aa tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:33 -05:00
Nicolai Hähnle dfcf420412 st/glsl_to_tgsi: provide Texture and Format information for image ops
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:26 -05:00
Nicolai Hähnle 3243b6fc97 tgsi: add Texture and Format to tgsi_instruction_memory
Frontends should have this information readily available, and it simplifies
image LOAD/STORE/ATOM* handling especially with indirect image access.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:02 -05:00
Emil Velikov 373f118c6c gallium: do not wrap header inclusion in
Add one missing extern C guard within include/pipe/p_video_enums.h, and
remove the wrapping throughout gallium.

On Haiku one could even use the gallium debug_printf() although
that's another topic.

v2: Leave dbghelp.h as is (Jose)

Cc: Jose Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-09 17:21:39 +00:00
Nicolai Hähnle 4eb416bd9d gallivm: special case TGSI_OPCODE_STORE
This instruction has the resource (buffer or image) as a destination to
represent the writemask for SSBO writes. However, this is obviously not
a "real" destination for the purpose of emitting LLVM IR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 11:39:55 -05:00
Nicolai Hähnle 10b2b584ee tgsi: set correct output mode for RESQ
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 11:39:43 -05:00
Marek Olšák 82db518f15 gallium: add external usage flags to resource_from(get)_handle (v2)
This will allow drivers to make better decisions about texture sharing
for DRI2, DRI3, Wayland, and OpenCL.

v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-03-09 15:02:25 +01:00
Samuel Pitoiset 7f8565f0b2 tgsi: fix parsing of shared memory declarations
The SHARED TGSI keyword is only allowed with TGSI_FILE_MEMORY and not
with TGSI_FILE_BUFFER. I have found this by using the nouveau_compiler
from command line.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
2016-03-07 22:13:08 +01:00
Brian Paul 9e6a6bd575 gallium/util: add new comments, assertions in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:34 -07:00
Brian Paul b6a607b221 gallium/util: update comments and URL in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:28 -07:00
Brian Paul cbca6964e2 gallium/util: make stream variable static in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:23 -07:00
Brian Paul fb0abedce7 gallium/util: re-indent u_debug_refcnt.[ch]
Wrap comments to 78 columns, etc.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:14 -07:00
Tim Rowley da4f95d168 gallium/target-helpers: add OpenSWR driver
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Tim Rowley ea37602273 gallium/auxilary: more __cplusplus exports
swr driver which is written in C++ needs access to some more
gallium utility functions than are currently exposed.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Thomas Hindoe Paaboel Andersen 535002f4da gallium/cso: fix indentation
Only one of these were recently introduced. However, since
we keep copy/pasting the same wrong indentation we should
probably just fix it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-02 08:55:20 -07:00
Marek Olšák 09bfbd43a0 tgsi/scan: count memory instructions
for radeonsi

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-01 00:11:32 +01:00
Rob Herring 574a92b048 Android: fix build break from nir/glsl move to compiler/
Commits a39a8fbbaa ("nir: move to compiler/") and eb63640c1d
("glsl: move to compiler/") broke Android builds. Fix them.

There is also a missing dependency between generated NIR headers and
several libraries. This isn't a new issue, but seems to have been
exposed by the NIR move.

Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Marek Olšák 190a291b03 tgsi/scan: handle holes between VS inputs, assert-fail in other cases
"st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap"
added a vertex shader declaring IN[0] and IN[2], but not IN[1].

Drivers relying on tgsi_shader_info can't handle holes in declarations,
because tgsi_shader_info doesn't track that.

This is just a quick workaround meant for stable that will work for vertex
shaders.

This fixes radeonsi DrawPixels and CopyPixels crashes.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-23 16:42:16 +01:00
Oded Gabbay a3e3c3e621 gallivm: Check whether to stop disassemble only for x86
Because the if statement that checks whether we have a return
statement is valid only on x86, surround it with X86 or X86-64
arch defines

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-19 00:18:11 +02:00
Oded Gabbay b3d42934a1 gallivm: use sstream for dissasembling
Currently, disassemble() directly prints to stdout. This has broke the
profiling support for llvmpipe JIT code.

This patch redirects the output to an sstream object, which is then
either gets printed to stdout (for assembly debugging) or gets written
to a file in /tmp/ (for profiling support).

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-19 00:18:11 +02:00
Rob Clark 5051d85b03 gallium/auxiliary: fix new gcc6 warnings
src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c: In function ‘mm_bufmgr_create_from_buffer’:
src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:288:4:
warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
    if(mm->map)
    ^~
src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:286:1: note:
...this ‘if’ clause, but it is not
 if(mm->heap)
 ^~

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-02-18 17:10:55 -05:00
Rob Clark bba836ea6a gallium/hud: fix new gcc6 warnings
src/gallium/auxiliary/hud/font.c:234:22: warning: ‘Fixed8x13_Character_159’ defined but not used [-Wunused-const-variable]
 static const GLubyte Fixed8x13_Character_159[] = {  9,  0,  0,  0,  0, 0,  0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0, 0,170,  0,  0,  0,  0,  0};
                      ^~~~~~~~~~~~~~~~~~~~~~~
.... many more..

These are simply unused, just #if 0 them out for now, in case someone
wants to use them in the future.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-02-18 17:10:55 -05:00
Samuel Pitoiset dfc95ad6d1 gallium/cso: only enable compute shaders when TGSI is supported
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94186
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-18 20:41:25 +01:00
Roland Scheidegger d335b6abc0 gallivm, tgsi: provide fake sample_i_ms implementations
Just like the rest of the msaa "implementation" it's just fake for now...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-18 05:00:03 +01:00
Tom Stellard 77f4e1c7ff gallivm: Add helpers for creating and destroying TargetLibraryInfo
This functionality is not exposed via the LLVM C API.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-17 19:06:41 +00:00
Brian Paul 1bf8fa8277 cso: add CSO_BITS_ALL_SHADERS
For saving/restoring all shader stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-16 10:22:32 -07:00
Brian Paul ffa1a1dd21 cso: make most of the cso_save/restore_x() functions static
Users of the CSO save/restore facility all use the new
cso_save/restore_state() functions instead.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-16 10:22:32 -07:00
Brian Paul 223ffd8a08 postprocess: use new cso_save/restore_state() functions
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-16 10:22:32 -07:00
Brian Paul 70e8a4f734 gallium/hud: use new cso_save/restore_state() functions
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-16 10:22:32 -07:00
Brian Paul 66889d8f84 gallium/util: use new cso_save/restore_state() functions
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-16 10:22:32 -07:00
Brian Paul 9403571755 cso: add new cso_save/restore_state() functions
cso_save_state() takes a bitmask of state items to save.  Calling
cso_restore_state() restores those states.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-16 10:22:32 -07:00
Brian Paul 017a003f1c cso: remove comment
There's a similar comment just a few lines before.
2016-02-16 10:22:32 -07:00
Brian Paul f7af12ae85 cso: add new cso_set_viewport_dims() helper
To simplify some viewport setting code in the state tracker.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-16 10:22:32 -07:00
Matthew Dawson 0bba5ca468 Handle removal of LLVMAddTargetData in SVN revision 260919
LLVM removed LLVMAddTargetData for the 3.9 release in r260919.  For the two
places in mesa where this is called, only enable the lines when compiling
for less then 3.9.

For the radeon driver, I'm not sure how to check if any other LLVM calls need
to be adjusted.  I think since the target data used is extracted from the
LLVMModule, it isn't necessary to pass it back to LLVM again.

The code does compile, and at least for radeonsi does run OpenGL games.

[ Michel Dänzer: Move #if closer to LLVMAddTargetData in lp_bld_init.c,
  and add HAVE_LLVM < 0x0309 guards around now unused occurrences of TD
  and data_layout ]

Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-02-16 16:18:35 +09:00
Ilia Mirkin e2a1ec5f0f tgsi: show textual format representation
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-02-15 22:22:33 -05:00
Ilia Mirkin 9fbfa1abb2 gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGES
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-02-15 22:22:33 -05:00
Ilia Mirkin bceff68114 gallium: make image views non-persistent objects
Make them akin to shader buffers, with no refcounting/etc. Just used to
pass data about the bound image in ->set_shader_images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-15 22:22:33 -05:00
Samuel Pitoiset a8328e3a50 tgsi/ureg: add shared variables support for compute shaders
This introduces TGSI_FILE_MEMORY for shared, global and local memory.
Only shared memory is currently supported.

Changes from v2:
 - introduce TGSI_FILE_MEMORY

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-13 15:51:17 +01:00
Samuel Pitoiset 5e09ac78e5 gallium: add PIPE_SHADER_CAP_SUPPORTED_IRS
This cap indicates the supported representations of programs. It should
be a mask of pipe_shader_ir bits. It will allow to enable
ARB_compute_shader if the underlying driver supports TGSI.

Changes from v2:
 - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-13 15:51:17 +01:00
Samuel Pitoiset 61ed09c7ea gallium/cso: add support for compute shaders
Changes from v2:
 - removed cso_{save,restore}_compute_shader() functions and the
   compute_shader_saved variable because disabling compute shaders for
   meta ops is not currently needed

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-13 15:51:17 +01:00
Jose Fonseca 0d4898ae80 include,gallium: Remove pre-MSVC 2013 compatibility.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-11 21:36:00 +00:00
Jose Fonseca a97a955b92 scons: Eliminate MSVC2008 compatibility.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-11 21:36:00 +00:00
Jose Fonseca 1cadfe08c4 configure: Eliminate MSVC2008 compatibility.
We no longer need to build any part of Mesa with Windows SDK 7.0.7600 or
MSVC 2008.  MSVC 2013 will be the oldest we support.

In practice this means people are now free to declare variables in the
middle of blocks, on the whole Mesa tree.

Care should still be taken with variable length arrays and void pointer
arithmetic.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Hella-acked-by: Ian Romanick <ian.d.romanick@intel.com>
2016-02-11 21:36:00 +00:00
Jason Ekstrand 5ec456375e nir: Separate texture from sampler in nir_tex_instr
This commit adds the capability to NIR to support separate textures and
samplers.  As it currently stands, glsl_to_nir only sets the texture deref
and leaves the sampler deref alone as it did before and nir_lower_samplers
assumes this.  Backends can still assume that they are combined and only
look at only at the texture index.  Or, if they wish, they can assume that
they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir
all set both texture and sampler index whenever a sampler is required (the
two indices are the same in this case).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-09 15:00:17 -08:00
Jason Ekstrand ee85014b90 nir/tex_instr: Rename sampler to texture
We're about to separate the two concepts.  When we do, the sampler will
become optional.  Doing a rename first makes the separation a bit more
safe because drivers that depend on GLSL or TGSI behaviour will be fine to
just use the texture index all the time.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-09 15:00:17 -08:00
Rob Clark ead05e8670 ttn: use const_index helpers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-02-09 17:30:33 -05:00
Rob Clark b1770235ed ttn: small logic cleanup
The only case where dim!=NULL is where op==load_ubo.  But using
op==load_ubo is less confusing.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-02-09 17:30:33 -05:00
Nicolai Hähnle c260175677 draw: use util_pstipple_* function for stipple pattern textures and samplers
This reduces code duplication.

Suggested-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-09 10:01:57 -05:00
Nicolai Hähnle 452e51bf1e draw: use util_pstipple_create_fragment_shader
This reduces code duplication. It also adds support for drivers where the
fragment position is a system value.

Suggested-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-02-09 10:01:32 -05:00
Brian Paul 59251610ed tgsi: minor whitespace fixes in tgsi_scan.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul 42246ab1f5 tgsi: s/true/TRUE/ in tgsi_scan.c
Just to be consistent.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul da6e879a6c tgsi: use switches instead of big if/else ifs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul 37eb3f0400 tgsi: break gigantic tgsi_scan_shader() function into pieces
New functions for examining instructions, declarations, etc.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-08 09:29:38 -07:00
Brian Paul 6691ba1fe8 gallium/util: whitespace, formatting fixes in u_debug_stack.c 2016-02-08 09:29:38 -07:00
Brian Paul 5d2539cb49 gallium/util: whitespace, formatting fixes in u_staging.[ch] files
Still some nonsensical comments.
2016-02-08 09:29:38 -07:00
Brian Paul c84a8911fc gallium/util: switch over to new u_debug_image.[ch] code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-08 09:29:38 -07:00
Brian Paul 3917c8f3f9 gallium/util: put image dumping functions into separate file
To try to reduce the clutter in u_debug.[ch]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-08 09:29:38 -07:00
Brian Paul 6c7d4a7173 gallium/util: whitespace, formatting fixes in u_debug.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-08 09:29:38 -07:00
Samuel Pitoiset 04c2ca5038 tgsi: use TGSI_WRITEMASK_XYZW instead of hardcoding the mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Serge Martin <edb+mesa@sigluy.net>
2016-02-06 20:24:41 +01:00
Patrick Rudolph 2a4d1509c8 st/nine: Fix use of uninitialized memory
The values of box.z and box.depth weren't set and lead to a crash.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Leo Liu 4f598f2173 vl: add zig zag scan for list 4x4
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-02-02 20:29:43 -05:00
Marek Olšák 84a6d2d7d6 tgsi/scan: add tgsi_shader_info::reads_samplemask 2016-02-02 21:04:52 +01:00
Marek Olšák f96f94966d tgsi: set correct src type for UP2H
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-02 21:02:26 +01:00
Roland Scheidegger 7221b8aec6 gallivm: add PK2H/UP2H support
Add support for these opcodes, the conversion functions were already
there albeit need some new packing stuff.
Just like the tgsi version, piglit won't like it for all the same
reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing
tests, albeit since PK2H won't due to those rounding differences I don't
know if that one works or not as the piglit test is rather difficult to
deal with).

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:20 +01:00
Roland Scheidegger 5171ec9ca9 gallivm: add PK2H/UP2H support
Add support for these opcodes, the conversion functions were already
there albeit need some new packing stuff.
Just like the tgsi version, piglit won't like it for all the same
reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing
tests, albeit since PK2H won't due those rounding differences I don't
know if that one works or not as the piglit test is rather difficult to
deal with).
2016-02-02 05:58:19 +01:00
Roland Scheidegger dc16086e3b tgsi: add PK2H/UP2H support
The util functions handle the half-float conversion.
Note that piglit won't like it much due to:
a) The util functions use magic float mul conversion but when run inside
softpipe/llvmpipe, denorms are flushed to zero, therefore when the conversion
is from/to f16 denorm the result will be zero. This is a bug which should be
fixed in these functions (should not rely on denorms being available), but
will happen elsewhere just the same (e.g. conversion to f16 render targets).
b) The util functions use trunc round mode rather than round-to-nearest. This
is NOT a bug (as it is a d3d10 requirement). This will result of rounding not
representable finite values to MAX_F16 rather than INFINITY. My belief is the
piglit tests are wrong here but it's difficult to tell (generally glsl
rounding mode is undefined, however I'm not sure if rounding mode might need
to be consistent for different operations). Nevertheless, for gl it would be
better to use round-to-nearest, but using different rounding for GL and d3d10
is an unsolved problem (as it affects things like conversion to f16 render
targets, clear colors, this shader opcode).

Hence for now don't enable the cap bit (so the code is unused).
(Code is from imirkin, comment from sroland)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmvware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Roland Scheidegger 116e4dc995 mesa: fix typo in python scripts
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-02-02 05:58:19 +01:00
Rob Herring f87330dbce virgl: reuse screen when fd is already open
It is necessary to share the screen between mesa and gralloc to
properly ref count resources. This implements a hash lookup on
the file description to re-use an already created screen. This is
a similar implementation as freedreno and radeon.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-02 09:58:29 +10:00
François Tigeot a48afb92ff gallium: Add DragonFly support
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-31 11:56:09 +00:00
Ilia Mirkin 2ccc42fd2c tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsics
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
v1 -> v2: add defines for the various bits
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-29 21:04:36 -05:00
Emil Velikov eb63640c1d glsl: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:33 +00:00
Emil Velikov a39a8fbbaa nir: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:30 +00:00
Emil Velikov 1a882fd2ee nir: move shader_enums.[ch] to compiler
This way one can reuse it in glsl, nir or other infrastructure without
pulling nir as dependency.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:20 +00:00
Roland Scheidegger e925ec8811 llvmpipe,i915: add back NEW_RASTERIZER dependency when computing vertex info
I removed this mistakenly in 2dbc20e456. I
actually thought it should not be necessary and a piglit run didn't show
any differences, but this shouldn't have been in there.
draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER.
The new polygon-mode-facing test indeed shows why this is necessary, there's
lots of invalid reads and writes with valgrind (also crashes without
valgrind), because the pre-pipeline vertex size doesn't match the
post-pipeline vertex size (note this won't help much with stages which don't
have the prepare hook which can grow the vertex size, in particular the wide
point stage, but this isn't used by llvmpipe). The test still won't pass, of
course, but it is only usage of uninitialized values now, which is much
less dangerous...
(Albeit I'm pretty sure for i915 it really is not needed anymore as it
doesn't care about the extra outputs and doesn't call
draw_prepare_shader_outputs().)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-21 00:09:55 +01:00
Nicolai Hähnle e6281a2850 util/u_pstipple.c: copy immediates during transformation
Apparently, nobody has combined stippling with a fragment shader
containing immediates in almost five years...

Fixes a bug in Kodi with radeonsi reported by Christian König.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-19 10:52:35 -05:00
Emil Velikov c03f3dd0a5 gallium: bundle the compat header u_pwr8.h in the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-01-18 13:37:58 +02:00
Oded Gabbay 679a654a77 llvmpipe: use vpkswss when dst is signed
This patch fixes a bug when building a pack instruction.

For POWER (altivec), in case the destination is signed and the
src width is 32, we need to use vpkswss. The original code used vpkuwus,
which emits an unsigned result.

This fixes the following piglit tests on ppc64le:
- spec@arb_color_buffer_float@gl_rgba8-drawpixels
- shaders@glsl-fs-fogscale

I've also corrected some coding style issues in the function.

v2: Returned else statements to vmware style

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-18 09:45:25 +02:00
Ilia Mirkin 3281ae96c8 tgsi: initialize Atomic field in tgsi_default_declaration
Spotted by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-17 16:37:04 -05:00
Oded Gabbay 529aa8249a llvmpipe: fix arguments order given to vec_andc
This patch fixes a classic "confuse the enemy" bug.

_mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the
arguments are opposite.

_mm_andnot_si128 performs "r = (~a) & b" while
vec_andc performs "r = a & (~b)"

To make sure this error won't return in another place, I added a wrapper
function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-17 21:07:27 +02:00
Rob Clark ebd3a1fc17 ttn: use writemask for store_var
Only user is freedreno, and after array-rework it can cope.  Avoids
generating loads for a store.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-01-16 14:21:52 -05:00
Rob Clark 6f0377d651 ttn: add missing writemask on store_output
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-01-16 13:35:44 -05:00
Jeff Muizelaar e5fefe49f2 gallivm: avoid crashing in mod by 0 with llvmpipe
This adds code that is basically the same as the code in umod, udiv and idiv.
However, unlike idiv we return -1.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-16 03:36:29 +01:00
Roland Scheidegger 9422999e40 draw: fix key comparison with uninitialized value
Discovered by accident, valgrind was complaining (could have possibly caused
us to create redundant geometry shader variants).

v2: convinced by Brian and Jose, just use memset for both gs and vs keys,
just as easy and less error prone.
2016-01-13 02:43:04 +01:00
Christian König 6f898f740c vl: use preferred format for deinterlacing
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-12 13:28:42 +01:00
Christian König 5fdd4a5aef vl: improve motion adaptive deinterlacer
Handle other formats than YV12 as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-12 13:28:39 +01:00
Christian König 52ca9a9b8b vl/buffers: extract vl_video_buffer_adjust_size helper
Useful for the state trackers as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-12 13:28:16 +01:00
Erik Faye-Lund 2a15dc0dd5 gallium/util: removed unused header-file
This hasn't been in use since c476305 ("gallium/util: pregenerate
half float tables"), where the last bit of run-time init using this
was killed. So let's just get rid of the pointless header.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-01-12 11:02:08 +11:00
Ilia Mirkin 90ba06618e gallium: add a RESQ opcode to query info about a resource
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-08 15:10:33 -05:00
Ilia Mirkin 266d001261 gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-08 15:10:33 -05:00
Ilia Mirkin bdef02ff26 tgsi: add a is_store property
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-08 15:10:33 -05:00
Ilia Mirkin 50b8488926 tgsi: provide a way to encode memory qualifiers for SSBO
Each load/store on most hardware can specify what caching to do. Since
SSBO allows individual variables to also have separate caching modes,
allow loads/stores to have the qualifiers instead of attempting to
encode them in declarations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-08 15:10:32 -05:00
Ilia Mirkin 888ddd632d ureg: add buffer support to ureg
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-08 15:10:32 -05:00
Ilia Mirkin 8cc9a8aa2a tgsi: add ureg support for image decls
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-08 15:10:32 -05:00
Marek Olšák d0cf66d835 vl: allow fragment shader POSITION to be a system value
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 20:07:16 +01:00
Marek Olšák 69f43c2cc9 util/pstipple: allow fragment shader POSITION to be a system value
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 20:07:16 +01:00
Marek Olšák c00e534283 tgsi/scan: update for POSITION and FACE sytem values
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 20:07:15 +01:00
Marek Olšák c07cf5f5a9 tgsi/ureg: handle redundant declarations in ureg_DECL_system_value
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-08 20:06:22 +01:00
Marek Olšák c886422656 tgsi/ureg: remove index parameter from ureg_DECL_system_value
It can be trivially derived from the number of already declared system
values. This allows ureg users not to worry about which index to choose.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-08 20:06:22 +01:00
Edward O'Callaghan b42254eff3 gallium/aux: Use TGSI chan name defines inplace of literals
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-08 12:18:24 -05:00
Roland Scheidegger 2923c7a0ed llvmpipe: do 64bit plane calculations in the sse path
The sse path was pretty much disabled for practical purposes because the
largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations.
This is actually not that difficult, though a problem is that we can't do
a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall,
the code still looks reasonable, though it's not like changes there in
setup really make much of a difference in the end...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 00:34:14 +01:00
Roland Scheidegger 9db7309595 draw: initialize prim header flags when clipping lines
Otherwise, clipped lines would have undefined stippling reset bit if line
stippling is enabled.
(Untested, and I just assume copying over the bits from the original line
is actually the right thing to do.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-08 00:34:13 +01:00
Roland Scheidegger 64da11f052 draw: fix line stippling with unfilled prims
The unfilled stage was not filling in the prim header, and the line stage
then decided to reset the stipple counter or not based on the uninitialized
data. This causes some failures in conform linestipple test (albeit quite
randomly happening depending on environment).
So fill in the prim header in the unfilled stage - I am not entirely sure
if anybody really needs determinant after that stage, but there's at least
later stages (wide line for instance) which copy over the determinant as well.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 00:34:13 +01:00
Oded Gabbay f41b6cfb07 llvmpipe: use sse2 conv code for altivec
In lp_build_conv() and lp_build_conv_auto(), there is a special case of
conversion when sse2 is present. That code path is suitable without any
changes to altivec, because all the functions that are called in that
code path already support altivec.

This patch increase the FPS in POWER arch across the board
between 10%-25%

I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-07 22:07:02 +02:00
Marek Olšák ff7e77724e tgsi/scan: set which color components are read by a fragment shader
This will be used by radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Marek Olšák 18ec76730a tgsi/scan: fix tgsi_shader_info::reads_z
This has no users in Mesa.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Marek Olšák f3658be108 tgsi/scan: set if a fragment shader writes sample mask
This will be used by radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-07 18:26:05 +01:00
Roland Scheidegger 2dbc20e456 draw: nuke the interp parameter from vertex_info
draw emit couldn't care less what the interpolation mode is...
This somehow looked like it would matter, all drivers more or less
dutifully filled that in correctly. But this is only used for emit,
if draw needs to know about interpolation mode (for clipping for instance)
it will get that information from the vs anyway.
softpipe actually used to depend on that interpolation parameter, as it
abused that structure quite a bit but no longer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-07 01:58:05 +01:00
Roland Scheidegger afa035031f draw: rework handling of non-existing outputs in emit code
Previously the code would just redirect requests for attributes which
don't exist to use output 0. Rework this to output all zeros instead which
seems more useful - in particular some extensions like
ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by
previous stages. That way, drivers don't have to special case this depending
if the vs/gs outputs some attribute or not.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-07 01:52:39 +01:00
Edward O'Callaghan 5071c192cc gallium: Use unsigned for loop index
Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-06 08:04:03 -07:00
Edward O'Callaghan 67d4b4b28c gallium: Remove unnecessary semicolons
Fix silly issue with MSVC case fall-though support to need
a extra 'break;'

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-06 08:04:03 -07:00
Oded Gabbay e99555ef0b llvmpipe: add POWER8 portability file - u_pwr8.h
This file provides a portability layer that will make it easier to convert
SSE-based functions to VMX/VSX-based functions.

All the functions implemented in this file are prefixed using "vec_".
Therefore, when converting from SSE-based function, one needs to simply
replace the "_mm_" prefix of the SSE function being called to "vec_".

Having said that, not all functions could be converted as such, due to the
differences between the architectures. So, when doing such
conversion hurt the performance, I preferred to implement a more ad-hoc
solution. For example, converting the _mm_shuffle_epi32 needed to be done
using ad-hoc masks instead of a generic function.

All the functions in this file support both little-endian and big-endian
but currently the file is build only on POWER8 LE machine.

All of the functions are implemented using the Altivec/VMX intrinsics,
except one where I needed to use inline assembly (due to missing
intrinsic).

v2:
- Use vec_vgbbd instead of __builtin_vec_vgbbd
- Add an aligned load function
- Don't use typeof()
- Make file build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-06 14:54:16 +02:00
Brian Paul f4caa7d2fc draw: minor indentation fix 2016-01-05 13:03:05 -07:00
Brian Paul 95d412181d util: add debug_dump_ubyte_rgba_bmp()
Like debug_dump_float_rgba_bmp() but takes ubyte values.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-01-05 13:03:04 -07:00
Ilia Mirkin 459e4532af tgsi: update PK2H/UP2H channel behavior info
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-03 16:20:27 -05:00
Marek Olšák ecb2da1559 u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:45 +01:00
Marek Olšák 37d0aea772 u_upload_mgr: remove alignment parameter from u_upload_create
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:45 +01:00
Marek Olšák 1bb79c3a7b u_upload_mgr: pass alignment to u_upload_buffer manually
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:44 +01:00
Marek Olšák e0f932846c u_upload_mgr: pass alignment to u_upload_data manually
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:44 +01:00
Marek Olšák 020009f7cc u_upload_mgr: pass alignment to u_upload_alloc manually
The fixed alignment of u_upload_mgr will go away.
This is the first step.

The motivation is that one u_upload_mgr can have multiple users,
each allocating from the same buffer, but requiring a different alignment.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:44 +01:00
Marek Olšák ffc4716e97 u_upload_mgr: rework the application of alignment
The function only aligned the size, but not the offset.
The offset was aligned only when the previous suballocation was aligned.
That yielded the correct offset alignment if the alignment was constant
for all suballocations.

Instead, directly align the offset, but allow an unaligned size.
There is no change in behavior, because the alignment is constant
at the moment.

This a prerequisite for allowing a variable alignment for suballocations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:44 +01:00
Ilia Mirkin bb52ea45cc gallium: add baseinstance/drawid semantics
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-12-30 16:55:56 -05:00
Jason Ekstrand 0119773ffc nir/builder: Add an init function that creates a simple shader for you
A hugely common case when using nir_builder is to have a shader with a
single function called main.  This adds a helper that gives you just that.
This commit also makes us use it in the NIR control-flow unit tests as well
as tgsi_to_nir and prog_to_nir.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-12-29 13:44:05 -08:00
Nicolai Hähnle 4711170239 gallium/util: add DEBUG_GET_ONCE_OPTION
This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-12-29 09:06:57 -05:00
Jason Ekstrand 237f2f2d8b nir: Get rid of function overloads
When Connor originally drafted NIR, he copied the same function+overload
system that GLSL IR had with a few names changed.  However, this
double-indirection is not really needed and has only served to confuse
people.  Instead, let's just have functions which may not have unique names
and may or may not have an implementation.  If someone wants to do overload
resolving, they can hav a hash table based function+overload system in the
overload resolving pass.  There's no good reason to keep it in core NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

ir3 bits are

Reviewed-by: Rob Clark <robclark@gmail.com>
2015-12-28 09:59:53 -08:00
Connor Abbott 41c7912d04 gallium/auxiliary: don't build NIR sources with MSVC2008 flags
NIR has never been built with MSVC2008, so we shouldn't add
MSVC2008_COMPAT_CFLAGS to anything that uses it. This allows us to get
rid of the pragma in tgsi_to_nir.c.

Build tested with freedreno.

v2: Use MSVC2013_COMPAT_CLFAGS instead.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-12-23 20:46:48 -05:00
Kenneth Graunke 7d539080c1 nir: Add a writemask to store intrinsics.
Tessellation control shaders need to be careful when writing outputs.
Because multiple threads can concurrently write the same output
variables, we need to only write the exact components we were told.

Traditionally, for sub-vector writes, we've read the whole vector,
updated the temporary, and written the whole vector back.  This breaks
down with concurrent access.

This patch prepares the way for a solution by adding a writemask field
to store_var intrinsics, as well as the other store intrinsics.  It then
updates all produces to emit a writemask of "all channels enabled".  It
updates nir_lower_io to copy the writemask to output store intrinsics.

Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks
by doing a read-modify-write cycle (which is safe, because local
variables are specific to a single thread).

This should have no functional change, since no one actually emits
partial writemasks yet.

v2: Make nir_validate momentarily assert that writemasks cover the
    complete value - we shouldn't have partial writemasks yet
    (requested by Jason Ekstrand).

v3: Fix accidental SSBO change that arose from merge conflicts.

v4: Don't try to handle writemasks in ir3_compiler_nir - my code
    for indirects was likely wrong, and TTN doesn't generate partial
    writemasks today anyway.  Change them to asserts as requested by
    Rob Clark.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v3]
2015-12-22 15:57:59 -08:00
Matt Turner c8a74e3a4e nir: Delete bany, ball, fany, fall.
As in the previous patches, these can be implemented as

   any(v) -> any_nequal(v, false)
   all(v) -> all_equal(v, true)

and their removal simplifies the code in the next patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-12-18 13:20:13 -05:00
Roland Scheidegger 61e5f8d073 gallium/util: (trivial) include p_shader_tokens.h in u_simple_shaders.h
as it uses definition from it (enum tgsi_return_type).
2015-12-18 01:02:16 +01:00
Roland Scheidegger 6743c68a11 draw: fix clip test with NaNs
NaNs mean it should be clipped, otherwise the NaNs might get passed to the
next stages (if clipping didn't happen for another reason already), which
might cause all kind of problems.
The llvm path got this right already (possibly by luck), but this isn't used
when there's a gs active.
Found by code inspection, verified with some hacked piglit test and some more
hacked debug output.
(Note the clipper can still itself incorrectly generate NaN and INF position
values in its output prims (at least after w divide / viewport transform) even
if the inputs weren't NaNs, if the position data of the vertices is
"sufficiently bad".)

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-12-18 00:57:07 +01:00
Roland Scheidegger 44e87b7b7b draw: fix pstipple and aaline stages wrt sampler_views/samplers
Those stages only really work for OGL-style texturing (so number of samplers
and views mostly the same, certainly for the max values).
These get often set up all at once, thus there might be max number of both
even if all of them are just NULL. We must not set the max number of samplers
and views to the same value since that will lead to terrible things if a driver
supports more views than samplers (and the state tracker set up all the views).
(This will not make these stages magically work if a shader uses dx10-style
texturing, they might still replace an actually used sview in that case.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-18 00:55:35 +01:00
Roland Scheidegger 8e195a6251 draw: handle edge flags in llvm path
We just ignored them altogether. While this feature is rather old-fashioned
supporting it is actually rather trivial.
This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag
and all (7) of point-vertex-id).

v2: comment fixes, and make the use of the edgeflag in clipmask consistent
with when it's actually there (should be impossible to hit a case where the
difference would actually matter but still...)

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-12-16 03:55:25 +01:00
Roland Scheidegger 13c0b1c780 draw: don't set start_instance and instance id for pt emit
This just adds confusion, these parameters are used when fetching vertices
by translate, but certainly not when emitting hw vertices for drivers, they
make no sense there (setting them has no consequences otherwise since there
won't be any elements with instance_divisor set). So just set them to 0 (the
draw_pipe_vbuf code for emitting vertices when the draw pipeline is run
already does exactly that).
Also while here do some whitespace cleanup.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-12-16 03:55:14 +01:00
Roland Scheidegger 8e264765a4 draw: remove clip_vertex from vertex header
vertex header had both clip_pos and clip_vertex.
We only really need one (clip_pos) because the draw llvm shader would
overwrite the position output from the vs with the viewport transformed.
However, we don't really need the second one, which was only really used
for gl_ClipVertex - if the shader didn't have that the values were just
duplicated to both clip_pos and clip_vertex. So, just use this from the vs
output instead when we actually need it.
Also change clip debug to output both the data from clip_pos and the
clipVertex output (if available).
Makes some things more complex, some things less complex, but seems more
easy to understand what clipping actually does (and what values it uses
to do its magic).

Reviewed-by: Brian Paul <brianp@vmware.com
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-15 02:03:40 +01:00
Roland Scheidegger 1775400a20 draw: use clip_pos, not clip_vertex for the fake guardband xy point clipping
Seems obvious now this should use the data from position and not clip_vertex
(albeit might not really make a difference).

Reviewed-by: Brian Paul <brianp@vmware.com
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-15 02:03:40 +01:00
Roland Scheidegger 8575ddb644 draw: rename vertex header members
clip -> clip_vertex and pre_clip_pos -> clip_pos.
Looks more obvious to me what these values actually represent (so use
something resembling the vs output names).

Reviewed-by: Brian Paul <brianp@vmware.com
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-15 02:03:40 +01:00
Roland Scheidegger 1b22815af6 draw: don't pretend have_clipdist is per-vertex
This is just for code cleanup, conceptually the have_clipdist really
isn't per-vertex state, so don't put it there (just dependent on the
shader). Even though there wasn't really any overhead associated with
this, we shouldn't store random shader information in the vertex header.

Reviewed-by: Brian Paul <brianp@vmware.com
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-15 02:03:40 +01:00
Roland Scheidegger 9e3f2af3c3 draw: use position not clipVertex output for xyz view volume clipping
I'm pretty sure this should use position (i.e. pre_clip_pos) and not
the output from clipVertex. Albeit piglit doesn't care. It is what we
use in the clip test, and it is what every other driver does (as they
don't even have clipVertex output and lower the additional planes to
clip distances).

Reviewed-by: Brian Paul <brianp@vmware.com
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-15 02:03:40 +01:00
Brian Paul c877f1aeef util/blitter: minor formatting fixes 2015-12-11 16:53:20 -07:00
Roland Scheidegger 6c2c1e0ffe draw: don't assume fixed offset for data in struct vertex_info
Otherwise, if struct vertex_info is changed, you're in for some surprises...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-11 20:09:21 +01:00
Marek Olšák 3fbf250dfa gallium/pb_bufmgr_cache: use the new pb_cache module
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2015-12-11 15:25:12 +01:00
Marek Olšák 2b396eeed9 gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager
This simplified (basically duplicated) version of pb_cache_manager will
allow removing some ugly hacks from radeon and amdgpu winsyses and
flatten simplify their design.

The difference is that winsyses must manually add buffers to the cache
in "destroy" functions and the cache doesn't know about the buffers before
that. The integration is therefore trivial and the impact on the winsys
design is negligible.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2015-12-11 15:25:12 +01:00
Marek Olšák eb4813a952 tgsi/scan: add flag colors_written
This is a prerequisite for the following r600g fix.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-11 15:25:11 +01:00
Roland Scheidegger 64c59b0624 draw: fix clipping with linear interpolated values and gl_ClipVertex
Discovered this when working on other clip code, apparently didn't work
correctly - the combination of linear interpolated values and using
gl_ClipVertex produced wrong values (failing all such combinations
in piglits glsl-1.30 interpolation tests, named
interpolation-noperspective-XXX-vertex).
Use the pre-clip-pos values when determining the interpolation factor to
fix this.
Noone really understands this code well, but everybody agrees this looks
sane... This fixes all those failing tests (10 in total) both with
the llvm and non-llvm draw paths, with no piglit regressions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-12-11 02:21:39 +01:00
Jason Ekstrand 78b81be627 nir: Get rid of *_indirect variants of input/output load/store intrinsics
There is some special-casing needed in a competent back-end.  However, they
can do their special-casing easily enough based on whether or not the
offset is a constant.  In the mean time, having the *_indirect variants
adds special cases a number of places where they don't need to be and, in
general, only complicates things.  To complicate matters, NIR had no way to
convdert an indirect load/store to a direct one in the case that the
indirect was a constant so we would still not really get what the back-ends
wanted.  The best solution seems to be to get rid of the *_indirect
variants entirely.

This commit is a bunch of different changes squashed together:

 - nir: Get rid of *_indirect variants of input/output load/store intrinsics
 - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect
 - nir/lower_io: Get rid of load/store_foo_indirect
 - i965/fs: Get rid of load/store_foo_indirect
 - i965/vec4: Get rid of load/store_foo_indirect
 - tgsi_to_nir: Get rid of load/store_foo_indirect
 - ir3/nir: Use the new unified io intrinsics
 - vc4: Do all uniform loads with byte offsets
 - vc4/nir: Use the new unified io intrinsics
 - vc4: Fix load_user_clip_plane crash
 - vc4: add missing src for store outputs
 - vc4: Fix state uniforms
 - nir/lower_clip: Update to the new load/store intrinsics
 - nir/lower_two_sided_color: Update to the new load intrinsic

NIR and i965 changes are

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

NIR indirect declarations and vc4 changes are

Reviewed-by: Eric Anholt <eric@anholt.net>

ir3 changes are

Reviewed-by: Rob Clark <robdclark@gmail.com>

NIR changes are

Acked-by: Rob Clark <robdclark@gmail.com>
2015-12-10 12:25:16 -08:00
Patrick Rudolph 79bff488bc gallium/util: return correct number of bound vertex buffers
In case a state tracker unbinds every slot by a seperate
pipe->set_vertex_buffers() call, starting from slot zero, the number
of bound buffers would not reach zero at all.
The current algorithm does not account for pre-existing holes in the
buffer list.

Unbinding all buffers at once or starting at the top-most slot results
in correct behaviour.

Calculating the correct number of bound buffers fixes a NULL pointer
dereference in nvc0_validate_vertex_buffers_shared().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2015-12-10 13:55:53 -05:00
Edward O'Callaghan f32f80e19d gallium/util: Make u_prims_for_vertices() safe
Let us avoid trapping in hardware from a SIGFPE and instead
assert on a zero divisor.

Hint: This can occur if a PIPE_PRIM_? is not handled in
      u_prim_vertex_count() that results in ' info ' not
      being initialized in the expected manner.

Further, we also fix a possibly NULL pointer dereference
from ' info ' being NULL from a u_prim_vertex_count() call.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-09 22:51:56 +01:00
Dave Airlie a13b14930d llvmpipe: fix fp64 inputs to geom shader.
This fixes the fetching of fp64 inputs to the geometry shader,

this fixes the recently posted piglit's
arb_gpu_shader_fp64/execution/gs-fs-vs-double-array.shader_test
arb_vertex_attrib_64bit/execution/gs-fs-vs-attrib-double-array.shader_test

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-12-09 13:56:39 +10:00
Brian Paul 5effc3ae74 gallium/util: check callback pointers for non-null in pipe_debug_message()
So the callers don't have to do it.

v2: also check cb!=NULL in the macro

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-12-07 08:56:51 -07:00
Edward O'Callaghan d108b69d2c gallium: Remove redundant NULL ptr checks
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:23 +01:00
Edward O'Callaghan 150c289f60 gallium/auxiliary: Sanitize NULL checks into canonical form
Use NULL tests of the form `if (ptr)' or `if (!ptr)'.
They do not depend on the definition of the symbol NULL.
Further, they provide the opportunity for the accidental
assignment, are clear and succinct.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:23 +01:00
Edward O'Callaghan 147fd00bb3 gallium/auxiliary: Trivial code style cleanup
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:22 +01:00
Edward O'Callaghan 34782eec31 gallium/auxiliary: Fix zero integer literal to pointer comparison
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:02 +01:00
Ilia Mirkin a3f90ef0a6 gallium/util: fix pipe_debug_message macro to allow 0 args
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2015-12-04 15:24:17 -05:00
Roland Scheidegger 51140f452a draw: fix clipping of layer/vp index outputs
This was just plain broken. It used always the value from v0 (for vp_index)
but would pass the value from the provoking vertex to later stages - but only
if there was a corresponding fs input, otherwise the layer/vp index would get
lost completely (as it would try to interpolate the (unsigned) values as
floats).
So, make it obey provoking vertex rules (drivers relying on draw will need to
do the same). And make sure that the default interpolation mode (when no
corresponding fs input is found) for them is constant.
Also, change the code a bit so constant inputs aren't interpolated then
copied over later.

Fixes the new piglit test gl-layer-render-clipped.

v2: more consistent whitespaces fixes for function defs, and more tab killing
(overall still not quite right however).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-04 03:42:19 +01:00
Edward O'Callaghan a5055e2f86 gallium/aux/util: Trivial, we already have format use it
No need to dereference again, fixup for clarity.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-03 23:41:23 +01:00
Jose Fonseca 5294debfa4 automake: Fix typo in MSVC2008 compat flags.
It should be MSVC2008_COMPAT_CFLAGS and not MSVC2008_COMPAT_CXXFLAGS.

This is why the recent util_blitter breakage went unnoticed on autotools
builds.

Trivial.
2015-12-03 22:00:49 +00:00
Jose Fonseca 071af9a511 ttn: Whitelist from -Werror=declaration-after-statement.
nir is the exception among gallium/auxiliary -- we don't need to compile
it with MSVC2008 yet.  And this enables us to use
-Werror=declaration-after-statement in the next commit as we should,
without complicated fixes to tgsi_to_nir module.

Trvial.  Tested with GCC and Clang.
2015-12-03 22:00:49 +00:00
Jose Fonseca 4a3d388834 util/blitter: Fix "SO C90 forbids mixed declarations and code".
Trivial.
2015-12-02 17:49:20 +00:00
Edward O'Callaghan 772f429f0a gallium/util: Fix util_blitter_clear_depth_stencil() for num_layers>1
Previously util_blitter_clear_depth_stencil() could not clear more
than the first layer. We need to generalise this as we did for
util_blitter_clear_render_target().

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-02 18:23:43 +01:00
Edward O'Callaghan 8f2c5e281d gallium/util: Fix util_blitter_clear_render_target() for num_layers>1
Previously util_blitter_clear_render_target() could not clear more
than the first layer. We need to generalise this so that
ARB_clear_texture can pass the 3d piglit test.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-02 18:23:43 +01:00
Jose Fonseca 56aff6bb4e Remove Sun CC specific code.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2015-12-02 07:51:04 +00:00
Julien Isorce 10c14919c8 vl/buffers: fixes vl_video_buffer_formats for RGBX
Fixes: 42a5e143a8 "vl/buffers: add RGBX and BGRX to the supported formats"
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-30 09:02:29 +00:00
Emil Velikov 5d294d9fa3 auxiliary/vl/dri: fd management cleanups
Analogous to previous commit, minus the extra dup. We are the one
opening the device thus we can directly use the fd.

Spotted by Coverity (CID 1339867, 1339877)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-29 14:41:00 +00:00
Emil Velikov 151290c154 auxiliary/vl/drm: fd management cleanups
Analogous to previous commit.

Spotted by Coverity (CID 1339868)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-29 14:40:26 +00:00
Emil Velikov 5f92906b87 pipe-loader: check if winsys.name is non-null prior to strcmp
In theory this wouldn't be an issue, as we'll find the correct name and
break out of the loop before we hit the sentinel.

Let's fix this and avoid issues in the future.

Spotted by Coverity (CID 1339869, 1339870, 1339871)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-29 14:38:22 +00:00
Nicolai Hähnle f36d9857cd gallium: add PIPE_DRIVER_QUERY_FLAG_DONT_LIST
This allows the driver to give a hint to the HUD so that GALLIUM_HUD=help is
less spammy.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-26 10:57:43 +01:00
Emil Velikov 59cfb21d46 targets: use the non-inline sw helpers
Previously (with the inline ones) things were embedded into the
pipe-loader, which means that we cannot control/select what we want in
each target.

That also meant that at runtime we ended up with the empty
sw_screen_create() as the GALLIUM_SOFTPIPE/LLVMPIPE were not set.

v2: Cover all the targets, not just dri.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2015-11-25 20:25:29 +00:00
Emil Velikov fbc6447c3d target-hepers: add non inline sw helpers
Feeling rather dirty copying the inline ones, yet we need the inline
ones for swrast only targets like libgl-xlib, osmesa.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2015-11-25 20:25:14 +00:00
Emil Velikov f623517188 pipe-loader: fix off-by one error
With earlier commit we've dropped the manual iteration over the fixed
size array and prepemtively set the variable storing the size, that is
to be returned. Yet we forgot to adjust the comparison, as before we
were comparing the index, now we're comparing the size.

Fixes: ff9cd8a67c "pipe-loader: directly use
pipe_loader_sw_probe_null() at probe time"
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93091
Reported-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2015-11-25 20:22:35 +00:00
Ilia Mirkin cca8dd4e93 ttn: fix UMSB conversion
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-23 11:17:16 -05:00
Ilia Mirkin f0e670bdd7 ttn: add LODQ support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-23 11:17:15 -05:00
Ilia Mirkin 3333977556 gallium: add ASTC formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-11-23 11:17:15 -05:00
Ilia Mirkin 1c7d0a6aa4 gallium/util: remove the fake format helpers for bptc and etc2
This was a silly hack that kept growing and growing. Instead, just write
NULLs for those functions. No need to have helpers that just assert(0)
when you call them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-11-23 11:17:14 -05:00
Emil Velikov 8a6d476588 pipe-loader: link against libloader regardless of libdrm presence
Whether or not the loader has libdrm support is up-to it. Anyone using
the loader should just include it whenever they depend on it.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 0f39f9cb7a "pipe-loader: add a dummy 'static' pipe-loader"
Reported-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Tested-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-23 12:07:09 +00:00
Jason Ekstrand f58813842b nir: s/nir_type_unsigned/nir_type_uint
v2: do the same in tgsi_to_nir (Samuel)

v3: added missing cases after rebase (Iago)

v4: Add a blank space after '#' in one of the comments (Matt)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-23 08:36:12 +01:00
Eric Anholt 1b62a4e885 vc4: Take precedence over ilo when in simulator mode.
They're exclusive at build time, but the ilo entry is always present, so
we'd try to use it and fail out.

v2: Add comment in the code, from Emil.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-22 13:15:43 -08:00
Igor Gnatenko 05eed0eca7 virgl: pipe_virgl_create_screen is not static
Cc: mesa-stable@lists.freedesktop.org
Fixes: 17d3a5f857 "target-helpers: add a non-inline drm_helper.h"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93063
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-22 11:17:17 +00:00
Jose Fonseca 4befd82a64 pipe-loader: Fix PATH_MAX define on MSVC. 2015-11-21 23:03:20 +00:00
Jose Fonseca 02afbd2476 scons: Conditionally use DRM module on pipe-loader.
Fixes non Linux builds.

Trivial.
2015-11-21 21:20:12 +00:00
Emil Velikov 623f64efc1 util: use RTLD_LOCAL with util_dl_open()
Otherwise we risk things blowing up due to conflicting symbols.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
2015-11-21 12:52:21 +00:00