Commit Graph

6560 Commits

Author SHA1 Message Date
Axel Davy 197cdd1bbd gallium/os: Use unsigned integers for size computation
Use uint64_t instead of int64_t in the calculation,
as the result is uint64_t.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 21:16:35 +02:00
Nicolai Hähnle 2b460c750a tgsi/ureg: add ureg_DECL_output_layout
For specifying an exact location/component.

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle 047a7c7a0b tgsi/ureg: add layout/component input declarations
v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle f9a01f3872 tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays
v2: remove a tautological left-over assert (Marek)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Roland Scheidegger 7e86b2ddae draw: initialize shader inputs
This should make the code more robust if a shader tries to use inputs which
aren't defined by the vertex element layout (which usually shouldn't happen).

No piglit change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-10-12 15:05:44 +02:00
Axel Davy 2290eac84e gallium/util: Really allow aliasing of dst for u_box_union_*
Gallium nine relies on aliasing to work with this function.
Without this patch, dirty region tracking was incorrect, which
could lead to incorrect textures or vertex buffers.
Fixes several game bugs with nine.
Fixes https://github.com/iXit/Mesa-3D/issues/234

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-10-10 23:43:48 +02:00
Axel Davy 218459771a gallium/os: Fix overflow on 32 bits
On systems with more than 4GB of ram,
os_get_total_physical_memory was triggering an integer
overflow for the linux and haiku path, when on
32 bits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94561

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-10 23:43:48 +02:00
Steven Toth e00fdd643b gallium/hud: Remove superfluous debug
No longer required.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-06 16:37:06 +01:00
Marek Olšák faee2d6dda tgsi/scan: don't set interp flags for inputs only used by INTERP (v2)
(v1 pushed, then reverted)

This fixes 9 randomly failing tests on radeonsi:
  GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.*

v2: use input_interpolate[input] (correct) instead of
    input_interpolate[index] (incorrect)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-05 21:03:23 +02:00
Jose Fonseca 437d7e1baf gallivm: Use AVX2 gather instrinsics.
v2: Use AVX2 gather for non aligned loads too.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-10-04 23:36:20 +01:00
Roland Scheidegger bc80741d7a gallivm: Use 8 wide AoS sampling on AVX2.
v2: Make sure that with num_lods > 1 and min_filter != mag_filter we
still enter the splitting path. So this case would still use 4-wide aos
path (as a side note, the 4-wide aos sampling path could actually be
improved quite a bit if we have avx2, by just doing the filtering with
256bit vectors).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-10-04 23:36:20 +01:00
José Fonseca e088390c7d gallivm: Basic AVX2 support.
v2: pblendb -> pblendvb

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-10-04 23:36:20 +01:00
Matt Whitlock 5d0069eca2 gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-04 11:08:55 +02:00
Nayan Deshmukh b7a0f2e1f7 vl/dri3: fix warning about incompatible pointer type
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-10-03 12:51:30 -04:00
Steven Toth e99b9395be gallium/hud: Add support for CPU frequency monitoring
Detect all of the CPUs in the system. Expose metrics
for min, max and current frequency in Hz.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-30 15:18:46 -06:00
Marek Olšák 7b87190d2b Revert "gallium/hud: automatically print % if max_value == 100"
This reverts commit dbfeb0ec12.

With max_value being rounded to 100, it's often wrong.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-30 22:07:12 +02:00
Steven Toth 1d466b9b04 gallium/hud: Add power sensor support
Implement support for power based sensors, reporting units in
milli-watts and watts.

Also, minor cleanup - change the related if block to a switch.

Tested with two different power sensors, including the nouveau
'power1' sensors on a GTX950 card.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-29 17:51:15 -06:00
Steven Toth 8c60bcb4c3 gallium/hud: Add support for block I/O, network I/O and lmsensor stats
V8: Feedback based on peer review
    convert if block into a switch
    Constify some func args

V7: Increase precision when measuring lmsensors volts
    Flatten patch series.

V6: Feedback based on peer review
    Simplify sensor initialization (arg passing).
    Constify some func args

V5: Feedback based on peer review
    Convert sprintf to snprintf
    Convert char * to const char *
    int arg converted to bool
    Func changes to take a filename vs a larger struct.
    Omit the space between '*' and the param name.

V4: Merged with master as of 2016/9/27 6pm

V3: Flatten the entire patchset ready for the ML

V2: Additional seperate patches based on feedback
a) configure.ac: Add a comment related to libsensors

b) HUD: Disable Block/NIC I/O stats by default.
Implement configuration option --enable-gallium-extra-hud=yes
and enable both statistics when this option is enabled.

c) Configure.ac: Minor cleanup to user visible configuration settings

d) Configure.ac: HUD stats - build system improvements
Move the -lsensors out of a deeper Makefile, bring it into the configure.ac.
Also, rename a compiler directive to more closely follow the standard.

V1: Initial release to the ML
Three new features:
1. Disk/block I/O device read/write stats MB/ps.
2. Network Interface RX/TX transfer statistics as a percentage
   of the overall NIC speed.
3. lmsensor power, voltage and temperature sensors.

The lmsensor changes makes a dependency on libsensors so support
for the change is opt out by default.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-28 16:18:05 -06:00
Nicolai Hähnle 4421c0fb0d gallium/radeon/winsyses: reduce the number of pb_cache buckets
Small buffers are now handled via the slabs code, so separate buckets in
pb_cache have become redundant.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-27 16:45:41 +02:00
Nicolai Hähnle 84f156c0cb gallium/pipebuffer: add pb_slab utility
This is a simple framework for slab allocation from buffers that fits into
the buffer management scheme of the radeon and amdgpu winsyses where bufmgrs
aren't used.

The utility knows about different sized allocations and explicitly manages
reclaim of allocations that have pending fences. It manages all the free lists
but does not actually touch buffer objects directly, relying on callbacks for
that.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-27 16:44:42 +02:00
Nicolai Hähnle b3ebc229dc gallium/u_math: add util_logbase2_ceil
For finding the exponent of the next power of two.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-27 16:44:38 +02:00
Rob Clark ecd6fce261 mesa/st: support lowering multi-planar YUV
Support multi-planar YUV for external EGLImage's (currently just in the
dma-buf import path) by lowering to multiple texture fetch's for each
plane and CSC in shader.

There was some discussion of alternative approaches for tracking the
additional UV or U/V planes:

  https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html

They all seemed worse than pipe_resource::next

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-09-26 15:29:17 -04:00
Samuel Pitoiset be0535b8c7 gallium/util: make use of strtol() in debug_get_num_option()
This allows to use hexadecimal numbers which are automatically
detected by strtol() when the base is 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2016-09-26 19:39:04 +02:00
Brian Paul b35684543e gallium/util: add comment on util_is_format_compatible()
From reading the code, it's not obvious what is src/dest compatible.
The list of a->b copy-compatible formats comes from Jose's original
check-in message, with some format name updates.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-09-21 12:26:17 -06:00
Nicolai Hähnle 1f291369e4 gallivm: support negation on 64-bit integers
This should be analogous to 32-bit integers.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-21 10:24:50 +02:00
Dave Airlie 5561a37710 gallivm/llvmpipe: prepare support for ARB_gpu_shader_int64.
This enables 64-bit integer support in gallivm and
llvmpipe.

v2: add conversion opcodes.
v3:
- PIPE_CAP_INT64 is not there yet
- restrict DIV/MOD defaults to the CPU, as for 32 bits
- TGSI_OPCODE_I2U64 becomes TGSI_OPCODE_U2I64

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-21 10:24:30 +02:00
Dave Airlie 6b26039da3 tgsi/softpipe: prepare ARB_gpu_shader_int64 support. (v3)
This adds all the opcodes to tgsi_exec for softpipe to use.

v2: add conversion opcodes.
v3:
- no PIPE_CAP_INT64 yet
- change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-21 10:24:11 +02:00
Dave Airlie 3985e6c044 gallium/tgsi: add support for 64-bit integer immediates.
This adds support to TGSI for 64-bit integer immediates.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-09-21 10:23:55 +02:00
Dave Airlie 6e1a34d545 gallium: add opcode and types for 64-bit integers. (v3)
This just adds the basic support for 64-bit opcodes,
and the new types.

v2: add conversion opcodes.
add documentation.
v3:
- make docs more consistent
- change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-21 10:23:05 +02:00
Nayan Deshmukh 853e80f5a0 vl/dri3: handle the case of different GPU(v4.2)
In case of prime when rendering is done on GPU other then the
server GPU, use a seprate linear buffer for each back buffer
which will be displayed using present extension.

v2: Use a seprate linear buffer for each back buffer (Michel)
v3: Change variable names and fix coding style (Leo and Emil)
v4: Use PIPE_BIND_SAMPLER_VIEW for back buffer in case when
    a seprate linear buffer is used (Michel)
v4.1: remove empty line
v4.2: destroy the context and handle the case when
      create_context fails (Emil)

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-09-20 11:17:02 +02:00
Lars Hamre ddd6116e32 tgsi: Enable returns from within loops
Fixes the following piglit test (for softpipe):
/spec/glsl-1.10/execution/fs-loop-return

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-17 10:24:13 -06:00
Rob Clark ba8a50955d ttn: fix warning after 7bf76563e
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-09-16 11:55:26 -04:00
Marek Olšák f019255acf Revert "tgsi/scan: don't set interp flags for inputs only used by INTERP instructions"
This reverts commit 524fd55d2d.

Reason: https://bugs.freedesktop.org/show_bug.cgi?id=97808
2016-09-15 00:47:24 +02:00
Marek Olšák 524fd55d2d tgsi/scan: don't set interp flags for inputs only used by INTERP instructions
radeonsi depends on the interp flags a little bit too much.

This fixes 9 randomly failing tests:
  GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.*

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-13 20:38:25 +02:00
Marek Olšák c723acc03d ddebug: dump shader buffers and images
this was unimplemented

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-09-13 20:38:25 +02:00
Andy Furniss 304f70536a vl/util: Fix YV12/I420 convert to NV12 U/V reversal
Fix VAAPI YV12/I420 convert to NV12 U/V reversal.
Input order is YVU when this is called.

Signed-off-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-09-13 13:58:40 -04:00
Leo Liu 6a7f79af9b vl/rbsp: match initial escaped bits with valid in the buffer
Otherwise the check for the three byte will not make sense.

Signed-off-by: Leo Liu <leo.liu@amd.com>
2016-09-12 10:09:27 -04:00
Marek Olšák 5981ab5445 gallium: remove PIPE_BIND_TRANSFER_READ/WRITE
not used in any useful way

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-09-08 22:51:33 +02:00
Marek Olšák e7a73b75a0 gallium: switch drivers to the slab allocator in src/util 2016-09-06 14:24:04 +02:00
Thomas Hellstrom fc6be40011 gallium/postprocess: Fix resource freeing
The code was triggering asserts in DEBUG builds of the SVGA driver since
the reference count of the resource was never decremented before destroy.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-01 07:59:49 +02:00
Kai Wasserbäch 4c53267b8f gallium: Use enum pipe_shader_type in set_shader_images()
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 09:07:37 -06:00
Kai Wasserbäch 532db3b788 gallium: Use enum pipe_shader_type in set_sampler_views()
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 09:07:25 -06:00
Kai Wasserbäch 7413625ad3 gallium: Use enum pipe_shader_type in bind_sampler_states() (v2)
v1 → v2:
 - Fixed indentation (noted by Brian Paul)
 - Removed second assert from nouveau's switch statements (suggested by
   Brian Paul)

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 08:45:48 -06:00
Marek Olšák d301efb400 tgsi/scan: remember sampler view types
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-29 14:16:57 +02:00
Brian Paul d221a6545c gallium/hud: move signo declaration inside PIPE_OS_UNIX block
To silence unused var warning with MSVC, MinGW.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-26 06:19:51 -06:00
Marek Olšák 9daaa6f5a6 gallium: add a pipe_context parameter to resource_get_handle
radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL
interop and this is the only way to make it coherent with the current
context. It can optionally be set to NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-25 14:09:48 +02:00
Rhys Kidd c9c989763a gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization
Duplicate line is currently on 1535.

Identified by Clang, when run through Eric Anholt's Travis harness.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-24 11:54:50 -07:00
Leo Liu 5277f25480 vl/rbsp: fix another three byte not detected
This happens when three byte "00 00 03" is partly loaded to
vlc->buffer, thus at the bottom of buffer with valid bits is
"00" or "00 00" and left  like "00 03" or "03" in the data,
so that it will not be detected by three byte emulation check.
The reason for that is the escaped bit was set to 0 from the
rbsp init.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-08-24 11:17:16 -04:00
Eric Engestrom 9411eb67ec gallium/cso: avoid unnecessary null dereference
The label `out:` calls `destroy()` which dereferences `ctx`.
This is unnecessary as there is nothing to destroy.
Immediately return instead.

CovID: 1258255
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-24 11:35:05 +01:00
Marek Olšák 0328b20050 gallium/hud: round max_value to print nicely rounded numbers next to graphs
This improves readability a lot.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák 0f1befe926 gallium/hud: generalize code for drawing numbers next to graphs
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák a33eb48d61 gallium/hud: draw numbers with 3 decimal places if those aren't 0
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák b9c9551c09 gallium/hud: use sRGB for nicer AA lines
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák 6ffde82083 gallium/hud: use AA lines for graphs
this looks a lot better (with the next patch)

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák 6902f9e82a gallium/hud: don't enable blending for all objects
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Eric Anholt c078c41520 ttn: Use nir_load_front_face instead of the TGSI-style input.
This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR
more optimization to work on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Eric Anholt ed92241d78 ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn.
This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all
their source paths.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-08-19 13:11:36 -07:00
Marek Olšák 325379096f gallium: change pipe_image_view::first_element/last_element -> offset/size
This is required by OpenGL. Our hardware supports this.

Example: Bind RGBA32F with offset = 4 bytes.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-17 14:15:33 +02:00
Marek Olšák 7cd256ce7e gallium: change pipe_sampler_view::first_element/last_element -> offset/size
This is required by OpenGL. Our hardware supports this.

Example: Bind RGBA32F with offset = 4 bytes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-17 14:15:33 +02:00
Nicolai Hähnle 41001ca4bd gallivm: add lp_build_alloca_undef
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-17 12:11:24 +02:00
Nicolai Hähnle 17e88e276c gallivm: add create_builder_at_entry helper function
Reduces code duplication.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-17 12:11:24 +02:00
Nicolai Hähnle 67c0f077a2 tgsi/scan: add tgsi_scan_arrays
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-17 12:11:21 +02:00
Brian Paul 038b1b11fe gallium: remove unused u_clear.h file
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-16 08:28:33 -06:00
Brian Paul 66debeae9d gallium/util: minor reformatting in u_box.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-16 08:28:32 -06:00
Rob Clark 142dd7b9c0 gallium/u_blitter: split out a helper for common clear state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-16 09:21:13 -04:00
Rob Clark 2b2f436c69 gallium/u_blitter: add helper to save FS const buffer state
Not (currently) state that is overwridden by u_blitter itself, but
drivers with custom blit/clear which are reusing part of the u_blitter
infrastructure will use it.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-16 09:21:13 -04:00
Rob Clark 433e12fea8 gallium/u_blitter: export some functions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-16 09:21:13 -04:00
Ilia Mirkin c85b7f0e87 gallium/util: add helper to compute zmin/zmax for a viewport state
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-08-14 17:41:33 -04:00
Leo Liu 6575ebdc45 vl/rbsp: add a check for emulation prevention three byte
This is the case when the "00 00 03" is very close to the beginning of
nal unit header

v2: move the check to rbsp init

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-08-10 09:52:44 -04:00
Marek Olšák a909210131 gallium: add render_condition_enable param to clear_render_target/depth_stencil
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-10 01:10:21 +02:00
Mathias Fröhlich aa920736fe gallium: Add c99_compat.h to u_bitcast.h
We need this for 'inline'.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-09 21:20:56 +02:00
Mathias Fröhlich 027cbf00f2 util: Move _mesa_fsl/util_last_bit into util/bitscan.h
As requested with the initial creation of util/bitscan.h
now move other bitscan related functions into util.

v2: Split into two patches.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-09 21:20:46 +02:00
Jason Ekstrand f29fd7897a util: Move format_r11g11b10f.h to src/util
It's used from both mesa main and gallium.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:57 -07:00
Jason Ekstrand 6c665cdfc5 util: Move format_rgb9e5.h to src/util
It's used from both mesa main and gallium.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:31 -07:00
Michel Dänzer 67c5e843b9 vl/dri3: Destroy Present event context when destroying drawable v2
Without this, the X server may accumulate stale Present event contexts
if a client performs several video decoding sessions using the same
window.

v2: Based on Chris Wilson's review:
* Use xcb_discard_reply() instead of free(xcb_request_check())

Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
2016-08-04 15:45:43 +09:00
Marek Olšák 6db93cd167 gallium/util: fix align64
it cut off the upper 32 bits

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-08-01 23:28:14 +02:00
Matt Turner be35c6ba92 draw: Avoid aliasing violations.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-01 12:09:17 -07:00
Matt Turner 16ff8f9ae8 gallium/auxiliary: Add u_bitcast.h header.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-01 12:09:17 -07:00
Brian Paul 13fa051356 auxiliary/os: add new os_get_command_line() function
This can be used by the driver to get the command line which started
the process.  Will be used by the VMware driver for extra logging.

For now, this is only implemented for Linux via /proc/self/cmdline
and Windows via GetCommandLine().

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-01 12:20:19 -06:00
Rob Clark 53b2b8bf6f u_vbuf: fix potentially bogus assert
There are cases where we hit u_vbuf path due to alignment or pitch-
alignment restrictions, but for an output-format that u_vbuf does not
support translating (yet the driver does support natively).  In which
case we hit the memcpy() path and don't care that u_vbuf doesn't
understand it.

Fixes crash with debug build of mesa in:
dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95000
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-01 13:42:11 -04:00
Roland Scheidegger 99a47391e4 Revert "gallium/util: fix resource leak"
This reverts commit d1fe26a628.

Replacing a resource leak with a segfault isn't the solution.
2016-07-30 18:18:09 +02:00
Eric Engestrom d1fe26a628 gallium/util: fix resource leak
CovID: 401540
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-30 17:27:42 +02:00
Rob Clark 010e4b2d52 os: add pipe_mutex_assert_locked()
Would be nice if we could also have lockdep, like in the linux kernel.
But this is better than nothing.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-30 09:23:42 -04:00
Eric Anholt 4d0b2c7aaa ttn: Update shader->info as we generate code.
We could use the nir_shader_gather_info() pass to update it after the
fact, but this is what glsl_to_nir and prog_to_nir do.

Reviewed-by: Rob Clark <robclark@freedesktop.org>
2016-07-26 13:47:50 -07:00
Boyuan Zhang 23b4ab1738 vl/util: add copy func for yv12image to nv12surface v2
Add function to copy from yv12 image to nv12 surface for VAAPI putimage call.
We need this function in VaPutImage call where copying from yv12 image to nv12
surface for encoding. Existing function can't be used because it only work for
copying from yv12 surface to nv12 image in Vaapi.

v2: cleanup variable types and commit message

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2016-07-25 13:39:18 +02:00
Marek Olšák 8e3e9d2839 gallium/util: don't modify usage in pipe_buffer_write
All drivers were already doing it except virgl.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-23 13:33:42 +02:00
Marek Olšák 1ffe77e7bb gallium: split transfer_inline_write into buffer and texture callbacks
to reduce the call indirections with u_resource_vtbl.

The worst call tree you could get was:
  - u_transfer_inline_write_vtbl
    - u_default_transfer_inline_write
      - u_transfer_map_vtbl
        - driver_transfer_map
      - u_transfer_unmap_vtbl
        - driver_transfer_unmap

That's 6 indirect calls. Some drivers only had 5. The goal is to have
1 indirect call for drivers that care. The resource type can be determined
statically at most call sites.

The new interface is:
  pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data)
  pipe_context::texture_subdata(ctx, resource, level, usage, box, data,
                                stride, layer_stride)

v2: fix whitespace, correct ilo's behavior

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-07-23 13:33:42 +02:00
Marek Olšák 4cdc482283 gallium/os: use CLOCK_MONOTONIC for sleeps (v2)
v2: handle EINTR, remove backslashes

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-07-22 22:34:49 +02:00
Marek Olšák 8d5944199d gallium/pb_cache: reduce the number of pointer dereferences
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-19 23:45:06 +02:00
Marek Olšák 3cdc0e133f gallium/pb_cache: divide the cache into buckets for reducing cache misses
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-19 23:45:06 +02:00
Marek Olšák fec7f74129 gallium/pb_cache: check parameters that are more likely to fail first
This makes Bioshock Infinite with deferred flushing 2% faster.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-19 23:45:06 +02:00
Eric Engestrom 8ba46fbd9e vl: fix memory leak
CovID: 1363008
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-19 12:41:00 +02:00
Leo Liu 134d6e4e4f vl/dri3: fix a memory leak from front buffer
Inspired by fix for mem leak of vdpau interop, resource_from_handle
set texture reference count, that need to be decreased and released,
recall there is a similar case for DRI3, that is with VA-API glx
extension, there is temporary TFP(texture from pixmap), we target it
through dma-buf. leak happens when without count down the reference.

Checked and found with mpv vo=opengl case, there only one static TFP,
the leak happens once, but for totem player using gstreamer VA-API glx,
the dynamic TFP for each frame, so leak quite a bit.

This fixes mem leak for mpv and totem.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-18 09:20:40 -04:00
Kenneth Graunke ac1181ffbe compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_*.
Likewise, rename the enum type to glsl_interp_mode.

Beyond the GLSL front-end, talking about "interpolation modes" seems
more natural than "interpolation qualifiers" - in the IR, we're removed
from how exactly the source language specifies how to interpolate an
input.  Also, SPIR-V calls these "decorations" rather than "qualifiers".

Generated by:
$ find . -regextype egrep -regex '.*\.(c|cpp|h)' -type f -exec sed -i \
  -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \
  -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \;

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Dave Airlie <airlied@redhat.com>
2016-07-17 19:26:48 -07:00
Rob Clark 44bbfedbd9 gallium/u_queue: add optional cleanup callback
Adds a second optional cleanup callback, called after the fence is
signaled.  This is needed if, for example, the queue has the last
reference to the object that embeds the util_queue_fence.  In this
case we cannot drop the ref in the main callback, since that would
result in the fence being destroyed before it is signaled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-16 10:00:04 -04:00
Yaakov Selkowitz 5d303867f5 Use correct names for dlopen()ed files on Cygwin
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
2016-07-15 19:46:54 +01:00
Marek Olšák 6596ecf8c5 gallivm: add helper lp_add_attr_dereferenceable
Not sure if this is the right way to do it, but it seems to work.

v2: make it a no-op on LLVM <= 3.5

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-13 19:46:16 +02:00
Leo Liu 82f875f4d8 vl/compositor: set layer of y or uv to render
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Julien Isorce <j.isorce@samsung.com>
2016-07-12 09:27:53 -04:00
Leo Liu 14761da9f9 vl/compositor: add weave to yuv shader
This shader will make interlaced yuv to progressive yuv.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Julien Isorce <j.isorce@samsung.com>
2016-07-12 09:27:53 -04:00
Leo Liu 2e18c2c6f8 vl/compositor: move weave shader out from rgb weaving
We'll use weave shader in the later patch.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Julien Isorce <j.isorce@samsung.com>
2016-07-12 09:27:53 -04:00
Marek Olšák d7b6f90684 gallivm: set LLVMNoUnwindAttribute on all intrinsics
RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement:

 Application            Files    SGPRs     VGPRs   SpillSGPR SpillVGPR Code Size    LDS    Max Waves   Waits
 unigine_valley           278    0.00 %   -0.29 %    0.00 %    0.00 %    0.01 %    0.00 %    0.17 %    0.00 %

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-07-11 19:06:05 +02:00
Nicolai Hähnle 374aa2bb27 gallium/u_queue: assert that users must wait on fences before destroying them
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-11 11:04:44 +02:00
Nicolai Hähnle a0a616720a gallium/u_queue: guard fence->signalled checks with fence->mutex
I have seen a hang during application shutdown that could be explained by the
following race condition which this patch fixes:

1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true.
2. Main thread calls util_queue_job_wait, which returns immediately.
3. Main thread deletes the job and fence structures, leaving garbage behind.
4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because
   it is accessing garbage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-11 11:03:59 +02:00
Nayan Deshmukh af18a04755 vl: add half pixel to v_tex before adding offsets
Since pixel center lies at 0.5, add half_pixel to vtex
before adding offsets to it.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-08 20:51:12 +02:00
Rob Clark def044376a gallium/util: make util_copy_framebuffer_state(src=NULL) work
Be more consistent with the other u_inlines util_copy_xyz_state()
helpers and support NULL src.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-06 10:17:30 -04:00
Hans de Goede d386cef246 tgsi: Add WORK_DIM System Value
Add a new WORK_DIM SV type, this is will return the grid dimensions
(1-4) for compute (opencl) kernels.

This is necessary to implement the opencl get_work_dim() function.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Nayan Deshmukh 872dd9ad15 vl: add a bicubic interpolation filter(v5)
This is a shader based bicubic interpolater which uses cubic
Hermite spline algorithm.

v2: set dst_area and dst_clip during scaling (Christian)
v3: clear the render target before rendering
v4: intialize offsets while initializing shaders
    use a constant buffer to send dst_size to frag shader
    small changes to reduce calculation in shader
v5: send half pixel offset instead of sending dst_size

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-01 12:54:33 +02:00
Brian Paul c823ff8dfb gallium/util: check for window cliprects in util_can_blit_via_copy_region()
We can't blit with resource_copy_region() if there are window clip rects.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-30 18:19:09 -06:00
Brian Paul 5f1335878e gallium/util: add tight_format_check param to util_can_blit_via_copy_region()
The VMware driver will use this for implementing GL_ARB_copy_image.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-30 14:32:06 -06:00
Brian Paul a029d9f074 gallium/util: simplify a few things in util_can_blit_via_copy_region()
Since only the src box can have negative dims for flipping, just
comparing the src/dst box sizes is enough to detect flips.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-30 14:32:06 -06:00
Brian Paul 5d31ea4b8f gallium/util: new util_try_blit_via_copy_region() function
Pulled out of the util_try_blit_via_copy_region() function.  Subsequent
changes build on this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-30 14:32:06 -06:00
Hans de Goede 459cc94507 pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms
Make pipe_loader_sw_probe_kms take ownership of the passed in fd,
like pipe_loader_drm_probe_fd does.

The only caller is dri_kms_init_screen which passes in a dupped fd,
just like dri2_init_screen passes in a dupped fd to
pipe_loader_drm_probe_fd.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-28 12:29:54 +02:00
Marek Olšák cbb5adb908 gallium/u_queue: allow the execute function to differ per job
so that independent types of jobs can use the same queue.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 4a06786efd gallium/u_queue: reduce the number of mutexes by 2
by converting semaphores to condvars and using the main mutex

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 2fba0aaa70 gallium/u_queue: add an option to name threads
for debugging

v2: correct the snprintf use

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 404d0d50d8 gallium/u_queue: add an option to have multiple worker threads
independent jobs don't have to be stuck on only one thread

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák 4358f6dd13 gallium/u_queue: rewrite util_queue_fence to allow multiple waiters
Checking "signalled" is first done without a mutex, then with a mutex.
Also, checking without waiting doesn't lock the mutex. This is racy, but
should be safe.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák d8367e91f2 gallium/u_queue: use a ring instead of a stack
and allow specifying its size in util_queue_init.

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Brian Paul e0dc3c5f19 gallium/util: fix some 4-space indentation in blitter code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Ilia Mirkin 5b0d64886d translate: fix start_instance parameter in sse version
The generic version gets this right already, but this was using an
incorrect formula in SSE.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-21 21:50:16 -04:00
Marek Olšák 5fed1122e8 gallium/u_blitter: implement mipmap generation
for pipe_context::generate_mipmap

first move some of the blit code from util_blitter_blit_generic
to a separate function, then use it from util_blitter_generate_mipmap

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-21 13:52:05 +02:00
Roland Scheidegger b0cf99165a gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).

This addresses https://llvm.org/bugs/show_bug.cgi?id=28176

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Aaron Watry <awatry@gmail.com>
2016-06-20 17:19:03 +02:00
Christian König bf89e672cf vl: support luma keying for interlaced surfaces as well
We had the CSC code twice in there, factor it out into a separate function.

Signed-off-by: Christian König <christian.koenig@amd.com>
2016-06-16 09:41:12 +02:00
Brian Paul bb1292e226 auxilary/os: allow appending to GALLIUM_LOG_FILE
If the log file specified by the GALLIUM_LOG_FILE begins with '+', open
the file in append mode.  This is useful to log all gallium output for
an entire piglit run, for example.

v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-15 17:16:42 -06:00
Marek Olšák 562cb03d76 gallium/util: import the multithreaded job queue from amdgpu winsys (v2)
v2: rename the event to util_queue_fence

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-15 21:07:34 +02:00
Roland Scheidegger afbf5888f5 gallium/util: don't use blocksize for minify for assertions
The previous assertions required for texture sizes smaller than block_size
that src_box.x + src_box.width still be block size.
(e.g. for a texture with width 3, and src_box.x = 0, src_box.width would
have to be 4 to not assert.)
This caused some assertions with some other state tracker.
It looks though like callers aren't expected to round up widths to block sizes
(for sizes larger than block size the assertion would still have verified it
wouldn't have been rounded up) so we simply shouldn't use a minify which
rounds up to block size.
(No piglit change with llvmpipe.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-14 17:03:34 +02:00
Julien Isorce 1cdb4da1d6 st/va: ensure linear memory for dmabuf
In order to do zero-copy between two different devices
the memory should not be tiled.

Tested with GStreamer on a laptop that has 2 GPUs:
1- gstvaapidecode:
   HW decoding and dmabuf export with nouveau driver on Nvidia GPU.
2- glimagesink:
   EGLImage imports dmabuf on Intel GPU.

TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-14 08:40:33 +01:00
Mathias Fröhlich c3b6656676 mesa/gallium: Move u_bit_scan{,64} from gallium to util.
The functions are also useful for mesa.
Introduce src/util/bitscan.{h,c}. Move ffs function
implementations from src/mesa/main/imports.{h,c}.
Move bit scan related functions from
src/gallium/auxiliary/util/u_math.h. Merge platform
handling with what is available from within mesa.

v2: Try to fix MSVC compile.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2016-06-14 05:19:10 +02:00
Brian Paul cf9bb9acac util: update some assertions in util_resource_copy_region()
To cope with copies of compressed images which are not multiples of
the block size.  Suggested by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>
2016-06-13 13:30:19 -06:00
Jan Vesely 1fb4179f92 vl: Fix trivial sign compare warnings
v2: add whitepace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
[Emil Velikov: squash a few more whitespace issues]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Rob Herring 112e988329 Android: move libdrm settings to top-level Android.common.mk
Fix warnings like these due to HAVE_LIBDRM being inconsistently defined:

external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition]
typedef struct drm_clip_rect drm_clip_rect_t;

HAVE_LIBDRM needs to be set project wide to fix this. This change also
harmlessly links libdrm with everything, but simplifies the makefiles a
bit.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Jan Vesely ace70aedcf gallivm: Fix trivial sign warnings
v2: include whitespace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-13 09:23:09 -04:00
Brian Paul dd4be2e19a util: update util_resource_copy_region() for GL_ARB_copy_image
This primarily means added support for copying between compressed
and uncompressed formats.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Anuj Phogat 466b320163 gallium: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
 -----------
|           |
 ------- ---
|       |   |
|       |   |
 ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-10 14:35:21 -07:00
Dave Airlie 1584918996 gallivm: more 64-bit integer prep work.
This converts one other place to using the new helper.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:30 +10:00
Dave Airlie e5c57824ec gallivm: make non-float return code bitcast consistent.
This just uses the same form across the fetches.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:17 +10:00
Dave Airlie 3b97e50b9a gallium/gallivm: use 64-bit test instead of doubles.
This just makes some generic code that currently emits double
suitable for emitting 64-bit values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:13 +10:00
Dave Airlie 213ab8db87 gallium/tgsi: add 64-bitness type check function.
Currently this just doubles, but we'll convert users to this
so making adding 64-bit integers easier.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:43:45 +10:00
Leo Liu 2ad443e4cc vl/dri3: support receiving new pixmap for front buffer
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.

This also fixes Totem player playback corruption.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 11:24:24 -04:00
Leo Liu 0ef8500aab vl/dri3: get Makefile properly
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.

This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 11:24:19 -04:00
Jose Fonseca 2b4cee0571 gallivm: Never emit llvm.fmuladd on LLVM 3.3.
Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't
handle llvm.fmuladd and instead it fall backs to a C function.  Which in
turn causes a segfault on Windows.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 16:17:04 +01:00
Jose Fonseca 320d1191c6 gallivm: Use llvm.fmuladd.*.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Jose Fonseca 9e8edfa190 util,gallivm: Explicitly enable/disable fma attribute.
As suggested by Roland Scheidegger.

Use the same logic as f16c, since fma requires VEX encoding.

But disable FMA on LLVM 3.3 without MCJIT.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Nayan Deshmukh f24eb5a178 vl: Apply luma key filter before CSC conversion
Apply the luma key filter to the YCbCr values during the CSC conversion
    in video buffer shader. The initial values of max and min luma are set
    to opposite values to disable the filter initially and will be set when
    enabling it.

    Add extra parmeters min and max luma for the luma key filter in
    vl_compositor_set_csc_matrix in va, xvmc. Setting them
    to opposite value 1.f and 0.f respectively won't effect the CSC
    conversion

    v2: -Squash 1,2 and 3 into one patch to avoid breaking build of
        other components. (Christian)
        -use ureg_swizzle. (Christian)
        -change name of the variables. (Christian)

    v3: -Squash all patches in one to avoid breaking of build. (Emil)
        -wrap functions properly. (Emil)
        -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil)

    v4: -Divide it in two patches one which introduces the functionality
	 and assigs dummy values to the changed functions and second which
	 implements the lumakey filter. (Christian)
	-use ureg_scalar instead ureg_swizzle. (Christian)

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-09 14:23:07 +02:00
Nicolai Hähnle d3a584defe tgsi/scan: add uses_derivatives (v2)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-07 23:45:17 +02:00
Ilia Mirkin 30684b50d7 gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:28 -04:00
Charmaine Lee 627e975896 tgsi: fix mixed data type comparison in tgsi_point_sprite.c
Cast the unsigned semantic index to integer datatype before comparing
to max_generic, otherwise, max_generic which is initialized to -1
will be converted to unsigned int before the comparison, causing a wrong
semantic index to be assigned to a shader output.

Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265)

Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord.

v2: use the original max_generic variable but add the (int) cast
    to the semantic index, as suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-06 10:20:45 -06:00
Lars Hamre 4163c71010 tgsi: use truncf in micro_trunc
Switches to using truncf in micro_trunc.

Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-trunc-float
fs-trunc-vec2
fs-trunc-vec3
fs-trunc-vec4
vs-trunc-float
vs-trunc-vec2
vs-trunc-vec3
vs-trunc-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-trunc-float
gs-trunc-vec2
gs-trunc-vec3
gs-trunc-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-06 15:56:28 +02:00
Marek Olšák ada3d8f31e gallium/u_suballoc: allow different alignment for each allocation
Just move the alignment parameter from u_suballocator_create
to u_suballocator_alloc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Rob Clark 228b2b36f4 gallium/util: remove u_staging
Unused, and fixes a couple of coverity warnings: CID 1362171, 1362170

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2016-06-02 15:44:07 -04:00
Nicolai Hähnle d9893feb2c gallium/cso: allow saving the first fragment shader image slot
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:15 +02:00
Nicolai Hähnle fc0352ff9c gallium/u_inlines: allow NULL src in util_copy_image_view
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:12 +02:00
Marek Olšák 9d881cc0ac gallium/util: add util_texrange_covers_whole_level from radeon
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák 921ab0028e gallium/u_blitter: do GL-compliant integer resolves
The GL spec has been clarified and the new rule says we should just
copy 1 sample.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-31 16:48:53 +02:00
Frederic Devernay cee459d84d gallivm: initialize init_native_targets_once_flag correctly
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-05-30 16:13:52 +02:00
Brian Paul 747754f027 gallium/util: another s/unsigned/enum pipe_prim_type/ for clang
Trivial.
2016-05-27 18:42:21 -06:00
Brian Paul 8beb6f3c9c gallium/util: another unsigned -> enum pipe_prim_type change
gcc didn't warn about the unsigned / enum pipe_prim_type mismatch
between the .c and .h file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-27 17:55:05 -06:00
Roland Scheidegger 9247570d42 gallivm: eliminate a unnecessary AND with unorm lerps
Instead of doing a add and then mask out the upper bits, we can
simply do a add with a half wide type (this, of course, assumes
the hw can actually do it...), so we'll get the required zero
in the upper bits automatically.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-27 19:11:28 +02:00
Roland Scheidegger 17d685c426 gallium/util: use enum pipe_prim_type instead of unsigned some more
There were complaints from a mingw build:
u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’
to ‘pipe_prim_type’ [-fpermissive]

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-27 19:11:28 +02:00
Rob Clark 4f98c94be7 gallium/util: fix build break
Missing #include caused build breaks after 21a3fb9cd.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-26 20:59:08 -04:00
Brian Paul 1ec45a1948 gallium/util: use enum pipe_prim_type in u_prim.h functions
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul 7a49b41436 util/indices: move duplicated assignments out of switch cases
Spotted by Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul a25ae485a6 util/indices,svga: s/unsigned/enum pipe_prim_type/
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul 21a3fb9cd8 util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul 0f983e1793 util/indices: implement unfilled (tri->line) conversion for adjacency prims
Tested with new piglit gl-3.2-adj-prims test.

v2: re-order trisadj and tristripadj code, per Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul d6c2c7d710 util/indices: implement provoking vertex conversion for adjacency primitives
Tested with new piglit gl-3.2-adj-prims test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul 479d364c39 util/indices: assert that the incoming primitive is a triangle type
The unfilled index translator/generator functions should only be
called when the primitive mode is one of the triangle types.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul 26de558072 util/indices: formatting, whitespace fixes in u_unfilled_indices.c
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul 24eadb4810 util/indices: improve comments in u_indices.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Rob Clark 6e51fe75a4 tgsi: fix coverity out-of-bounds warning
CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local:
Overrunning array of 2 16-byte elements at element index 2 (byte offset
32) by dereferencing pointer &inst.Dst[i].

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 15:17:49 -04:00
Rob Clark 3d66ba971e tgsi: fix out of bounds access
Not sure why coverity calls this an out-of-bounds read vs out-of-bounds
write.

CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local:
Overrunning array r of 3 16-byte elements at element index 3 (byte
offset 48) using index chan (which evaluates to 3).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 15:17:49 -04:00
Lars Hamre c626a86586 gallium/tgsi: use _mesa_roundevenf in micro_rnd
Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-roundeven-float
fs-roundeven-vec2
fs-roundeven-vec3
fs-roundeven-vec4
vs-roundeven-float
vs-roundeven-vec2
vs-roundeven-vec3
vs-roundeven-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-roundeven-float
gs-roundeven-vec2
gs-roundeven-vec3
gs-roundeven-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 07:59:15 -06:00
Giuseppe Bilotta 8c00fe3970 scons: whitespace cleanup
This text transformation was done automatically via the following shell
command:

$ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \;

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-25 12:23:12 -06:00
Brian Paul 9690ab0cdf tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer
Print "GEOM" instead of "2", for example.

v2: also update the text parsing code, per Ilia.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-25 07:21:23 -06:00
Brian Paul 2b773fcf00 tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-25 07:21:23 -06:00
Emil Velikov a155cdaace vl/drm: don't call close(-1) in vl_drm_screen_create error path
Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-05-23 12:07:47 +01:00
Dave Airlie e6d9389366 tgsi: remove culldist semantic.
This isn't used anymore in the tree, culldist's
are part of the clipdist semantic, we could in theory
rename it, but I'm not sure there is much point, and
I'd have to be careful with virgl.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:03:44 +10:00
Dave Airlie d17062a40e draw: stop using CULLDIST semantic.
The way the HW works doesn't really fit with having
two semantics for this.

The GLSL compiler emits 2 vec4s and two properties,
this makes draw use those instead of CULLDIST semantics.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:03:40 +10:00
Axel Davy 52cb8e33c3 gallium/util: Implement util_format_translate_3d
This is the equivalent of util_format_translate, but for volumes.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Brian Paul 5888c47cc9 cso: remove / add some comments
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-17 19:20:36 -06:00
Jan Vesely 47b390fe45 Treewide: Remove Elements() macro
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-17 15:28:04 -04:00
Jose Fonseca cf010de6ee vl/dri: Move the DRI3 check out of sources include into C.
Fixes SCons build.

Trivial.  Built locally with SCons and autotools.
2016-05-16 21:50:43 +01:00
Leo Liu c122c74dca vl/dri3: implement functions for get and set timestamp
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 9f50a79b8f vl/dri3: handle PresentCompleteNotify event
and get timestamp calculated based on the event's reply

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 8d7ac0a4e4 vl/dri3: implement DRI3 BufferFromPixmap
We also need render to the front buffer of temporary X pixmap,
this is the case of when we using opengl as video out for vaapi.
the basic implementation is to pass pixmap ID to X server, and
then X will return dma-buf fd, we will get the buffer object
through this dma-buf fd.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 858b329c2c vl/dri3: add support for resizing
When drawable size changed, PresentConfigureNotify event will be
emitted, by handling the event to re-allocate resized buffer.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 96580ad593 vl/dri3: implement funciton for get dirty area
This will clear presentation area not covered by video content

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu b0bd908284 vl/dri3: implement function for flush frontbuffer
Request drawable content in pixmap by calling DRI3 PresentPixmap,
and handle PresentIdleNotify event.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu e1223282db vl/dri3: add back buffers support
This implements DRI3 PixmapFromBuffer. Create buffer objects, and
associate it to a dma-buf fd, and then pass this fd with a pixmap
ID to X server for creating pixmap object; also add a function
for wait events.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 69ba9be4d2 vl/dri3: implement flushing for queued events
also place holder for present events handling

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 758b1bbaa7 vl/dri3: register present events
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 672e8d5e7e vl/dri3: set drawable geometry
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu 12e5220e34 vl/dri3: add DRI3 support and implement create and destroy
Required functions into place for implementation, create screen
with device fd returned from X server, also bail out to DRI2
with certain conditions.

v2: -organize the error out path (Axel)
    -squash previous patch 1 and 2 into one (Emil)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu bd9ae72459 vl/dri: fix close fd error out
fd should be set to -1 only if it got closed by pipe_loader_release.

Signed-off-by: Leo Liu <leo.liu@amd.com>
2016-05-12 18:26:48 -04:00
Tim Rowley 2785f2f2d7 swr: properly expose compressed format support
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-12 14:12:18 -05:00
Rob Clark 425dc4c4b3 gallium: refactor pipe_shader_state to support multiple IR's
The goal is to allow the pipe driver to request something other than
TGSI, but detect whether what is getting is TGSI vs what it requested.
The pipe drivers will always have to support TGSI (and convert that into
whatever it is that they prefer), but in some cases we should be able to
skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir).

I think pipe_compute_state should get similar treatment.  Currently,
afaict, it has one user and one consumer, which has allowed it to be
sloppy wrt. supporting alternative IR's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-11 12:20:11 -04:00
Roland Scheidegger 430797843a gallivm: improve dumping of bitcode
Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode.
Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple
modules can be dumped (albeit it might still overwrite previous modules,
particularly the modules from draw tend to always have the same name).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-11 04:43:35 +02:00
Roland Scheidegger e4cf8717de gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir
Those aren't really interesting, however outputting them is helpful when
trying to feed the IR to llvm llc (or opt) for debugging.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-10 17:08:16 +02:00
Roland Scheidegger 5c200894c8 gallivm: use InternalLinkage instead of PrivateLinkage for texture functions
At least with MCJIT the disassembler will crash otherwise when trying to
disassemble such functions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-10 17:08:16 +02:00
Roland Scheidegger 8b66e2647d gallivm: disable avx512 features
We don't target this yet, and some llvm versions incorrectly enable it based
on cpu string, causing crashes.
(Albeit this is a losing battle, it is pretty much guaranteed when the next
new feature comes along llvm will mistakenly enable it on some future cpu,
thus we would have to proactively disable all new features as llvm adds them.)

This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested)

Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com

CC: <mesa-stable@lists.freedesktop.org>
2016-05-10 17:08:16 +02:00
Tim Rowley b65f7ec450 gallium: enable intel jitevents profiling
LLVM when configured with "intel jitevents" enabled can inform
VTune about dynamic code, so individual shaders are attributed
profiling data and the resulting assembly can be examined.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-05-09 11:25:02 -05:00
Nicolai Hähnle 62b7958cd0 gallium: fix various undefined left shifts into sign bit
Funnily enough, some of these were turned into a compile-time error by gcc
with -fsanitize=undefined ("initializer is not a constant").

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:59 -05:00
Brian Paul ef5a31fc06 gallium/util: change assertion to conditional in util_bitmask_destroy()
If we fail to create a context in the VMware driver we call this function
unconditionally to free a bunch of bit vectors.  Instead of asserting on
a null pointer, just no-op.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 15:40:49 -06:00
Brian Paul 68116dcd5a cso: null-out previously bound sampler states
If, for example, we previously had 2 sampler states bound and now we
are binding one, we'd leave the second sampler state unchanged.
This change nulls-out the second sampler state in this situation.
We're already doing the same thing for sampler views.

This silences an occasional warning issued by the VMware driver when
the number of sampler views and sampler states disagreed.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-03 15:40:49 -06:00
Jan Vesely ebbe31d57c gallium,utils: Fix trivial sign compare warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 12:00:09 -04:00
WuZhen 798f7a8596 tgsi: initialize stack allocated struct
Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
Emil Velikov 0700cdd5aa gallium/target-helpers: remove inline_wrapper_sw_helper.h
Unused as of commit dddedbec0e "{st,targets}/nine: use static/dynamic
pipe-loader"

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:37:25 +01:00
Nicolai Hähnle 59af21c3e9 tgsi/text: fix parsing of memory instructions
Properly handle Target and Format parameters when present.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:56 -05:00
Nicolai Hähnle 4055babc75 tgsi/text: add str_match_name_from_array
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:53 -05:00
Nicolai Hähnle a56edbdd8f tgsi/text: add str_match_format helper function
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:51 -05:00
Nicolai Hähnle acb65a23a3 tgsi/build: pass Memory.Texture and .Format through tgsi_build_full_instruction
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:32 -05:00
Nicolai Hähnle 318d305f6d tgsi/dump: signal nospace when the last print exceeded the size
Previously, there was a bug where nospace wasn't signalled if it just so
happened that the very last print exceeded the available space.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:28 -05:00
Nicolai Hähnle e08eaa5b72 tgsi/dump: shared dump_ctx initialization
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:21 -05:00
Brian Paul a609da60c0 gallium/util: s/Elements/ARRAY_SIZE/
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 09:04:24 -06:00
Brian Paul 23c55e5c23 tgsi: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 419e386571 os: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul d902504a67 hud: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul e522a76226 gallivm: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 489df4a71a draw: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Nicolai Hähnle 91fb4bb2e9 gallium/util: add u_bit_consecutive for generating a consecutive range of bits
There are some undefined behavior subtleties, so having a function to match
the u_bit_scan_consecutive_range makes sense.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Dave Airlie f78bcb7638 tgsi/exec: initialise SysSemanticToIndex array to -1
We want to use the SysSemanticToIndex to tell if we've seen
the semantics at all.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:46 +10:00
Dave Airlie fbea4e177f tgsi/exec: implement restartable machine.
This lets us restart the machine at a PC value, and exits
the machine when we hit a barrier.

Compute shaders will then execute all the threads up to the
barrier, then restart the machines after the barrier once
all are done.

v2: comment the code a bit, change return types.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:44 +10:00
Dave Airlie 8ffa3c58d4 tgsi/exec: make inputs/outputs optional for compute shaders.
compute shaders don't need input/outputs so don't bother
allocating memory for these.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:41 +10:00
Dave Airlie 16a9dc1e49 tgsi/exec: implement load/store/atomic on MEMORY.
This implements basic load/store/atomic ops on MEMORY types
for compute shaders.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:35 +10:00
Dave Airlie 354c5f2d0f tgsi/exec: split out setting up masks to separate function
This is just a cleanup that will make later changes easier
to make.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:56:22 +10:00
Dave Airlie 6cf36a7231 tgsi: accept a starting PC value for exec machine.
This will be used later to restart barriered execution
threads in compute, for now we just want to change the API.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:56:17 +10:00
Dave Airlie 912ed84f83 tgsi: move to using vector for system values.
For compute support some of the system values are .xyz types,
so move to using a vector instead of a single channel.

[airlied: squash swizzle fix from compute series].

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:26:53 +10:00
Dave Airlie 9013d9267c tgsi/exec: fix system value handling.
a) SysSemanticToIndex needs to be indexed with the semantic name
not the decl->Declaration.Semantic.

b) doing this in run is too late, as the mappings are all setup
prior to run in the execs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:25:38 +10:00
Jose Fonseca dcc3baf733 gallium: Include intrin.h instead of defining ourselves.
More portable, particularly when building with Clang, which implements
all MSVC intrisincs in its own intrin.h, but doesn't actually support
`#pragma instrinsic`.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 17:17:00 +01:00
Dave Airlie e3e6859381 tgsi: pass a shader type to the machine create and clean up.
There was definitely bugs here mixing up the PIPE_ and TGSI_ defines,
hopefully they didn't cause any problems, since mostly it was special
cases for GEOMETRY.

This clarifies at shader machine create what type of shader this
machine will execute. This is needed also for compute shaders where
we don't want to allocate inputs/outputs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 13:05:32 +10:00
Dave Airlie a6aae0c24d gallium/tgsi: move tgsi_exec.h header out of draw_context.h
It gets annoying that changing the tgsi exec rebuilds the state
tracker unnecessarily. Putting this include into draw_gs.h which
uses it causes a lot less rebuilds.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 13:00:57 +10:00
Roland Scheidegger bd07e20d20 gallivm: make sampling more robust against bogus coordinates
Some cases (especially these using fract for coord wrapping) did not handle
NaNs (or Infs) correctly - the following code assumed the fract result
could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract
result was NaN - which then could produce out-of-bound offsets.

(Note that the explicit NaN behavior changes for min/max on x86 sse don't
result in actual changes in the generated jit code, but may on other
architectures. Found by looking through all the wrap functions.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955

No piglit changes.

(v2: fix min/max typo in coord_mirror, add comment)

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-26 04:55:37 +02:00
Brian Paul 63df017fda util/blitter: use ARRAY_SIZE macro
And remove local definition of Elements() macro.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-25 12:59:29 -06:00
Brian Paul 1db8313168 gallium/util: initialize pipe_framebuffer_state to zeros
To silence a valgrind uninitialized memory warning.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-25 12:59:29 -06:00
Brian Paul 1e990978ee util/cache: add comments, fix formatting 2016-04-25 12:59:29 -06:00
Grazvydas Ignotas dc732a8ef2 gallium: use unreachable instead of asserts
Avoids warnings in release builds.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:34 +02:00
Grazvydas Ignotas cbb0d4ad75 gallium: fix warnings in release build
Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings
in release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:21 +02:00
Marek Olšák c9e5a7df61 gallium: remove helpers converting to/from TGSI_PROCESSOR_*
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák af249a7da9 gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák fb523cb6ad gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Roland Scheidegger 4ff8cbb0d8 gallivm: fix bogus argument order to lp_build_sample_mipmap function
Screwed up since 0753b135f6.

(Only an issue with different min/mag filters, and then only in some cases,
which is probably why it went unnoticed for quite a while.
The effect should have simply been nearest mip filter instead of linear, iff
min was nearest, mag was linear, and all pixels hit the mignifying path.)

Fixes a bunch of dEQP failures.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-21 23:57:24 +02:00
Russell King fadfaa82c6 tgsi/lowering: improved lowering for LRP
Provide an improved lowering for LRP, which can be implemented in two
MAD instructions with a bit of rearranging of the equation, rather
than the literal implementation of two multiplies, an add and a
subtract.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King 67da7dd98a tgsi/lowering: improved lowering for XPD
Improve XPD lowering to consume less instructions by using the
MAD instruction to perform the multiply and subtraction together.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King 65460cf4c8 tgsi/lowering: add support for lowering TRUNC
Add support for lowering TRUNC using the following sequence:

	FRC tmpA, |src|
	SUB tmpA, |src|, tmpA
	CMP dst, -tmpA, tmpA

Note that this is incompatible with FRC lowering.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King 23e870a888 tgsi/lowering: add support for lowering FLR and CEIL
Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD
instructions for GPUs that support FRC but not FLR or CEIL.  Since
these uses FRC, it is invalid to ask for FLR or CEIL to be lowered
along with FRC, so add an assert to catch this invalid configuration.

We also need to deal with FLR instructions emitted by the lowering
code.  Fix these up with the FRC+SUB equivalent when FLR lowering is
enabled.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Bas Nieuwenhuizen 0b6c463dac gallium/util: Add u_bit_scan_consecutive_range64.
For use by radeonsi.

v2: Make sure that it works for all 64 bits set.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Jose Fonseca 932b71f17d gallivm: Avoid llvm::sys::getProcessTriple().
Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3
onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway,

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:37 +01:00
Jose Fonseca b5ca689cee gallivm: Remove lp_get_module_id.
Just keep a copy of the module_name in gallivm.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:26 +01:00
Jose Fonseca 969ba8bfa7 gallivm: Fix MCJIT with LLVM 3.3.
One needs to call setJITMemoryManager for LLVM 3.3, instead of
setMCJITMemoryManager.

This regressed in commits 065256df/75ad4fe7 when trying to make the
code to build with LLVM 3.6.

Tested MCJIT with LLVM 3.3 to 3.6.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:17 +01:00
Jose Fonseca cf4105740f gallivm: Make MCJIT a runtime option.
On the LLVM versions that support it, so we can easily switch between
MCJIT/old-jit for testing.

The new option is GALLIVM_MCJIT.

Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes
segfault, both on Linux and Windows.  I'm almost certain this used to
work, so there probably is a regression somewhere.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:14 +01:00
Jose Fonseca 2211f8d559 gallivm: Use LLVMSetTarget.
Instead of LLVM C++ interfaces.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:00 +01:00
Jose Fonseca 9aa23b11e4 gallivm: Use LLVMPrintValueToString where available.
And llvm::raw_string_ostream where not (LLVM 3.3).

Thereby eliminating yet another dependency on unstable LLVM interfaces.

As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which
was disabled, probably due to C++ issues.)

Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:28:37 +01:00
Jose Fonseca f6621cd3be gallium/tests: Update UTIL_FORMAT_MAX_* defines.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:28:16 +01:00
Dave Airlie 3a26ef23e7 gallivm: convert size query to using a set of parameters.
This isn't currently that easy to expand, so fix it up
before expanding it later to include dynamic samplers.

[airlied: use some local variables (Roland)]

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-19 07:33:39 +10:00
Marek Olšák 6dc21b1962 gallium/util: fix undefined shift to the last bit in u_bit_scan
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák 9434aa8103 gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffff
The second ffs returns 0, yielding count == -1.

v2: change 1 to 1u

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-18 19:51:24 +02:00
Roland Scheidegger d11111a551 gallivm: don't use vector selects with llvm 3.7
llvm 3.7 sometimes simply miscompiles vector selects.
See https://bugs.freedesktop.org/show_bug.cgi?id=94972

This was fixed in llvm r249669
(https://llvm.org/bugs/show_bug.cgi?id=24532).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-18 00:23:34 +02:00
Tim Rowley ee72fec9cf gallium/swr: allow swr use as a swrast dri driver
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-15 14:21:50 -05:00
Jose Fonseca 9586468c03 gallivm: Workaround LLVM PR 27332.
The credit for finding and isolating this bug goes to Vinson and Roland.

The buggy LLVM versions were found by doing

  opt -instcombine llvm-pr27332.ll > /dev/null

where llvm-pr27332.ll is the IR from
https://llvm.org/bugs/show_bug.cgi?id=27332#c3

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-13 16:42:55 +01:00
Roland Scheidegger cb438d8b3e gallivm: use llvm.nearbyint instead of llvm.round.
We used to use sse roundps intrinsic directly, but switched to use the llvm
intrinsics for rounding with e4f01da15d.
However, llvm semantics follows standard math lib round function which is
specced to do roundNearestAwayFromZero but we really want roundNearestEven
(moreoever, using round generates atrocious code since the cpu can't do it
directly and it results in scalar calls to libm __roundf).
So, use llvm.nearbyint instead, which does exactly the right thing, and even
has the advantage of being available with llvm 3.3 too. (I've verified it
actually generates a roundps instruction with llvm 3.3.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-13 11:13:03 +01:00
Thomas Hindoe Paaboel Andersen b89708f95f tgsi: fix buffer overflow
Increase r to four channels as rgba is written to it
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-13 11:51:34 +10:00
Jose Fonseca b5105e67a8 gallium: Use STATIC_ASSERT whenever possible.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 16:56:15 +01:00
Marek Olšák 0ba0933f48 winsys/amdgpu: add support for 64-bit buffer sizes
v2: fail in radeon_winsys_bo_create if size > 32 bits

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák 7e78b5ed38 pb_buffer: switch pb_buffer::size to 64 bits
being able to allocate more than 4 GB may be useful

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák e599b8f384 gallium: pause queries for all meta ops
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:46 +02:00
Dave Airlie afa8707ba9 softpipe: add SSBO/shader atomics support.
This adds support for the features requires for ARB_shader_storage_buffer_object
and ARB_shader_atomic_counters, ARB_shader_atomic_counter_ops.

[airlied: some cleanups applied]
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:16:13 +10:00
Dave Airlie c2aeeca455 draw: add support for passing buffers to vs/gs shaders.
Like the image code, but for shader buffers this time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:36 +10:00
Dave Airlie 081a958bcd tgsi: add support for buffer/atomic operations to tgsi_exec.
This adds support for doing load/store/atomic operations on
buffer objects.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:33 +10:00
Dave Airlie 9c7a0d188a tgsi: set nonhelpermask for vertex shaders
For atomic operations we really need to avoid executing unnecessary shaders, so for some
tests that just draw a single point we only want one vertex to get processed not 4,

this fixes a number of the atomic counters tests.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:16 +10:00
Emil Velikov 594e868555 Part revert "gallium/auxiliary: don't build NIR sources with MSVC2008 flags"
This reverts commit 41c7912d04 but leaves
out the pragma [that inspired the original commit].

Building mesa requires MSVC2013 or later, thus we no longer need this.

v2: Use correct include path (src/glsl/nir -> src/compiler/nir)

Conflicts:
	src/gallium/auxiliary/Makefile.am

Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-04-11 19:08:23 +01:00
Samuel Iglesias Gonsálvez 3663a2397e nir: add bit_size info to nir_load_const_instr_create()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Nicolai Hähnle 4bfcc86bf9 tgsi/scan: add an assert for the size of the samplers_declared bitfield
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-07 13:15:05 -05:00
Nicolai Hähnle cc39879989 draw/aaline: stronger guard against no free samplers (v2)
Line anti-aliasing will fail when there is no free sampler available. Make
the corresponding guard more robust in preparation of raising
PIPE_MAX_SAMPLERS to 32.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:05 -05:00
Nicolai Hähnle 040f5cb09e util/pstipple: stronger guard against no free samplers (v2)
When hasFixedUnit is false, polygon stippling will fail when there is no free
sampler available. Make the corresponding guard more robust in preparation
of raising PIPE_MAX_SAMPLERS to 32.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:02 -05:00
Edward O'Callaghan 0b7075fed7 gallium: Put no.of {samples,layers} into pipe_framebuffer_state
Here we store the number of samples and layers directly in the
pipe_framebuffer_state so that in the case of
ARB_framebuffer_no_attachment we may make use of them directly.

Further, we adjust various gallium/auxiliary helper functions
accordingly.

V2:
  Convert branches in util_framebuffer_get_num_layers() and
  util_framebuffer_get_num_samples() to their canonical form.

V3:
  'git stash pop' the typo fix of 'cbufs' which should be
  'nr_cbufs' that was missing in V2, woops! Thanks Marek for
  pointing this out yet again.

V4:
  Squash in the following patch:

  'gallium/util: Ensure util_framebuffer_get_num_samples() is valid'

   Upon context creation, internal driver structures are malloc()'ed
   and memset() to zero them. This results in a invalid number of
   samples 'by default'. Handle this in the simplest way to avoid
   elaborate and probably equally sub-optimial solutions.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Jose Fonseca 7ad49daca6 gallivm: Introduce lp_format_intrinsic.
For adding .v4f32 like suffixes to intrinsics, taking special care for
scalar case, which was being often neglected.

This fixes invalid IR when doing mipmap filtering on SSE2 (the only
case where we'd use intrinsics with scalars.)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-04 00:06:09 +01:00
Jose Fonseca a293f57e13 gallivm: Use llvm.fabs.
Exactly the same code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:09:09 +01:00
Jose Fonseca e4f01da15d gallivm: Prefer backend agnostic intrinsic for rounding.
We could unconditionally use these instrinsics, but performance with SSE2
would suck, as LLVM falls back to calling libm.

lp_test_arit.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:09:07 +01:00
Jose Fonseca 324451e73f gallivm: Add debug option to force SSE2.
For simulating less capable machines.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:08:57 +01:00
Jose Fonseca b284f1f7f9 gallivm: Fix performance regressions due to vector selects.
LLVM often can't determine the mask elements are all ones/zeros, and
there doesn't seem to be a good way to hint that.

Thanks to Roland Scheidegger for spotting and analyzing the issue.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Jose Fonseca 11c4e5b45c gallivm: Remove lp_build_load_volatile.
No longer needed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Jose Fonseca bcfb86b09d gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.
Only provide a fallback for LLVM 3.3.

One less dependency on LLVM C++ interface.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Brian Paul ef10b5427a tgsi: add simple tgsi_is_msaa_target() helper
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-02 08:05:20 -06:00
Bas Nieuwenhuizen 01f993a21f gallium: add threads per block TGSI property
The value 0 for unknown has been chosen to so that
drivers using tgsi_scan_shader do not need to detect
missing properties if they zero-initialize the struct.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-02 01:50:59 +02:00
Jose Fonseca f72de6f386 gallivm: Prevent disassembly debug output from being truncated.
By using os_log_message directly, as _debug_vprintf truncates messages
to 4K.

Also cleanup the disassemble interface.

Spotted by Roland.

Trivial.
2016-04-01 21:22:42 +01:00
Jose Fonseca cdf7c6b83d gallivm: Use vector selects on LLVM 3.3+.
This is an old patch I had around.

Vector selects seem to work well from LLVM 3.3.  Using them should
improve code quality, as it might make constant propagation pass more
effective.

Tested lp_test_*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-01 09:05:19 +01:00
Samuel Pitoiset d22eca5f90 tgsi: silence compiler warning in fetch_sampler_unit()
The unit variable can be used uninitialized.

Fixes: 24e77cb09 ("tgsi: handle indirect sampler arrays. (v2)")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-01 07:16:24 +10:00
Samuel Pitoiset 05902a6686 tgsi: fix out of bounds access in exec_atomop()
The number of channels must be 4 for all RGBA components.

Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-01 07:15:16 +10:00
Brian Paul 9076e04934 tgsi: split tgsi_util_get_texture_coord_dim() function into two
It was kind of overloaded, returning two different things.  Now get
the index of the shadow reference src register with a new
tgsi_util_get_shadow_ref_src_index() function.

To verify the new code, I added some temp/debug code which looped
over all TGSI_TEXTURE_x values, calling the old function and new and
checking that the returned indexes matched.

Also tested piglit "shadow" tests with softpipe/llvmpipe.
No testing of ilo and radeonsi changes.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:48:00 -06:00
Brian Paul 9d7cd43988 tgsi: skip texture query opcodes when examining texture targets
Should fix the assertion in piglit
spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the
TXQ instruction specifies a 2D target but the sampler view was
declared as SHADOW2D.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-31 09:47:40 -06:00
Dave Airlie eb9ad9faa3 softpipe: add image support to softpipe (v3)
This adds support for ARB_shader_image_load_store to softpipe.

v2: add RESQ support (Ilia)
v3: constify, cleanup internals, add some comments (Brian).

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:16 +10:00
Dave Airlie 0d1f679ded draw: add support for passing images to vs/gs shaders.
This just adds support for passing through images to the
tgsi execution stage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:11 +10:00
Dave Airlie 22d1296013 tgsi: add support for image operations to tgsi_exec. (v2.1)
This adds support for load/store/atomic operations on images
along with image tracking support.

v2: add RESQ support. (Ilia)
v2.1: constify interface (Brian)
split get_image_coord_dim (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:05 +10:00
Dave Airlie 827393b76f tgsi: introduce NonHelperMask
This is a mask of which of the current 2x2 grid are non-helper
invocations. This allows us to mask off the helper invocations
later for the image operations.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:50 +10:00
Dave Airlie ca180c09bb tgsi_exec: handle execmask when doing indirect lookups
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:46 +10:00
Dave Airlie 1ff4cc0535 tgsi_exec: add support for up to 3 address registers (v2)
v2: be consistent with other definitions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:08 +10:00
Roland Scheidegger 2d3b8aefda tgsi: (trivial) only verify target for is_tex instructions
d3d10 state tracker does not encode (valid) target (only offsets are
really used from the texture bits), since that information always comes
from the sview dcl, and not the instruction (note the meaning of target
is actually slightly different between gl and d3d10 in any case, because
d3d10 target does never include shadow bit).
Also move the msaa sampler identification as well - would need to set that
on the sview not sampler, so while this does not fix it make it at least
obvious it won't work with sample instructions.
2016-03-30 04:26:54 +02:00
Brian Paul 5c85c3be26 tgsi: simplify tgsi_shader_info::is_msaa_sampler checking
We assert that fullinst->Instruction.Texture != 0 above so no need to
check it in the conditional.  We also have the fullinst->Texture.Texture
value in a local variable, so use it.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:46 -06:00
Brian Paul 86e1768c13 tgsi: collect texture sampler target info in tgsi_scan_shader()
Texture sample instructions specify a sampler unit and texture target
such as "1D", "2D", "CUBE", etc.  Sampler view declarations also specify
the sampler unit and texture target.

This patch checks that the texture instructions agree with the declarations
and collects the texture target type for each sampler unit.

v2: only compare instruction's texture target to the sampler view declaration
target if the instruction is a TEX instruction, not a SAMPLE instruction.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:46 -06:00
Rovanion Luckey 7087e0ab27 gallium: Format code in pb_buffer_fenced.c according to style guide.
This is a tiny housekeeping patch which does the following:

  * Replaced tabs with three spaces.
  * Formatted oneline and multiline code comments. Some doxygen
    comments weren't marked as such and some code comments were marked
    as doxygen comments.
  * Spaces between if- and while-statements and their parenthesis.

According to the mesa coding style guidelines.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-29 13:44:11 -06:00