Commit Graph

32726 Commits

Author SHA1 Message Date
Roland Scheidegger 20c77ae639 gallium/util: remove some block alignment assertions
These assertions were revisited a couple of times in the past, and they
still weren't quite right.
The problem I was seeing (with some other state tracker) was a copy between
two 512x512 s3tc textures, but from mip level 0 to mip level 8. Therefore,
the destination has only size 2x2 (not a full block), so the box width/height
was only 2, causing the assertion to trigger for src alignment.
As far as I can tell, such a copy is completely legal, and because a correct
assertion would get ridiculously complicated just get rid of it for good.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-25 19:52:24 +02:00
Rob Clark 4aa69cc425 meson: build freedreno
Mostly copy/pasta from Dylan Baker's conversion of nouveau and i965.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-24 15:33:40 -04:00
Rob Clark 0ca8d53215 freedreno/ir3: use a flag instead of setting PYTHONPATH
Similar to 848da66222, pass an arg to
ir3_nir_trig.py to add to python path, rather than using $PYTHONPATH,
to prep for meson build support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-24 15:33:40 -04:00
Rob Clark eed9685dd6 freedreno: per-context fd_pipe
To enable per-context priorities, we need to have per-context pipe's.
Unfortunately we still need to keep the global screen pipe, mostly just
for screen->get_timestamp().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-24 12:56:51 -04:00
Rob Clark 9c32333a58 freedreno: rename pipe -> vsc_pipe
To add context priority support we need to have an fd_pipe per context,
rather than per-screen.  Which conflicts with existing ctx->pipe (which
is actually a visibility stream pipe (hw resource).  So just rename it.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-24 12:56:51 -04:00
Rob Clark 7e7096307a freedreno: pass context flags through to fd_context_init()
Prep work for later patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-24 12:56:51 -04:00
Brian Paul 7a6c6e73a8 gallium/util: use util_snprintf() in u_socket_connect()
Instead of plain snprintf().  To fix the MSVC build.

snprintf() is used in various places in Mesa/gallium, but apparently,
not in code built with MSVC.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-24 08:17:15 -06:00
Marek Olšák 2a414c3961 radeonsi: postponed KILL isn't postponed anymore, but maintains WQM
This restores performance for the drirc workaround, i.e.
KILL_IF does:
   visible = src0 >= 0;
   kill_flag &= visible; // accumulate kills
   amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only

And all helper pixels are killed at the end of the shader:
   amdgcn_kill(kill_flag);

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák da0083f123 radeonsi: use postponed KILL only when derivatives are used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák 1ff9e27cbd ac: replace ac_build_kill with ac_build_kill_if_false
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Brian Paul 069211f205 gallium/util: don't call close() on Windows in u_tests.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:44 -06:00
Brian Paul 5134c0dedf mesa: use util_strdup() macro in u_debug_symbol.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:38 -06:00
Brian Paul 6230773936 gallium/util: replace gethostbyname() with getaddrinfo()
Compiling with MSVC options /we4995 /we4996 (a subset of /sdl) generates
a warning that the gethostbyname() function is deprecated in favor of
getaddrinfo() or GetAddrInfoW().  Replace the call with getaddrinfo().

Untested.  There are no callers to u_socket_connect() in Gallium.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 15:10:01 -06:00
Dylan Baker d4567efa5c meson: build imx driver
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-23 11:45:55 -07:00
Dylan Baker 51558a1d6c meson: build etnaviv driver + winsys
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-23 11:45:38 -07:00
Bas Nieuwenhuizen a548b727a1 ac/nir: Only clamp shadow reference on radeonsi.
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.

I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.

This at least fixes radv dEQP-VK.texture.shadow.* on VI.

Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 09:13:38 +02:00
Stefan Schake e5fea0d621 broadcom/vc4: Fix aliasing issue
This was causing Android clang version 3.8.256229 to miscompile,
presumably due to strict aliasing.

Fixes: 14dc281c13 ("vc4: Enforce one-uniform-per-instruction after optimization.")
2017-10-20 17:09:35 -07:00
Andres Rodriguez 557de3b9ae radeonsi: hardcode shader WAVE_LIMIT to the maximum value
This is part of a cooperative scheduling approach used by radv. All
drivers in the stack must opt-in to resource arbitration, otherwise GL
based apps will be able to ignore system priorities.

We always hardcode the field to its maximum value, instead of attempting
to calculate an approximate usage. In testing, there were no benefits to
using anything other than the maximum.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Eric Anholt 0e6fee7328 broadcom/vc5: Fix pasteo that broke vertex texturing.
We weren't ever filling in the texture state record, so we'd dereference
NULL from the shader.
2017-10-20 15:59:41 -07:00
Eric Anholt 34690536a7 broadcom/vc5: Move default attribute value setup to the CSO and fix them.
I was generating some stub values to bring the driver up, but fill them in
properly now.  We now set 1.0 or 1u as appropriate, and thanks to being in
their own BO it fixes piglit failures on the 7268 (where our 4-byte
alignment was insufficient).

Fixes const-packHalf2x16.shader_test
2017-10-20 15:59:41 -07:00
Eric Anholt fb15168919 broadcom/vc5: Move most of the shader state attribute record to the CSO.
This should reduce our draw-time overhead, and puts the code where it
should go long term.
2017-10-20 15:53:55 -07:00
Eric Anholt f4ff8f74ee broadcom/vc5: Fix build failure frm nir_shader::stage removal.
Fixes: 59fb59ad54 ("nir: Get rid of nir_shader::stage")
2017-10-20 15:53:55 -07:00
Jason Ekstrand 59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Christian Gmeiner 65ccee2dc2 etnaviv: fix implicit conversion warning
Galliums query_type used in APIs is unsigned.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:55 +02:00
Christian Gmeiner 57a586828f etnaviv: enable occlusion query if GPU supports it
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:48 +02:00
Christian Gmeiner 246243d447 etnaviv: add support for occlusion queries
Passes most occlusion query piglits. The following piglits are broken:
- spec@arb_occlusion_query@occlusion_query_meta_fragments
- spec@arb_occlusion_query@occlusion_query_meta_save
- spec@arb_occlusion_query2@render

v1 -> v2:
 - use one sample provider for all occlusion queries tyes
 - add comment about 'magic' value 0x1DF5E76

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:44 +02:00
Christian Gmeiner 282d8698ec etnaviv: add basic infrastructure for hw queries
No hardware query is supported yet.

v1 -> v2
 - removed query_type from strcut etna_hw_sample_provider

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:40 +02:00
Christian Gmeiner b8c335c91b etnaviv: update headers from rnndb
Update to etna_viv commit 6c9c706.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:35 +02:00
Chris Wilson 5c5618338a egl,dri: Propagate context priority hint to driver->CreateContext
Jump through the layers of abstraction between egl and dri in order to
feed the context priority attribute through to the backend. This
requires us to read the value from the base _egl_context, convert it to
a DRI attribute, parse it again in the generic context creator before
passing it to the driver as a function parameter.

In order to not require us to pass back the actual value of the context
priority after creation, we impose that drivers should report the
available set of priorities during screen setup (and then they may chose
to fail if given an invalid value as that should have been checked at
the user boundary.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net> # i915/i965
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
George Kyriazis f9d239e11f swr: Rework scratch space allocation
Remove allocation of > 2kbyte buffers into context memory in
swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
and shader constants to a scratch space to be used by the upcoming draw.)

Large shader constant allocations need to be done in the circular scratch
buffer instead of context memory, because their values persist across
render calls.

Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
buffers will get too large for the circular scratch space.

Fixes render issues with CEI Ensight.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 20:18:09 -05:00
Dylan Baker a447f9fe7b meson: don't build gallium dri target if gallium is disabled
Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and
the megadriver install script to attempt to install drivers without any
actual drivers being built.

fixes: 66f97f6640 ("meson: build radeonsi")
Reported-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2017-10-19 15:17:34 -07:00
Tim Rowley bfda35c8dd swr: knob overrides for Intel Xeon Phi
Architecture benefits from having more threads/work outstanding.

Patch by Jan Zielinski.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley 028ffa5e18 swr/rast: Add api to override draws in flight
Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO.

Patch by Jan Zielinski.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley 2559f2b93e swr/rast: Widen fetch shader to SIMD16 (disabled for now)
Refactored the gather operation to process 16 elements at a time via
paired SIMD8 operations.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley 49090ccf54 swr/rast: Change DS memory allocation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley 04ea03d99d swr/rast: Fix indentation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley 62e2d657c8 swr/rast: Miscellaneous viewport array code changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley ed1db803fa swr/rast: Minor changes for os-x
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley 33bdbc1db4 gallium: add more exceptions to tgsi_util_get_inst_usage_mask
A number of double/int64 operations don't have matching
read and write usage masks, which the fallthrough case of
tgsi_util_get_inst_usage_mask assumes for componentwise
tagged instructions.

No regressions in llvmpipe piglit; fixes a large number of
swr regressions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 12:49:32 -05:00
Roland Scheidegger 77b8392858 tgsi: fix tgsi_util_get_inst_usage_mask
The logic for handling shadow coords was completely broken.
Fixes be3ab867bd.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 16:33:39 +02:00
Michel Dänzer 8c9e7c9638 st/osmesa: include u_inlines.h for pipe_resource_reference
Fixes build failure due to unresolved symbol.

Fixes: 7561da367b "st/mesa: Initialize textures array in
                     st_framebuffer_validate"

Trivial.
2017-10-18 18:44:58 +02:00
Michel Dänzer 7561da367b st/mesa: Initialize textures array in st_framebuffer_validate
And just reference pipe_resources to it in the validate callbacks.

Avoids pipe_resource leaks when st_framebuffer_validate ends up calling
the validate callback multiple times, e.g. when a window is resized.

v2:
* Use generic stable tag instead of Fixes: tag, since the problem could
  already happen before the commit referenced in v1 (Thomas Hellstrom)
* Use memset to initialize the array on the stack instead of allocating
  the array with os_calloc.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2017-10-18 18:28:00 +02:00
Roland Scheidegger 3d0deed12a llvmpipe: handle shader sample mask output
This probably isn't all that useful for GL, but there are apis where
sample_mask is a valid output even without msaa.
Just discard the pixel if the sample_mask doesn't include the bit for
sample 0.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-18 18:16:44 +02:00
Eric Anholt 4f3e380fa0 meson: Add support for the vc5 driver.
v2: Default vc5 to off, since it requires the simulator currently.  Add
    missing dep on the XML generation from libbroadcom_vc5.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
2017-10-17 13:41:59 -07:00
Eric Anholt 1918c9b162 meson: Add support for the pl111 driver.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 13:41:59 -07:00
Eric Anholt 1ae8018a6a meson: Add support for the vc4 driver.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 13:41:59 -07:00
Marek Olšák 2f4705afde radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer
SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0
if there is no other buffer there.

Benefits:
- there is no constbuf descriptor upload and shader load

It's assumed that all constant addresses are within bounds. Non-constant
addresses are clamped against the last declared CONST variable.
This only works if the state tracker ensures the bound constant buffer
matches what the shader needs.

Once we get 32-bit pointers, we can only do this for user constant buffers
where the driver is in charge of the upload so that it can guarantee a 32-bit
address.

The real performance benefit might not be measurable.

These apps get 100% theoretical benefit in all shaders (except where noted):
- antichamber
- barman arkham origins
- borderlands 2
- borderlands pre-sequel
- brutal legend
- civilization BE
- CS:GO
- deadcore
- dota 2 -- most shaders
- europa universalis
- grid autosport -- most shaders
- left 4 dead 2
- legend of grimrock
- life is strange
- payday 2
- portal
- rocket league
- serious sam 3 bfe
- talos principle
- team fortress 2
- thea
- unigine heaven
- unigine valley -- also sanctuary and tropics
- wasteland 2
- xcom: enemy unknown & enemy within
- tesseract
- unity (engine)

Changed stats only:
    SGPRS: 2059998 -> 2086238 (1.27 %)
    VGPRS: 1626888 -> 1626904 (0.00 %)
    Spilled SGPRs: 7902 -> 7865 (-0.47 %)
    Code Size: 60924520 -> 60982660 (0.10 %) bytes
    Max Waves: 374539 -> 374526 (-0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 854593b8eb ac: clean up ac_build_indexed_load function interfaces
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák cdb21dfffa radeonsi: handle 64-bit loads earlier in fetch_constant
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák ee0e1a47ce radeonsi: add si_descriptors::gpu_address and remove buffer_offset
This allows us to change the pointer arbitrarily.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 6d2664880c radeonsi: unify code for extracting a buffer address from a descriptor
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 8d2685d129 radeonsi: remove atom parameter from si_upload_descriptors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 4ddce1b1a4 radeonsi: pack si_descriptors better again
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 859eeffb3d radeonsi: emit dirty consecutive pointers in one SET_SH_REG packet
IB size: -1.6%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 36626ffe46 radeonsi: split si_emit_shader_pointer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 69325fa88d radeonsi: generalize the SI_VS_SHADER_POINTER_MASK macro
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 79c2e7388c radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON
IB size: -0.4%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák b762a08896 radeonsi/gfx9: move RW_BUFFERS from s[0:1] to s[8:9] for HS and GS
Let's use the same user data SGPRs in all stages.
(for SPI_SHADER_USER_DATA_COMMON_0)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 0aafedbbb2 radeonsi: add GFX-IB-size query to the HUD
It shows the sum of all IBs per frame.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 4d944c72b1 winsys/amdgpu: disable CPU caching for GFX & SDMA IBs
This should decrease IB fetch latency.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák 49f5ce39c1 winsys/amdgpu: don't do read-modify-write on command buffers
i.e. don't use |=

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Eric Anholt cde209960c broadcom/vc4: Fix false-positive for the tiling ioctls on simulator mode.
If there happened to be an ENOENT laying around, we would try using the
ioctls later and fail out resource allocation.
2017-10-17 12:35:16 -07:00
Eric Anholt b202f90f65 broadcom/vc4: Skip BO labeling when in simulator mode.
It was calling down into i915 trying to label the BO, which is definitely
not the right thing.
2017-10-17 12:35:16 -07:00
Eric Anholt d623a34ab2 broadcom/vc5: Don't forget to set the RT format for 1555 textures.
Fixes dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb5_a1
2017-10-17 12:35:16 -07:00
Mark Thompson 31fb7bbe0b st/va: Return correct width and height for encode/decode support
Previously this would return the largest possible buffer size, which is
much larger than the codecs themselves support.  This caused confusion
when client applications attempted to decode 8K video thinking it was
supported when it isn't.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-17 08:23:55 -04:00
Mark Thompson ba28c1c9f7 st/va: Fix config entrypoint handling
Consistently use it as a PIPE_VIDEO_ENTRYPOINT.

v2: Return an error if the entrypoint is not set (Christian).

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-17 08:23:55 -04:00
Mark Thompson b6f41e393e st/va: Disable vaExportSurfaceHandle()
This is not in libva 2.0, so it shouldn't be enabled yet.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
2017-10-17 08:23:55 -04:00
Dylan Baker 6a9ad20b7c meson: build llvmpipe
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker de24d61765 meson: build softpipe
This doesn't include llvmpipe.

v2: - Fix inconsistent use of with_gallium_swrast and
      with_gallium_softpipe.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker 813b4b09f9 meson: build nouveau (gallium) driver
Tested with a GK107.

v2: - Add target for nouveau standalone compiler. This target is not
      built by default.
v3: - Add nouveau to list of drivers built by default

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker b154b44ae3 meson: build radeonsi gallium driver
This hooks up the bits necessary to build gallium dri drivers, with
radeonSI as the first example driver. This isn't tested yet.

v4: - drop radeonsi generated header from sources.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker 66c94b9313 meson: build gallium winsys for dri, null, and wrapper
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker 66f97f6640 meson: build radeonsi
This builds the radeonsi (and radeon) window system bits and gallium
driver bits.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker f3d03a2cf7 meson: Build gallium dri state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker 4d701ee969 meson: build gallium helper drivers
This builds ddebug, noop, rbug, and trace drivers.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker d451a11b21 meson: Build gallium pipe-loader
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-16 16:32:43 -07:00
Dylan Baker b1b65397d0 meson: Build gallium auxiliary
v2: - guard gallivm files with "with_llvm" instead of "dep_llvm.found()"

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2017-10-16 16:32:43 -07:00
Eric Engestrom b05820621d svga: format the version string like the rest of mesa
All 4 other version strings do it like this.
((Also, double parentheses just look confusing))

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-16 18:52:41 +01:00
Brian Paul 4542a63254 svga: fix format_conversion_table breakage
The new A1B5G5R5_UNORM, X1B5G5R5_UNORM formats were added in the
wrong place in commit ef874ee450.

Fixes: ef874ee450 "gallium: Add support for 5551 with the 1-bit field in the low bit."

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-16 10:58:02 -06:00
Ilia Mirkin 790b5c4a38 a2xx: add support for a few 16-bit color rendering formats
The rest should be possible too, just needs some additional
investigation. Passes fbo-*-formats piglit tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-15 12:09:21 -04:00
Wladimir J. van der Laan d3af7f5153 freedreno/a20x: Enable rendering to RGBA/RGBX
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-15 12:09:14 -04:00
Wladimir J. van der Laan c10eeb454d freedreno/a20x: Fix rendering to BGRX
Make sure that BGRX rendering is swapped the correct way around.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-15 12:09:03 -04:00
Lucas Stach 4daee6733f etnaviv: rework TS enable to be a derived state
Draw operations should not use the TS if the TS buffer content is invalid,
as this leads to wrong rendering or even GPU hangs. As the TS valid status
can change between draws (clear operations changing it to valid, blits using
the RS to the color or ZS buffer changing it to invalid), the TS_MEM_CONFIG
must be updated before each draw if the status has changed.

This fixes the remaining TS related piglit failures (regressions of a
standard run against a piglit run with TS completely disabled).

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-14 16:40:08 +02:00
Lucas Stach 34360ac6ed etnaviv: skip unused vertex attributes when assigning VS inputs
When not all of the vertex attributes are actually used in the shader,
we end up with some inputs without an assigned reg. Those are marked
as invalid and must be skipped when assigning the inputs, as those would
overwrite other valid inputs otherwise.

Fixes piglit drawpixels and a bunch of other tests using the st_draw path.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-14 16:39:46 +02:00
Mark Thompson e7f24859ca st/dri: Add definitions to allow importing 16-bit surfaces
Necessary to support P010/P016 surfaces for video.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Leo Liu <leo.liu@amd.com>
2017-10-13 08:11:47 -04:00
Eric Anholt dbf9e4fbf8 broadcom/vc5: Remove the u_resource_vtbl usage.
Like for vc4, this was just a wasted indirection.
2017-10-12 12:44:27 -07:00
Marek Olšák f536f45250 radeonsi: implement sync_file import/export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 21:07:48 +02:00
Marek Olšák 162502370c winsys/amdgpu: implement sync_file import/export
syncobj is used internally for interactions with command submission.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 21:07:41 +02:00
Marek Olšák 11adea4b24 ac: add radeon_info::has_sync_file
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 21:04:56 +02:00
Marek Olšák 255573996c st/dri: implement __DRIimageExtension::validateUsage properly
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 19:03:33 +02:00
Marek Olšák 07fdc0a09c gallium: add pipe_screen::check_resource_capability
This is optional (and no CAP).

Implemented by radeonsi, ddebug, rbug, trace.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 19:03:33 +02:00
Wladimir J. van der Laan 78ade65956 etnaviv: Do GC3000 resolve-in-place when possible
If an RS blit is done with source exactly the same as destination, and
the hardware supports this, do an in-place resolve. This only fills in
tiles that have not been rendered to using information from the TS.

This is the same as the blob does and potentially saves significant
bandwidth when doing i.MX6qp scanout using PRE, and when rendering to
textures (though here using sampler TS would be even better).

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-12 16:03:26 +02:00
Nicolai Hähnle bc2d874101 radeonsi: add support for PIPE_FORMAT_{X1,A1}R5G5B5_UNORM
Fixes dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-12 08:42:55 +02:00
Nicolai Hähnle 9f55da130e gallium: add tests for PIPE_FORMAT_{X1,A1}B5G5R5_UNORM formats
This is a left-over from my version of adding the new format
after rebasing on Eric's version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-12 08:42:55 +02:00
Tim Rowley e484805352 swr: simd16 shaders work in progress
Start building vertex shaders as simd16.

Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-11 14:35:23 -05:00
Tim Rowley 9cad9cbaf8 gallium: allow 512-bit vectors
Increase the max allowed vector size from 256 to 512.

No piglit llvmpipe regressions running on avx2.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-11 14:34:31 -05:00
Ilia Mirkin b20bccbcac nv50,nvc0: fix push hint logic in presence of a start offset
Previously buffer offsets were passed in explicitly as an offset, which
had to be added to the resource address. Now they are passed in via an
increased 'start' parameter. As a result, we were double-adding the
start offset in this kind of situation.

This condition was triggered by piglit's draw-elements test which has a
requisite glMultiDrawElements in combination with a small enough number
of vertices to go through the immediate push path.

Fixes: 330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
Reported-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-11 08:18:16 -04:00
Rob Herring e5e93c727f Android: fix build break from r600/radeon split
Commit 06bfb2d28f ("r600: fork and import gallium/radeon") broke the
Android build:

external/mesa3d/src/gallium/drivers/radeon/r600_pipe_common.c:43:10: fatal error: 'llvm-c/TargetMachine.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~

Update the Android makefiles so that drivers/radeon is only built when
radeonsi (and therefore LLVM) is enabled.

Fixes: 06bfb2d28f (r600: fork and import gallium/radeon)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-10 21:37:19 -05:00
Dave Airlie 96e85709df r600: cleanup llvm ir target selection.
Only r600 target used now for compute IR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:40:04 +10:00
Dave Airlie ce0ee31890 r600: drop tc_L2_dirty bit, this was SI only.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:39:32 +10:00
Dave Airlie 80bbdb1483 radeonsi: lower ffma in nir to mad.
This lowers ffma to a * b + c.

This seems like it should keep Marek happiest, so
we'd never get to the fma instruction emission code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:33:32 +10:00
Eric Anholt 2687183a34 broadcom/vc5: Fix handling of 5551 textures using the new gallium format.
Like vc4, we have the alpha in the low bit.  Fixes a bunch of piglit
texwrap failures.
2017-10-10 11:42:06 -07:00
Eric Anholt f4b5158874 broadcom/vc5: Set the RCL's MSAA mode to match the BCL's MSAA state. 2017-10-10 11:42:06 -07:00
Eric Anholt ae9a56db6a braodcom/vc5: Set up clear color for higher-bpp formats.
Fixes arb_color_buffer_float-clear
2017-10-10 11:42:06 -07:00
Eric Anholt c0561808c0 broadcom/vc5: Set up per-MRT clear colors.
Fixes fbo-mrt-alphatest.
2017-10-10 11:42:06 -07:00
Eric Anholt 5208d2889e broadcom/vc5: Fix blendfactor zero handling.
I cut the line out to move it up to the top, when putting "0" in the
switch made the compiler complain that that wasn't a valid enum.
2017-10-10 11:42:06 -07:00
Eric Anholt 4b7de2a360 broadcom/vc5: Add support for f32 render targets.
The TLB write code is getting ugly and needs a refactoring (that will
hopefully handle TLBU uniform coalescing as well).
2017-10-10 11:42:06 -07:00
Eric Anholt f2e6e1bbc3 broadcom/vc5: Fix color masks for non-independent blending.
This gets fbo-mrt-alphatest working except for the second RT's clear color.
2017-10-10 11:42:06 -07:00
Eric Anholt 476db7e66b broadcom/vc5: Make the BCL's number of render targets setup match the RCL. 2017-10-10 11:42:06 -07:00
Eric Anholt 8b4c00a7b2 braodcom/vc5: Fix tile size setup for MRTs.
We need to divide the TLB in two for the 2nd color buffer, and again if
the 3rd or 4th are present.
2017-10-10 11:42:06 -07:00
Eric Anholt dc25a83a7a broadcom/vc5: Start hooking up multiple render targets support.
We now emit as many TLB color writes as there are color buffers.
2017-10-10 11:42:05 -07:00
Eric Anholt f0ee7d6ba8 broadcom/vc5: Add support for GL_EXT_provoking_vertex.
The bit was missing from the spec, but it's there in the simulator.  Fixes
the piglit clipflat test.
2017-10-10 11:42:05 -07:00
Eric Anholt f4133865d1 braodcom/vc5: Find the actual first TF output for our TF spec.
This doesn't yet support PSIZ, but gets us at least some of TF working.
2017-10-10 11:42:05 -07:00
Eric Anholt bd94f6821e broadcom/vc5: Fix translation of transform feedback's output_register field.
It's a NIR driver_location, not a slot offset.
2017-10-10 11:42:05 -07:00
Eric Anholt d8bc9c71df broadcom/vc5: Mark our primitives as needing TF processing.
The TF enable state appears to stick around until the next TF enable
packet is sent, so we only want to request TF when the shader is using it.
2017-10-10 11:42:05 -07:00
Eric Anholt 28105560f7 broadcom/vc5: Fix setup of TF dword output count.
I missed the "- 1" when reading the spec.
2017-10-10 11:42:05 -07:00
Eric Anholt 3ac8a2a4ba broadcom/vc5: Fix up a comment from vc4 about the predraw texture setup. 2017-10-10 11:42:05 -07:00
Eric Anholt ec5af12b5d broadcom/vc5: Flush the job when mapping a transform feedback buffer.
We will want something fancier for reusing a TF output within the same
frame, but we at least need this in order for piglit tests to work.
2017-10-10 11:42:05 -07:00
Eric Anholt 361c5f28bd broadcom/vc5: Fix handling of interp qualifiers on builtin color inputs.
The interpolation qualifier, if specified, is supposed to take precedence
over glShadeModel().
2017-10-10 11:42:05 -07:00
Eric Anholt d0dfc4bd5f broadcom/vc5: Fix CLIF dumping of lists that aren't capped by a HALT.
The HW will halt when you hit a HALT packet, or when you hit the end
address.  Tell CLIF if there's an end address is so that it can stop
correctly.  (There was usually a 0 byte after the CL, so it would stop
anyway).
2017-10-10 11:42:05 -07:00
Eric Anholt 7f3b890697 broadcom/vc5: Fix depth and stencil clear values.
I had misread the packet description: We always have a 32f depth, and a
separate u8 stencil.
2017-10-10 11:42:05 -07:00
Eric Anholt be11251e3c broadcom/vc5: Add missing Z16 format.
We can render to and sample from it just fine.
2017-10-10 11:42:05 -07:00
Eric Anholt e20c82c550 braodcom/vc5: Fix incorrect early Z writes in discard shaders.
Fixes glsl-fs-discard-02.
2017-10-10 11:42:05 -07:00
Eric Anholt c25de31824 broadcom/vc5: Add proper support for base_vertex and base_instance.
I had base_vertex hacked into the shader state setup like in vc4, but it's
not correct for big offsets.  Using the proper packet is easier and
hopefully means we can re-emit shader state setup less frequently.
2017-10-10 11:42:05 -07:00
Eric Anholt 24c8bbbb75 broadcom/vc5: Use supertiles and generic tile lists.
This massively reduces the size of our RCL setup.  It also gets us closer
to supporting multicore platforms.
2017-10-10 11:42:05 -07:00
Eric Anholt 45bb8f2957 broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.
V3D 3.3 is a continuation of the 3D implementation in VC4 (v2.1 and v2.6).
V3D 3.3 introduces an MMU (no more CMA allocations) and support for
GLES3.1.  This driver is not currently conformant, though that will be a
target as soon as possible.

V3D 3.x parts use a new texture tiling layout common across many Broadcom
graphics parts including and the HVS scanout engine.  It also massively
changes the QPU instructions, introducing a common physical register file
(no more A/B split) and half-float instructions, while removing the 4x8
unorm instructions in favor of half-float for talking to fixed function
interfaces.  Because so much has changed, vc5 is implemented in a separate
gallium driver, using only the XML code-generation support from vc4.

v2: Fix tile layout for 64bpp textures.  Fix texture swizzling for 32-bit
    returns.  Fix up a bit of MRT setup.  Sync the simulator to kernel
    behavior a bit more.  Improve uniform debugging code.  Rebase on
    QIR->VIR rename.  Move texture state mostly to the CSOs.  Improve
    cache flushing on the simulator.  Fix program deletion
    use-after-frees.

Acked-by: Dave Airlie <airlied@gmail.com> (uabi plan)
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (uabi plan)
2017-10-10 11:42:04 -07:00
Eric Anholt c34295b1a3 nir: Move vc4's alpha test lowering to core NIR.
I've been doing this inside of vc4, but vc5 wants it as well and it may be
useful for other drivers (Intel has a related path for pre-gen6 with MRT,
and freedreno had a TGSI path for it at one point).

This required defining a common enum for the standard comparison
functions, but other lowering passes are likely to also want that enum.

v2: Add to meson.build as well.

Acked-by: Rob Clark <robdclark@gmail.com>
2017-10-10 11:42:04 -07:00
Eric Anholt 087b39a346 broadcom/vc4: Expose PIPE_CAP_TILE_RASTER_ORDER
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.

v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Split from the core gallium commit, drop some unnecessary code related
    to glBlitFramebuffer(), fix a crash with clears before state has been
    bound.
2017-10-10 10:45:22 -07:00
Eric Anholt ac0051a507 gallium: Create a new PIPE_CAP_TILE_RASTER_ORDER for vc4.
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.

This commit introduces the core gallium support, vc4 changes will follow.

v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Drop vc4 changes from this commit, for clarity.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
2017-10-10 10:45:22 -07:00
Eric Anholt 4aa700e0e0 broadcom/vc4: Implement GL_ARB_texture_barrier.
Improves x11perf -copywinwin100 from ~2000/sec to ~4700/sec.  More
importantly, this is a prerequisite for the new GL_MESA_tile_raster_order
extension.
2017-10-10 10:45:22 -07:00
Eric Anholt cbb532429b vc4: Add support for 5551 textures.
This keeps us from promoting them up to 8888, at the cost of not being
color-renderable.
2017-10-10 09:31:29 -07:00
Eric Anholt ef874ee450 gallium: Add support for 5551 with the 1-bit field in the low bit.
This is how VC4 stores 5551 textures, which we need to support for
GL_OES_required_internalformat.

v2: Extend commit message, fix svga driver build, add BE ordering from
    Roland.
v3: Rebase on PIPE_FORMAT_R10G10B10X2_UNORM addition.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
2017-10-10 09:31:29 -07:00
Nicolai Hähnle fbcae1897b st_api: remove unused get_resource_for_egl_image
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:48 +02:00
Nicolai Hähnle e14fe41e0b st/dri: implement createImageFromRenderbuffer(2)
Tested with dEQP-EGL.functional.image.*renderbuffer* tests.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-10 13:58:48 +02:00
Nicolai Hähnle 1b592d30c5 u_threaded_context: fix a memory leak
The uploaders can own transfers which need to be unmapped. Destroy them
before the final sync (they're not used from the driver thread anyway)
so that the transfer_unmap call is processed by the driver.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:46 +02:00
Lucas Stach ca949e00d8 etnaviv: update HW headers and fix provoking vertex
Now that the real meaning of the 2 bits in PA_SYSTEM_MODE is known,
we can set them according to the rasterizer state, which fixes uses
that are setting provoking vertex first.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-10 12:32:24 +02:00
Lucas Stach 0ab59f120b etnaviv: remove flat shading workaround
It turned out not to be a hardware bug, but the shader compiler
emitting wrong varying component use information. With that fixed
we can turn flat shading back on.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:30:34 +02:00
Lucas Stach cedab87e76 etnaviv: fix varying interpolation
It seems that newer cores don't use the PA_ATTRIBUTES to decide if the
varying should bypass the flat shading, but derive this from the component
use. This fixes flat shading on GC880+.

VARYING_COMPONENT_USE_POINTCOORD is a bit of a misnomer now, as it isn't
only used for pointcoords, but missing a better name I left it as-is.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:29:35 +02:00
Lucas Stach 03b1f8ba20 etnaviv: fix bogus flush requests in transfer handling
The logic to decide if we need to flush the GPU command stream was broken
and hard to reason about. Fix and clarify this.

Fixes the data sync subtests from piglit arb_vertex_buffer_object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:29:03 +02:00
Ilia Mirkin ce6da2a026 nv50/ir: fix 64-bit integer shifts
TGSI was adjusted to always pass in 64-bit integers but nouveau was left
with the old semantics. Update to the new thing.

Fixes: d10fbe5159 (st/glsl_to_tgsi: fix 64-bit integer bit shifts)
Reported-by: Karol Herbst <karolherbst@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 20:42:59 -04:00
Christian Gmeiner 148604fe75 etnaviv: call util_query_clear_result(..) in the generic layer
Saves us from calling util_query_clear_result(..) in every query
type implementation.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-09 22:19:47 +02:00
Christian Gmeiner b22bacc6cf etnaviv: push query active handling into generic layer
We want the same active handling for every query type. So lets
handle it in the generic layer.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-09 22:19:31 +02:00
Dave Airlie bee61d16c8 r600: drop a bunch of post-cayman code. (v2)
Now that Marek has split the two drivers apart, drop a bunch
of unnecessary code from the r600 half. There is probably a bunch
more hiding in the video code.

No piglit regressions on caicos.

v2: fix HAVE_LLVM protected code
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-10 06:08:42 +10:00
Marek Olšák 7b697c8b78 amd: move r600d_common.h into r600g
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:06 +02:00
Marek Olšák 76997e9133 radeonsi: shrink r600d_common.h and stop using it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:05 +02:00
Marek Olšák 0ecf9b90ef radeonsi: import cayman_msaa.c from drivers/radeon
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:04 +02:00
Marek Olšák 345f04ed92 radeonsi: remove r600_emit_reloc
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:02 +02:00
Marek Olšák da61946cb1 radeonsi: merge si_set_streamout_targets with si_common_set_streamout_targets
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:00 +02:00
Marek Olšák a86c9328ce radeonsi: add si_so_target_reference
The src type is different on purpose.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:58 +02:00
Marek Olšák 65f2e33500 radeonsi: import r600_streamout from drivers/radeon
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:55 +02:00
Marek Olšák ed7f27ded8 radeonsi: add performance thresholds for CP DMA, decrease it for clears
The first one isn't used yet.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:24:21 +02:00
Marek Olšák 8e969cce38 radeonsi: disable primitive binning on Vega10 (v2)
Our driver implementation is known to decrease performance for some tests,
but we don't know if any apps and benchmarks (e.g. those tested by Phoronix)
are affected. This disables the feature just to be safe.

Set this to enable partial primitive binning:
    R600_DEBUG=dpbb
Set this to enable full primitive binning:
    R600_DEBUG=dpbb,dfsm

v2: add new debug options

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:18 +02:00
Marek Olšák 3784ce9782 radeonsi: enumerize DBG flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:16 +02:00
Marek Olšák 5a47abb63e radeonsi: don't change viewport for blits, use window-space positions
The viewport state was an identity anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 76ef08f6ee radeonsi: set correct PA_SC_VPORT_ZMIN/ZMAX when viewport is disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 13b6c1c031 radeonsi: minor cleanup of si_update_vs_writes_viewport_index
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 5f566faa46 radeonsi: don't save and restore vertex buffers and elements for u_blitter
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 69ccb9dae7 radeonsi: use new VS blit shaders (VS inputs in SGPRs)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 6a8401a94e radeonsi: add VS blit shader creation
no users yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák f3fe6afba8 radeonsi: split declare_default_desc_pointers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 0a3b5a0232 gallium/u_blitter: let drivers decide which VS to use for draw_rectangle
This approach allows drivers to set their own vertex shader and skip
compilation of u_blitter vertex shaders.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák a46bcf0a77 gallium/u_blitter: let drivers set the vertex elements state
radeonsi won't set it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 7f8af4624d gallium/u_blitter: remove blitter_context_priv::viewport
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák f84a63bc00 radeonsi: don't use util_draw_arrays_instanced in si_draw_rectangle
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák 387590accb radeonsi: move si_draw_rectangle into si_state_draw.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák de810f8b84 radeonsi: remove wrappers si_decompress_xx_textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák efd72b31cb gallium/radeon: remove r600_atom::num_dw
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák f1eb9a9c27 gallium/radeon: remove old r600g code checking chip_class and family
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Mark Thompson c4ed39f85b st/va: Implement vaExportSurfaceHandle()
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

v2: Convert surfaces to progressive before exporting them (Christian).

v3: Set destination rectangle to match source when converting (Leo).
    Add guards to allow building with libva1.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-and-Tested-by: Leo Liu <leo.liu@amd.com>
2017-10-07 10:15:14 -04:00
Roland Scheidegger 52b73caaf4 gallivm: don't use pabs intrinsic with llvm version >= 6
The intrinsic is gone, causing shader compilation to crash.
While here, also change the fallback code to match what llvm's auto-updater
of these intrinsics would do (except that there will still be zext/trunc
instructions in there), which should ensure that the sequence gets recognized
and fused back into a pabs in the end (I didn't test this, and it's possible
even the old sequence would get recognized, but I don't see a reason why we
shouldn't use the same sequence in any case).

Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-10-07 00:54:09 +02:00
Tim Rowley 9716c69e22 swr/rast: use proper alignment for debug transposedPrims
Causing a crash in ParaView waveletcontour.py test when
_DEBUG defined due to vector aligned copy with unaligned
address.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-06 13:27:39 -05:00
Marek Olšák c4d1a199f8 radeonsi: add a drirc workaround for HTILE corruption in ARK: Survival Evolved
v2: use DB_META | PS_PARTIAL_FLUSH

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-10-06 02:56:11 +02:00
Marek Olšák 15d918e46f radeonsi: inline struct si_sampler_views
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 23cdde5138 radeonsi: rename si_textures_info -> si_samplers, si_images_info -> si_images
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 3dfb375446 radeonsi: fold needs_*_decompress_mask update into si_set_sampler_view
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák bd5509d0a8 radeonsi: simplify a loop in si_update_fb_dirtiness_after_rendering
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák cceb916456 radeonsi: use f32_0 and f32_1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 1516059ab1 radeonsi: fold *gallivm
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák e1b83c67da radeonsi: lp_type::length is always 1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 906ee3a3ba radeonsi: don't use bld.elem_type
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 723a23905f radeonsi: don't use lp_build_const_*
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák b4600b4740 radeonsi: use ctx->ac.context and ctx->types
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák d0751f6c1f radeonsi: use ctx->ac.builder
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 82dc72c8bd radeonsi: use ctx->i/f32 types more
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák dcbd3d470c radeonsi: use i32_0 and i32_1 more
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák bacdf5a928 radeonsi: use bitcast in a few places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák ad7305aa96 radeonsi: use ac helpers for bitcasts
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák dbe16d7537 radeonsi: implement PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 44993bd26f radeonsi: use si_get_indirect_index for TEMP indexing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák e986a16c16 radeonsi: use si_get_indirect_index for CONST indexing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 212c612a63 tgsi/ureg: allow any register file in address operands
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 41b85158ab gallium: add PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák cb686a340f tgsi/scan: scan address operands (v2)
v2: set swizzled usage mask

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 37714c6df2 tgsi/scan: set correct usage mask for tex offsets in scan_src_operand
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák 5cc779197c tgsi/scan: take advantage of already swizzled usage mask in scan_src_operand
It has always been a usage mask *after* swizzling.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák ea85b76519 tgsi/scan: set non-valid src_index for tex offsets in scan_src_operand
tex offsets are not "Src" operands.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák be3ab867bd tgsi: implement tgsi_util_get_inst_usage_mask properly
All opcodes are handled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák bb8abc10bf tgsi: add docs for some existing pack opcodes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Daniel Stone bbe2082e7d broadcom: Fix out-of-tree build include path
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
2017-10-05 15:03:11 -07:00
Derek Foreman 17d78ece36 broadcom/vc4: Don't advertise tiled dmabuf modifiers if we can't use them
If the DRM_VC4_GET_TILING ioctl isn't present then we can't tell
if a dmabuf bo is tiled or linear, so will always assume it's
linear.

By not advertising tiled formats in this situation we ensure the
assumption is correct.

This fixes a bug where most attempts to render a gl wayland client
under weston will result in a client side abort.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Stone <daniels@collabora.com> (on irc)
2017-10-05 11:26:14 -07:00