Commit Graph

82932 Commits

Author SHA1 Message Date
Marek Olšák 28a03be06b radeonsi: enable string markers and record apitrace call numbers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:13 +02:00
Marek Olšák 642cf400aa ddebug: add an option to dump info about a specific apitrace call
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 1daec2b795 ddebug: implement pipe_context::generate_mipmap
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 50b2235478 ddebug: record and dump apitrace call numbers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 861ecf1ca9 ddebug: implement emit_string_marker
and remove some obsolete comments

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák a446c40e0a gallium/radeon: remove unused code - radeon_llvm_util.*
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák eaccc4e8c8 radeonsi: keep using v_rcp_f32 for division in future LLVM (v2)
This will be needed after some LLVM changes that haven't landed yet.

v2: - use LLVMIsConstant to fix an LLVM assertion failure.
      LLVMSetMetadata doesn't work with constants.
    - don't set float metadata as string

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 1c00086746 radeonsi: remove an obsolete comment
It's not true.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 4d1f32376d radeonsi: don't interpolate colors if flatshading is enabled
use v_interp_mov for those

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 4accb02d7a radeonsi: enable the barycentric optimization in all cases
Handle the bc_optimize SGPR bit if both CENTER and CENTROID are enabled.
This should increase the PS launch rate for big primitives with MSAA.
Based on discussion with SPI guys.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 476e9cee1d radeonsi: compute only one set of interpolation (i,j) when MSAA is disabled
This should increase the PS launch rate for shaders using at least 2 pairs
of perspective (i,j) and same for linear.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák a675c6a000 radeonsi: split ps.prolog.force_persample_interp into persp and linear bits
This reduces the number of v_mov's in the prolog.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Marek Olšák 61010cfac0 radeonsi: don't dump the shader key for non-monolithic shaders early
It's always zero.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-05 00:47:12 +02:00
Jan Vesely 015e2e0fce r600g: Add double precision FMA ops
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782
Fixes: 54c4d525da ("r600g: Enable FMA on chips that support it")

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: James Harvey <lothmordor@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-07-05 00:47:12 +02:00
Francesco Ansanelli 9827fc3f03 r600: fix duplicate 'const' declaration
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-04 21:26:31 +02:00
Topi Pohjolainen 2a60654f56 i965/urb: Allow blorp to record current settings
This makes it possible to skip urb re-configuration if the
subsequent renders agree with the settings.

Also allows blorp to allocate the maximun amount of vs entries
available. Core upload logic already knows how to calculate this.
Helps one synthetic benchmark.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 39fdee6b2d i965/blorp/gen7+: Do not trigger push constant space reconfig
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen cc2d0e64c0 i965/blorp/gen7+: Stop trashing push constant allocation
Packet 3DSTATE_CONSTANT_PS is still emitted explicitly as ps stage
itself is enabled and hardware may try to prefetch constants from
the buffer. From the BSpec: 3D Pipeline - Windower -
3DSTATE_PUSH_CONSTANT_ALLOC_PS

  "Specifies the size of the PS constant buffer. This value will
   determine the amount of data the command stream can pre-fetch
   before the buffer is full."

This is not possible on gen6. From the BSpec about 3DSTATE_CONSTANT_PS:

"This packet must be followed by WM_STATE."

Binding table emissions for stages other than PS can be now dropped,
they were only needed for the 3DSTATE_CONSTANT_XS to be effective:

From the BSpec:

  "The 3DSTATE_CONSTANT_* command is not committed to the shader unit
   until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_*
   command is parsed."

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 175e095744 i965/blorp: Remove support for push constants
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 46e1132b80 i965/blorp: Use flat inputs instead of uniforms
v2 (Jason): Use LOAD_INPUT() macro

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 07db95c24d i965/blorp: Fix the size requirement for vertex elements
v2: Rebased as this is needed before flat inputs are enabled

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 741a245ae4 i965/blorp: Load tranformation coordinates as vec4
In preparation for loading as flat vertex input.

v2: Use LOAD_INPUT() macro

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 01f2f364d4 i965/blorp: Rename LOAD_UNIFORM to LOAD_INPUT
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen 641868103c i965/blorp: Organize pixel kill and blend/scaled inputs into vec4s
In addition, as these are never used in parallel, add a few
assertions.

v2 (Jason): Skip some complexity by putting them into a union but
            pad rectangle grid into a vec4 instead. Also keep the
            LOAD_UNIFORM macro.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Lionel Landwerlin dbbc4fb4cc anv/wsi: create swapchain images using specified image usage
The image usage specified by the caller of vkCreateSwapchainKHR should be
passed onto the internal image creation. Otherwise the driver might later
crash when the user tries to use the image as a combined sampler even though
the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT.

Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be
expected even if the swapchain is created without any flag.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-04 10:15:48 -07:00
Indrajit Das 51227b41c6 radeon/uvd: fix overflow error while calculating bit stream buffer size
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-04 11:38:05 +02:00
Topi Pohjolainen 9e3774a460 i965/blorp: Prepare for more than two vertex attributes
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:05:02 +03:00
Topi Pohjolainen e762354309 i965/blorp: Tell vertex fetcher about flat inputs
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:04:38 +03:00
Topi Pohjolainen 89e6b4ef5d i965/blorp: Add support for flat input buffer
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:04:00 +03:00
Topi Pohjolainen 9b2fa17e97 i965/blorp: Store input read mask
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:03:41 +03:00
Topi Pohjolainen 73f78ab44b i965/blorp: Rename push constants to inputs
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:37:51 +03:00
Topi Pohjolainen f2c472fcb3 i965/blorp: Use core vertex buffer state setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:37:44 +03:00
Topi Pohjolainen 4f7e68799f i965/blorp: Split vertex data and element setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Topi Pohjolainen 575c8cbb54 i965: Unify vertex buffer setup
On gen >= 8 one doesn't provide ending address but number of bytes
available. This is relative to the given offset.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Topi Pohjolainen bdab945edd i965/draw: Expose vertex buffer state setup
Also change the interface to use start and end offsets.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Rob Clark 7295428e41 freedreno: fix crash on smaller gpus and higher resolutions
Devices with smaller GMEM size need more tiles.  On db410c at 2048x1152,
glmark2 shadow needed ~330 tiles for fullscreen.  Lets bump it up to
512.  (Maybe with MRT you could end up needing more, but at that point
things are probably going to be painfully slow.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-03 11:16:28 -04:00
Rob Clark 01ccb0d91e i965: don't drop const initializers in vector splitting
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-02 09:00:19 -04:00
Rob Clark f78a6b1ce3 glsl: add driconf to zero-init unintialized vars
Some games are sloppy.. perhaps because it is defined behavior for DX or
perhaps because nv blob driver defaults things to zero.

So add driconf param to force uninitialized variables to default to zero.

This issue was observed with rust, from steam store.  But has surfaced
elsewhere in the past.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-02 09:00:19 -04:00
Rob Clark 202710d110 freedreno/ir3: support glsl linking for cmdline compiler
For .vert/.frag, now multiple can be specified on the cmdline for
purposes of linking, and the last one specified is the one that is
fed into the ir3 backend (and dumped along the way if --verbose is
specified)

Without this, varyings in frag shaders would appear as undefined.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-07-02 09:00:19 -04:00
Rob Clark 07cfe4e6aa glsl/standalone: initialize MaxUserAssignableUniformLocations
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-07-02 09:00:19 -04:00
Rob Clark 1759eb1d19 freedreno: update valid_buffer_range for SO buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark da39ac9c51 freedreno/ir3: support non-user_buffer consts
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark 2081c1ecc0 freedreno/a2xx: move setup/restore cmds into binning pass
Rather than doing a separate submit at context create, move these cmds
to before first tile, as is done on a3xx/a4xx.  Otherwise state can
be overwritten by other contexts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark 2c3b54c278 freedreno: pass index buffer as a pipe_resource
This will be useful in a following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark 88cc11e971 freedreno: switch emit_const_bo() to take prsc's
We can push the unwrap of pipe_resource down.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Hans de Goede d7dfd4cb51 nv30: Fix "array subscript is below array bounds" compiler warning
gcc6 does not like the trick where we point to one entry before the
array start and then start a while with a pre-increment.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-02 12:21:28 +02:00
Hans de Goede 110ef733dc nouveau: Fix a couple of "foo may be used uninitialized' compiler warnings
These are all new false positives with gcc6.

In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer
to a variable into a function initialises that variable.

In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0
enabled dst channels, this never happens, but gcc cannot know this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-02 12:21:28 +02:00
Hans de Goede 1f3c8f3664 nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warnings
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede 2aa1197eee nouveau: Add support for SV_WORK_DIM
Add support for SV_WORK_DIM for nvc0 and nve4.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede 3345f70f63 nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argument
This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO
and NVC0_CB_AUX_TEX_INFO.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00