Commit Graph

85285 Commits

Author SHA1 Message Date
Jason Ekstrand 0176c6a692 intel/isl: Allow non-2D HiZ surfaces
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand 4e397c6c75 intel/isl: Add a detailed comment about multisampling with HiZ
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand c3bd711411 intel/isl: Remove tiling checks from choose_msaa_layout
We already do those checks in filter_tiling.  There's no good reason to
repeat them in choose_msaa_layout.  If anything they should have been
asserts and not "return false" checks.  Also, this check was causing us to
outright reject multisampled HiZ surfaces which wasn't intended.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand 69d3bb9915 intel/isl: Handle HiZ and CCS tiling more directly
The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces
respectively.  There's no reason why we should go through filter_tiling and
it's much easier to always get HiZ and CCS right if we just handle them
directly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand b1311a48e0 intel/isl: Allow multisampling with ISL_FORMAT_HiZ
HiZ buffers can be multisampled and, on Broadwell and earlier, simply using
interleaved multisampling with a compression block size of 8x4 samples
yields the correct HiZ surface size calculations.  Unfortunately,
choose_msaa_layout was rejecting multisampled HiZ buffers because of format
checks.  Now that we have a simple helper for determining if a format
supports multisampling, that's an easy enough issue to fix.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand baade41a5c intel/isl: Allow creation of 1-D compressed textures
Compressed 1-D textures are not well-defined thing in either GL or Vulkan.
However, auxiliary surfaces are treated as compressed textures in ISL and
we can do HiZ and CCS with 1-D so we need to be able to create them.  In
order to prevent actually using them (the docs say no), we assert in the
state setup code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand f82166578f intel/isl: Fix up asserts in calc_phys_level0_extent_sa
The assertion that a format is uncompressed in the multisample layouts
isn't quite right.  What we really want to assert is that the format
supports multisampling which is a bit more complicated query.  We also want
to assert that it has a block size of 1x1 since we do nothing with the
block size in the phys_level0_sa assignment.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand 5637f3f120 intel/isl: Add a format_supports_multisampling helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Nayan Deshmukh b7a0f2e1f7 vl/dri3: fix warning about incompatible pointer type
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-10-03 12:51:30 -04:00
Bruce Cherniak 903d00cd32 swr: Removed stalling SwrWaitForIdle from queries.
Previous fundamental change in stats gathering added a temporary
SwrWaitForIdle to begin_query and end_query.  Code has been reworked to
remove stall.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-10-03 09:57:45 -05:00
Tim Rowley cdac042733 swr: [rasterizer core] refactor thread creation
Create worker pool now computes number of worker threads based on
things like topologies, etc. and creates the pool but doesn't actually
launch the threads. Instead there is a separate start thread pool
function. This allows thread resources to be constructed first before
threads start.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:38 -05:00
Tim Rowley 114f7a92c6 swr: [rasterizer jitter] canonicalize blend compile state
Canonicalize to prevent unnecessary JIT compiles.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:31 -05:00
Tim Rowley 4198520a82 swr: [rasterizer core] archrast fixes
- Immediately sleep threads until thread data is initialized
- Fix some compile bugs with AR enabled

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:25 -05:00
Tim Rowley aaeb07989e swr: [rasterizer jitter] fixes for icc in vs2015 compat mode
- Move most jitter functionality into SwrJit namespace
- Avoid global "using namespace llvm" in headers

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:19 -05:00
Tim Rowley b8a6f06c85 swr: [rasterizer core] generalize compute dispatch mechanism
Generalize compute dispatch mechanism to support other types of dispatches.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:13 -05:00
Tim Rowley 33a1a09eb0 swr: [rasterizer common] os.h portability header changes
- Fix conflict between windows MemoryFence and llvm::sys::MemoryFence
- Declare gettid()

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:56:47 -05:00
Ville Syrjälä 2fef0d108a anv/formats: Fix build on gcc-4 and earlier
gcc-4 and earlier don't allow compound literals where a constant
is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
when populating the anv_formats[] array. There are a few ways around
it: First one would be -std=c89/gnu89, but the rest of the code
depends on c99 so it's not really an option. The second option
would be to upgrade to gcc-5+ where the compiler behaviour was relaxed
a bit [1]. And the third option is just to avoid using compound
literals. I chose the last option since it keeps gcc-4 and earlier
working.

[1] https://gcc.gnu.org/gcc-5/porting_to.html

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 7ddb21708c ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-03 15:45:28 +03:00
Tapani Pälli 4d6d55deef egl: stop claiming support for pbuffer + msaa
This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
and same crash in many dEQP EGL tests.

I also found that some Qt example did a workaround because of this
crash: https://bugreports.qt.io/browse/QTBUG-47509

v2: Ian pointed out that v1 removed support for all multisample
    configs, including window ones. This one removes pbuffer bit
    when adding configs, now only pbuffer+msaa gets rejected and
    window+msaa continues to work. Fixed also comment (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-10-03 07:56:44 +03:00
Timothy Arceri eaf147cb46 i965: rename max_ds_* variable to max_tes_*
Using consistent naming allows us to create macros more easily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-03 15:29:58 +11:00
Timothy Arceri b67633ce5e i965: rename max_hs_* variables to max_tcs_*
Using consistent naming allows us to create macros more easily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-03 15:29:51 +11:00
Kenneth Graunke da274ba5f8 i965: Drop pointless stage == MESA_SHADER_FRAGMENT checks.
There's an assert right above this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-02 14:49:20 -07:00
Timothy Arceri 024c207319 glsl: add missing headers to blob.h
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-10-02 13:48:06 +11:00
Jason Ekstrand ef3c5ac7fb nir/spirv/cfg: Detect switch_break after loop_break/continue
While the current CFG code is valid in the case where a switch break also
happens to be a loop continue, it's a bit suboptimal.  Since hardware is
capable of handling the continue as a direct jump, it's better to use a
continue instruction when we can than to bother with all of the nasty
switch break lowering.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-10-01 15:40:34 -07:00
Jason Ekstrand 4d02faede5 nir/spirv/cfg: Handle switches whose break block is a loop continue
It is possible that the break block of a switch is actually the continue of
the loop containing the switch.  In this case, we need to identify the
break block as a continue and break out of current level of CFG handling.
If we don't, the continue portion of the loop will get handled twice, once
by following after the break and a second time by the loop handling code
handling it explicitly.

This fixes 6 of the new Vulkan CTS tests:
 - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order*
 - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order*

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-10-01 15:40:14 -07:00
Eric Engestrom fc03ecfeaf nir/spirv: add spirv2nir binary to .gitignore
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-01 15:27:48 -07:00
Eric Engestrom c867938044 nir/spirv: improve mmap() error handling
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-01 15:27:46 -07:00
Eric Engestrom 65c8cbe89d nir/spirv: improve lseek() error handling
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-01 15:27:44 -07:00
Eric Engestrom 23519a9de2 nir/spirv: add some error checking to open()
CovID: 1373369
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-01 15:27:31 -07:00
Timothy Arceri 913e0296f2 mesa: use uint32_t rather than unsigned for xfb struct members
These structs will be written to disk as part of the shader cache
so use uint32_t just to be safe.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-10-01 11:26:25 +10:00
Timothy Arceri 7064f8674a i915/i965: remove commented out warning
The warning was also the wrong location, it should have been
in the else.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-10-01 09:24:33 +10:00
Brian Paul 951bf44a56 mesa: move _mesa_valid_to_render() to api_validate.c
Almost all of the other drawing validation code is in api_validate.c
so put this function there as well.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-30 16:28:00 -06:00
Steven Toth e99b9395be gallium/hud: Add support for CPU frequency monitoring
Detect all of the CPUs in the system. Expose metrics
for min, max and current frequency in Hz.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-30 15:18:46 -06:00
Marek Olšák 7b87190d2b Revert "gallium/hud: automatically print % if max_value == 100"
This reverts commit dbfeb0ec12.

With max_value being rounded to 100, it's often wrong.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-30 22:07:12 +02:00
Brian Paul 1d07552ba5 docs: update the list of Mesa major versions and API support
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-30 09:17:33 -06:00
Nicolai Hähnle 7bac5bf032 gallium/radeon: fix crash/regression in performance counters
Regression introduced by "gallium/radeon: zero all query buffers".

Cc: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-30 12:41:45 +02:00
Nicolai Hähnle cfd870de70 gallium/radeon: update documentation of buffer_get_virtual_address
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-30 12:41:41 +02:00
Nicolai Hähnle fd9f54223d gallium/radeon: emit relocations for query fences
This is only needed for r600 which doesn't have ARB_query_buffer_object and
therefore wouldn't really need the fences, but let's be optimistic about
filling in this feature gap eventually.

Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-30 12:38:57 +02:00
Nicolai Hähnle 3e7cced4b9 radeon/uvd: adjust the buffer offset when relocation is used
We don't plan to use sub-allocated buffers with UVD, but just in case one
slips through, this increases the chances of things working out anyway.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-09-30 12:38:52 +02:00
Nicolai Hähnle a48bf02d05 radeon/vce: adjust the buffer offset when relocation is used
We don't plan to use sub-allocated buffers with VCE, but just in case one
slips through, this increases the chances of things working out anyway.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-09-30 12:38:48 +02:00
Nicolai Hähnle 13cb41f666 radeon/video: don't use sub-allocated buffers
Cc: Christian König <christian.koenig@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-09-30 12:38:29 +02:00
Steven Toth 1d466b9b04 gallium/hud: Add power sensor support
Implement support for power based sensors, reporting units in
milli-watts and watts.

Also, minor cleanup - change the related if block to a switch.

Tested with two different power sensors, including the nouveau
'power1' sensors on a GTX950 card.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-29 17:51:15 -06:00
Samuel Pitoiset 3abe68b828 nv50/ir: teach insnCanLoad() about SHLADD
Commutativity is not allowed with SHLADD, but src2 can accept
loads. To allow the load propagation pass to do its job, add a
special case like for SUCLAMP because src1 is always an immediate.

This IMAD to SHLADD optimization helps a bunch of shaders from Tomb
Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow
Warrior.

GF100/GK104:

total instructions in shared programs :2838045 -> 2834712 (-0.12%)
total gprs used in shared programs    :396684 -> 396386 (-0.08%)
total local used in shared programs   :34416 -> 34416 (0.00%)

                local        gpr       inst      bytes
    helped           0         326        1105        1105
      hurt           0          55           3           3

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 21:20:50 +02:00
Samuel Pitoiset 115c79be10 nv50/ir: optimize SHLADD(a, b, c) to MOV((a << b) + c)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 21:20:47 +02:00
Samuel Pitoiset 2e008be9a9 nv50/ir: optimize SHLADD(a, b, 0x0) to SHL(a, b)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 21:20:44 +02:00
Samuel Pitoiset e4eb0fca02 nv50/ir: optimize IMAD to SHLADD in presence of power of 2
Only and only if src1 is a power of 2 we can replace IMAD by SHLADD.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 21:20:41 +02:00
Samuel Pitoiset 31545b64b8 nvc0/ir: add emission for SHLADD
Unfortunately, we can't use the emit helpers for GF100/GK110
because src1 and src2 are swapped.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 21:20:36 +02:00
Samuel Pitoiset 85132c7453 nv50/ir: add preliminary support for SHLADD
This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL should be
replaced by SHLADD as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 21:20:30 +02:00
Samuel Pitoiset 652874754a nvc0: update GM107 sched control codes format
envyas now uses a much better representation for those control
codes and it displays the different flags instead of an
unreadable hex number.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-09-29 20:13:05 +02:00
Nicolai Hähnle e4b585f009 gallium/radeon: use smaller buffers for query results
Most of the time, even the 512 bytes that we now get is more than sufficient
(pipeline stats queries are the largest at 184 bytes per shot).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-29 11:24:56 +02:00
Nicolai Hähnle de84e99e45 gallium/radeon/winsyses: add radeon_winsys::min_alloc_size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-29 11:24:52 +02:00