Commit Graph

100425 Commits

Author SHA1 Message Date
Kenneth Graunke a63c74be85 i965: Stop restoring the default L3 configuration on Kernel 4.16+.
Kernel 4.16 has proper context isolation, which means we can change
the L3 configuration without worrying about that leaking to other
newly created contexts, breaking the assumptions of other userspace.

So, disable our workaround to reprogram it back to the default.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:18 -08:00
Mikko Perttunen 5a1606c51f nvc0: Use GP100_COMPUTE_CLASS on GP10B
GP10B requires the use of GP100_COMPUTE_CLASS instead of
GP104_COMPUTE_CLASS as is used for other non-GP100 chips.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 14:16:10 -05:00
Daniel Stone 9d21dbeb88 i965: Fix aux-surface size check
The previous commit reworked the checks intel_from_planar() to check the
right individual cases for regular/planar/aux buffers, and do size
checks in all cases.

Unfortunately, the aux size check was broken, and required the aux
surface to be allocated with the correct aux stride, but full image
height (!).

As the ISL aux surface is not recorded in the DRIimage, we cannot easily
access it to check. Instead, store the aux size from when we do have the
ISL surface to hand, and check against that later when we go to access
the aux surface.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: c2c4e5bae3 ("i965: Fix bugs in intel_from_planar")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-17 10:22:35 +00:00
Marek Olšák 931ec80eeb radeonsi: implement 32-bit pointers in user data SGPRs (v2)
User SGPRs changes:
    VS:     14 ->  9
    TCS:    14 -> 10
    TES:    10 ->  6
    GS:      8 ->  4
    GSCOPY:  2 ->  1
    PS:      9 ->  5
    Merged VS-TCS: 24 -> 16
    Merged VS-GS:  18 -> 11
    Merged TES-GS: 18 -> 11

SGPRS: 2170102 -> 2158430 (-0.54 %)
VGPRS: 1645656 -> 1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size: 52094872 -> 52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)

v2: - the shader cache needs to take address32_hi into account
    - set amdgpu-32bit-address-high-bits

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2018-02-17 04:52:17 +01:00
Marek Olšák 5722cd4084 radeonsi: disallow constant buffers with a 64-bit address in slot 0
State trackers must use a user buffer or const_uploader,
or set pipe_resource::flags same as const_uploader->flags.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák d790b6cece radeonsi: move const_uploader allocations to 32-bit address space
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák 50581549b7 winsys/radeon: implement and enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák 1104d1e9d3 winsys/radeon: add struct radeon_vm_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák 48ecacfefa winsys/amdgpu: enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák c2da45be86 gallium/radeon: add 32-bit address space heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák 0977b7f7b3 ac: query high bits of 32-bit address space 2018-02-17 04:51:58 +01:00
Marek Olšák 16be55da94 gallium: use PIPE_CAP_CONSTBUF0_FLAGS 2018-02-17 04:20:55 +01:00
Marek Olšák 8e7222f4e5 gallium: allow drivers to impose BO flags restrictions on constant buffer 0
Required by radeonsi for optimal behavior.
2018-02-17 04:20:55 +01:00
Alexander von Gluck IV 834d221512 meson: Add Haiku platform support v4
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-16 16:56:34 -06:00
Anuj Phogat 7b283544dc anv/icl: Add render target flush after uploading binding table
The PIPE_CONTROL command description says:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
points to a different RENDER_SURFACE_STATE, SW must issue a Render
Target Cache Flush by enabling this bit. When render target flush
is set due to new association of BTI, PS Scoreboard Stall bit must
be set in this packet."

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat 136f583a24 anv/icl: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat cd7102972f anv/icl: Use gen11 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat 9673c21d4f anv/icl: Build anv libs for gen11
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat 1f108b436b anv/icl: Generate gen11 entry point functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat a86c0a08df anv/icl: Don't use DISPATCH_MODE_SIMD4X2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat cd5fc634a8 anv/icl: Don't use SingleVertexDispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat 6e3940b3cf anv/icl: Don't set ResetGatewayTimer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat 41a4c2c8e8 anv/icl: Add #define genX
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Anuj Phogat 413d475b44 anv/icl: Add gen11 mocs defines
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Kenneth Graunke 1d6cf433d2 i965: Implement GenerateMipmap directly, rather than using Meta.
Meta is awful and we'd like to stop using it.  Implementing this using
BLORP allows us to stop trashing a bunch of GL state every time.

This follows the structure of st_generate_mipmap().
compute_num_levels is lifted directly from there.

Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3)
on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher
resolutions).

v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5
because it's not implemented yet.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-02-16 10:48:10 -08:00
Kenneth Graunke 9bcd31ea90 mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c.
I want to use compute_num_levels inside i965.  Rather than duplicating
it, move it from mesa/st to core Mesa, and make it non-static.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 10:48:10 -08:00
Dylan Baker 03ab40b1f7 meson: freedreno depends on nir
This fixes a race condition in building targets that link in freedreno.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120
Fixes: 0bbecc5a85 ("meson: define driver dependencies")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Mark Janes <mark.a.janes@intel.com>
2018-02-16 10:10:18 -08:00
George Kyriazis f1fbeb1a53 swr/rast: blend_epi32() should return Integer, not Float
fix gcc8 compiler error for KNL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis 7dd793d10c swr/rast: Normalize path for debug metadata
in template gen_llvm.hpp

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis f979d0bc2f swr/rast: Consolidate archrast Draw events
Consolidate archrst draw events into single draw event with an attribute
that represents the type of draw

- Add handlers for new private proto versions of DrawInstancedEvent,
  DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and
  DrawIndexedInstancedSplitEvent
- Convert the draw events to generic DrawInfoEvents
- parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with
  'uint32_t'. This draw type is actually an enum, but can be represented
  as an unsigned integer.
- is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis 45df1a6520 swr/rast: Add semantics for translating address
Added support for another full translation path in fetch jitter.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis c09483cf0a swr/rast: Convert C Sampler intrinsics
Convert portions of the C sampler to the rasty SIMD lib.

Also fix SRL call with a non-immediate.  Don't count on the compiler
automagically converting an srli call to srl if the shift count isn't
an immediate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis 37ebf86add swr/rast: Make SIMDLib templated types easier to use
"typename SIMD_T::TypeName" --> "TypeName<SIMD_T>"

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis 74e8bb4a22 swr/rast: Be more explicit when fetching next component
Use a new function to denote that we want to get offset to next component
and hide the fact that GEP is used underneath.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis da77eb55d5 swr/rast: Fix bug related to passing AR handle
We were passing a garbage handle. Let's not do that.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis 48d62409f8 swr/rast: Fix primitive replication issue in tesselation PA.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis e12db47a7d swr/rast: Use llvm intrinsic masked gather
Use llvm intrinsic masked.gather instead of manual unroll for the cases
where we have vector of pointers.  Improves llvm IR debug experience by
reducing a ton of IR to a single intrinsic call. Also seems to reduce
overall stack use considerably.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis 9cc9688e49 swr/rast: Misc cleanup
Together with correct detection of clipDistance NaNs when no cullDistance is set

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis 036c8b6247 swr/rast: Renamed variable in vertexbufferstate
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis b25efa36e6 swr/rast: Fix GATHERPS to avoid assertions.
With the pBase type change, LLVM was asserting because of wrong types.
Cast appropriately.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis 8a64593bde swr/rast: More precise user clip distance interpolation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis 3e560b7c85 swr/rast: Cull prims when all verts have negative clip distances
Performance optimization, and fixes some clipping issues.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis cb4b604ebd swr/rast: whitespace and comment cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis 5df4d98780 swr/rast: Fix invalid number of attributes
Fix invalid number of attributes passed into tesselation PA.
Needs to take into account any offsets from the shader.
Innocuous issue, but removes an assert firing in debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis 2053472723 swr/rast: Add clipper stats.
Clipper event is now:

event ClipperEvent
{
    uint32_t drawId;
    uint32_t trivialRejectCount;
    uint32_t trivialAcceptCount;
    uint32_t mustClipCount;
};

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis 0420b2be89 swr/rast: Separate event types to public and private
Split into two proto files and modify appropriate build rules for
configure / scons / meson builds.

There are private internal events (proxy) that communicate information
from rasterizer to ArchRast. ArchRast can use these events to calculate
a final answer and then emit other public events which will be saved to
file. Users will use the public proto file and not the private one.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis e48dd2489c swr/rast: Clean up event types and remove BE events
Begin/End events not needed anymore.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis 7070027d7b swr/rast: Removed unused variable
Gets rid of zillions of unused variable warnings, made worse by templates.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis e3f92bb7af swr/rast: Separate RDTSC code from archrast
Renamed rdstc defines more appropriately

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis 8bce71622e swr/rast: Cleanup of mpPrivateContext in Builder
Provide access functions for mpPrivateContext in Builder.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00