Commit Graph

90118 Commits

Author SHA1 Message Date
Tim Rowley cf8fa67364 swr: [rasterizer codegen] Remove BOM from knob_defs.py
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 8a5069e81f swr: [rasterizer codegen] Rewrite gen_llvm_types.py to use mako
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 5d0b3b05a2 swr: [rasterizer codegen] Fix generation of knobs
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 4ed72758db swr: [rasterizer codegen] Change backend template comment style
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 2776d94545 swr: [rasterizer codegen] Rewrite gen_llvm_ir_macros.py to use mako
Don't create/use cpp files, header only now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 9538ba9bd1 swr: [rasterizer codegen] Quiet gen_backends.py execution
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 97cbabc8fb swr: [rasterizer scripts] Put codegen scripts into a separate directory
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley 7046695a0e swr: [rasterizer core] Fix trifan regression from 9d3442575f
Fixes piglit triangle-rasterization-overdraw.

SIMD16 path not working.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:22 -05:00
Tim Rowley 4cb69e817c swr: [rasterizer core] SIMD16 Frontend WIP - fix tesselation crashes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley ab3f4449c3 swr: [rasterizer jitter] Fix LogicOp blend jit after assert changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 8cd8240cfc swr: [rasterizer] Convert more SWR_ASSERT(false, ...) to SWR_INVALID(...)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley ab032fb436 swr: [rasterizer core] Fix typo in SIMD16 code path
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley d011ba74ee swr: [rasterizer core/common] Fix the native AVX512 build under ICC
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 2f513d8d83 swr: [rasterizer core] Allow no arguments to SWR_INVALID macro
Turns out this is somewhat tricky with gcc/g++.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 0b066b2bf3 swr: [rasterizer] Slight assert refactoring
Make asserts more robust.

Add SWR_INVALID(...) as a replacement for SWR_ASSERT(0, ...)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley f445b6de9c swr: [rasterizer] Backend code adjustments
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley e4d1294afb swr: [rasterizer archrast] Fix the early and late depthstencil events
The coverage and stencil mask arguments were reversed.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley a508c2c2ac swr: [rasterizer core] Implement double pumped SIMD16 TESS
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 2cbac00221 swr: [rasterizer archrast/core/scripts] Fix archrast multithreading issue
Per pixel stats are cached but were not always being flushed as threads
moved from one draw context to the next.  Added an explicit flush to allow
all archrast objects to flush any cached events.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 0a36a7cf04 swr: [rasterizer archrast] Remove redundant data from archrast files
If count can be derived from other counts then this can be done in
post processing scripts.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 1cc885d1d1 swr: [rasterizer archrast/scripts] Further archrast cleanups
Removed redundant data being written out to file

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 1399fbd6fd swr: [rasterizer core] Fix RECT_LIST primitive assembly
The bug would make the 3rd component of attributes on the second
triangle of a RECT be invalid.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley ade5351900 swr: [rasterizer common] Add InterpolateComponentFlat utility
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley ab04221bf1 swr: [rasterizer archrast] Fix performance issue with archrast stats
Performance is now 50x faster with archrast now that we're properly
filtering out all of the rdtsc begin/end.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley b228d2db18 swr: [rasterizer core] Implement SIMD16 GS and STREAMOUT
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 5830a0a6f8 swr: [rasterizer archrast] Add additional API events
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley d2759c1eb3 swr: [rasterizer core/scripts] Autogen backend initialization function(s)
Autogen functions that instantiates different BackendPixelRate templates.
Functions get split into separate files after reaching a user defined
threshold (currently 512 per file) to speed up compilation.

This change will enable the addition of more template flags in the pixel
back end.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 2c820d22cf swr: [rasterizer core] backend.h declares gBackendPixelRateTable
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 50d491e22d swr: [rasterizer core] Finish SIMD16 PA OPT including tesselation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 9d3442575f swr: [rasterizer core] Finish SIMD16 PA OPT except tesselation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley 7b94e5e1fa swr: [rasterizer core] Support sparse numa id values on all OSes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Kenneth Graunke 5e29af5f77 i965: Skip register write detection when possible.
Detecting register write support by trial and error introduces a
stall at screen creation time, which it would be nice to avoid.
Certain command parser versions guarantee this will work (see the
giant comment in intelInitScreen2 below, or a few commits ago):

- Ivybridge: version >= 1 (kernel v3.16)
- Baytrail:  version >= 2 (kernel v3.19)
- Haswell:   version >= 7 (kernel v4.8)

For simplicity, we don't bother with version 1 in this patch.

This assumes that the user hasn't disabled aliasing PPGTT via a kernel
command line parameter.  Don't do that - you're only breaking things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Kenneth Graunke 31693a13f8 i965: Set screen->cmd_parser_version to 0 if we can't write registers.
If we can't write registers, then the effective command parser version
is 0 - it may exist, but it's not usefully enabling anything.

See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8)
where the kernel starts doing this for us.  This makes us do more or
less the same thing on older kernels.

This should preserve a bit of sanity by allowing us to perform a
screen->cmd_parser_version > N check to determine that we really can
use the features promised by command parser version N.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Kenneth Graunke 4a2ad6b145 i965: Document the sad story of the kernel command parser.
This should help us figure out the complexities of which kernel
versions we need to get various features on various platforms.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Kenneth Graunke 9b324e4dca i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough.
In commit d2590eb65f I enabled GL 4.5
on Haswell...but failed to check if we could do indirect compute
shader dispatch...and query buffer objects.

Indirect compute shader dispatch requires command parser version 5
(kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in
Linux v4.4).  On earlier kernels we would have disabled
ARB_compute_shader, which is a mandatory part of OpenGL 4.3+.

Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG,
which mean command parser version 7 (Linux v4.8).  On earlier kernels
we would have disabled ARB_query_buffer_object, which is a mandatory
part of OpenGL 4.4+.

The new version support looks like:

- Kernel 4.1 and older => OpenGL 3.3
- Kernel 4.2-4.3       => OpenGL 4.2
- Kernel 4.4-4.7       => OpenGL 4.3
- Kernel 4.8+          => OpenGL 4.5

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Constantine Kharlamov 99d400b78f r600g/sb: Fix memory leak by reworking uses list (rebased)
The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment.
Tested-by: Constantine Charlamov <Hi-Angel@yandex.ru>

v2: Resend the patch again through git-email. The prev. rebase was sent
through Thunderbird, which screwed up tab characters, making the patch
not apply.

--------------
When fixing the stalls on evergreen I introduced leaking of the useinfo
structure(s). Sorry. Instead of allocating a new object to hold 3 values
where only one is actually used, rework the list to just store the node
pointer. Thus no allocating and deallocation is needed. Since use_info
and use_kind aren't used anywhere, drop them and reduce code complexity.
This might also save some small amount of cycles.

Thanks to Bartosz Tomczyk for finding the bug.

Reported-by: Bartosz Tomczyk <bartosz.tomczyk86 at gmail.com <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>
Signed-off-by: Heiko Przybyl <lil_tux at web.de <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>
Supersedes: https://patchwork.freedesktop.org/patch/135852
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-03-20 23:23:50 +01:00
Marek Olšák 827ae79b2c radeonsi: check the IR type before waiting for a compute compilation fence
This should fix OpenCL getting stuck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100288
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-20 23:17:14 +01:00
Kenneth Graunke 4084083124 aubinator: Move the guts of decode_group() to decoder.c.
This lets us use it outside of the aubinator binary itself.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke aa1ef0b984 aubinator: Drop spec parameter to decode_group().
No longer necessary - the iterator gets it from the group.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke b2c0c1d9a5 aubinator: Make the iterator store a pointer to structure descriptions.
When the iterator encounters a structure field, it now looks up the
gen_group for that structure definition and saves a pointer to it.

This lets us drop a lot of ridiculous code in the caller, which looked
at item->value (<struct NAME dword>), strtok'd the structure name back
out, and looked it up itself.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke a1aa78cb45 aubinator: Track the current field's starting dword offset.
The iterator code already computed this value, then we stored it in
the structure name, strtok'd it back out, and also manually computed
it when printing dword headers.

Just put the value in the struct and use it.  Way simpler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke e6f7357cab aubinator: Drop decode_structure() helper.
It made more sense when decode_group() took a bunch of extra options,
but now that there's only one...we may as well pass 0 and call it a day.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke a8d4184b00 aubinator: Drop unused print_dword_headers flag.
I added this flag in 65a9d5eabb but
it was completely unused.  Both callers appear to have printed dword
headers, so we can just drop the flag and continue doing it
unconditionally.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke 7f21cb56b8 aubinator: Store a pointer from gen_group back to gen_spec.
When decoding a structure field within a group, we may want to look up
that structure type.  Having a gen_spec pointer makes it easy to do so.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke 2c6c760a4b aubinator: Store enum textual name in iter->value.
gen_field_iterator_next() produces a string representing the value of
the field.  For enum values, it also produced a separate "description"
string containing the textual name of the enum.

The only caller of this function combines the two, printing enums as
"<numeric value> (<texture enum name>)".  We may as well just store
that in item->value directly, eliminating the description field, and
a layer of wrapping.

v2: Use non-overlapping source and destination strings in snprintf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Julien Isorce a6e2124402 si_descriptor: move velems nullity check before dereference
CID 1399479: Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking velems suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 18:01:51 +00:00
Julien Isorce 521860b2a9 radeon_drm_bo: explicitly check return value of drmCommandWriteRead
CID 1313492

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 18:01:51 +00:00
Julien Isorce dac124466a si_pipe: remove nullity check after dereference
sscreen cannot be NULL

CID 1354483

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 18:01:41 +00:00
Julien Isorce ce27b27c38 radeon: initialize hole variable before calling container_of
Like in a few other places in that radeon_drm_bo.c file.

CID 715739.

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 16:47:31 +00:00
Nanley Chery 7c50f9903f intel: Correct the BDW surface state size
The PRMs state that this packet is 16 DWORDS long. Ensure that the last
three DWORDS are zeroed as required by the hardware when allocating a
null surface state.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-20 09:43:44 -07:00