KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	7e76f9a7a8	radeonsi: record information about all written and read varyings It's just tgsi_shader_info with DEFAULT_VAL varyings removed. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	c7f3e5c647	radeonsi: make si_shader_io_get_unique_index stricter Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	ed3190b3f3	radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled This is the first user of optimized monolithic shader variants. Cull distances can't be disabled by states. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	d984a324bf	radeonsi: add infrastr. for compiling optimized shader variants asynchronously Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	d2a56985d7	radeonsi: don't set vs.epilog.export_prim_id if TES is bound there is no VS epilog in this case Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	fee71fec25	radeonsi: simplify checking for monolithic compilation Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	e6aee45db4	radeonsi: print all flags in si_dump_shader_key Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	6d5c2a8b5c	radeonsi: split the shader key into 3 logical parts key->part.: prolog and epilog flags only key->as_{ls,es}: special flags key->mono.: flags for monolithic compilation only Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	d4e9f409e9	radeonsi: fix culling if clip & cull distances are used at the same time Fixed piglits: - arb_cull_distance/clip-cull-3 - arb_cull_distance/clip-cull-4 Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	9d8db805ef	radeonsi: clean up si_emit_clip_regs Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	e59389d738	radeonsi: assume that a VS without POSITION is LS Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	7dbf83af54	tgsi/scan: record if a shader writes the position output Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	8a2251911e	tgsi/scan: use a big switch for scanning outputs Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	bdd860e307	radeonsi: decrease the number of texture slots to 24 Company Of Heroes 2 needs only 24. This saves 512 bytes of CE RAM per shader stage. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	fa476e0566	radeonsi: fast exit si_emit_derived_tess_state early Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	79a8e674ae	winsys/amdgpu: set addrlib flag opt4Space Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	72d1669ed2	radeonsi: check for !is_linear in do_hardware_msaa_resolve We don't want opt4Space here. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	49fa4a4e60	gallium/radeon: add RADEON_SURF_OPTIMIZE_FOR_SPACE FORCE_TILING should disable it. It has no effect now, but that may change soon. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Mun Gwan-gyeong	44a3f2ee09	radeonsi: Add missing error-checking to si_create_compute_state (v2) When the uploading of shader fails on si_shader_binary_upload(), it returns -ENOMEM. We should handle si_shader_binary_upload() failure path on si_create_compute_state(). CID 1394027 v2: Fixes from Edward O'Callaghan's review a) Update explicitly return value check with "si_shader_binary_upload() < 0" b) Update commit message. Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 21:09:06 +01:00
Roland Scheidegger	e442db8e98	draw: drop some overflow computations It turns out that noone actually cares if the address computations overflow, be it the stride mul or the offset adds. Wrap around seems to be explicitly permitted even by some other API (which is a _very_ surprising result, as these overflow computations were added just for that and made some tests pass at that time - I suspect some later fixes fixed the actual root cause...). So the requirements in that other api were actually sane there all along after all... Still need to make sure the computed buffer size needed is valid, of course. This ditches the shiny new widening mul from these codepaths, ah well... And now that I really understand this, change the fishy min limiting indices to what it really should have done. Which is simply to prevent fetching more values than valid for the last loop iteration. (This makes the code path in the loop minimally more complex for the non-indexed case as we have to skip the optimization combining two adds. I think it should be safe to skip this actually there, but I don't care much about this especially since skipping that optimization actually makes the code easier to read elsewhere.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	2471aaa02f	draw: simplify fetch some more Don't keep the ofbit. This is just a minor simplification, just adjust the buffer size so that there will always be an overflow if buffers aren't valid to fetch from. Also, get rid of control flow from the instanced path too. Not worried about performance, but it's simpler and keeps the code more similar to ordinary fetch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	4e1be31f01	draw: unify linear and elts draw jit functions The code for elts and linear paths was nearly 100% identical by now - with the elts path simply having some additional gather for the elements in the main loop (with some additional small differences before the main loop). Hence nuke the separate functions and decide this at jit shader execution time (simply based on the presence of the elts pointer). Some analysis shows that the generated vs jit functions seem to be just very minimally more complex than the former elts functions, and almost none of the additional complexity is in the main loop (basically just the branch logic for the branch fetching the actual indices). Compared to linear, the codesize of the function is of course a bit larger, however the actual executed code in the main loop appears to be near 100% identical (the additional code looking up indices is skipped as expected). So, I would not expect a (meaningful) performance difference with the generated code, neither with elts nor linear, this does however roughly half the compilation time (the compiled shaders should also use only half the memory of course). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	8cf7edff7d	draw: use same argument order for jit draw linear / elts functions This is a bit simpler. Mostly to make it easier to unify the paths later... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	78a997f728	draw: drop unnecessary index overflow handling from vsplit code This was kind of strange, since it replaced indices which were only overflowing due to bias with MAX_UINT. This would cause an overflow later in the shader, except if stride was 0, however the vertex id would be essentially random then (-1 + eltBias). No test cared about it, though. So, drop this and just use ordinary int arithmetic wraparound as usual. This is much simpler to understand and the results are "more correct" or at least more consistent (vertex id as well as actual fetch results just correspond to wrapped around arithmetic). There's only one catch, it is now possible to hit the cache initialization value also with ushort and ubyte elts path (this wouldn't be an issue if we'd simply handle the eltBias itself later in the shader). Hence, we need to make sure the cache logic doesn't think this element has already been emitted when it has not (I believe some seriously bad things could happen otherwise). So, borrow the logic which handled this from the uint case, but not before fixing it up... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	7a55c436c6	draw: simplify vsplit elts code a bit vsplit_get_base_idx explicitly returned idx 0 and set the ofbit in case of overflow. We'd then check the ofbit and use idx 0 instead of looking it up. This was necessary because DRAW_GET_IDX used to return DRAW_MAX_FETCH_IDX and not 0 in case of overflows. However, this is all unnecessary, we can just let DRAW_GET_IDX return 0 in case of overflow. In fact before `bbd1e60198` the code already did that, not sure why this particular bit was changed (might have been one half of an attempt to get these indices to actual draw shader execution - in fact I think this would make things less awkward, it would require moving the eltBias handling to the shader as well). Note there's other callers of DRAW_GET_IDX - those code paths however explicitly do not handle index buffer overflows, therefore the overflow value doesn't matter for them. Also do some trivial simplification - for (unsigned) a + b, checking res < a is sufficient for overflow detection, we don't need to check for res < b too (similar for signed). And an index buffer overflow check looked bogus - eltMax is the number of elements in the index buffer, not the maximum element which can be fetched. (Drop the start check against the idx buffer though, this is already covered by end check and end < start). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
George Kyriazis	9aae167e94	gallium: Add support for SWR compilation Include swr library and include -DHAVE_SWR in the compile line. v3: split to a separate commit Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	5b4d1500dd	gallium: swr: Added swr build for windows v4: Add windows-specific gen_knobs.{cpp\|h} changes v5: remove aggresive squashing of gen_knobs.py to this commit; added SConscript to EXTRA_DIST in Makefile.am Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	9e4e1f5190	swr: Modify gen_knobs.{cpp\|h} creation script Modify gen_knobs.py so that each invocation creates a single generated file. This is more similar to how the other generators behave. v5: remove Scoscript edits from this commit; moved to commit that first adds SConscript Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	9085f1a9cc	scons: Add swr compile option To buils The SWR driver (currently optional, not compiled by default) v3: add option as opposed to target Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	bc26e8d4a7	swr: Windows-related changes - Handle dynamic library loading for windows - Implement swap for gdi - fix prototypes - update include paths on configure-based build for swr_loader.cpp v2: split to multiple patches v3: split and reshuffle some more; renamed title v4: move Makefile.am changes to other commit. Modify header files Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	87bd28210f	swr: renamed duplicate swr_create_screen() There are 2 swr_create_screen() functions. One in swr_loader.cpp, which is used during driver init, and the other is hiding in swr_screen.cpp, which ends up in the arch-specific .dll/.so. Rename the second one to swr_create_screen_internal(), to avoid confusion in header files. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	974d280e81	swr: Handle windows.h and NOMINMAX Reorder header files so that we have a chance to defined NOMINMAX before mesa include files include windows.h v3: split from bigger patch Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	915b4b0d49	gallium: Added SWR support for gdi Added hooks for screen creation and swap. Still keep llvmpipe the default software renderer. v2: split from bigger patch v3: reword commit message Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	30ae2cbf82	scons: add llvm 3.9 support. v2: reworded commit message Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	2da28dbd11	scons: ignore .hpp files in parse_source_list() Drivers that contain C++ .hpp files need to ignore them too, along with .h files, when building source file lists. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	c323180733	mesa: removed redundant #else Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
Jordan Justen	44c5ed02d1	i965/hsw: Set integer mode in sampling state for stencil texturing Fixes: ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-21 10:10:53 -08:00
Emil Velikov	8e0e2478ba	reviewers: add Rob H for the Android EGL+build parts Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 16:01:06 +00:00
Emil Velikov	7a39a0091d	docs: recommend using --enable-mangling over the manual -DUSE... Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:27 +00:00
Emil Velikov	0fa854aea5	docs: rework/update install.html Still far from perfect, but a few small steps in the right direction. - Split build systems, compilers, third party tools - Mention building mesa for Android (part of AOSP) - Drop explicit "other" dependencies. Reference to disto methods to get them. - HTML 4.01 Traditional compliance fixes - mixed ul and br tags. - nuke dead links README.{CYGWIN,VMS} v2: Squash typos, add note about buggy flex 2.6.2 (Eric), add Suse zipper command (Tobias). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:23 +00:00
Emil Velikov	438086efb1	docs: sourcetree.html misc updates A mixed bag of updates/fixes - mostly aiming at removing no longer applicable directories. Add a few more state-trackers, drivers, etc. alongside "XXX more" where applicable. Attribute for the GLSL/NIR movement and nukage of src/egl/docs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:20 +00:00
Emil Velikov	2edc29ab1e	docs: flesh out releasing.html Properly document the whole process: - Brief on what, when, where - Picking, testing, branchpoints, pre-release announcement - Releasing, announcement, website and bugzilla updates Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:18 +00:00
Emil Velikov	b571c075e9	docs/submittingpatches: fix tags mis/abuse Fix the odd tag so that we're HTML 4.01 Traditional compliant Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:14 +00:00
Emil Velikov	07384468af	docs/submittingpatches: flesh out "how to nominate" methods Currently they are buried within the text, making it hard to find. Move them to the top and be clear what is _not_ a good idea. v2: Minor commit polish, use only "resending" as suggested by Matt. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:12 +00:00
Emil Velikov	019f055f32	docs/autoconf: update glx driver / enable-debug text With earlier commit we folded all the xlib handling in --enable-glx, but we forgot to update the documentation. Elaborate on --enable-debug and drop mentions about depenencies. v2: Grammar - s\|haven't\|hasn't\| (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:09 +00:00
Emil Velikov	49ac732651	docs/repository: refer to Submitting patches v2: Improve grammar - add missing "to" (Eric). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:07 +00:00
Emil Velikov	259e65c03e	docs: split Submitting Patches into separate document Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:05 +00:00
Emil Velikov	e561737c52	docs: split Codying style into separate document Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:04 +00:00
Emil Velikov	edbf3ebe1f	docs: mention/suggest testing your patch against dEQP Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:02 +00:00
Emil Velikov	f2d9c7b60c	docs: mention that coding style can differ between drivers ... and point people to use/honour the EditorConfig/Emacs files, where applicable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:07:59 +00:00

1 2 3 4 5 ...

86765 Commits All Branches Search

86765 Commits

All Branches