KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Roland Scheidegger	e442db8e98	draw: drop some overflow computations It turns out that noone actually cares if the address computations overflow, be it the stride mul or the offset adds. Wrap around seems to be explicitly permitted even by some other API (which is a _very_ surprising result, as these overflow computations were added just for that and made some tests pass at that time - I suspect some later fixes fixed the actual root cause...). So the requirements in that other api were actually sane there all along after all... Still need to make sure the computed buffer size needed is valid, of course. This ditches the shiny new widening mul from these codepaths, ah well... And now that I really understand this, change the fishy min limiting indices to what it really should have done. Which is simply to prevent fetching more values than valid for the last loop iteration. (This makes the code path in the loop minimally more complex for the non-indexed case as we have to skip the optimization combining two adds. I think it should be safe to skip this actually there, but I don't care much about this especially since skipping that optimization actually makes the code easier to read elsewhere.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	2471aaa02f	draw: simplify fetch some more Don't keep the ofbit. This is just a minor simplification, just adjust the buffer size so that there will always be an overflow if buffers aren't valid to fetch from. Also, get rid of control flow from the instanced path too. Not worried about performance, but it's simpler and keeps the code more similar to ordinary fetch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	4e1be31f01	draw: unify linear and elts draw jit functions The code for elts and linear paths was nearly 100% identical by now - with the elts path simply having some additional gather for the elements in the main loop (with some additional small differences before the main loop). Hence nuke the separate functions and decide this at jit shader execution time (simply based on the presence of the elts pointer). Some analysis shows that the generated vs jit functions seem to be just very minimally more complex than the former elts functions, and almost none of the additional complexity is in the main loop (basically just the branch logic for the branch fetching the actual indices). Compared to linear, the codesize of the function is of course a bit larger, however the actual executed code in the main loop appears to be near 100% identical (the additional code looking up indices is skipped as expected). So, I would not expect a (meaningful) performance difference with the generated code, neither with elts nor linear, this does however roughly half the compilation time (the compiled shaders should also use only half the memory of course). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	8cf7edff7d	draw: use same argument order for jit draw linear / elts functions This is a bit simpler. Mostly to make it easier to unify the paths later... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	78a997f728	draw: drop unnecessary index overflow handling from vsplit code This was kind of strange, since it replaced indices which were only overflowing due to bias with MAX_UINT. This would cause an overflow later in the shader, except if stride was 0, however the vertex id would be essentially random then (-1 + eltBias). No test cared about it, though. So, drop this and just use ordinary int arithmetic wraparound as usual. This is much simpler to understand and the results are "more correct" or at least more consistent (vertex id as well as actual fetch results just correspond to wrapped around arithmetic). There's only one catch, it is now possible to hit the cache initialization value also with ushort and ubyte elts path (this wouldn't be an issue if we'd simply handle the eltBias itself later in the shader). Hence, we need to make sure the cache logic doesn't think this element has already been emitted when it has not (I believe some seriously bad things could happen otherwise). So, borrow the logic which handled this from the uint case, but not before fixing it up... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	7a55c436c6	draw: simplify vsplit elts code a bit vsplit_get_base_idx explicitly returned idx 0 and set the ofbit in case of overflow. We'd then check the ofbit and use idx 0 instead of looking it up. This was necessary because DRAW_GET_IDX used to return DRAW_MAX_FETCH_IDX and not 0 in case of overflows. However, this is all unnecessary, we can just let DRAW_GET_IDX return 0 in case of overflow. In fact before `bbd1e60198` the code already did that, not sure why this particular bit was changed (might have been one half of an attempt to get these indices to actual draw shader execution - in fact I think this would make things less awkward, it would require moving the eltBias handling to the shader as well). Note there's other callers of DRAW_GET_IDX - those code paths however explicitly do not handle index buffer overflows, therefore the overflow value doesn't matter for them. Also do some trivial simplification - for (unsigned) a + b, checking res < a is sufficient for overflow detection, we don't need to check for res < b too (similar for signed). And an index buffer overflow check looked bogus - eltMax is the number of elements in the index buffer, not the maximum element which can be fetched. (Drop the start check against the idx buffer though, this is already covered by end check and end < start). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	5ec3a7333f	draw: finally optimize bool clip mask generation lp_build_any_true_range is just what we need, though it will only produce optimal code with sse41 (ptest + set) - but even without it on 64bit x86 the code is still better (1 unpack, 2 movq + or + set), on 32bit x86 it's going to be roughly the same as before. While here also make it a "real" 8bit boolean - cuts one instruction but more importantly similar to ordinary booleans. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-18 01:25:21 +01:00
Roland Scheidegger	b16f06fd05	draw: use vectorized calculations for fetch (v2) Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to llvm not recognizing it's all the same fetch, since it would have been possible some of the fetches getting replaced with zeros in case vector size exceeds remaining fetch count - the values of such fetches don't matter at all though). Also, for elts gathering, use vectorized code as well. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). v3: skip the fake index buffer, not needed due to the jit code never seeing the real index buffer in the first place. Fix a bug with mask expansion (needs SExt, not ZExt). Also, be really really careful to keep the behavior the same, even in cases where it looks wrong, and add comments why the code is doing the seemingly wrong stuff... Fortunately it's not actually more complex in the end... Also change function order slightly just to make the diff more readable. No piglit change. Passes some internal testing with another api too... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-18 01:25:21 +01:00
Nicolai Hähnle	fb17b7f99d	u_simple_shaders: try to un-break the Windows build Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 13:25:35 +01:00
Nicolai Hähnle	3817a7a1d7	util/blitter: add clamping during SINT <-> UINT blits Even though glBlitFramebuffer cannot be used for SINT <-> UINT blits, we still need to handle this type of blit here because it can happen as part of texture uploads / downloads, e.g. uploading a GL_RGBA8I texture from GL_UNSIGNED_INT data. Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:21 +01:00
Nicolai Hähnle	ab5fd10eaa	util/blitter: index texfetch_col shaders by type Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:07 +01:00
Marek Olšák	72217d4335	gallium: add PIPE_SHADER_CAP_LOWER_IF_THRESHOLD Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:23:40 +01:00
Marek Olšák	5b8876609e	gallivm: limit use of setFastMathFlags to LLVM 3.8 and later Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-15 20:22:28 +01:00
Marek Olšák	41d20d4920	gallivm: add lp_create_builder with an unsafe_fpmath option Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 19:17:56 +01:00
Tim Rowley	b9578b683d	gallium: detect avx512 cpu features v3: fix check for xmm/ymm test v2: style code, add avx512 to cpu dump Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-10 15:03:21 -06:00
Nicolai Hähnle	b46a9c570f	gallivm: fix [IU]MUL_HI regression harder The fix in commit `88f791db75` was insufficient for radeonsi because the vector case was not handled properly. It seems piglit only covers the scalar case, unfortunately. Fixes GL45-CTS.shader_bitfield_operation.[iu]mulExtended.* Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-10 13:17:10 +01:00
Tom Stellard	8bdd52c8f3	gallivm: Fix build after removal of deprecated attribute API v3 v2: Fix adding parameter attributes with LLVM < 4.0. v3: Fix typo. Fix parameter index. Add a gallivm enum for function attributes. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-09 20:13:27 +00:00
Roland Scheidegger	4d5346aaac	Revert "draw: use vectorized calculations for fetch" Trivial. There's some regressions internally, related to overflow behavior. I'll have to look at it at another time, some interactions with vsplit/vcache are actually mind-blowing. This reverts commit `3fa10ffb49`.	2016-11-09 05:53:16 +01:00
Marek Olšák	bdd48e47c0	tgsi/scan: turn a huge if-else-if.. chain into a switch statement Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-08 17:56:42 +01:00
Marek Olšák	f864547fa9	tgsi/scan: fix images_buffers regression The first IF statement disabled the second one. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98599 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-08 17:56:42 +01:00
Nicolai Hähnle	88f791db75	gallivm: fix [IU]MUL_HI regression This patch does two things: 1. It separates the host-CPU code generation from the generic code generation. This guards against accidently breaking things for radeonsi in the future. 2. It makes sure we actually use both arguments and don't just compute a square :-p Fixes a regression introduced by commit `29279f44b3` Cc: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-08 16:25:54 +01:00
Roland Scheidegger	3fa10ffb49	draw: use vectorized calculations for fetch Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. Because llvm is complete fail with the zero-extend widening mul, roll our own even... To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to apparently llvm not being able to deduce it's really all the same with a couple instanced elements). Also, for elts gathering, use vectorized code as well - provide a fake elt buffer if there's no valid one bound. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). No piglit change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-08 03:41:26 +01:00
Roland Scheidegger	29279f44b3	gallivm: introduce 32x32->64bit lp_build_mul_32_lohi function This is used by shader umul_hi/imul_hi functions (and soon by draw). It's actually useful separating this out on its own, however the real reason for doing it is because we're using an optimized sse2 version, since the code llvm generates is atrocious (since there's no widening mul in llvm, and it does not recognize the widening mul pattern, so it generates code for real 64x64->64bit mul, which the cpu can't do natively, in contrast to 32x32->64bit mul which it could do). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-08 03:41:26 +01:00
Steven Toth	381edca826	gallium/hud: protect against and initialization race In the event that multiple threads attempt to install a graph concurrently, protect the shared list. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Steven Toth	5a58323064	gallium/hud: close a previously opened handle We're missing the closedir() to the matching opendir(). Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Steven Toth	6ffed08679	gallium/hud: fix a problem where objects are free'd while in use. Instead of trying to maintain a reference counted list of valid HUD objects, and freeing them accordingly, creating race conditions between unanticipated multiple threads, simply accept they're allocated once and never released until the process terminates. They're a shared resource between multiple threads, so accept they're always available for use. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Roland Scheidegger	572a952126	draw: fix undefined input handling some more... Previous fixes were incomplete - some code still iterated through the number of elements provided by velem layout instead of the number stored in the key (which is the same as the number defined by the vs). And also actually accessed the elements from the layout directly instead of those in the key. This mismatch could still cause crashes. (Besides, it is a very good idea to only use data stored in the key anyway.) v2: move null format check, remove now unnecessary function parameter, some minor prettify Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 01:48:22 +01:00
Brian Paul	f4dd3bde37	gallium/hud: call fflush() after printing error messages For Windows. Otherwise, we don't see the message until the program exits. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:23 -06:00
Timothy Arceri	e1af20f18a	nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Brian Paul	88a618ce86	tgsi: trivial build fix for MSVC Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-24 14:16:07 -07:00
Axel Davy	54010cf8b6	gallium/util: Add align_calloc Add implementation for align_calloc, which is align_malloc + memset. v2: add if (ptr) before memset. Fix indentation. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:56:44 +02:00
Marek Olšák	f35b1d156b	tgsi/scan: scan texture offset operands This seems important considering how much we depend on some of the flags. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:38 +02:00
Marek Olšák	a2f98dff14	tgsi/scan: move src operand processing into a separate function the next commit will need this Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:36 +02:00
Marek Olšák	72267a25db	tgsi/scan: get information about shader buffer usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:35 +02:00
Marek Olšák	d89890d000	tgsi/scan: handle indirect image indexing correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:33 +02:00
Marek Olšák	ac37720f51	tgsi/scan: don't treat RESQ etc. as memory instructions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:30 +02:00
Marek Olšák	f095a4eb17	tgsi/scan: get information about indirect 2D file access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:28 +02:00
Marek Olšák	965a5f1810	tgsi/scan: get information about indirect CONST access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:26 +02:00
Marek Olšák	c2a602d21a	gallivm: try to fix build with LLVM <= 3.4 due to missing CallSite.h Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-10-20 17:45:23 +02:00
Marek Olšák	2db56434d4	gallivm: add wrappers for missing functions in LLVM <= 3.8 radeonsi needs these. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-20 11:07:50 +02:00
Roland Scheidegger	aeceec54a8	draw: improve vertex fetch (v2) The per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	0942fe548e	draw: improved handling of undefined inputs Previous attempts to zero initialize all inputs were not really optimal (though no performance impact was measurable). In fact this is not really necessary, since we know the max number of inputs used. Instead, just generate fetch for up to max inputs used by the shader, directly replacing inputs for which there was no vertex element by zero. This also cleans up key generation, which previously would have stored some garbage for these elements. And also drop the assertion which indicates such bogus usage by a debug_printf (the whole point of initializing the undefined inputs was to make this case safe to handle). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	d1b4a3451e	gallivm: print out time for jitting functions with GALLIVM_DEBUG=perf Compilation to actual machine code can easily take as much time as the optimization passes on the IR if not more, so print this out too. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	6f2f0daeb4	gallivm: Use native packs and unpacks for the lerps For the texturing packs, things looked pretty terrible. For every lerp, we were repacking the values, and while those look sort of cheap with 128bit, with 256bit we end up with 2 of them instead of just 1 but worse, plus 2 extracts too (the unpack, however, works fine with a single instruction, albeit only with llvm 3.8 - the vpmovzxbw). Ideally we'd use more clever pack for llvmpipe backend conversion too since we actually use the "wrong" shuffle (which is more work) when doing the fs twiddle just so we end up with the wrong order for being able to do native pack when converting from 2x8f -> 1x16b. But this requires some refactoring, since the untwiddle is separate from conversion. This is only used for avx2 256bit pack/unpack for now. Improves openarena scores by 8% or so, though overall it's still pretty disappointing how much faster 256bit vectors are even with avx2 (or rather, aren't...). And, of course, eliminating the needless packs/unpacks in the first place would eliminate most of that advantage (not quite all) from this patch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Emil Velikov	af7abc512c	loader: remove loader_get_driver_for_fd() driver_type Reminiscent from the pre-loader days, were we had multiple instances of the loader logic in separate places and one could build a "GALLIUM_ONLY" version. Since that is no longer the case and the loaders (glx/egl/gbm) do not (and should not) require to know any classic/gallium specific we can drop the argument and the related code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:29 +01:00
Marek Olšák	34099894c3	gallium/tgsi: add missing #include Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 11:20:57 +02:00
Jose Fonseca	c6d17701c8	pipe_loader_sw: Don't invoke Unix close() on Windows. Trivial.	2016-10-14 16:29:04 +01:00
Emil Velikov	c079a206ad	gallium: rename drm_driver_descriptor::{, driver_}name Historically we use "device name" for the name of the kernel module and "driver name" for the dri/other driver. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	9837cf13b1	gallium: remove unused drm_driver_descriptor::driver_name Likely unused since day 1, although I've only checked back until the st/dri unification with commit `29ca7d2c94` ("st/dri: merge dri/drm and dri/sw backends") Based on the comment, referencing drmOpenByName it's not something we want to bring back. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Brian Paul	b81546d43c	tgsi: fix comment typo in tgsi_ureg.c Trivial.	2016-10-13 17:38:49 -06:00
Axel Davy	197cdd1bbd	gallium/os: Use unsigned integers for size computation Use uint64_t instead of int64_t in the calculation, as the result is uint64_t. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 21:16:35 +02:00
Nicolai Hähnle	2b460c750a	tgsi/ureg: add ureg_DECL_output_layout For specifying an exact location/component. v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	047a7c7a0b	tgsi/ureg: add layout/component input declarations v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	f9a01f3872	tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays v2: remove a tautological left-over assert (Marek) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Roland Scheidegger	7e86b2ddae	draw: initialize shader inputs This should make the code more robust if a shader tries to use inputs which aren't defined by the vertex element layout (which usually shouldn't happen). No piglit change. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-10-12 15:05:44 +02:00
Axel Davy	2290eac84e	gallium/util: Really allow aliasing of dst for u_box_union_* Gallium nine relies on aliasing to work with this function. Without this patch, dirty region tracking was incorrect, which could lead to incorrect textures or vertex buffers. Fixes several game bugs with nine. Fixes https://github.com/iXit/Mesa-3D/issues/234 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-10 23:43:48 +02:00
Axel Davy	218459771a	gallium/os: Fix overflow on 32 bits On systems with more than 4GB of ram, os_get_total_physical_memory was triggering an integer overflow for the linux and haiku path, when on 32 bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94561 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 23:43:48 +02:00
Steven Toth	e00fdd643b	gallium/hud: Remove superfluous debug No longer required. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:37:06 +01:00
Marek Olšák	faee2d6dda	tgsi/scan: don't set interp flags for inputs only used by INTERP (v2) (v1 pushed, then reverted) This fixes 9 randomly failing tests on radeonsi: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* v2: use input_interpolate[input] (correct) instead of input_interpolate[index] (incorrect) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Jose Fonseca	437d7e1baf	gallivm: Use AVX2 gather instrinsics. v2: Use AVX2 gather for non aligned loads too. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-04 23:36:20 +01:00
Roland Scheidegger	bc80741d7a	gallivm: Use 8 wide AoS sampling on AVX2. v2: Make sure that with num_lods > 1 and min_filter != mag_filter we still enter the splitting path. So this case would still use 4-wide aos path (as a side note, the 4-wide aos sampling path could actually be improved quite a bit if we have avx2, by just doing the filtering with 256bit vectors). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-04 23:36:20 +01:00
José Fonseca	e088390c7d	gallivm: Basic AVX2 support. v2: pblendb -> pblendvb Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-04 23:36:20 +01:00
Matt Whitlock	5d0069eca2	gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:55 +02:00
Nayan Deshmukh	b7a0f2e1f7	vl/dri3: fix warning about incompatible pointer type Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-10-03 12:51:30 -04:00
Steven Toth	e99b9395be	gallium/hud: Add support for CPU frequency monitoring Detect all of the CPUs in the system. Expose metrics for min, max and current frequency in Hz. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-30 15:18:46 -06:00
Marek Olšák	7b87190d2b	Revert "gallium/hud: automatically print % if max_value == 100" This reverts commit `dbfeb0ec12`. With max_value being rounded to 100, it's often wrong. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-30 22:07:12 +02:00
Steven Toth	1d466b9b04	gallium/hud: Add power sensor support Implement support for power based sensors, reporting units in milli-watts and watts. Also, minor cleanup - change the related if block to a switch. Tested with two different power sensors, including the nouveau 'power1' sensors on a GTX950 card. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-29 17:51:15 -06:00
Steven Toth	8c60bcb4c3	gallium/hud: Add support for block I/O, network I/O and lmsensor stats V8: Feedback based on peer review convert if block into a switch Constify some func args V7: Increase precision when measuring lmsensors volts Flatten patch series. V6: Feedback based on peer review Simplify sensor initialization (arg passing). Constify some func args V5: Feedback based on peer review Convert sprintf to snprintf Convert char * to const char * int arg converted to bool Func changes to take a filename vs a larger struct. Omit the space between '*' and the param name. V4: Merged with master as of 2016/9/27 6pm V3: Flatten the entire patchset ready for the ML V2: Additional seperate patches based on feedback a) configure.ac: Add a comment related to libsensors b) HUD: Disable Block/NIC I/O stats by default. Implement configuration option --enable-gallium-extra-hud=yes and enable both statistics when this option is enabled. c) Configure.ac: Minor cleanup to user visible configuration settings d) Configure.ac: HUD stats - build system improvements Move the -lsensors out of a deeper Makefile, bring it into the configure.ac. Also, rename a compiler directive to more closely follow the standard. V1: Initial release to the ML Three new features: 1. Disk/block I/O device read/write stats MB/ps. 2. Network Interface RX/TX transfer statistics as a percentage of the overall NIC speed. 3. lmsensor power, voltage and temperature sensors. The lmsensor changes makes a dependency on libsensors so support for the change is opt out by default. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-28 16:18:05 -06:00
Nicolai Hähnle	4421c0fb0d	gallium/radeon/winsyses: reduce the number of pb_cache buckets Small buffers are now handled via the slabs code, so separate buckets in pb_cache have become redundant. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:41 +02:00
Nicolai Hähnle	84f156c0cb	gallium/pipebuffer: add pb_slab utility This is a simple framework for slab allocation from buffers that fits into the buffer management scheme of the radeon and amdgpu winsyses where bufmgrs aren't used. The utility knows about different sized allocations and explicitly manages reclaim of allocations that have pending fences. It manages all the free lists but does not actually touch buffer objects directly, relying on callbacks for that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:44:42 +02:00
Nicolai Hähnle	b3ebc229dc	gallium/u_math: add util_logbase2_ceil For finding the exponent of the next power of two. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:44:38 +02:00
Rob Clark	ecd6fce261	mesa/st: support lowering multi-planar YUV Support multi-planar YUV for external EGLImage's (currently just in the dma-buf import path) by lowering to multiple texture fetch's for each plane and CSC in shader. There was some discussion of alternative approaches for tracking the additional UV or U/V planes: https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html They all seemed worse than pipe_resource::next Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Samuel Pitoiset	be0535b8c7	gallium/util: make use of strtol() in debug_get_num_option() This allows to use hexadecimal numbers which are automatically detected by strtol() when the base is 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-09-26 19:39:04 +02:00
Brian Paul	b35684543e	gallium/util: add comment on util_is_format_compatible() From reading the code, it's not obvious what is src/dest compatible. The list of a->b copy-compatible formats comes from Jose's original check-in message, with some format name updates. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-21 12:26:17 -06:00
Nicolai Hähnle	1f291369e4	gallivm: support negation on 64-bit integers This should be analogous to 32-bit integers. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:50 +02:00
Dave Airlie	5561a37710	gallivm/llvmpipe: prepare support for ARB_gpu_shader_int64. This enables 64-bit integer support in gallivm and llvmpipe. v2: add conversion opcodes. v3: - PIPE_CAP_INT64 is not there yet - restrict DIV/MOD defaults to the CPU, as for 32 bits - TGSI_OPCODE_I2U64 becomes TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:30 +02:00
Dave Airlie	6b26039da3	tgsi/softpipe: prepare ARB_gpu_shader_int64 support. (v3) This adds all the opcodes to tgsi_exec for softpipe to use. v2: add conversion opcodes. v3: - no PIPE_CAP_INT64 yet - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:11 +02:00
Dave Airlie	3985e6c044	gallium/tgsi: add support for 64-bit integer immediates. This adds support to TGSI for 64-bit integer immediates. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-21 10:23:55 +02:00
Dave Airlie	6e1a34d545	gallium: add opcode and types for 64-bit integers. (v3) This just adds the basic support for 64-bit opcodes, and the new types. v2: add conversion opcodes. add documentation. v3: - make docs more consistent - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:23:05 +02:00
Nayan Deshmukh	853e80f5a0	vl/dri3: handle the case of different GPU(v4.2) In case of prime when rendering is done on GPU other then the server GPU, use a seprate linear buffer for each back buffer which will be displayed using present extension. v2: Use a seprate linear buffer for each back buffer (Michel) v3: Change variable names and fix coding style (Leo and Emil) v4: Use PIPE_BIND_SAMPLER_VIEW for back buffer in case when a seprate linear buffer is used (Michel) v4.1: remove empty line v4.2: destroy the context and handle the case when create_context fails (Emil) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:17:02 +02:00
Lars Hamre	ddd6116e32	tgsi: Enable returns from within loops Fixes the following piglit test (for softpipe): /spec/glsl-1.10/execution/fs-loop-return Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Rob Clark	ba8a50955d	ttn: fix warning after `7bf76563e` Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-16 11:55:26 -04:00
Marek Olšák	f019255acf	Revert "tgsi/scan: don't set interp flags for inputs only used by INTERP instructions" This reverts commit `524fd55d2d`. Reason: https://bugs.freedesktop.org/show_bug.cgi?id=97808	2016-09-15 00:47:24 +02:00
Marek Olšák	524fd55d2d	tgsi/scan: don't set interp flags for inputs only used by INTERP instructions radeonsi depends on the interp flags a little bit too much. This fixes 9 randomly failing tests: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	c723acc03d	ddebug: dump shader buffers and images this was unimplemented Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Andy Furniss	304f70536a	vl/util: Fix YV12/I420 convert to NV12 U/V reversal Fix VAAPI YV12/I420 convert to NV12 U/V reversal. Input order is YVU when this is called. Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-09-13 13:58:40 -04:00
Leo Liu	6a7f79af9b	vl/rbsp: match initial escaped bits with valid in the buffer Otherwise the check for the three byte will not make sense. Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-09-12 10:09:27 -04:00
Marek Olšák	5981ab5445	gallium: remove PIPE_BIND_TRANSFER_READ/WRITE not used in any useful way Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	e7a73b75a0	gallium: switch drivers to the slab allocator in src/util	2016-09-06 14:24:04 +02:00
Thomas Hellstrom	fc6be40011	gallium/postprocess: Fix resource freeing The code was triggering asserts in DEBUG builds of the SVGA driver since the reference count of the resource was never decremented before destroy. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-01 07:59:49 +02:00
Kai Wasserbäch	4c53267b8f	gallium: Use enum pipe_shader_type in set_shader_images() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:37 -06:00
Kai Wasserbäch	532db3b788	gallium: Use enum pipe_shader_type in set_sampler_views() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:25 -06:00
Kai Wasserbäch	7413625ad3	gallium: Use enum pipe_shader_type in bind_sampler_states() (v2) v1 → v2: - Fixed indentation (noted by Brian Paul) - Removed second assert from nouveau's switch statements (suggested by Brian Paul) Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 08:45:48 -06:00
Marek Olšák	d301efb400	tgsi/scan: remember sampler view types Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 14:16:57 +02:00
Brian Paul	d221a6545c	gallium/hud: move signo declaration inside PIPE_OS_UNIX block To silence unused var warning with MSVC, MinGW. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-26 06:19:51 -06:00
Marek Olšák	9daaa6f5a6	gallium: add a pipe_context parameter to resource_get_handle radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL interop and this is the only way to make it coherent with the current context. It can optionally be set to NULL. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-25 14:09:48 +02:00
Rhys Kidd	c9c989763a	gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization Duplicate line is currently on 1535. Identified by Clang, when run through Eric Anholt's Travis harness. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-24 11:54:50 -07:00
Leo Liu	5277f25480	vl/rbsp: fix another three byte not detected This happens when three byte "00 00 03" is partly loaded to vlc->buffer, thus at the bottom of buffer with valid bits is "00" or "00 00" and left like "00 03" or "03" in the data, so that it will not be detected by three byte emulation check. The reason for that is the escaped bit was set to 0 from the rbsp init. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-08-24 11:17:16 -04:00
Eric Engestrom	9411eb67ec	gallium/cso: avoid unnecessary null dereference The label `out:` calls `destroy()` which dereferences `ctx`. This is unnecessary as there is nothing to destroy. Immediately return instead. CovID: 1258255 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-24 11:35:05 +01:00
Marek Olšák	0328b20050	gallium/hud: round max_value to print nicely rounded numbers next to graphs This improves readability a lot. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	0f1befe926	gallium/hud: generalize code for drawing numbers next to graphs Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	a33eb48d61	gallium/hud: draw numbers with 3 decimal places if those aren't 0 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	b9c9551c09	gallium/hud: use sRGB for nicer AA lines Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	6ffde82083	gallium/hud: use AA lines for graphs this looks a lot better (with the next patch) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	6902f9e82a	gallium/hud: don't enable blending for all objects Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Eric Anholt	c078c41520	ttn: Use nir_load_front_face instead of the TGSI-style input. This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR more optimization to work on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	ed92241d78	ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn. This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all their source paths. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-19 13:11:36 -07:00
Marek Olšák	325379096f	gallium: change pipe_image_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	7cd256ce7e	gallium: change pipe_sampler_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Nicolai Hähnle	41001ca4bd	gallivm: add lp_build_alloca_undef Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	17e88e276c	gallivm: add create_builder_at_entry helper function Reduces code duplication. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	67c0f077a2	tgsi/scan: add tgsi_scan_arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:21 +02:00
Brian Paul	038b1b11fe	gallium: remove unused u_clear.h file Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00
Brian Paul	66debeae9d	gallium/util: minor reformatting in u_box.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:32 -06:00
Rob Clark	142dd7b9c0	gallium/u_blitter: split out a helper for common clear state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Rob Clark	2b2f436c69	gallium/u_blitter: add helper to save FS const buffer state Not (currently) state that is overwridden by u_blitter itself, but drivers with custom blit/clear which are reusing part of the u_blitter infrastructure will use it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Rob Clark	433e12fea8	gallium/u_blitter: export some functions Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Ilia Mirkin	c85b7f0e87	gallium/util: add helper to compute zmin/zmax for a viewport state Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-08-14 17:41:33 -04:00
Leo Liu	6575ebdc45	vl/rbsp: add a check for emulation prevention three byte This is the case when the "00 00 03" is very close to the beginning of nal unit header v2: move the check to rbsp init Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-10 09:52:44 -04:00
Marek Olšák	a909210131	gallium: add render_condition_enable param to clear_render_target/depth_stencil Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Mathias Fröhlich	aa920736fe	gallium: Add c99_compat.h to u_bitcast.h We need this for 'inline'. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 21:20:56 +02:00
Mathias Fröhlich	027cbf00f2	util: Move _mesa_fsl/util_last_bit into util/bitscan.h As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 21:20:46 +02:00
Jason Ekstrand	f29fd7897a	util: Move format_r11g11b10f.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:57 -07:00
Jason Ekstrand	6c665cdfc5	util: Move format_rgb9e5.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:31 -07:00
Michel Dänzer	67c5e843b9	vl/dri3: Destroy Present event context when destroying drawable v2 Without this, the X server may accumulate stale Present event contexts if a client performs several video decoding sessions using the same window. v2: Based on Chris Wilson's review: * Use xcb_discard_reply() instead of free(xcb_request_check()) Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>	2016-08-04 15:45:43 +09:00
Marek Olšák	6db93cd167	gallium/util: fix align64 it cut off the upper 32 bits Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-01 23:28:14 +02:00
Matt Turner	be35c6ba92	draw: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	16ff8f9ae8	gallium/auxiliary: Add u_bitcast.h header. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Brian Paul	13fa051356	auxiliary/os: add new os_get_command_line() function This can be used by the driver to get the command line which started the process. Will be used by the VMware driver for extra logging. For now, this is only implemented for Linux via /proc/self/cmdline and Windows via GetCommandLine(). Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:20:19 -06:00
Rob Clark	53b2b8bf6f	u_vbuf: fix potentially bogus assert There are cases where we hit u_vbuf path due to alignment or pitch- alignment restrictions, but for an output-format that u_vbuf does not support translating (yet the driver does support natively). In which case we hit the memcpy() path and don't care that u_vbuf doesn't understand it. Fixes crash with debug build of mesa in: dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95000 Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 13:42:11 -04:00
Roland Scheidegger	99a47391e4	Revert "gallium/util: fix resource leak" This reverts commit `d1fe26a628`. Replacing a resource leak with a segfault isn't the solution.	2016-07-30 18:18:09 +02:00
Eric Engestrom	d1fe26a628	gallium/util: fix resource leak CovID: 401540 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-30 17:27:42 +02:00
Rob Clark	010e4b2d52	os: add pipe_mutex_assert_locked() Would be nice if we could also have lockdep, like in the linux kernel. But this is better than nothing. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Eric Anholt	4d0b2c7aaa	ttn: Update shader->info as we generate code. We could use the nir_shader_gather_info() pass to update it after the fact, but this is what glsl_to_nir and prog_to_nir do. Reviewed-by: Rob Clark <robclark@freedesktop.org>	2016-07-26 13:47:50 -07:00
Boyuan Zhang	23b4ab1738	vl/util: add copy func for yv12image to nv12surface v2 Add function to copy from yv12 image to nv12 surface for VAAPI putimage call. We need this function in VaPutImage call where copying from yv12 image to nv12 surface for encoding. Existing function can't be used because it only work for copying from yv12 surface to nv12 image in Vaapi. v2: cleanup variable types and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:18 +02:00
Marek Olšák	8e3e9d2839	gallium/util: don't modify usage in pipe_buffer_write All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-23 13:33:42 +02:00
Marek Olšák	1ffe77e7bb	gallium: split transfer_inline_write into buffer and texture callbacks to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-07-23 13:33:42 +02:00
Marek Olšák	4cdc482283	gallium/os: use CLOCK_MONOTONIC for sleeps (v2) v2: handle EINTR, remove backslashes Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-22 22:34:49 +02:00
Marek Olšák	8d5944199d	gallium/pb_cache: reduce the number of pointer dereferences Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	3cdc0e133f	gallium/pb_cache: divide the cache into buckets for reducing cache misses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	fec7f74129	gallium/pb_cache: check parameters that are more likely to fail first This makes Bioshock Infinite with deferred flushing 2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Eric Engestrom	8ba46fbd9e	vl: fix memory leak CovID: 1363008 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:41:00 +02:00
Leo Liu	134d6e4e4f	vl/dri3: fix a memory leak from front buffer Inspired by fix for mem leak of vdpau interop, resource_from_handle set texture reference count, that need to be decreased and released, recall there is a similar case for DRI3, that is with VA-API glx extension, there is temporary TFP(texture from pixmap), we target it through dma-buf. leak happens when without count down the reference. Checked and found with mpv vo=opengl case, there only one static TFP, the leak happens once, but for totem player using gstreamer VA-API glx, the dynamic TFP for each frame, so leak quite a bit. This fixes mem leak for mpv and totem. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-18 09:20:40 -04:00
Kenneth Graunke	ac1181ffbe	compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_. Likewise, rename the enum type to glsl_interp_mode. Beyond the GLSL front-end, talking about "interpolation modes" seems more natural than "interpolation qualifiers" - in the IR, we're removed from how exactly the source language specifies how to interpolate an input. Also, SPIR-V calls these "decorations" rather than "qualifiers". Generated by: $ find . -regextype egrep -regex '.\.(c\|cpp\|h)' -type f -exec sed -i \ -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \ -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \; Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Dave Airlie <airlied@redhat.com>	2016-07-17 19:26:48 -07:00
Rob Clark	44bbfedbd9	gallium/u_queue: add optional cleanup callback Adds a second optional cleanup callback, called after the fence is signaled. This is needed if, for example, the queue has the last reference to the object that embeds the util_queue_fence. In this case we cannot drop the ref in the main callback, since that would result in the fence being destroyed before it is signaled. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-16 10:00:04 -04:00
Yaakov Selkowitz	5d303867f5	Use correct names for dlopen()ed files on Cygwin Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Marek Olšák	6596ecf8c5	gallivm: add helper lp_add_attr_dereferenceable Not sure if this is the right way to do it, but it seems to work. v2: make it a no-op on LLVM <= 3.5 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Leo Liu	82f875f4d8	vl/compositor: set layer of y or uv to render Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	14761da9f9	vl/compositor: add weave to yuv shader This shader will make interlaced yuv to progressive yuv. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	2e18c2c6f8	vl/compositor: move weave shader out from rgb weaving We'll use weave shader in the later patch. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Marek Olšák	d7b6f90684	gallivm: set LLVMNoUnwindAttribute on all intrinsics RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement: Application Files SGPRs VGPRs SpillSGPR SpillVGPR Code Size LDS Max Waves Waits unigine_valley 278 0.00 % -0.29 % 0.00 % 0.00 % 0.01 % 0.00 % 0.17 % 0.00 % Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-07-11 19:06:05 +02:00
Nicolai Hähnle	374aa2bb27	gallium/u_queue: assert that users must wait on fences before destroying them Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-11 11:04:44 +02:00
Nicolai Hähnle	a0a616720a	gallium/u_queue: guard fence->signalled checks with fence->mutex I have seen a hang during application shutdown that could be explained by the following race condition which this patch fixes: 1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true. 2. Main thread calls util_queue_job_wait, which returns immediately. 3. Main thread deletes the job and fence structures, leaving garbage behind. 4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because it is accessing garbage. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-11 11:03:59 +02:00
Nayan Deshmukh	af18a04755	vl: add half pixel to v_tex before adding offsets Since pixel center lies at 0.5, add half_pixel to vtex before adding offsets to it. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-08 20:51:12 +02:00
Rob Clark	def044376a	gallium/util: make util_copy_framebuffer_state(src=NULL) work Be more consistent with the other u_inlines util_copy_xyz_state() helpers and support NULL src. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:17:30 -04:00
Hans de Goede	d386cef246	tgsi: Add WORK_DIM System Value Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Nayan Deshmukh	872dd9ad15	vl: add a bicubic interpolation filter(v5) This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a constant buffer to send dst_size to frag shader small changes to reduce calculation in shader v5: send half pixel offset instead of sending dst_size Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-01 12:54:33 +02:00
Brian Paul	c823ff8dfb	gallium/util: check for window cliprects in util_can_blit_via_copy_region() We can't blit with resource_copy_region() if there are window clip rects. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 18:19:09 -06:00
Brian Paul	5f1335878e	gallium/util: add tight_format_check param to util_can_blit_via_copy_region() The VMware driver will use this for implementing GL_ARB_copy_image. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	a029d9f074	gallium/util: simplify a few things in util_can_blit_via_copy_region() Since only the src box can have negative dims for flipping, just comparing the src/dst box sizes is enough to detect flips. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	5d31ea4b8f	gallium/util: new util_try_blit_via_copy_region() function Pulled out of the util_try_blit_via_copy_region() function. Subsequent changes build on this. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Hans de Goede	459cc94507	pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms Make pipe_loader_sw_probe_kms take ownership of the passed in fd, like pipe_loader_drm_probe_fd does. The only caller is dri_kms_init_screen which passes in a dupped fd, just like dri2_init_screen passes in a dupped fd to pipe_loader_drm_probe_fd. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-28 12:29:54 +02:00
Marek Olšák	cbb5adb908	gallium/u_queue: allow the execute function to differ per job so that independent types of jobs can use the same queue. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	4a06786efd	gallium/u_queue: reduce the number of mutexes by 2 by converting semaphores to condvars and using the main mutex Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	2fba0aaa70	gallium/u_queue: add an option to name threads for debugging v2: correct the snprintf use Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	404d0d50d8	gallium/u_queue: add an option to have multiple worker threads independent jobs don't have to be stuck on only one thread v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	4358f6dd13	gallium/u_queue: rewrite util_queue_fence to allow multiple waiters Checking "signalled" is first done without a mutex, then with a mutex. Also, checking without waiting doesn't lock the mutex. This is racy, but should be safe. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	d8367e91f2	gallium/u_queue: use a ring instead of a stack and allow specifying its size in util_queue_init. v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Brian Paul	e0dc3c5f19	gallium/util: fix some 4-space indentation in blitter code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Ilia Mirkin	5b0d64886d	translate: fix start_instance parameter in sse version The generic version gets this right already, but this was using an incorrect formula in SSE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-21 21:50:16 -04:00
Marek Olšák	5fed1122e8	gallium/u_blitter: implement mipmap generation for pipe_context::generate_mipmap first move some of the blit code from util_blitter_blit_generic to a separate function, then use it from util_blitter_generate_mipmap Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Roland Scheidegger	b0cf99165a	gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9 Apparently, these are deprecated. There's some AutoUpgrade feature which is supposed to promote these to cmp/select, which apparently doesn't work with jit code. It is possible it's not actually even meant to work (see the bug filed against llvm which couldn't provide an answer neither) but in any case this is meant to be only temporary unless the intrinsics are really illegal. So, just use the fallback code (which should be cmp/select, we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages to optimize this back to pmin/pmax in the end). This addresses https://llvm.org/bugs/show_bug.cgi?id=28176 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Aaron Watry <awatry@gmail.com>	2016-06-20 17:19:03 +02:00
Christian König	bf89e672cf	vl: support luma keying for interlaced surfaces as well We had the CSC code twice in there, factor it out into a separate function. Signed-off-by: Christian König <christian.koenig@amd.com>	2016-06-16 09:41:12 +02:00
Brian Paul	bb1292e226	auxilary/os: allow appending to GALLIUM_LOG_FILE If the log file specified by the GALLIUM_LOG_FILE begins with '+', open the file in append mode. This is useful to log all gallium output for an entire piglit run, for example. v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-15 17:16:42 -06:00
Marek Olšák	562cb03d76	gallium/util: import the multithreaded job queue from amdgpu winsys (v2) v2: rename the event to util_queue_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-15 21:07:34 +02:00
Roland Scheidegger	afbf5888f5	gallium/util: don't use blocksize for minify for assertions The previous assertions required for texture sizes smaller than block_size that src_box.x + src_box.width still be block size. (e.g. for a texture with width 3, and src_box.x = 0, src_box.width would have to be 4 to not assert.) This caused some assertions with some other state tracker. It looks though like callers aren't expected to round up widths to block sizes (for sizes larger than block size the assertion would still have verified it wouldn't have been rounded up) so we simply shouldn't use a minify which rounds up to block size. (No piglit change with llvmpipe.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-14 17:03:34 +02:00
Julien Isorce	1cdb4da1d6	st/va: ensure linear memory for dmabuf In order to do zero-copy between two different devices the memory should not be tiled. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-14 08:40:33 +01:00
Mathias Fröhlich	c3b6656676	mesa/gallium: Move u_bit_scan{,64} from gallium to util. The functions are also useful for mesa. Introduce src/util/bitscan.{h,c}. Move ffs function implementations from src/mesa/main/imports.{h,c}. Move bit scan related functions from src/gallium/auxiliary/util/u_math.h. Merge platform handling with what is available from within mesa. v2: Try to fix MSVC compile. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-14 05:19:10 +02:00
Brian Paul	cf9bb9acac	util: update some assertions in util_resource_copy_region() To cope with copies of compressed images which are not multiples of the block size. Suggested by Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>	2016-06-13 13:30:19 -06:00
Jan Vesely	1fb4179f92	vl: Fix trivial sign compare warnings v2: add whitepace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Jose Fonseca <jfonseca@vmware.com> [Emil Velikov: squash a few more whitespace issues] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Rob Herring	112e988329	Android: move libdrm settings to top-level Android.common.mk Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Jan Vesely	ace70aedcf	gallivm: Fix trivial sign warnings v2: include whitespace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-13 09:23:09 -04:00
Brian Paul	dd4be2e19a	util: update util_resource_copy_region() for GL_ARB_copy_image This primarily means added support for copying between compressed and uncompressed formats. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Anuj Phogat	466b320163	gallium: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 14:35:21 -07:00
Dave Airlie	1584918996	gallivm: more 64-bit integer prep work. This converts one other place to using the new helper. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:30 +10:00
Dave Airlie	e5c57824ec	gallivm: make non-float return code bitcast consistent. This just uses the same form across the fetches. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:17 +10:00
Dave Airlie	3b97e50b9a	gallium/gallivm: use 64-bit test instead of doubles. This just makes some generic code that currently emits double suitable for emitting 64-bit values. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:13 +10:00
Dave Airlie	213ab8db87	gallium/tgsi: add 64-bitness type check function. Currently this just doubles, but we'll convert users to this so making adding 64-bit integers easier. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:43:45 +10:00
Leo Liu	2ad443e4cc	vl/dri3: support receiving new pixmap for front buffer With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 11:24:24 -04:00
Leo Liu	0ef8500aab	vl/dri3: get Makefile properly From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 11:24:19 -04:00
Jose Fonseca	2b4cee0571	gallivm: Never emit llvm.fmuladd on LLVM 3.3. Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't handle llvm.fmuladd and instead it fall backs to a C function. Which in turn causes a segfault on Windows. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 16:17:04 +01:00
Jose Fonseca	320d1191c6	gallivm: Use llvm.fmuladd.*. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 13:47:35 +01:00
Jose Fonseca	9e8edfa190	util,gallivm: Explicitly enable/disable fma attribute. As suggested by Roland Scheidegger. Use the same logic as f16c, since fma requires VEX encoding. But disable FMA on LLVM 3.3 without MCJIT. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 13:47:35 +01:00
Nayan Deshmukh	f24eb5a178	vl: Apply luma key filter before CSC conversion Apply the luma key filter to the YCbCr values during the CSC conversion in video buffer shader. The initial values of max and min luma are set to opposite values to disable the filter initially and will be set when enabling it. Add extra parmeters min and max luma for the luma key filter in vl_compositor_set_csc_matrix in va, xvmc. Setting them to opposite value 1.f and 0.f respectively won't effect the CSC conversion v2: -Squash 1,2 and 3 into one patch to avoid breaking build of other components. (Christian) -use ureg_swizzle. (Christian) -change name of the variables. (Christian) v3: -Squash all patches in one to avoid breaking of build. (Emil) -wrap functions properly. (Emil) -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil) v4: -Divide it in two patches one which introduces the functionality and assigs dummy values to the changed functions and second which implements the lumakey filter. (Christian) -use ureg_scalar instead ureg_swizzle. (Christian) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-09 14:23:07 +02:00
Nicolai Hähnle	d3a584defe	tgsi/scan: add uses_derivatives (v2) v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-07 23:45:17 +02:00
Ilia Mirkin	30684b50d7	gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:28 -04:00
Charmaine Lee	627e975896	tgsi: fix mixed data type comparison in tgsi_point_sprite.c Cast the unsigned semantic index to integer datatype before comparing to max_generic, otherwise, max_generic which is initialized to -1 will be converted to unsigned int before the comparison, causing a wrong semantic index to be assigned to a shader output. Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265) Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord. v2: use the original max_generic variable but add the (int) cast to the semantic index, as suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-06 10:20:45 -06:00
Lars Hamre	4163c71010	tgsi: use truncf in micro_trunc Switches to using truncf in micro_trunc. Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-trunc-float fs-trunc-vec2 fs-trunc-vec3 fs-trunc-vec4 vs-trunc-float vs-trunc-vec2 vs-trunc-vec3 vs-trunc-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-trunc-float gs-trunc-vec2 gs-trunc-vec3 gs-trunc-vec4 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-06 15:56:28 +02:00
Marek Olšák	ada3d8f31e	gallium/u_suballoc: allow different alignment for each allocation Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Rob Clark	228b2b36f4	gallium/util: remove u_staging Unused, and fixes a couple of coverity warnings: CID 1362171, `1362170` Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-06-02 15:44:07 -04:00
Nicolai Hähnle	d9893feb2c	gallium/cso: allow saving the first fragment shader image slot Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:15 +02:00
Nicolai Hähnle	fc0352ff9c	gallium/u_inlines: allow NULL src in util_copy_image_view Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:12 +02:00
Marek Olšák	9d881cc0ac	gallium/util: add util_texrange_covers_whole_level from radeon Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	921ab0028e	gallium/u_blitter: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:53 +02:00
Frederic Devernay	cee459d84d	gallivm: initialize init_native_targets_once_flag correctly Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 16:13:52 +02:00
Brian Paul	747754f027	gallium/util: another s/unsigned/enum pipe_prim_type/ for clang Trivial.	2016-05-27 18:42:21 -06:00
Brian Paul	8beb6f3c9c	gallium/util: another unsigned -> enum pipe_prim_type change gcc didn't warn about the unsigned / enum pipe_prim_type mismatch between the .c and .h file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-27 17:55:05 -06:00
Roland Scheidegger	9247570d42	gallivm: eliminate a unnecessary AND with unorm lerps Instead of doing a add and then mask out the upper bits, we can simply do a add with a half wide type (this, of course, assumes the hw can actually do it...), so we'll get the required zero in the upper bits automatically. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-27 19:11:28 +02:00
Roland Scheidegger	17d685c426	gallium/util: use enum pipe_prim_type instead of unsigned some more There were complaints from a mingw build: u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_prim_type’ [-fpermissive] Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-27 19:11:28 +02:00
Rob Clark	4f98c94be7	gallium/util: fix build break Missing #include caused build breaks after `21a3fb9cd`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-26 20:59:08 -04:00
Brian Paul	1ec45a1948	gallium/util: use enum pipe_prim_type in u_prim.h functions Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	7a49b41436	util/indices: move duplicated assignments out of switch cases Spotted by Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	a25ae485a6	util/indices,svga: s/unsigned/enum pipe_prim_type/ Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	21a3fb9cd8	util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	0f983e1793	util/indices: implement unfilled (tri->line) conversion for adjacency prims Tested with new piglit gl-3.2-adj-prims test. v2: re-order trisadj and tristripadj code, per Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	d6c2c7d710	util/indices: implement provoking vertex conversion for adjacency primitives Tested with new piglit gl-3.2-adj-prims test. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	479d364c39	util/indices: assert that the incoming primitive is a triangle type The unfilled index translator/generator functions should only be called when the primitive mode is one of the triangle types. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	26de558072	util/indices: formatting, whitespace fixes in u_unfilled_indices.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	24eadb4810	util/indices: improve comments in u_indices.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Rob Clark	6e51fe75a4	tgsi: fix coverity out-of-bounds warning CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local: Overrunning array of 2 16-byte elements at element index 2 (byte offset 32) by dereferencing pointer &inst.Dst[i]. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Rob Clark	3d66ba971e	tgsi: fix out of bounds access Not sure why coverity calls this an out-of-bounds read vs out-of-bounds write. CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local: Overrunning array r of 3 16-byte elements at element index 3 (byte offset 48) using index chan (which evaluates to 3). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Lars Hamre	c626a86586	gallium/tgsi: use _mesa_roundevenf in micro_rnd Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-roundeven-float fs-roundeven-vec2 fs-roundeven-vec3 fs-roundeven-vec4 vs-roundeven-float vs-roundeven-vec2 vs-roundeven-vec3 vs-roundeven-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-roundeven-float gs-roundeven-vec2 gs-roundeven-vec3 gs-roundeven-vec4 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 07:59:15 -06:00
Giuseppe Bilotta	8c00fe3970	scons: whitespace cleanup This text transformation was done automatically via the following shell command: $ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \; Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-25 12:23:12 -06:00
Brian Paul	9690ab0cdf	tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer Print "GEOM" instead of "2", for example. v2: also update the text parsing code, per Ilia. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-25 07:21:23 -06:00
Brian Paul	2b773fcf00	tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-25 07:21:23 -06:00
Emil Velikov	a155cdaace	vl/drm: don't call close(-1) in vl_drm_screen_create error path Analogous to previous commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:47 +01:00
Dave Airlie	e6d9389366	tgsi: remove culldist semantic. This isn't used anymore in the tree, culldist's are part of the clipdist semantic, we could in theory rename it, but I'm not sure there is much point, and I'd have to be careful with virgl. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:44 +10:00
Dave Airlie	d17062a40e	draw: stop using CULLDIST semantic. The way the HW works doesn't really fit with having two semantics for this. The GLSL compiler emits 2 vec4s and two properties, this makes draw use those instead of CULLDIST semantics. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:40 +10:00
Axel Davy	52cb8e33c3	gallium/util: Implement util_format_translate_3d This is the equivalent of util_format_translate, but for volumes. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Brian Paul	5888c47cc9	cso: remove / add some comments Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-17 19:20:36 -06:00
Jan Vesely	47b390fe45	Treewide: Remove Elements() macro Signed-off-by: Jan Vesely <jano.vesely@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-17 15:28:04 -04:00
Jose Fonseca	cf010de6ee	vl/dri: Move the DRI3 check out of sources include into C. Fixes SCons build. Trivial. Built locally with SCons and autotools.	2016-05-16 21:50:43 +01:00
Leo Liu	c122c74dca	vl/dri3: implement functions for get and set timestamp Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	9f50a79b8f	vl/dri3: handle PresentCompleteNotify event and get timestamp calculated based on the event's reply Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	8d7ac0a4e4	vl/dri3: implement DRI3 BufferFromPixmap We also need render to the front buffer of temporary X pixmap, this is the case of when we using opengl as video out for vaapi. the basic implementation is to pass pixmap ID to X server, and then X will return dma-buf fd, we will get the buffer object through this dma-buf fd. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	858b329c2c	vl/dri3: add support for resizing When drawable size changed, PresentConfigureNotify event will be emitted, by handling the event to re-allocate resized buffer. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	96580ad593	vl/dri3: implement funciton for get dirty area This will clear presentation area not covered by video content Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	b0bd908284	vl/dri3: implement function for flush frontbuffer Request drawable content in pixmap by calling DRI3 PresentPixmap, and handle PresentIdleNotify event. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	e1223282db	vl/dri3: add back buffers support This implements DRI3 PixmapFromBuffer. Create buffer objects, and associate it to a dma-buf fd, and then pass this fd with a pixmap ID to X server for creating pixmap object; also add a function for wait events. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	69ba9be4d2	vl/dri3: implement flushing for queued events also place holder for present events handling Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	758b1bbaa7	vl/dri3: register present events Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	672e8d5e7e	vl/dri3: set drawable geometry Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	12e5220e34	vl/dri3: add DRI3 support and implement create and destroy Required functions into place for implementation, create screen with device fd returned from X server, also bail out to DRI2 with certain conditions. v2: -organize the error out path (Axel) -squash previous patch 1 and 2 into one (Emil) Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	bd9ae72459	vl/dri: fix close fd error out fd should be set to -1 only if it got closed by pipe_loader_release. Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-05-12 18:26:48 -04:00
Tim Rowley	2785f2f2d7	swr: properly expose compressed format support Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 14:12:18 -05:00
Rob Clark	425dc4c4b3	gallium: refactor pipe_shader_state to support multiple IR's The goal is to allow the pipe driver to request something other than TGSI, but detect whether what is getting is TGSI vs what it requested. The pipe drivers will always have to support TGSI (and convert that into whatever it is that they prefer), but in some cases we should be able to skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir). I think pipe_compute_state should get similar treatment. Currently, afaict, it has one user and one consumer, which has allowed it to be sloppy wrt. supporting alternative IR's. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:11 -04:00
Roland Scheidegger	430797843a	gallivm: improve dumping of bitcode Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode. Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple modules can be dumped (albeit it might still overwrite previous modules, particularly the modules from draw tend to always have the same name). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-11 04:43:35 +02:00
Roland Scheidegger	e4cf8717de	gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir Those aren't really interesting, however outputting them is helpful when trying to feed the IR to llvm llc (or opt) for debugging. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	5c200894c8	gallivm: use InternalLinkage instead of PrivateLinkage for texture functions At least with MCJIT the disassembler will crash otherwise when trying to disassemble such functions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	8b66e2647d	gallivm: disable avx512 features We don't target this yet, and some llvm versions incorrectly enable it based on cpu string, causing crashes. (Albeit this is a losing battle, it is pretty much guaranteed when the next new feature comes along llvm will mistakenly enable it on some future cpu, thus we would have to proactively disable all new features as llvm adds them.) This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com CC: <mesa-stable@lists.freedesktop.org>	2016-05-10 17:08:16 +02:00

... 3 4 5 6 7 ...

6560 Commits