KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Christian Gmeiner	f39a7fd627	util/macros: rework DIV_ROUND_UP macro Simplify used math. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-07-04 10:21:32 +02:00
Eric Engestrom	1b259f1ae7	util: add os_file_create_unique() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 23:37:49 +01:00
Eric Engestrom	53f17c4efd	meson: set up a proper internal dependency for xmlconfig Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-27 17:42:25 +00:00
Eric Engestrom	ad0ee5bfa5	xmlconfig: add missing #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-27 17:42:25 +00:00
Eric Engestrom	069e6d587e	xmlpool: fix typo in comment s/otions/options/, and while here let's give the full path to xmlpool.h since `../` won't be true in the generated file. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-27 17:42:25 +00:00
Eric Engestrom	2d2e824fae	util: support "y" and "n" in env_var_as_boolean() Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-24 12:49:13 +00:00
Marek Olšák	8ab9f3a857	include: update GL headers from the registry Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-06-21 19:00:52 -04:00
Eric Engestrom	955c63d364	util/os_file: resize buffer to what was actually needed Fixes: `316964709e` "util: add os_read_file() helper" Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-20 21:49:30 +00:00
Alejandro Piñeiro	6a159bca9d	util: add empty line before virgl options Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-06-20 15:21:39 +02:00
Alejandro Piñeiro	790c3dbac8	util: add missing DRI_CONF_OPT_END When DRI_CONF_GLES_EMULATE_BGRA was added for the virgl driver, it missed a DRI_CONF_OPT_END. This make some drivers, like v4c/v3d to crash with the following error: Fatal error in __driConfigOptions line 99, column 2: mismatched tag. Not sure why it doesn't fail with virgl. Fixes: `b793663449` Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-20 14:11:30 +02:00
Gert Wollny	5dbecf7863	virgl: Add a tweak to set the value for emulated queries of GL_SAMPLES_PASSED On GLES hosts GL_SAMPLES_PASSED is emulated by GL_ANY_SAMPLES_PASSED which returns a boolen. With this tweak the value that is returned if any sample passed can be set. This may be of iterest when an application decides whether some geometry is rendered based on an amount of visibility and not just a binary desicion. virgelrenderer sets a default of 1024 on th host. v2: Remove reference from virgl and correct description (Emil) v3: Send the tweak binary encoded instead of using strings (Gurchetan) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	59757dbad6	virgl: Add tweak to apply a swizzle when drawing/blitting to a emulated BGRA texture With Qemu this final swizzle is not needed, but with vtest it is, i.e. it depends on how a program using virglrenderer uses the surface that is rendered to, hence a tweak is added. v2: Update description and fix spelling (Emil) v3: Send tweak as binary value instead of using strings (Gurchetan) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	b793663449	virgl: Add driconf tweak for emulating BGRA surfaces on GLES These tweaks are used to fix rendering issues with Valve games and at least also "The Raven Remastered" when run on a GLES host. v2: Fix type in define and remove virgl from driconf option (Emil) v3: Encode tweak binary instead of using strings (Gurchetan) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Jory Pratt	fd7b7f14d8	util: Heap-allocate 256K zlib buffer The disk cache code tries to allocate a 256 Kbyte buffer on the stack. Since musl only gives 80 Kbyte of stack space per thread, this causes a trap. See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size (In musl-1.1.21 the default stack size has increased to 128K) [mattst88]: Original author unknown, but I think this is small enough that it is not copyrightable. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-19 12:16:18 -07:00
Caio Marcelo de Oliveira Filho	608257cf82	i965: Fix INTEL_DEBUG=bat Use hash_table_u64 instead of hash_table directly, since the former will also handle the special keys (deleted and freed) and allow use the whole u64 space. Fixes crash in INTEL_DEBUG=bat when using a key with value 0 -- the current value for a freed key. Fixes: `b38dab101c` "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-12 15:57:16 -07:00
Caio Marcelo de Oliveira Filho	eb41ce1b01	util/hash_table: Properly handle the NULL key in hash_table_u64 The hash_table_u64 should support any uint64_t as input. It does special handling for the "deleted" key, storing the data in the table itself; do the same for the "freed" key. Fixes: `b38dab101c` "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-12 15:57:16 -07:00
Nicolai Hähnle	de8a919702	u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros The main motivation for this change is API ergonomics: most operations on dynarrays are really on elements, not on bytes, so it's weird to have grow and resize as the odd operations out. The secondary motivation is memory safety. Users of the old byte-oriented functions would often multiply a number of elements with the element size, which could overflow, and checking for overflow is tedious. With this change, we only need to implement the overflow checks once. The checks are cheap: since eltsize is a compile-time constant and the functions should be inlined, they only add a single comparison and an unlikely branch. v2: - ensure operations are no-op when allocation fails - in util_dynarray_clone, call resize_bytes with a compile-time constant element size v3: - fix iris, lima, panfrost Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:30:25 -04:00
Nicolai Hähnle	71b45bae14	u_dynarray: return 0 on realloc failure and ensure no-op We're not very good at handling out-of-memory conditions in general, but this change at least gives the caller the option of handling it gracefully and without memory leaks. This happens to fix an error in out-of-memory handling in i965, which has the following code in brw_bufmgr.c: node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node)); if (unlikely(!node)) return 0ull; Previously, allocation failure for util_dynarray_grow wouldn't actually return NULL when the dynarray was previously non-empty. v2: - make util_dynarray_ensure_cap a no-op on failure, add MUST_CHECK attribute - simplify the new capacity calculation: aside from avoiding a useless loop when newcap is very large, this also avoids an infinite loop when newcap is larger than 1 << 31 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:30:25 -04:00
Eric Engestrom	9996ddbb27	util/futex: fix dangling pointer use Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110901 Fixes: `7dc2f47882` "util: emulate futex on FreeBSD using umtx" Cc: Greg V <greg@unrelenting.technology> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-12 17:27:44 +01:00
Eric Engestrom	93349d7118	util/os_file: suppress sign comparison warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	fd5c18de88	util/os_file: fix error being sign-cast back and forth Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	341ba406fd	util/os_file: avoid shadowing read() with a local variable Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	7e35f20d44	util/os_file: actually return the error read() gave us Fixes: `316964709e` "util: add os_read_file() helper" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Jason Ekstrand	b38dab101c	util/hash_table: Assert that keys are not reserved pointers If we insert a NULL key, it will appear to succeed but will mess up entry counting. Similar errors can occur if someone accidentally inserts the deleted key. The later is highly unlikely but technically possible so we should guard against it too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 00:27:53 +00:00
Jason Ekstrand	8306dabc03	util/set: Assert that keys are not reserved pointers If we insert a NULL key, it will appear to succeed but will mess up entry counting. Similar errors can occur if someone accidentally inserts the deleted key. The later is highly unlikely but technically possible so we should guard against it too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 00:27:53 +00:00
Connor Abbott	8c74772edc	util/hash_table: Use fast modulo computation While we're here, copy the size table from set.c to get rid of hard tabs in the hash_table.c version. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:35 +02:00
Connor Abbott	83667f7a61	util/set: Use fast modulo computation Compilation times with my shader-db database: Difference at 95.0% confidence -1.22312 +/- 0.726033 -0.283979% +/- 0.168254% (Student's t, pooled s = 1.02177) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:30 +02:00
Connor Abbott	b87817871b	util: Add a helper for faster remainders This should be at least as fast as using fast_idiv_by_const, and has the advantage that the precomputation is simple enough to be evaluated at Mesa-compile time for hash tables and sets which have a fixed table of possible divisors. Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:27 +02:00
Connor Abbott	983b001c77	util/hash_table: Add specialized resizing add function To keep it in sync with the set implementation. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:22 +02:00
Connor Abbott	6f9beb28bb	util/set: Add specialized resizing add function A significant portion of the time spent in nir_opt_cse for the Dolphin ubershaders was in resizing the set. When resizing a hash table, we know in advance that each new element to be inserted will be different from every other element, so we don't have to compare them, and there will be no tombstone elements, so we don't have to worry about caching the first-seen tombstone. We add a specialized add function which skips these steps entirely, speeding up resizing. Compile-time results from my shader-db database: Difference at 95.0% confidence -2.29143 +/- 0.845534 -0.529475% +/- 0.194767% (Student's t, pooled s = 1.08807) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:16 +02:00
Connor Abbott	451211741c	util/hash_table: Pull out loop-invariant computations To keep the set and hash table in sync. Note that some of this had already been done for hash tables, in particular pulling out the hash % ht->size computation. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:09 +02:00
Connor Abbott	f7ff685649	util/set: Pull out loop-invariant computations Unfortunately GCC can't do this for us, probably because we call the key comparison function which GCC can't prove won't modify arbitrary memory. This is a pretty hot function, so do the optimization manually to be sure the compiler will get it right. While we're here, make the computation of the new probe address use a single conditional subtract instead of a modulo, since we know that it won't ever get as big as 2 * ht->size before the modulo. Modulos tend to be pretty expensive operations. shader-db compile time results for my database: Difference at 95.0% confidence -2.24934 +/- 0.69897 -0.516296% +/- 0.159993% (Student's t, pooled s = 0.983684) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:04 +02:00
Connor Abbott	8a838e172f	util/set: Add a _mesa_set_search_or_add() function Unlike _mesa_set_search_and_add(), it doesn't replace an entry if it's found, returning it instead. This is useful for nir_instr_set, where we have to know both the original original instruction and its equivalent. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:13:45 +02:00
Rob Clark	372e83b95f	list: add some iterator debug Debugging use of unsafe iterators when you should have used the _safe version sucks. Add some DEBUG build support to catch and assert if someone does that. I didn't update the UPPERCASE verions of the iterators. They should probably be deprecated/removed. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-05-30 22:11:26 +00:00
Marek Olšák	b5697c311b	Change a few frequented uses of DEBUG to !NDEBUG debugoptimized builds don't define NDEBUG, but they also don't define DEBUG. We want to enable cheap debug code for these builds. I only chose those occurences that I care about. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-29 21:13:35 -04:00
Timothy Arceri	d2b0246741	radeonsi: add drirc workaround for American Truck Simulator Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110711	2019-05-28 08:47:44 +10:00
Timothy Arceri	ac779ff2b7	util: add missing include to build_id.h Required to use uint8_t Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-20 10:24:23 +10:00
Marek Olšák	9f505ce21d	radeonsi: disable primitive restart for triangles for DiRT Rally It may decrease performance and it prevents compute-based primitive culling. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Eric Engestrom	22c1657d05	util/os_file: always use the 'grow' mechanism Use fstat() only to pre-allocate a big enough buffer. This fixes a race where if the file grows between fstat() and read() we would be missing the end of the file, and if the file slims down read() would just fail. Fixes: `316964709e` "util: add os_read_file() helper" Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-16 12:56:25 +01:00
Jason Ekstrand	5911abd76f	util/ra: Assert nodes are in-bounds in add_node_interference Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e291cd8a7e	util/ra: Don't destroy the graph in ra_allocate() We want to be able to call ra_allocate() and, when it fails, mutate the graph and try again rather than re-building the graph from scratch. This commit moves all the scratch bits except the final register allocation (which is really an out value not scratch) into sub-structs named "tmp" to make it clear which things are scratch. It also adds bits to the ra_select() initialization loop to initialize things (since we can't trust rzalloc anymore) and copy q_test and forced_reg over. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9040215f5d	util/ra: Add a helper for resetting a node's interference Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	698bb9b984	util/ra: Add helpers for adding nodes to an interference graph Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	6c0f75c953	util/ralloc: Add helpers for growing zero-initialized memory Unfortunately, we can't quite follow the standard C conventions for these because ralloc doesn't know the sizes of pointers. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	41b310e219	util/ra: Improve the performance of ra_simplify The most expensive part of register allocation is the ra_simplify step which is a fixed-point algorithm with a worst-case complexity of O(n^2) which adds the registers to a stack which we then use later to do the actual allocation. This commit uses bit sets and changes the core loop of ra_simplify to first walk 32-node chunks and then walk each chunk. This lets us skip whole 32-node chunks in one go based on bit operations and compute the minimum q value potentially 32x as fast. Of course, the algorithm still has the same fundamental O(n^2) run-time but the constant is now much lower. In the nasty Aztec Ruins compute shader, this shaves a full four seconds off the 30s compile time for a release build of mesa. In a debug build (needed for accurate stack traces), perf says that ra_select takes 20% of runtime before this patch and only 5-6% of runtime after this patch. It also makes shader-db runs faster. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355468050 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2602.37 -> 2524.31 (-3.00%) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e1511f1d4c	util/ra: Only update q_total if the reg is not assigned We only use q_total if the reg is not assigned so there's no point in updating it if the reg is not assigned. This has no known perf benefit but it will reduce churn in a future commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9d6d1f47e7	util/ra: Only update best_optimistic_node if !progress This shaves about half a second off the 30 second compile time of one of the compute shaders in Aztec ruins. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	de56d3a2d1	util/ra: Make in_stack a bitset in the graph Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	7720ad65ae	util/ra: Get rid of tabs Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Eric Anholt	60a64f028d	v3d: Use driconf to expose non-MSAA texture limits for Xorg. The V3D 4.2 HW has a limit to MSAA texture sizes of 4096. With non-MSAA, we can go up to 7680 (actually probably 8138, but that hasn't been validated by the HW team). Exposing 7680 in X11 will allow dual 4k displays.	2019-05-13 12:03:11 -07:00

1 2 3 4 5 ...

698 Commits