mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Eric Anholt	a221f9709e	v3d: Fix incorrect handling of two fences created back-to-back. Recreating our context's syncobj with ALREADY_SIGNALED meant that if you created two fences in a row, then waiting on the second would succeed immediately. Instead, export a sync file in the gallium fence (since we don't have a syncobj clone ioctl), and just create a new syncobj to wait on whenever we need to. Noticed while debugging dEQP-GLES3.functional.fence_sync.client_wait_sync_finish	2018-07-20 11:11:29 -07:00
Eric Anholt	fc28692a5a	v3d: Fix the timeout value passed to drmSyncobjWait(). The API wants an absolute time, so we need to go add gallium's argument to CLOCK_MONOTONIC.	2018-07-20 11:11:29 -07:00
Eric Anholt	4f04bd68cf	v3d: Fix drmSyncobjWait() return value checking even more. It tends to return >0 in the success case (I think the value is something like "how much of the timeout remained"). Fixes dEQP-GLES3.functional.fence_sync.client_wait_sync_finish	2018-07-20 11:11:29 -07:00
Eric Anholt	2f90879a34	v3d: Use the list_first_entry/list_last_entry macros.	2018-07-20 11:11:29 -07:00
Eric Anholt	d0e53373e5	v3d: Move BO cache counting to dump time instead of cache management. This is one less way to get the dump stats wrong.	2018-07-20 11:11:29 -07:00
Eric Anholt	7d6aef6fa5	v3d: Reduce the stale BO reclamation spam with dump_stats set. This was obviously meant to be when we were actually freeing a BO, not just when there was at least one BO in the list.	2018-07-20 11:11:29 -07:00
Eric Anholt	5d11094db1	v3d: Respect a sampler view's first_layer field. Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture	2018-07-20 11:11:29 -07:00
Sonny Jiang	c6737756ad	radeonsi: emit_spi_map packets optimization v2: marek: remove an empty line before break; rename reg_val_seq -> spi_ps_input_cntl "type * x" -> "type *x" Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 13:50:26 -04:00
Gert Wollny	4d094993c3	virgl: Expose GL_ARB_copy_image if host supports it Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-20 19:15:12 +02:00
Gert Wollny	0bde9739c0	virgl: Allow RGB32* textures only as buffer objects When requesting a texture of the internal format GL_RGB32F Gallium will try to allocate a renderable texture and returns RGBA32F or RGBX32F, but when one requests GL_RGB32I or GL_RGB32UI the according 3-component texture will be returned. This leads to problems later, when one wants to use glCopyImageSubData to copy data between these textures that should be compatible, but given the way virgl and Gallium handle this the latter fails with an assertion, because the per-texel bit size is different. By allowing the GL_RGB32* only for texture buffers these problems are avoided without losing the ARB_tbo_rgb32 extension (thanks Ilia Mirkin). v2: Correct spelling (Gurchetan Singh) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-20 19:12:49 +02:00
Lionel Landwerlin	feb43ef674	intel: tools: dump: protect against multiple calls on destructor When running gdb, make sure to pass the LD_PRELOAD variable only to the executed program, not the debugger. Otherwise the debugger will run the preloaded constructor/destructor too and bad things will happen. Suggested-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-20 17:36:56 +01:00
Lionel Landwerlin	2a9069eb97	intel: tools: dump: make dump tool reliable under gdb The problem with passing the configuration of the dump lib through a file descriptor is that it can be read only once. But under gdb you might want to rerun your program multiple times. This change hands the configuration through a temporary file that is deleted once the command line passes to intel_dump_gpu has exited. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-20 17:36:37 +01:00
Samuel Pitoiset	1efc9094e0	radv: don't flush DB before subpass FS resolves That shouldn't be needed because the DB state is invalid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 17:30:13 +02:00
Gert Wollny	016807161b	r600: Correct evaluation of cube array index and face The array index needs to be corrected and it must be insured that it is rounded and its value is non-negative before it is combined with the face id. v5: Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) v6: Fix type (Roland Scheidegger) Fixes 182 from android/cts/master/gles31-master.txt: dEQP-GLES31.functional.texture.filtering.cube_array.formats.* dEQP-GLES31.functional.texture.filtering.cube_array.sizes.* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.no_edges_visible.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Gert Wollny	01766c1db6	r600: correct texture offset for array index lookup Correct the array index for TEXTURE_1D_ARRAY, and TEXTURE_2D_ARRAY The standard says the array index is evaluated according to floor(z + 0.5) but RNDNE is sufficient also for the test cases were z is close to 1.5 and it is likely to hit 1.5, the corner case were RNDNE gives a result different from above formula. v5: - Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) - update commit message Fixes 325 tests from android/cts/master/gles3-master.txt: dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler2darray dEQP-GLES3.functional.texture.filtering.2d_array.formats.* dEQP-GLES3.functional.texture.filtering.2d_array.sizes.* dEQP-GLES3.functional.texture.filtering.2d_array.combinations.* dEQP-GLES3.functional.texture.shadow.2d_array.* dEQP-GLES3.functional.texture.vertex.2d_array.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Gert Wollny	626bd455d4	r600: Delay emission of texture gradients and lookup offsets Gradients used in texture lookups and the offsets must reside in the same fetch clause (the first is imposed by the hardware and the second is expected by sb). In order to ensure that no ALU clause is inserted between emission and use of these, delay the emission of these instructions until the texture instruction using them is also emitted. This is needed in preparation for the correction of the texture array indices. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Bas Nieuwenhuizen	cc10b34e9e	util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache. radv always needs it, so just check the header instead. Also do not declare the function if the variable is not set, so we get a nice compile error instead of failing to open a device at runtime. Fixes: `b87ef9e606` "util: fix MSVC build issue in disk_cache.h" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-20 12:09:19 +02:00
Bas Nieuwenhuizen	8cacf38f52	nir: Do not use continue block after removing it. Reinserting code directly before a jump means the block gets split and merged, removing the original block and replacing it in the process. Hence keeping a pointer to the continue block over a reinsert causes issues. This code changes nir_opt_if to simply look for the new continue block. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275 CC: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-20 12:09:19 +02:00
Samuel Pitoiset	ce454d02cc	radv: simplify a condition in radv_src_access_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:17 +02:00
Samuel Pitoiset	1ff25c4e6b	radv: save current state just before resolving with FS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:15 +02:00
Samuel Pitoiset	c3d5f124c6	radv: don't check if a subpass has resolve attachments twice We already check that in radv_cmd_buffer_resolve_subpass(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:13 +02:00
Samuel Pitoiset	0a8127bbfb	radv: make use of radv_subpass_barrier() when resolving subpasses The goal is to use radv_barrier()/radv_subpass_barrier() as much as possible for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:11 +02:00
Rhys Perry	409a60df3b	nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding total instructions in shared programs : 5480808 -> 5472107 (-0.16%) total gprs used in shared programs : 647530 -> 647532 (0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58551648 -> 58459352 (-0.16%) local shared gpr inst bytes helped 0 0 73 2609 2609 hurt 0 0 71 34 34	2018-07-19 23:34:58 +02:00
Rhys Perry	2afef231db	nv50/ir: handle SHLADD in IndirectPropagation An alternative solution to the problem fixed in `0bd83d0` ("nv50/ir: move LateAlgebraicOpt to the very end"). total instructions in shared programs : 5481195 -> 5480808 (-0.01%) total gprs used in shared programs : 647535 -> 647530 (-0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58555784 -> 58551648 (-0.01%) local shared gpr inst bytes helped 0 0 2 34 34 hurt 0 0 0 0 0	2018-07-19 23:34:58 +02:00
Rhys Perry	3b6edd0b59	gm107/ir: use CS2R for SV_CLOCK This instruction seems to be faster than S2R and requires no barrier, though the range of special registers it can read from is limited. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-19 23:34:58 +02:00
Lionel Landwerlin	94cf964586	intel: tools: dump: remove mentions of intel_aubdump Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 20:12:53 +01:00
Lionel Landwerlin	0f9d8b754f	intel: tools: aubwrite: fix invalid frees on finish Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-07-19 20:11:56 +01:00
Samuel Pitoiset	3d41757788	ac/nir: add a workaround for bitfield_extract when count is 0 LLVM 7 returns incorrect results when count is 0, something has been broken since LLVM 6. Of course, the best solution is to fix LLVM but this workaround works as expected for now. Original workaround by Philippe Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-19 20:41:10 +02:00
Nanley Chery	e2e32b6afd	intel/isl/gen4: Make depth/stencil buffers Y-Tiled Rendering to a linear depth buffer on gen4 is causing a GPU hang in the CI system. Until a better explanation is found, assume that errata is applicable to all gen4 platforms. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Nanley Chery	44ab26d0c9	i965/misc: Use depth/stencil surf's tiling on gen4-5 Make the 3D engine aware of the depth/stencil surface's tiling before doing any render operations. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Caio Marcelo de Oliveira Filho	507a8037a7	glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch When handling 'if' in copy propagation elements, if a certain variable was killed when processing the first branch of the 'if', then the second would get any propagation from previous nodes. x = y; if (...) { z = x; // This would turn into z = y. x = 22; // x gets killed. } else { w = x; // This would NOT turn into w = y. } With the change, we let copy propagation happen independently in the two branches and only then apply the killed values for the subsequent code. One example in shader-db part of shaders/unity/8.shader_test: (assign (xyz) (var_ref col_1) (var_ref tmpvar_8) ) (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) ( (assign (xyz) (var_ref col_1) (expression vec3 + (var_ref tmpvar_8) ... ) ... ) ) ( (assign (xyz) (var_ref col_1) (expression vec3 lrp (var_ref col_1) ... ) ... ) )) The variable col_1 was replaced by tmpvar_8 in the then-part but not in the else-part. NIR deals well with copy propagation, so it already covered for the missing ones that this patch fixes. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:59 -07:00
Caio Marcelo de Oliveira Filho	e4f32dec23	glsl: change opt_copy_propagation_elements data structures Instead of keeping multiple acp_entries in lists, have a single acp_entry per variable. With this, the implementation of clone is more convenient and now fully implemented. In the previous code, clone was only partial. Before this patch, each acp_entry struct represented a write to a variable including LHS, RHS and a mask of what channels were written to. There were two main hash tables, the first (lhs_ht) stored a list of acp_entries per LHS variable, with the values available to copy for that variable; the second (rhs_ht) was a "reverse index" for the first hash table, so stored acp_entries per RHS variable. After the patch, there's a single acp_entry struct per LHS variable, it contains an array with references to the RHS variables per channel. There now is a single hash table, from LHS variable to the corresponding entry. The "reverse index" is stored in the ACP entry, in the form of a set of variables that copy from the LHS. To make the clone operation cheaper, the ACP entries are created on demand. This should not change the result of copy propagation, a later patch will take advantage of the clone operation. v2: Add note clarifying how the hashtable is destroyed. v3: (all from Eric Anholt) Add remove_unused_var_from_dsts() function for reuse. Remove from dsts as we go instead of clearing at the end. Add clarifying comment to erase(). Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Caio Marcelo de Oliveira Filho	7b0d395250	glsl: separate copy propagation state Separate higher level logic of visiting instructions and chosing when to store and use new copy data from the datastructure holding the copy propagation information. This will also make easier later patches that change the structure. v2: Remove empty destructor and clarify how hash tables are destroyed. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Lionel Landwerlin	49e86f09fe	intel: tools: dump: trace memory writes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 16:48:42 +01:00
Lionel Landwerlin	5ba3e5c358	intel: tools: dump: remove command execution feature In commit `86cb05a6d3` ("intel: aubinator: remove standard input processing option") we removed the ability to process aub as an input stream because we're now rely on mmapping the aub file to back the buffers aubinator is parsing. intel_aubdump was the provider of the standard input data and since we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa, we don't need that code anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-19 10:11:54 +01:00
Danylo Piliaiev	494a206229	radv: Fix incorrect assumption about ternary operator precedence Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 10:04:27 +02:00
Marek Olšák	dcbcc83003	mesa: fix make check for AMD_performance_monitor	2018-07-19 01:17:01 -04:00
Marek Olšák	f097f0c55c	mesa: remove dead code from api_loopback This should only contain functions not set in vtxfmt.c. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 01:10:32 -04:00
Marek Olšák	987c2ece03	mesa: expose ARB_indirect_parameters in the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) v2: fix dispatch_sanity	2018-07-19 01:10:18 -04:00
Marek Olšák	d40188800e	vbo: fix ARB_multi_draw_indirect for the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	6c4652ea8a	mesa: expose ARB_shader_viewport_layer_array in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	da528898bc	mesa: expose ARB_ES3_1_compatibility in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	565dacc3d6	winsys/amdgpu: remove RADEON_SURF_FMASK leftover RADEON_SURF_FMASK is never set.	2018-07-19 00:58:51 -04:00
Marek Olšák	9b82d128c9	ac: run LLVM optimization passes only on the final function after inlining	2018-07-19 00:58:49 -04:00
Bas Nieuwenhuizen	17b5a59b4e	radv: Enable binning and dfsm by default on Raven. Seems like it increases performance by 2-3% for some demos and games. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:21 +02:00
Bas Nieuwenhuizen	978570769d	radv: Always set disable zpass increment bit when possible. When no occlusion queries are active even if out of order is enabled. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:10 +02:00
Bas Nieuwenhuizen	82664af6cf	radv: Select correct entries for binning. Overshot it by one every time. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:01 +02:00
Bas Nieuwenhuizen	760211b77c	radv: Fix number of samples used for binning. Used the wrong register ... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:54 +02:00
Bas Nieuwenhuizen	c0144e915a	radv: Disable disabled color buffers in rbplus opts. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:47 +02:00
Marek Olšák	fb049742d6	r600: silence the signed overflow warning like radeonsi r600_gpu_load.c: In function ‘r600_gpu_load_thread’: ../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow] if (start <= end)	2018-07-18 17:48:48 -04:00

1 2 3 4 5 ...

95525 Commits