KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	66e38654c9	radv: fix dumping compute shader on the graphics queue The graphics pipeline can be NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:07 +02:00
Samuel Pitoiset	de06dfa9ea	radv: add radv_dump_pipeline_state() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:05 +02:00
Samuel Pitoiset	6f0530ecfe	radv: rework how shaders are dumped when generating a hang report Use a flag for the active stages instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:03 +02:00
Samuel Pitoiset	8c406f0b4d	radv: remove unused parameter in radv_dump_annotated_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:57:59 +02:00
Jose Dapena Paz	6c61c31dc2	mesa: do not leak ctx->Shader.ReferencedProgram references When glUseProgram is used, references to the included shaders are added in ctx->Shader.ReferencedProgram. But those references are not decreased when the shader data is deallocated. Thus, those shaders are leaked. Explicitely remove the pending references to these shaders. Fixes: `e6506b3cd2` ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-25 10:38:09 +10:00
Marek Olšák	508b423dd6	radeonsi: set DB_EQAA.MAX_ANCHOR_SAMPLES correctly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	07e02c8617	radeonsi: round ps_iter_samples in set_min_samples Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	510c88f9d1	radeonsi: remove redundant ps_iter_samples clamp si_get_ps_iter_samples already does this. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	25cdf754e4	radeonsi: remove some old gfx 9.x registers Leftover from bring up. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	b936f9aa32	radeonsi: disable primitive binning for all blitter ops same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	8c1c451a90	ac/surface/gfx6: don't overallocate mipmapped HTILE Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Eric Engestrom	473af0b541	egl/x11: deduplicate depth-to-format logic Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-24 18:01:45 +01:00
Tapani Pälli	7b54404c9d	i965: enable OES_texture_view for gen8+ Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-24 12:53:07 +03:00
Tapani Pälli	3ddcdcf94d	mesa: changes to expose OES_texture_view extension Functionality already covered by ARB_texture_view, patch also adds missing 'gles guard' for enums (added in `f1563e6392`). Tested via arb_texture_view.*_gles3 tests and individual app utilizing texture view with ETC2. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-24 12:53:07 +03:00
Juan A. Suarez Romero	046b2b651e	docs: update release calendar for 18.1 series v2: extend 18.1 series (Andres) v3: fix copy/paste typo (Engestrom) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-05-24 11:47:47 +02:00
Samuel Pitoiset	38a8c5903b	radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:57 +02:00
Samuel Pitoiset	ded1509587	radv: call nir_split_var_copies() before nir_lower_var_copies() This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:54 +02:00
Francisco Jerez	936cd3c87a	i965: Use intel_bufferobj_buffer() wrapper in image surface state setup. Instead of directly using intel_obj->buffer. Among other things intel_bufferobj_buffer() will update intel_buffer_object:: gpu_active_start/end, which are used by glBufferSubData() to decide which path to take. Fixes a failure in the Piglit ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests, which could be reproduced with a non-standard glGetTexSubImage implementation (see bug report). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351 Reported-by: Nanley Chery <nanleychery@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:34 -07:00
Francisco Jerez	e989acb03b	i965: Handle non-zero texture buffer offsets in buffer object range calculation. Otherwise the specified surface state will allow the GPU to access memory up to BufferOffset bytes past the end of the buffer. Found by inspection. v2: Protect against out-of-range BufferOffset (Nanley). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:28 -07:00
Francisco Jerez	156d2c6e62	i965: Move buffer texture size calculation into a common helper function. The buffer texture size calculations (should be easy enough, right?) are repeated in three different places, each of them subtly broken in a different way. E.g. the image load/store path was never fixed to clamp to MaxTextureBufferSize, and none of them are taking into account the buffer offset correctly. It's easier to fix it all in one place. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:09 -07:00
Francisco Jerez	5a68147803	Revert "mesa: simplify _mesa_is_image_unit_valid for buffers" This reverts commit `c0ed52f614`. It was preventing the image format validation from being done on buffer textures, which is required to ensure that the application doesn't attempt to bind a buffer texture with an internal format incompatible with the image unit format (e.g. of different texel size), which is not allowed by the spec (it's not allowed for any texture target, whether or not there is spec wording restricting this behavior specifically for buffer textures) and will cause the driver to calculate texel bounds incorrectly and potentially crash instead of the expected behavior. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:09 -07:00
Bas Nieuwenhuizen	699e1f5aac	ac: Use DPP for build_ddxy where possible. WQM is pretty reliable now on LLVM 7, so let us just use DPP + WQM. This gives approximately a 1.5% performance increase on the vrcompositor built-in benchmark. v2: Use ac_build_quad_swizzle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-23 21:02:45 +02:00
Miguel Casas	b73b340c37	i965: add {X,A}BGR2101010 to 'intel_image_formats' This patch adds {X,A}BGR2101010 entries to the list of supported 'intel_image_formats'. Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-23 10:19:04 -07:00
Miguel Casas	432df741e0	dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format. Add R10G10B10{A,X}2 translation between mesa_format and DRI format to driGLFormatToImageFormat() and driImageFormatToGLFormat(). Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-23 10:17:45 -07:00
Dylan Baker	c8acfd5ab2	bin/get-pick-listh.sh: force git --pretty=medium Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 09:54:17 -07:00
Dylan Baker	5a639bdb81	bin/bugzilla_mesa.sh: explicitly set the --pretty argument Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 09:54:00 -07:00
Eric Engestrom	ec986241f3	docs: drop unnecessary out-of-frame target I'm guessing an earlier version of the website used to have the page contents in <frames>, but this isn't the case anymore so just drop the unnecessary `target="_main"` :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Eric Engestrom	09a6cb7be6	docs: fix various html tags mistakes Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Eric Engestrom	8034f5f623	docs: fix `<` & `>` used in html code Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Juan A. Suarez Romero	6db0660d08	docs: add news notes to 18.1.0 CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 13:06:55 +02:00
Dave Airlie	f2f464de57	tgsi/scan: add hw atomic to the list of memory accessing files This fixes 4 out of 5 cases in: arb_framebuffer_no_attachments-atomic on cayman. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>	2018-05-23 03:51:40 +01:00
Roland Scheidegger	7b89fcec41	llvmpipe: improve rasterization discard logic This unifies the explicit rasterization discard as well as the implicit rasterization disabled logic (which we need for another state tracker), which really should do the exact same thing. We'll now toss out the prims early on in setup with (implicit or explicit) discard, rather than do setup and binning with them, which was entirely pointless. (We should eventually get rid of implicit discard, which should also enable us to discard stuff already in draw, hence draw would be able to skip the pointless clip and fallback stages in this case.) We still need separate logic for only null ps - this is not the same as rasterization discard. But simplify the logic there and don't count primitives simply when there's an empty fs, regardless of depth/stencil tests, which seems perfectly acceptable by d3d10. While here, also fix statistics for primitives if face culling is enabled. No piglit changes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-23 04:23:32 +02:00
Bas Nieuwenhuizen	047438287c	ac/surface/gfx6: Don't force a tile index for fmask. The bpe of the fmask often differs from the bpe of the main surface. On SI that means it has to get a different tile index. addrlib is capable of figuring this out itself, so just pass -1 instead to let it know that it is not preset. Fixes: `9bf3570fed` "ac/surface/gfx6: compute FMASK together with the color surface" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-23 02:23:03 +02:00
Jason Ekstrand	a347a5a12c	i965: Remove ring switching entirely Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:39 -07:00
Jason Ekstrand	b499b85b0f	i965/miptree: Move the access_raw call to the individual map functions The only function that doesn't need to call access_raw is map_blit. If it takes the blitter path, it will happen as part of intel_miptree_copy. If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle doing whatever resolves are needed. This should save us resolves in quite a few cases and will probably help performance a bit. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:37 -07:00
Jason Ekstrand	f566a1264c	i965: Remove support for the BLT ring We still support the blitter on gen4-5 but it's on the same ring as 3D. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:35 -07:00
Jason Ekstrand	33affda8bf	i965/miptree: Use blorp for blit maps on gen6+ Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:34 -07:00
Jason Ekstrand	0eedb0fca9	i965/miptree: Use blorp for validation tex copies on gen6+ It's faster than the blitter and can handle things like stencil properly so it doesn't require software fallbacks. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:32 -07:00
Jason Ekstrand	80fc3896f3	i965: Delete the blitter path for CopyTexSubImage The blorp path (called first) can do anything the blitter path can do so it's just dead code. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:31 -07:00
Jason Ekstrand	8162256b01	i965: Don't fall back to the blitter in BlitFramebuffer On gen4-5, we try the blitter before we even try blorp. On newer platforms, blorp can do everything the blitter can so there's no point in even having the blitter fall-back path. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:29 -07:00
Jason Ekstrand	e596563b08	i965: Remove some unused includes of intel_blit.h Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:27 -07:00
Jason Ekstrand	a9499374a9	i965/blit: Delete intel_emit_linear_blit This function is no longer used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:25 -07:00
Jason Ekstrand	7fd962093f	i965: Use meta for pixel ops on gen6+ Using meta for anything is fairly aweful and definitely has more CPU overhead. However, it also uses the 3D pipe and is therefore likely faster in terms of GPU time than the blitter. Also, the blitter code has so many early returns that it's probably not buying us that much. We may as well just use meta all the time instead of working over-time to find the tiny case where we can use the blitter. We keep gen4-5 using the old blit paths to avoid perturbing old hardware too much. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:20 -07:00
Kenneth Graunke	92f01fc5f9	i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin. We'd like to start using soft-pin to assign BO addresses up front, and never move them again. Our previous plan for dealing with 48-bit VF cache bugs was to relocate vertex buffers to the low 4GB, so we'd never have addresses that alias in the low 32 bits. But that requires moving buffers dynamically. This patch tracks the last seen BO address for each vertex/index buffer, and emits a VF cache invalidate if the high bits change. (Ideally, we won't hit this case very often.) This should work for the soft-pin case, but unfortunately won't work in the relocation case, as we don't actually know the addresses. So, we have to use both methods. v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple more explicitly (suggested by Scott). Mention "single batch" too (suggested by Chris). Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-22 10:02:28 -07:00
Kenneth Graunke	c7259259d4	i965: Introduce a "memory zone" concept on BO allocation. We're planning to start managing the PPGTT in userspace in the near future, rather than relying on the kernel to assign addresses. While most buffers can go anywhere, some need to be restricted to within 4GB of a base address. This commit adds a "memory zone" parameter to the BO allocation functions, which lets the caller specify which base address the BO will be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA. Eventually, I hope to create a 4GB memory zone corresponding to each state base address. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-05-22 10:01:09 -07:00
Jason Ekstrand	417b9e5770	intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0 Fixes: `d6cd14f213` "i965/fs: Define new shader opcode to..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-22 09:53:23 -07:00
Michel Dänzer	fe2edb25dd	dri3: Stricter SBC wraparound handling Prevents corrupting the upper 32 bits of draw->recv_sbc when draw->send_sbc resets to 0 (which currently happens when the window is unbound from a context and bound to one again), which in turn caused loader_dri3_swap_buffers_msc to calculate target_msc with corrupted upper 32 bits. This resulted in hangs with the Xorg modesetting driver as of xserver 1.20 (older versions and other drivers ignored the upper 32 bits of the target MSC, which is why this wasn't noticed earlier). Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/106351 Tested-by: Mike Lothian <mike@fireburn.co.uk>	2018-05-22 17:59:53 +02:00
Samuel Pitoiset	75e919c045	radv: fix computation of user sgprs for 32-bit pointers With 32-bit pointers we only need one user SGPR per desc set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:29 +02:00
Samuel Pitoiset	c5536fc813	radv: drop user_sgpr_info::sgpr_count It's only used inside allocate_user_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:26 +02:00
Samuel Pitoiset	36a4d6d081	radv: add support for 32-bit pointers in user data SGPRs We still use 64-bit GPU pointers for all ring buffers because llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit GPU pointers for now. This can be improved later anyways. Vega10: Totals from affected shaders: SGPRS: 1008722 -> 1026710 (1.78 %) VGPRS: 706580 -> 707136 (0.08 %) Spilled SGPRs: 22555 -> 22209 (-1.53 %) Spilled VGPRs: 75 -> 75 (0.00 %) Code Size: 34819208 -> 35202140 (1.10 %) bytes Max Waves: 175423 -> 175086 (-0.19 %) Polaris10: Totals from affected shaders: SGPRS: 1029849 -> 1036517 (0.65 %) VGPRS: 709984 -> 708872 (-0.16 %) Spilled SGPRs: 22672 -> 22309 (-1.60 %) Spilled VGPRs: 82 -> 66 (-19.51 %) Scratch size: 76 -> 60 (-21.05 %) dwords per thread Code Size: 34915336 -> 35309752 (1.13 %) bytes Max Waves: 151221 -> 151677 (0.30 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:22 +02:00

1 2 3 4 5 ...

102445 Commits All Branches Search

102445 Commits

All Branches