KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	6b976024a8	radv: add support for FMASK expand Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:17 +01:00
Samuel Pitoiset	fa16da53d8	radv: initialize FMASK for images in fully expanded mode The value depends on the number of samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:15 +01:00
Samuel Pitoiset	65d82c84d2	ac/nir: restrict fmask lookup to image load intrinsics We don't ever want to do the fmask lookup on a atomic or store, the fmask should have been decompressed if the surface has been moved to IMAGE_LAYOUT. Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:11 +01:00
Samuel Pitoiset	f45e43e156	spirv: add support for SpvCapabilityStorageImageMultisample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:09 +01:00
Samuel Pitoiset	5b1ec10e4c	radv: compute optimal VM alignment for imported buffers This fixes GPU hangs on GFX9 with dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.* Copied from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 17:34:04 +01:00
Bas Nieuwenhuizen	9f0bfbed11	radv: Work around non-renderable 128bpp compressed 3d textures on GFX9. Exactly what title says, the new addrlib does not allow the above with certain dimensions that the CTS seems to hit. Work around it by not allowing the app to render to it via compat with other 128bpp formats and do not render to it ourselves during copies. Fixes: `776b911365` "amd/addrlib: update Mesa's copy of addrlib" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-20 15:07:20 +01:00
Samuel Pitoiset	5c7935f8fc	radv: fix subpass image transitions with multiviews The driver needs to decompress all image layers if a fast depth/color clear has been performed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 13:36:37 +01:00
Samuel Pitoiset	0a7e767e58	radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8 This workaround has been introduced by `135e4d434f` for fixing DXVK GPU hangs with many games. It is no longer needed since LLVM r345718. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 12:09:57 +01:00
Samuel Pitoiset	576040f2e5	ac/nir: remove the bitfield_extract workaround for LLVM 8 This workaround has been introduced by `3d41757788` and it is no longer needed since LLVM r346422. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-20 09:40:16 +01:00
Iago Toral Quiroga	d6110d4d54	intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs The former expects to see SSA-only things, but the latter injects registers. The assertions in the lowering where not seeing this because they asserted on the bit_size values only, not on the is_ssa field, so add that assertion too. Fixes: `11dc130779` "nir: Add a bool to int32 lowering pass" CC: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-20 08:02:44 +01:00
Ilia Mirkin	1250383e36	st/mesa: remove sampler associated with buffer texture in pbo logic A long time ago, when this was first implemented, not having a sampler bound would cause problems on Fermi. I didn't work out the reasons, but the solution was simple -- just put the samplers back in. Since then, regular texturing paths appear to have lost their associated samplers which required a fuller investigation and fix in nouveau. Now that this is done, this code should no longer need a sampler state for fetching texels from a buffer texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-20 00:27:16 -05:00
Roland Scheidegger	6f4083143b	gallivm: use llvm jit code for decoding s3tc This is (much) faster than using the util fallback. (Note that there's two methods here, one would use a cache, similar to the existing code (although the cache was disabled), except the block decode is done with jit code, the other directly decodes the required pixels. For now don't use the cache (being direct-mapped is suboptimal, but it's difficult to come up with something better which doesn't have too much overhead.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-12-20 06:03:20 +01:00
Jason Ekstrand	ec1d5841fa	radv/query: Use 1-bit booleans in query shaders Fixes: `44227453ec` "nir: Switch to using 1-bit Booleans for almost..." Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-19 16:36:40 -06:00
Jason Ekstrand	6896c91c10	radv/query: Add a nir_test_flag helper This is little more than an iadd_imm right now but it will help in the next commit where we refactor things further. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-19 16:36:26 -06:00
Eduardo Lima Mitev	c2ebc38052	freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat() An earlier patch that introduced the function failed to handle the case where an image format layout qualifier is not specified, which is allowed on desktop GL profiles. In these cases, nir_variable's image format is GL_NONE, and we don't need to print a debug message for those. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-12-19 22:49:05 +01:00
Eric Anholt	90818558f0	docs: Add an encouraging note about providing reviews and acks. Across several projects I've seen new contributors say "I wasn't sure if I should provide a review tag since I'm not really an expert in this area." Everyone I know already applies some implicit weighting to reviews from different people, so encourage participation. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-19 12:49:17 -08:00
Eric Anholt	463df0ffe2	docs: Add a note that MRs should still include any r-b or a-b tags. v2: Mention "Tested-by" too Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-19 12:48:13 -08:00
Eric Anholt	fcfb7f573c	v3d: Load and store aligned utiles all at once. This calls the expensive uif offset function once per utile, but it still gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over calling it on each pixel.	2018-12-19 10:27:26 -08:00
Eric Anholt	7c56b7a6ea	v3d: Add a fallthrough path for utile load/store of 32 byte lines. Now that V3D has 8 byte per pixel formats exposed, we've got stride==32 utiles to load and store. Just handle them through the non-NEON paths for now.	2018-12-19 10:27:26 -08:00
Eric Anholt	f6a0f4f41e	vc4: Move the utile load/store functions to a header for reuse by v3d. These implementations of whole-utile load/stores would be the same for v3d, though the layouts of blocks of utiles has changed.	2018-12-19 10:27:26 -08:00
Eric Anholt	8ee752194c	v3d: Implement texture_subdata to reduce teximage upload copies. This lets us store the non-PBO glTexImage data directly into the tiled image without making an extra untiled memcpy for the gallium transfer. Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around in the kernel mapping and unmapping the transfer's temporary area.	2018-12-19 10:27:26 -08:00
Eric Anholt	e09d8aecb4	v3d: Remove dead prototypes for load/store utile functions.	2018-12-19 10:27:26 -08:00
Eric Anholt	fcf881adda	v3d: Don't try to create shadow tiled temporaries for 1D textures. They're raster order anyway, so we'd assertion fail along with wasting bandwidth. Fixes: `6ad9e8690d` ("v3d: Add support for texturing from linear.")	2018-12-19 10:27:21 -08:00
Eric Anholt	b5adc744ba	v3d: Fix check for TFU job completion in the simulator. We're waiting for the jobs-completed count to increment (with wrapping), not to reach its starting state. This mostly ended up working out because the next v3d_hw_tick() for a submit CL would end up doing the TFU operation first, but it did fail when a blit was used for glReadPixels() at the end of a test. Fixes: `ee0549ff9a` ("v3d: Add the V3D TFU submit interface to the simulator.")	2018-12-19 10:26:04 -08:00
Eric Anholt	365728dc5d	v3d: Put the dst bo first in the list of BOs for TFU calls. In the UAPI, the first BO is the destination, and the one the kernel should do an exclusive reservation on. Currently we only do exclusive reservations, anyway. However, in the simulator path I was only copying back the "destination" BO (actually src in this case), and this caused regressions once I fixed the simulator to actually complete TFU before returning (since otherwise, the TFU op would happen at the start of the next CL submit and the draw would get the right contents). Fixes: `976ea90bdc` ("v3d: Add support for using the TFU to do some blits.")	2018-12-19 10:26:04 -08:00
Caio Marcelo de Oliveira Filho	947f7b452a	nir: properly find the entry to keep in copy_prop_vars When copy propagation handles a store/copy, it iterates the current copy entries to remove aliases, but keeps the "equal" entry (if exists) to be updated. The removal step may swap the entries around (to ensure there are no holes), invalidating previous iteration pointers. The bug was saving such pointer to use later. Change the code to first perform the removals and then find the remaining right entry. This was causing updates to be lost since they were being made to an entry that was not part of the current copies. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108624 Fixes: `b3c6146925` "nir: Copy propagation between blocks" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-19 09:33:36 -08:00
Michel Dänzer	9d8395bf0e	winsys/amdgpu: Pull in LLVM CFLAGS Fixes build failure if the LLVM headers aren't in a standard include directory. Fixes: `ec22dd34c8` "radeonsi: move SI_FORCE_FAMILY functionality to winsys" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-12-19 17:54:18 +01:00
Caio Marcelo de Oliveira Filho	0ddc911f4d	nir: properly clear the entry sources in copy_prop_vars When updating a copy entry source value from a "non-SSA" (the data come from a copy instruction) to a "SSA" (the data or parts of it come from SSA values), it was possible to hold invalid data in ssa[0] depending on the writemask. Because the union, ssa[0] could contain a pointer to a nir_deref_instr left-over from previous non-SSA usage. Change code to clean up the array before use to avoid invalid data around. Fixes: `62332d139c` "nir: Add a local variable-based copy propagation pass" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-19 08:35:48 -08:00
Eric Engestrom	0e4c7c3d5b	docs: format code blocks a bit nicely Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-19 16:32:30 +00:00
Eric Engestrom	b0319d0768	docs: add meson cross compilation instructions Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-19 16:31:51 +00:00
Gurchetan Singh	b45aa6290b	virgl: move resource creation / import / destruction to common code We can remove some duplicated code. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	1d3d311133	virgl: move resource metadata into base resource A resource is just a buffer with some metadata. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	db77573d7b	virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT Previously, we ignored the the glUnmap(..) operation and flushed before we flush the cbuf. Now, let's just flush the data when we unmap. Neither method is optimal, for example: glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT) glFlushMappedBufferRange(.., 25, 30) glFlushMappedBufferRange(.., 65, 70) We'll end up flushing 25 --> 70. Maybe we can fix this later. v2: Add fixme comment in the code (Elie) Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	11939f6fa2	virgl: make virgl_buffers use resource helpers We can reuse the helpers we created. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	4e2c77cd51	virgl: make transfer code with PIPE_BUFFER targets util_format_get_blocksize returns 1 for R8 formats (all PIPE_BUFFERs are R8). Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	174f530008	virgl: consolidate transfer code We could allocate and destroy transfers in one place. v2: Keep l_stride around. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	13626b46f1	virgl: store layer_stride in metadata Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	2a44acc83b	virgl: move vrend_get_tex_image_offset to common code Will be reused. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	f749229a8e	virgl: move virgl_resource_layout to common code Will be reused. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	a63da9c062	virgl: move texture metadata to common code Will be reused. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	6e7d396ad3	virgl: remove unnessecary code With commit 89b479, we moved to tracking buffer cleanliness when binding. TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	6d13d1aadb	virgl: texture_transfer_pool --> transfer_pool It's used for all types of resources. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Nicolai Hähnle	d73a25f2c0	radeonsi: const-ify the si_query_ops Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:02:07 +01:00
Nicolai Hähnle	c85b0dea0a	radeonsi: split perfcounter queries from si_query_hw Remove a level of indirection to make the code more explicit -- should make it easier to follow what's going on. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:02:04 +01:00
Nicolai Hähnle	e0f0d3675d	radeonsi: factor si_query_buffer logic out of si_query_hw This is a move towards using composition instead of inheritance for different query types. This change weakens out-of-memory error reporting somewhat, though this should be acceptable since we didn't consistently report such errors in the first place. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:02:01 +01:00
Nicolai Hähnle	0fc6e573dd	radeonsi: move query suspend logic into the top-level si_query struct Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:59 +01:00
Nicolai Hähnle	e2b9329f17	radeonsi: move remaining perfcounter code into si_perfcounter.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:57 +01:00
Nicolai Hähnle	7dd289d9e4	radeonsi: track constant buffer bind history in si_pipe_set_constant_buffer Other callers of si_set_constant_buffer don't need it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:54 +01:00
Nicolai Hähnle	829d417914	radeonsi: use si_set_rw_shader_buffer for setting streamout buffers Reduce the number of places that encode buffer descriptors. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:52 +01:00
Nicolai Hähnle	ce785f5ffd	radeonsi: add an si_set_rw_shader_buffer convenience function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:50 +01:00

... 2 3 4 5 6 ...

106558 Commits All Branches Search

106558 Commits

All Branches