KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	c0521ecffb	llvmpipe: enable compute shaders if LLVM has coroutines Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6453a22612	llvmpipe: add local memory allocation path Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	4e70970507	llvmpipe: add compute shader parameter fetching support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	0b51e73de2	llvmpipe: add compute shader images support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	45a8cf95f2	llvmpipe: add ssbo support to compute shaders Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6ea8e9b415	llvmpipe: add compute sampler + sampler view support. This is ported from the fragment shader code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	4ca40cc3dc	llvmpipe: add support for compute constant buffers. This is mostly ported from the fragment shader code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	775fa81d7b	llvmpipe: add compute pipeline statistics support. This just adds the CS invocations counter. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	50fde5b208	llvmpipe: add grid launch This adds the dispatch code. It creates a job for the number of blocks in the grid, and dispatches them to the threadpool implementation. The threadpool then calls the JIT code to execute the coroutines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	b320830bbd	llvmpipe: add compute shader generation. This creates the coroutine execution environment and the main compute shaders that get executed inside it. Each compute shader block is executed in it's own coroutine execution shader, which each "thread" being a coroutine executed inside it in sequence. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6ea41df94c	llvmpipe: introduce variant building infrastrucutre. This doesn't actually build any of the shaders yet, but just builds up the framework necessary to start building the shaders and variants. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	fc01fafdbc	llvmpipe: introduce new state dirty tracking for compute. Compute doesn't share dirty state with the fragment pipeline so create a separate path for it. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	a6f6ca37c8	llvmpipe: add initial shader create/bind/destroy variants framework. This is mostly a port of the fragment shader framework Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	a792c5ae3e	llvmpipe: add compute debug option Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	25f46ae9aa	gallivm: add compute jit interface. This adds the jit interface for compute shaders, it's based on the fragment shader one. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	3879f69b50	llvmpipe: add initial compute state structs These mirror the fragment shader structs, this is just a framework. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	add0b151f5	llvmpipe: introduce compute shader context The compute shader will need it's own context like the frag shader has, this just introduces the framework struct and allocates/frees for it in the right places. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	83597ad3f2	gallivm: add barrier support for compute shaders. When the code is executing an hits a barrier, it will suspend the coroutine and return control to the coroutine dispatcher. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	1b24e3ba75	llvmpipe: add compute threadpool + mutex Reviewed-by: Roland Scheidegger <sroland@vmware.com> In order to efficiently run a number of compute blocks, use a threadpool that just allows for jobs with unique sequential ids to be dispatched.	2019-09-04 15:22:20 +10:00
Dave Airlie	e5bf6b7013	gallivm: add support for compute shared memory Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	db6c78f9c8	gallivm: add new compute related intrinsics Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	3312bed7b0	llvmpipe: reogranise jit pointer ordering In order to share the texture/image/sampler code with compute shaders we need to reorg them to be at the front of context same as draw does for vs/gs sharing. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	d32690b43c	gallivm: add coroutine pass manager support coroutines require a proper pass manager, so add the passes to the correct places Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	9cf1340e4f	gallivm: add coroutine support files to gallivm. These wrap the coroutine intrinsics and also add some higher level wrappers around coroutine begin, end and suspend procedures Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	f3f0cbf4f4	gallivm/flow: add counter reset for loops This allows the counter value to be forced to a certain value Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6b3c6b91a8	llvmpipe: enable fb no attach Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Kenneth Graunke	f8887909c6	iris: Report correct number of planes for planar images We were only handling the modifiers case and not counting the number of planes in actual planar images. Fixes Piglit's ext_image_dma_buf_import-export. Fixes: `fc12fd05f5` ("iris: Implement pipe_screen::resource_get_param") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-09-03 21:55:23 -07:00
Ilia Mirkin	32d458fdff	teximage: ensure that TexSubImage checks format We were previously not doing at least some of the checks. This uses the same logic that is used in glTexImage*. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-04 00:35:45 -04:00
Jan Beich	8e92ce9ba5	gallium/hud: add CPU usage support for DragonFly/NetBSD/OpenBSD Each BSD has slightly different sysctl for retrieving per-CPU times. FreeBSD returns long while NetBSD returns uint64_t. On OpenBSD return type differs between summation and per-CPU times. DragonFly is compatible with FreeBSD. Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-09-03 22:53:15 -04:00
Roman Stratiienko	ef621a73f7	lima: Return fence unconditionally Based on the vc4 implementation. Fixes Android RenderEngine::flush() routine: android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225 Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-09-04 00:32:04 +00:00
Vasily Khoruzhick	1c1890fa70	lima/ppir: clone uniforms and load_coords into each successor Try more aggressive approach with cloning uniform and coord loads. Uniform load can be inserted into any instruction, so let's do that. ARM site claim that penalty for cache miss is one clock, so we don't lose anything if we merge it into instruction that uses the result. As side effect we can also pipeline it and thus decrease reg pressure. Do the same for varyings that hold texture coords, but for different reason: looks like there's a special path for coords that increases precision if varying that holds it is pipelined. If we don't pipeline it and load coords from a register its precision is fp16 and thus only 10 bits which is not enough to accurately sample textures of size 1024 or larger. Since instruction can hold only one uniform load and one varying load, node_to_instr now creates a move using helper introduced in previous commit if slot is already taken. As side effect of this change we can also try to pipeline texture loads and create a move if attempt fails. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-04 00:02:13 +00:00
Vasily Khoruzhick	e23fd2c375	lima/ppir: don't assume that load coords gets value from register It can load value from varying directly as well. Also load_regs is the only op that has a source, so add src_num field to load node and set it accordingly. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-04 00:02:13 +00:00
Vasily Khoruzhick	bd77d19300	lima/ppir: add common helper for creating movs Introduce common helper for creating movs to avoid code duplication Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-04 00:02:13 +00:00
Eric Engestrom	7659c6197f	nir: fix memleak in error path Fixes: `2cf59861a8` ("nir: Add partial redundancy elimination for compares") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-04 00:31:53 +01:00
Eric Engestrom	c4969b0a25	freedreno/drm-shim: fix mem leak Fixes: `494ecef6b4` ("freedreno: Add support for drm-shim.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-04 00:18:37 +01:00
Eric Engestrom	7abf65aedc	anv: fix format string in error message Fixes: `9775894f10` ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-04 00:13:20 +01:00
Eric Engestrom	1667360f7d	util/os_file: fix double-close() Fixes: `955c63d364` ("util/os_file: resize buffer to what was actually needed") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-04 00:11:51 +01:00
Eric Engestrom	43d470404c	egl: fix deadlock in malloc error path Fixes: `cb0980e69a` ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-04 00:10:18 +01:00
Eric Engestrom	3afe9d798a	ttn: fix 64-bit shift on 32-bit `1` Fixes: `4d0b2c7aaa` ("ttn: Update shader->info as we generate code.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-09-04 00:01:08 +01:00
Rob Clark	1ef459297c	freedreno/ir3: use uniform base When lowering from ubo, use the constant base field in the load_uniform instruction for the constant part of the offset. Doesn't change much for constant indexing, but this will help for indirect indexing because constant-folding can't completely clean up the result. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Rob Clark	305bcdf992	freedreno/drm: fix 64b iova shifts Should shift before splitting 64b iova into dwords Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Rob Clark	5ccd5871ed	nir: remove unused constant_fold_state Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Eric Anholt	79a5ebe045	freedreno: Fix the type of single-component scaled vertex attrs. This looks like clear copy-and-pasteos, and fixes: dEQP-GLES2.functional.draw.random.40 (on A307 and A630, both tested in the new CI farm) Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-03 19:34:09 +00:00
Connor Abbott	f3e978db4d	radeonsi/nir: Remove uniform variable scanning We can get all the information we need from NIR. It's slightly less accurate, but radeonsi doesn't use the extra information. The old code also overcounted atomic counters, which led to problems when everything was used at once. Fixes KHR-GL45.compute_shader.resources-max. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:55:02 +02:00
Connor Abbott	96c2a2832f	ttn: Fill out more info fields We'll use these in radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:54:57 +02:00
Connor Abbott	dcc64fcfed	nir: Fix num_ssbos when lowering atomic counters Otherwise it's impossible to know the maximum SSBO index for both internal TGSI shaders from TTN (which don't have any notion of atomic counters and no offset) as well as shaders from GLSL. I fixed everything I could find while grepping for num_ssbos and num_abos, which hopefully is everything (iris was the only user I could find that uses it in a meaningful way). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:54:54 +02:00
Connor Abbott	2abf62d348	ac/nir: Fix gather4 integer wa with unnormalized coordinates This adds a bit of unneccesary code on radeonsi, since whether unnormalized coordinates are used is known at compile time with GL, but I wasn't sure if it was worth the few instructions to plumb everything through, especially for something so rare -- my shader-db doesn't have any instances where this changes anything. Fixes CTS tests I created at https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-03 13:50:54 +00:00
Connor Abbott	c63ccf90df	ac/nir: Rewrite gather4 integer workaround based on radeonsi The workaround was originally written based on amdgpu-pro traces, but since then radeonsi has got its own slightly different version. Use the radeonsi version instead, to be consistent and because it'll be slightly more convenient for handling unnormalized coordinates. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-03 13:50:54 +00:00
Eric Engestrom	5f7d90f2ff	egl: warn user if they set an invalid EGL_PLATFORM Technically, the user might have set EGL_DISPLAY instead of EGL_PLATFORM, but since the former is deprecated let's just mention the latter in the warning message. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-03 14:41:43 +01:00
Alyssa Rosenzweig	5cdfccf8a6	panfrost: Remove panfrost_upload This routine was made obsolete over a series of reworks of memory allocation; Tomeu's changes to shader memory allocation finally made this unused as cppcheck noted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:29 +02:00

1 2 3 4 5 ...

115071 Commits All Branches Search

115071 Commits

All Branches