KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Francisco Jerez	9ec3362e0b	intel/l3: Don't allocate SLM partition on ICL+. SLM has a chunk of special-purpose memory separate from L3 on ICL+, we shouldn't allocate a partition for it on L3 anymore. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Charmaine Lee	af8877af3b	svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs Since geometry shader also consumes prescale constants, the geometry shader constant buffer will need to be updated when prescale factor is changed. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	dc79b88402	svga: fix blending regression The earlier Mesa commit `3d06c8afb5` ("st/mesa: don't translate blend state when it's disabled for a colorbuffer") subtly changed the details of gallium's per-RT blend state. In particular, when pipe_rt_blend_state[i].blend_enabled is true, we have to get the src/dst blend terms from pipe_rt_blend_state[i], not [0] as before. We now have to scan the blend targets to find the first one that's enabled (if any). We have to use the index of that target for getting the src/dst blend terms. And note that we have to set identical blend terms for all targets. This fixes the Piglit fbo-drawbuffers2-blend test. VMware bug 2063493. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	b871a77316	svga: check svga_have_vgpu10() in svga_delete_blend_state() We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not enabled (bs->id==0 by default), resulting in lots of device errors. Reviewed-by: Neha Bhende<bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	72df3a7a39	svga: if svga_update_state() fails, skip the draw call If svga_update_state() fails, we flush the command buffer and retry. If it fails again, it likely means we were unable to translate a shader for some reason (uses too many resources, for example). In that case, let's just skip the draw call. The alternative, just disabling the shader stage in question, would certainly lead to bad rendering anyway, and probably device errors. Fixes failed assertion running Piglit glsl-1.50/execution/ variable-indexing/gs-output-array-vec4-index-wr.shader_test since it uses too many GS output registers (though the test still fails). VMware bug 2063492. v2: also call pipe_debug_message() so apps or apitrace can be notified when this issue occurs. v3: use svga_update_state_retry(). Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	0a7deaa0d6	svga: let svga_update_state_retry() return a bool This will allow minor simplifications elsewhere. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	35c5cf8959	svga: s/unsigned/boolean/ for a few local vars Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-02 12:23:50 -07:00
Dylan Baker	e23192022a	meson: install vulkan_intel.h header Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-02 11:11:20 -08:00
Boyuan Zhang	1ad89fa138	st/omx_bellagio: add picture profile and entry point Profile and entry point were missing in the picture structure. Therefore, add them back. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-03-02 12:04:36 -05:00
Boyuan Zhang	6a62e455f2	radeonsi: fix radeon create encoder return Previous patch missed a "return" when trying to modify the create encoder function, which made the whole logic fail. Therefore, add the return back. Fixes: `b38b208ff8` "radeonsi:create uvd hevc enc entry" Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-02 12:04:36 -05:00
Thierry Reding	f9bc48d41d	loader: Add support for platform and host1x busses ARM SoCs usually have their DRM/KMS devices on the platform bus, so add support for this bus in order to allow use of the DRI_PRIME environment variable with those devices. While at it, also support the host1x bus, which is effectively the same but uses an additional layer in the bus hierarchy. Note that it isn't enough to support the bus that has the rendering GPU because the loader code will also try to construct an ID path tag for a scanout-only device if it is the default that is being opened. The ID path tag for a device can be obtained by running udevadm info on the device node, as shown in this example on NVIDIA Tegra: $ udevadm info /dev/dri/card0 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-50000000_host1x The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is constructed, can be found in the sysfs "uevent" attribute for the card0 device's parent: $ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent OF_FULLNAME=/host1x@50000000 Similarily, /dev/dri/card1 corresponds to the GPU: $ udevadm info /dev/dri/card1 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-57000000_gpu and: $ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent OF_FULLNAME=/gpu@57000000 Changes in v2: - avoid confusing pre-increment in strdup() - add examples of tags to commit message Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-02 14:40:29 +01:00
Thierry Reding	498faea103	disk cache: Link with -latomic if necessary The disk cache implementation uses 64-bit atomic operations. For some architectures, such as 32-bit ARM, GCC will not be able to translate these operations into atomic, lock-free instructions and will instead rely on the external atomics library to provide these operations. Check at configuration time whether or not linking against libatomic is necessary and if so, create a dependency that can be used while linking the mesautil library. This is the meson equivalent of `2ef7f23820` ("configure: check if -latomic is needed for __atomic_*"). For some background information on this, see: https://gcc.gnu.org/wiki/Atomic/GCCMM Changes in v2: - clarify meaning of lock-free in commit message - fix build if -latomic is not necessary Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-02 11:31:59 +01:00
Samuel Pitoiset	c133a3411b	radv: do not set pending_reset_query in BeginCommandBuffer() This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-02 09:44:12 +01:00
Dave Airlie	bf2af063c3	r600/cayman: fix fragcood loading recip generation. This fixes some hangs seen where the recip_ieee opcodes would end up split across the wrong slots. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-02 00:33:18 +00:00
Kenneth Graunke	cee9f38903	i965: Allow 48-bit addressing on Gen8+. This allows most GPU objects to use the full 48-bit address space offered by Gen8+ platforms, rather than being stuck with 32-bit. This expands the available GPU memory from 4G to 256TB or so. A few objects - instruction, scratch, and vertex buffers - need to remain pinned in the low 4GB of the address space for various reasons. We default everything to 48-bit but disable it in those cases. Thanks to Jason Ekstrand for blazing this trail in anv first and finding the nasty undocumented hardware issues. This patch simply rips off all of his findings. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 15:46:11 -08:00
Kenneth Graunke	6712611735	i965: Shorten the name of the workaround BO. This makes the name shorter in debug printouts. If "workaround_bo" is good enough for the code, it's probably good enough for debugging.	2018-03-01 15:46:11 -08:00
Kenneth Graunke	b04c5cece7	i965: Add debugging code to dump the validation list. When anything goes wrong with this code, dumping the validation list is a useful way to figure out what's happening.	2018-03-01 15:46:11 -08:00
Jason Ekstrand	ff4726077d	intel/fs: Set up sampler message headers in the visitor on gen7+ This gives the scheduler visibility into the headers which should improve scheduling. More importantly, however, it lets the scheduler know that the header gets written. As-is, the scheduler thinks that a texture instruction only reads it's payload and is unaware that it may write to the first register so it may reorder it with respect to a read from that register. This is causing issues in a couple of Dota 2 vertex shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-01 15:11:01 -08:00
Timothy Arceri	f5305c1b44	ac: fix nir_intrinsic_shared_atomic_comp_swap handling Following on from `49879f3778` this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Timothy Arceri	13cdf4e590	st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs We only need to check for previously processed location on user defined varyings as they are the only ones that support component packing. Therefore a single instance of processed_locs can be shared by regular varyings and patches. For simplicity we make processed_locs an array in order to handle dual source bleanding. Fixes the follow piglit test on radeonsi: tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Jason Ekstrand	89f78cf333	anv: Enable MSAA fast-clears This speeds up the Sascha Willems multisampling demo by around 25% when using 8x or 16x MSAA. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	00da139477	anv/cmd_buffer: Add support for MCS fast-clears and resolves Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	1805c483b1	anv/cmd_buffer: Add helpers for computing resolve predicates We'll want to re-use the complex resolve predicate computations for MCS resolves so it's nice to have them as helper functions. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	a0a319f16e	anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage This doesn't actually do anything because att_state->fast_clear is determined based on the return value of anv_layout_to_fast_clear_type which currently returns NONE for multisampled images. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	d0f701d2f1	anv/blorp: Pass the clear address to blorp for subpass MSAA resolves Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	f4f95496cb	anv/blorp: Allow indirect clear colors on blorp sources on gen7 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	d85f05bd6f	anv/blorp: Add partial clear support to anv_image_mcs_op Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	c34feaea52	intel/blorp: Add indirect clear color support to mcs_partial_resolve This is a bit complicated because we have to get the indirect clear color in there somehow. In order to not do any more work in the shader than needed, we set it up as it's own vertex binding which points directly at the clear color address specified by the client. Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	ca7ab1a6a5	intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE There are enough #ifs in there that it's kind-of pointless to duplicate it for each buffer. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Andriy Khulap	7859701920	i965: Fix RELOC_WRITE typo in brw_store_data_imm64() Fixes: `6c530ad116` ("i965: Reduce passing 2x32b of reloc_domains to 2 bits") Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-01 11:20:04 -08:00
Jonathan Gray	034bbaa6c0	gallium/util: use sockets on PIPE_OS_UNIX in u_network Instead of listing all the UNIX PIPE_OS platforms just use PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-01 18:44:39 +00:00
Jonathan Gray	7bea40e566	util: use clock_gettime() on PIPE_OS_BSD OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime() so use it when PIPE_OS_BSD is defined. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-01 18:44:38 +00:00
Jose Maria Casanova Crespo	4420d8866c	nir/search: Include 8 and 16-bit support in construct_value Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 09:16:03 -08:00
Jason Ekstrand	99ee40fb54	nir/search: Support 8 and 16-bit constants in match_value Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-03-01 09:15:01 -08:00
Andres Gomez	b5b912dfee	travis: make Meson find the proper llvm-config Travis CI has moved to LLVM 5.0, and meson is detecting automatically the available version in /usr/local/bin based on the PATH env variable order preference. As for 0.44.x, Meson cannot receive the path to the llvm-config binary as a configuration parameter. See https://github.com/mesonbuild/meson/issues/2887 and `7c8b6ee3fa` We want to use the custom (APT) installed version. Therefore, let's make Meson find our wanted version sooner than the one at /usr/local/bin Once this is corrected, we would still need a patch similar to: https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html v2: Create the link only to the specificly wanted LLVM version (Gert). Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Gert Wollny <gw.fossdev@gmail.com> Cc: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-01 12:21:30 +02:00
Andres Gomez	98f7650add	meson: fix LLVM version detection when <= 3.4 3 digits versions in LLVM only started from 3.4.1 on. Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in the system while not needing LLVM at all (auto), when passing through the LLVM version detection code, meson will fail when accessing "_llvm_version[2]" due to: "Index 2 out of bounds of array of size 2." v2: Properly compare LLVM version and set patch version to 0 if < 3.4.1 (Eric). v3: Improve the commit log explanation (Eric). Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-01 12:16:23 +02:00
Iago Toral Quiroga	bc73016703	i965/sbe: fix number of inputs for active components In `16631ca30e` we fixed gen9 active components to account for padded inputs in the URB, which we can have with SSO programs. To do that, instead of going through the bitfield of inputs (which doesn't include padding information), we compute the number of inputs from the size of the URB entry. Unfortunately, there are some special inputs that are not stored in the URB and that we also need to account for. These special inputs are identified and handled during calculate_attr_overrides(). Instead of keeping track of the exact number of inputs, we just program active components for all possible inputs like we do in anvil. This fixes a regression in a WebGL program that uses Point Sprite functionality (specifically, VARYING_SLOT_PNTC). v2: - Add 'Fixes' tag (Mark Janes) - make no_vue_inputs int instead of uint32_t, and add const qualifier to num_inputs variable (Ian) v3: - Do not try to count inputs correctly, just program all input slots like we do in anvil (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224 Fixes: `16631ca30e` (i965/sbe: fix active components for SSO programs with over 16 inputs) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-01 10:55:12 +01:00
Samuel Pitoiset	c27f5419f6	radv: only emit cache flushes when the pool size is large enough This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:40 +01:00
Samuel Pitoiset	2fe07933bd	radv: keep track of the query pool size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:39 +01:00
Samuel Pitoiset	c956d0f406	radv: make sure to emit cache flushes before starting a query If the query pool has been previously resetted using the compute shader path. Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:14:49 +01:00
Alejandro Piñeiro	e72fb4e611	nir/serialize: handle var->name being NULL var->name could be NULL under ARB_gl_spirv for example. And in any case, the code is already handing var name being NULL when reading a variable, so it is consistent to do it writing a variable too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo	ba642ee3ee	anv: Enable VK_KHR_16bit_storage for PushConstant Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	02266f9ba1	spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned The introduction of 16-bit types with VK_KHR_16bit_storages implies that push constant offsets could be multiple of 2-bytes. Some assertions are updated so offsets should be just multiple of size of the base type but in some cases we can not assume it as doubles aren't aligned to 8 bytes in some cases. For 16-bit types, the push constant offset takes into account the internal offset in the 32-bit uniform bucket adding 2-bytes when we access not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0. v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	23ffb7c2d1	spirv: Calculate properly 16-bit vector sizes Range in 16-bit push constants load was being calculated wrongly using 4-bytes per element instead of 2-bytes as it should be. v2: Use glsl_get_bit_size instead of if statement (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	994d210429	anv: Enable VK_KHR_16bit_storage for SSBO and UBO Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	69be3a82ca	i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout Restrict the use of untyped_surface_write with 16-bit pairs in ssbo to the cases where we can guarantee that offset is multiple of 4. Taking into account that VK_KHR_relaxed_block_layout is available in ANV we can only guarantee that when we have a constant offset that is multiple of 4. For non constant offsets we will always use byte_scattered_write. v2: (Jason Ekstrand) - Assert offset_reg to be multiple of 4 if it is immediate. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	8dd8be0323	i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout 16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't guarantee that offsets are multiple of 4-bytes as required by untyped_read message. This happens for example in the case of f16mat3x3 when then VK_KHR_relaxed_block_layout is enabled. Vectors reads when we have non-constant offsets are implemented with multiple byte_scattered_read messages that not require 32-bit aligned offsets. Now for all constant offsets we can use the untyped_read_surface message. In the case of constant offsets not aligned to 32-bits, we calculate a start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data function and the first_component parameter to skip the copy of the unneeded component. v2: (Jason Ekstrand) Use untyped_read_surface messages always we have constant offsets. v3: (Jason Ekstrand) Simplify loop for reads with non constant offsets. Use end - start to calculate the number of 32-bit components to read with constant offsets. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	2dd94f462b	i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components This helper used to load 16bit components from 32-bits read now allows skipping components with the new parameter first_component. The semantics now skip components until we reach the first_component, and then reads the number of components passed to the function. All previous uses of the helper are updated to use 0 as first_component. This will allow read 16-bit components when the first one is not aligned 32-bit. Enabling more usages of untyped_reads with 16-bit types. v2: (Jason Ektrand) Change parameters order to first_component, num_components Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	67d7dd594e	isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit The surfaces that backup the GPU buffers have a boundary check that considers that access to partial dwords are considered out-of-bounds. For example, buffers with 1,3 16-bit elements has size 2 or 6 and the last two bytes would always be read as 0 or its writting ignored. The introduction of 16-bit types implies that we need to align the size to 4-bytew multiples so that partial dwords could be read/written. Adding an inconditional +2 size to buffers not being multiple of 2 solves this issue for the general cases of UBO or SSBO. But, when unsized arrays of 16-bit elements are used it is not possible to know if the size was padded or not. To solve this issue the implementation calculates the needed size of the buffer surfaces, as suggested by Jason: surface_size = isl_align(buffer_size, 4) + (isl_align(buffer_size, 4) - buffer_size) So when we calculate backwards the buffer_size in the backend we update the resinfo return value with: buffer_size = (surface_size & ~3) - (surface_size & 3) It is also exposed this buffer requirements when robust buffer access is enabled so these buffer sizes recommend being multiple of 4. v2: (Jason Ekstrand) Move padding logic fron anv to isl_surface_state. Move calculus of original size from spirv to driver backend. v3: (Jason Ekstrand) Rename some variables and use a similar expresion when calculating. padding than when obtaining the original buffer size. Avoid use of unnecesary component call at brw_fs_nir. v4: (Jason Ekstrand) Complete comment with buffer size calculus explanation in brw_fs_nir. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Mathias Fröhlich	4c232dc721	vbo: Remove vbo_save_vertex_list::vertex_size. Like before use local variables from compile_vertex_list instead. Remove vertex_size from struct vbo_save_vertex_list. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00

... 2 3 4 5 6 ...

100728 Commits All Branches Search

100728 Commits

All Branches