KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	8915621882	intel/blorp: Take a range of layers in blorp_ccs_resolve Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	67b676f0c5	intel/blorp: Add initial support for indirect clear colors Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-27 16:22:12 -08:00
Jason Ekstrand	85aa4074a2	i965/blorp: Use a designated initializer for blorp_surf This way uninitialized fields get automatically zeroed and it's safe to add more fields to blorp_surf. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:12 -08:00
Jason Ekstrand	86becfd2de	intel/blorp: Add fast-clear to the special case in MSAA resolves This doesn't go all the way of avoiding the txf_ms if it's fast-cleared, however it does at least make us only do it once. This should improve performance of MSAA resolves in the presence of lots of clear color. Without the patch, enabling fast-clears in the multisampling Sascha demo drops the framerate by about 10%. With this patch, enabling fast-clears increases the demo's framerate by 25%. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:11 -08:00
Jason Ekstrand	dc21c3937c	intel/blorp/blit: Rename blorp_nir_txf_ms_mcs That name is already taken by one of the helpers in blorp_nir_builder.h and, while we haven't moved the guts of blorp_blit.c there yet, we'd like to start using some things from that header. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:19:38 -08:00
Rob Herring	46148be8e4	Android: disable warnings causing errors AOSP master has changed the build default to -Werror making all the warnings errors. Override that with -Wno-error. Signed-off-by: Rob Herring <robh@kernel.org>	2017-11-27 17:26:45 -06:00
Timothy Arceri	3e789026ca	st/glsl_to_tgsi: make use of driver_cache_blob with the disk cache driver_cache_blob was introduced with the i965 disk cache, it allows us to simplify the cache a little and possibly offers some minor speed improvements since we load the GLSL metadata and TGSI from disk in one pass. Using driver_cache_blob should also make it straight forward to implement binary support for ARB_get_program_binary in gallium. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:01:44 +11:00
Gwan-gyeong Mun	4cb27047c8	glsl: Fix typo nagivation -> navigation Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-28 08:48:55 +11:00
Emil Velikov	c7616ac069	gl_table.py: add extern C guard for the generated glapitable.h The header can be included from C++, hence contents should have appropriate notation. Cc: mesa-stable@lists.freedesktop.org Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-27 19:23:05 +00:00
Marek Olšák	6b8909f2d1	ac: pack legacy_surf_level better r600_texture: 1488 -> 1248 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:46:16 +01:00
Marek Olšák	ec15ff78c3	ac: change legacy_surf_level::slice_size to dword units The next commit will reduce the size even more. v2: typecast to uint64_t manually v3: add more typecasts, add asserts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:44:04 +01:00
Marek Olšák	474b4a9191	ac: pack ac_surface better r600_texture: 1736 -> 1488 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	b5444877c0	radeonsi: always initialize max_forced_staging_uploads r600_resource is malloc'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103808 Fixes: `4b0dc098b2` ("gallium/u_threaded: don't map big VRAM buffers for the first upload directly") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	95cd74abd4	radeonsi: remove an old hack for evergreen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	1cb731012c	radeonsi: set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST when profitable ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Dave Airlie	043d14db30	ac/nir: don't write tcs outputs to LDS that aren't read back. If the TCS doesn't read back the outputs, no need to store them to LDS in the first place. (except for tess factors). This seems to give about 50fps (3290->3330) with tessellation demo. I haven't tested if it impacts DoW3 at all. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 13:50:24 +10:00
Dave Airlie	33dca36f4f	nir: fill outputs_read field and add patch outputs read (v2) This is to be used for TCS optimisations on radv. v2: don't set written on reads (nha) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-27 13:50:03 +10:00
Dave Airlie	fd301472bd	r600/eg: dump event type in dumps This just makes it easier to debug some things. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 12:53:18 +10:00
Tobias Klausmann	068a72fbcb	nouveau/compiler: Allow to omit line numbers when printing instructions This comes in handy when checking "NV50_PROG_DEBUG=1" outputs with diff! V2: - Use environmental variable (Karol Herbst) V3: - Use the already populated nv50_ir_prog_info to forward information to the print pass (Pierre Moreau) V4: - get rid of default value in PrintPass constructor Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 12:51:30 -05:00
Nicolai Hähnle	0fed7f83ba	radeonsi: try flushing unflushed fences in si_fence_finish even when timeout == 0 Under certain conditions, waiting on a GL sync objects should act like a flush, regardless of the timeout. Portal 2, CS:GO, and presumably other Source engine games rely on this behavior and hang during loading without this fix. Fixes: `bc65dcab3b` ("radeonsi: avoid syncing the driver thread in si_fence_finish") Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103902 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103904	2017-11-26 16:53:00 +01:00
Ilia Mirkin	0bd83d0461	nv50/ir: move LateAlgebraicOpt to the very end Memory loads can take offsets, but the SHLADD will often attempt to consume the offsets too. As there may be multiple memory loads with the same base but different offsets, those would end up in a SHLADD instead of the offset of the memory operation. This moves the pass after we've had a chance to attempt to propagate immediate adds into the indirect offset. total instructions in shared programs : 6580681 -> 6567716 (-0.20%) total gprs used in shared programs : 944261 -> 943375 (-0.09%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60339896 -> 60221504 (-0.20%) local shared gpr inst bytes helped 0 0 555 2698 2698 hurt 0 0 138 336 336 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 01:10:19 -05:00
Ilia Mirkin	3072bbef63	nv50/ir: when merging immediates/consts, load directly When a MERGE operation gets its constraint moves added, it susbstantially extends live ranges to be reusing an immediate from earlier in the program (not to mention the silliness of loading an immediate into a register, and then moving into another register). We detect these scenarios and insert moves that take the immediate or constbuf load directly into the register. If it's the last use, then we can just move that operation to the closer location. With SM35 (255 regs) we get these results: total instructions in shared programs : 6583670 -> 6580681 (-0.05%) total gprs used in shared programs : 950818 -> 944261 (-0.69%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60367456 -> 60339896 (-0.05%) local shared gpr inst bytes helped 0 0 4584 3186 3186 hurt 0 0 55 968 968 I suspect they will be better for SM20 and SM30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 01:10:19 -05:00
Ilia Mirkin	50e913b9c5	nv50/ir: add optimization for modulo by a non-power-of-2 value We can still use the optimized division methods which make use of multiplication with overflow. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2017-11-26 01:10:03 -05:00
Ilia Mirkin	3079993727	nv50/ir: optimize signed integer modulo by pow-of-2 It's common to use signed int modulo in GLSL. As it happens, the GLSL specs allow the result to be undefined, but that seems fairly surprising. It's not that much more effort to get it right, at least for positive modulo operators. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-25 22:48:09 -05:00
Matt Turner	676761252b	util: Just give up and define PIPE_ARCH_LITTLE_ENDIAN on MSVC MSVC doesn't support #warning?! Getting really tired of this.	2017-11-25 16:46:00 -08:00
Andres Gomez	5fa589148a	docs: remove bug 103626 from fix list as per 17.2.6 Bug https://bugs.freedesktop.org/show_bug.cgi?id=103626 was incorrectly listed as fixed. Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit b9b60dbf55a1307a60a333c70c3add3643243c36)	2017-11-26 02:18:08 +02:00
Matt Turner	b8cbad624b	util: Use preprocessor correctly Fixes: `6a353479a7` ("util: Assume little endian in the absence of platform-specific handling")	2017-11-25 15:57:37 -08:00
Andres Gomez	63d488d10c	docs: update calendar, add news item and link release notes for 17.2.6 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-11-26 01:46:25 +02:00
Andres Gomez	b0049428b5	docs: add sha256 checksums for 17.2.6 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit 93c2beafc0a7fa2f210b006d22aba61caa71f773)	2017-11-26 01:42:16 +02:00
Andres Gomez	e6acc4d528	docs: add release notes for 17.2.6 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit 00b52f8e99653316a090826914509a138a1c78f7)	2017-11-26 01:42:15 +02:00
Ilia Mirkin	f39a91c152	freedreno/a4xx: add ARB_framebuffer_no_attachments support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	4f748d12e8	freedreno/a4xx: add indirect draw support This is a copy of the a5xx logic. Fails a few tests, but basic functionality is there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	c3c8d48725	freedreno: regenerate pm4 header, adjust code for new names Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	ffdcd51e66	freedreno/a4xx: add stencil texturing support Copied from a5xx, should be identical. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	86f12e9377	freedreno/ir3: add a pass to lower tg4 to txl, enable gather on a4xx Unfortunately Adreno A4xx hardware returns incorrect results with the GATHER4 opcodes. As a result, we have to lower to 4 individual texture calls (txl since we have to force lod to 0). We achieve this using offsets, including on cube maps which normally never have offsets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:59 -05:00
Ilia Mirkin	ab336e8b46	nir: allow texture offsets with cube maps GL doesn't have this, but some hardware supports it. This is convenient for lowering tg4 to plain texture calls, which is necessary on Adreno A4xx hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:30 -05:00
Matt Turner	c690a7a8cd	util: Fix disk_cache index calculation on big endian The cache-test test program attempts to create a collision (using key_a and key_a_collide) by making the first two bytes identical. The idea is fine -- the shader cache wants to use the first four characters of a SHA1 hex digest as the index. The following program unsigned char array[4] = {1, 2, 3, 4}; int ptr = (int )array; for (int i = 0; i < 4; i++) { printf("%02x", array[i]); } printf("\n"); printf("%08x\n", *ptr); prints 01020304 04030201 on little endian, and 01020304 01020304 on big endian. On big endian platforms reading the character array back as an int (as is done in disk_cache.c) does not yield the same results as reading the byte array. To get the first four characters of the SHA1 hex digest when we mask with CACHE_INDEX_KEY_MASK, we need to byte swap the int on big endian platforms. Bugzilla: https://bugs.freedesktop.org/103668 Bugzilla: https://bugs.gentoo.org/637060 Bugzilla: https://bugs.gentoo.org/636326 Fixes: `87ab26b2ab` ("glsl: Add initial functions to implement an on-disk cache") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-25 12:30:46 -08:00
Matt Turner	513d7ffa23	util: Add a SHA1 unit test program Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-25 12:30:46 -08:00
Matt Turner	532674303a	util: Fix SHA1 implementation on big endian The code defines a macro blk0(i) based on the preprocessor condition BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap operation. Unfortunately, if the preprocessor macros used in the test are no defined, then the comparison becomes 0 == 0 and it evaluates as true. Fixes: `d1efa09d34` ("util: import sha1 implementation from OpenBSD") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-25 12:30:46 -08:00
Matt Turner	6a353479a7	util: Assume little endian in the absence of platform-specific handling	2017-11-25 12:30:46 -08:00
Marek Olšák	78942e7dbf	mesa: shrink VERT_ATTRIB bitfields to 32 bits There are only 32 vertex attribs now. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:18:22 +01:00
Marek Olšák	43abaf2ad0	mesa: remove unused vertex attrib WEIGHT We don't support ARB_vertex_blend. Note that the attribute aliasing check for ARB_vertex_program had to be rewritten. vbo_context: 20344 -> 20008 bytes gl_context: 74672 -> 74616 bytes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:17:52 +01:00
Marek Olšák	2116b97418	mesa: don't assign numbers to vertex attrib enums manually I plan to remove one of them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:17:52 +01:00
Marek Olšák	bd57f45168	gallium/hud: add HUD sharing within a context share group This is needed for profiling multi-context applications like Chrome. One context can record queries and another context can draw the HUD. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	11e25eb7f4	gallium/hud: update the HUD interface for multiple contexts This is the boring subset of the following commit. All new parameters are optional. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	9c5b4eb6b4	gallium/hud: prevent a crash if the recording context is inactive Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	37ded08321	gallium/hud: separate code for record context init/release Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	fc07acc21e	gallium/hud: separate code for draw context init/release Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	8caf7d51a9	gallium/hud: don't use hud->pipe in hud_parse_env_var Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	65433c3fd0	gallium/hud: use cso_get_pipe_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00

1 2 3 4 5 ...

98046 Commits All Branches Search

98046 Commits

All Branches