KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	dea6fdadca	winsys/radeon: use pb_cache buckets for fewer pb_cache misses This makes Bioshock Infinite with deferred flushing 2.2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	8d5944199d	gallium/pb_cache: reduce the number of pointer dereferences Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	3cdc0e133f	gallium/pb_cache: divide the cache into buckets for reducing cache misses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	fec7f74129	gallium/pb_cache: check parameters that are more likely to fail first This makes Bioshock Infinite with deferred flushing 2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	2596ae2b6e	radeonsi: emit PS exports last This effectively removes s_waitcnt instructions after FP16 exports. Before: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v4, v5 ; 5E000B04 v_cvt_pkrtz_f16_f32_e32 v1, v6, v7 ; 5E020F06 exp 15, 1, 1, 0, 0, v0, v1, v0, v0 ; F800041F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v8, v9 ; 5E001308 v_cvt_pkrtz_f16_f32_e32 v1, v10, v11 ; 5E02170A exp 15, 2, 1, 0, 0, v0, v1, v0, v0 ; F800042F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v12, v13 ; 5E001B0C v_cvt_pkrtz_f16_f32_e32 v1, v14, v15 ; 5E021F0E exp 15, 3, 1, 1, 1, v0, v1, v0, v0 ; F8001C3F 00000100 s_endpgm ; BF810000 After: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 v_cvt_pkrtz_f16_f32_e32 v2, v4, v5 ; 5E040B04 v_cvt_pkrtz_f16_f32_e32 v3, v6, v7 ; 5E060F06 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 v_cvt_pkrtz_f16_f32_e32 v4, v8, v9 ; 5E081308 v_cvt_pkrtz_f16_f32_e32 v5, v10, v11 ; 5E0A170A exp 15, 1, 1, 0, 0, v2, v3, v0, v0 ; F800041F 00000302 v_cvt_pkrtz_f16_f32_e32 v6, v12, v13 ; 5E0C1B0C v_cvt_pkrtz_f16_f32_e32 v7, v14, v15 ; 5E0E1F0E exp 15, 2, 1, 0, 0, v4, v5, v0, v0 ; F800042F 00000504 exp 15, 3, 1, 1, 1, v6, v7, v0, v0 ; F8001C3F 00000706 s_endpgm ; BF810000 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	b2b45cecef	radeonsi: set optimal settings in COMPUTE_RESOURCE_LIMITS ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	ad70c3954b	radeonsi: really wait for the second EOP event and not the first one Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	1a1cc67edd	gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flag always set Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Ian Romanick	0b626d7524	nir/algebraic: Optimize fabs(u2f(x)) I noticed this when I tried to do frexp(float(some_unsigned)) in the ir_unop_find_lsb lowering pass. The code generated for frexp() uses fabs, and this resulted in an extra instruction. Ultimately I ended up not using frexp. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:30 -07:00
Ian Romanick	94296be276	st/mesa: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:30 -07:00
Ian Romanick	7cb49b1bd7	i965: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	5726e57f13	i965: Don't lower uaddCarry and usubBorrow in both GLSL IR and NIR Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	d7a47a76e0	i965: Update assertion to account for Gen < 7 Previously SHADER_OPCODE_MULH could only exist on Gen7+, so the assertion assumed the Gen7+ accumulator rules. A future patch will allow this instruction on at least Gen6, so update the assertion. v2: Use get_lowered_simd_width instead of open coding it. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2016-07-19 12:19:29 -07:00
Ian Romanick	3e7cebc8da	i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	c2019c6c26	i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	de20086eed	i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	26c7f04d4a	i965: Always enable GL_ARB_shading_language_packing With the existing lowering passes, the functions from this extension become a bunch of bit twiddling operations that have always been supported. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	4b2b6d4d4d	i965: Move enable of EXT_shader_integer_mix This extension does not depend on the Gen. It only depends on the availability of GLSL 1.30. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	a2379e44aa	glsl: Add lowering pass for ir_bin_imul_high This isn't the lowering pass you want. Most GPUs that can support GLSL 1.30 have a multiply unit that can do something more interesting than 32x32->32. Many have 32x16->48. Any GPU that does, should do the lowering in the backend. This is just the thing that will always work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	1b5477668a	glsl: Add lowering pass for ir_unop_find_msb Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	2a381a3c73	glsl: Add lowering pass for ir_unop_find_lsb Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	ad9acb19c3	glsl: Add lowering pass for ir_unop_bitfield_reverse Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	3079dcb00c	glsl: Add lowering pass for ir_quadop_bitfield_insert Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	4d6d219b58	glsl: Add lowering pass for ir_triop_bitfield_extract Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	7340be8a01	glsl: Add lowering pass for ir_unop_bit_count Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	806add360f	MESA_shader_integer_functions: Allow new function overload matching rules Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	90537e1a0e	MESA_shader_integer_functions: Allow implicit int->uint conversions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	65b0346fdb	MESA_shader_integer_functions: Expose new built-in functions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	15c4ae461d	MESA_shader_integer_functions: Boiler plate extension tracking Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	91482ef226	MESA_shader_integer_functions: Add extension specification v2: Fix typo in #extension line noticed by Ken. v3: Update spec status. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:15 -07:00
Samuel Pitoiset	9c63224540	gm107/ir: make use of ADD32I for all immediates ADD only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-07-19 18:07:15 +02:00
Samuel Pitoiset	0904a2ba97	gm107/ir: add missing NEG modifier for IADD32I Like FADD32I, the NEG modifier of src0 is at position 56. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-07-19 18:07:10 +02:00
Andreas Boll	c482decd4d	ddebug: Fix trivial typo in stderr message Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2016-07-19 16:04:40 +02:00
Andreas Boll	d66cb7c84f	configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too The help string wasn't updated in `cbc37f7`. Fixes: `cbc37f7` ("anv: install the intel_icd.json to ${datarootdir} by default") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-07-19 16:04:01 +02:00
Eric Engestrom	8ba46fbd9e	vl: fix memory leak CovID: 1363008 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:41:00 +02:00
Boyuan Zhang	60c7450f16	vl: add entry point Add entrypoint to distinguish H.264 decode and encode. For example, in patch 5/11 when is calling "VaCreateContext", "pps" and "sps" shouldn't be allocated for H.264 encoding. So we need to use the entry_point to determine this is H.264 decode or H.264 encode. We can use config to determine the entrypoint since config_id is passed to us for VaCreateContext call. However, for VaDestoyContext call, only context_id is passed to us. So we need to know the entrypoint in order to not free the pps/sps for encoding case. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:36:46 +02:00
Ilia Mirkin	ed9dd3bcd9	nv50,nvc0: srgb rendering is only available for rgba/bgra Mark both L8_SRGB and L8A8_SRGB as non-renderable (the latter already didn't have the bind flags). This makes the state tracker pick a different format when rendering is required, or mark the fb as incomplete. This fixes: bin/getteximage-formats init-by-clear-and-render -auto -fbo bin/getteximage-formats init-by-rendering -auto -fbo which previously ran into srgb-encoding differences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-07-18 20:04:17 -04:00
Ilia Mirkin	8e7893eb53	nvc0: add support for BGRA8 images This is useful for pbo downloads, which are now accelerated with images. BGRA8 is a moderately common format to do that in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-18 20:04:17 -04:00
Jason Ekstrand	905d7dc4d1	i965: Skip update_texture_surface when the plane doesn't exist Thanks to rebase fail, recent surface state changes (commits `7e951cd56`, `8521ce1a7`, and `69c0dc5c53`) effectively reverted `727a9b2493` and `367cf3a2e3` which was unintentional. This should bring it back. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-07-18 16:44:29 -07:00
Timothy Arceri	cd5cbf0f6b	glsl: use linked shaders rather than compiled shaders At this point there is no reason not to be using the linked shaders, using the linked shaders should be faster and will make things simpler for upcoming shader cache work. The previous variable name suggests the linked shaders were intended to be used here anyway. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-19 09:42:00 +10:00
Lars Hamre	198074a41c	The extension is already exposed, this simply marks it as done. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-07-19 01:20:27 +02:00
Anuj Phogat	22935a3040	docs: Fix typo in extension name Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-18 15:53:24 -07:00
Anuj Phogat	7832e18879	docs: Add support for GL_KHR_texture_compression_astc_sliced_3d Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-18 15:44:18 -07:00
Anuj Phogat	c7b787ef90	Revert "docs: Mark KHR_texture_compression_astc_sliced_3d done on i965" This reverts commit `82f8c23950`. KHR_texture_compression_astc_sliced_3d is not a requirement for GLES 3.2. Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>\ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-18 15:43:58 -07:00
Anuj Phogat	82f8c23950	docs: Mark KHR_texture_compression_astc_sliced_3d done on i965 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Anuj Phogat	ac0eb36d8e	i965/gen9: Enable KHR_texture_compression_astc_sliced_3d Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Anuj Phogat	15dea5ca82	mesa: Add the infrastructure for KHR_texture_compression_astc_sliced_3d V2: Drop the changes to gl.xml. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Christian König	3e1ad846f9	radeon/uvd: add session context buffer for polaris 10/11 v2 This way we have unlimited UVD sessions. v2: only enable it when kernel supports it as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-18 17:13:17 +02:00
Leo Liu	134d6e4e4f	vl/dri3: fix a memory leak from front buffer Inspired by fix for mem leak of vdpau interop, resource_from_handle set texture reference count, that need to be decreased and released, recall there is a similar case for DRI3, that is with VA-API glx extension, there is temporary TFP(texture from pixmap), we target it through dma-buf. leak happens when without count down the reference. Checked and found with mpv vo=opengl case, there only one static TFP, the leak happens once, but for totem player using gstreamer VA-API glx, the dynamic TFP for each frame, so leak quite a bit. This fixes mem leak for mpv and totem. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-18 09:20:40 -04:00
Iago Toral Quiroga	0f2516d88f	i965/tes/scalar: fix 64-bit indirect input loads We totally ignored this before because there were no piglit tests for indirect loads in tessellation stages with doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-18 09:53:51 +02:00

1 2 3 4 5 ...

83359 Commits All Branches Search

83359 Commits

All Branches