KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Nicolai Hähnle	81d7577d48	ddebug: add driver log to record dumps Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:50:44 +02:00
Nicolai Hähnle	1966d9ff41	gallium: add pipe_context::set_log_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:50:42 +02:00
Nicolai HÃ¤hnle	177144cefc	util/log: add auto logger facility Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:50:40 +02:00
Nicolai Hähnle	1cc2fd57d1	util: add chunk logging module Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:50:31 +02:00
Ian Romanick	b3a481779b	glsl/linker: Make several functions not static copy_constant_to_storage, set_uniform_initializer, populate_consumer_input_sets, and get_matching_input are all used by tests in src/compiler/glsl/tests: glsl/tests/varyings_test.o: In function `link_varyings_single_simple_input_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:131: undefined reference to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable*)' glsl/tests/varyings_test.o: In function `link_varyings_gl_ClipDistance_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:159: undefined reference to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable)' glsl/tests/varyings_test.o: In function `link_varyings_gl_CullDistance_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:186: undefined reference to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable)' glsl/tests/varyings_test.o: In function `link_varyings_single_interface_input_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:208: undefined reference to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable)' glsl/tests/varyings_test.o: In function `link_varyings_one_interface_and_one_simple_input_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:241: undefined reference to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable)' glsl/tests/varyings_test.o:src/compiler/glsl/tests/varyings_test.cpp:272: more undefined references to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable)' follow glsl/tests/varyings_test.o: In function `link_varyings_interface_field_doesnt_match_noninterface_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:289: undefined reference to `linker::get_matching_input(void, ir_variable const, hash_table, hash_table, ir_variable)' glsl/tests/varyings_test.o: In function `link_varyings_interface_field_doesnt_match_noninterface_vice_versa_Test::TestBody()': src/compiler/glsl/tests/varyings_test.cpp:314: undefined reference to `linker::populate_consumer_input_sets(void, exec_list, hash_table, hash_table, ir_variable)' src/compiler/glsl/tests/varyings_test.cpp:328: undefined reference to `linker::get_matching_input(void, ir_variable const, hash_table, hash_table, ir_variable*)' Fixes: `ca73c3358c` ("glsl: Mark functions static") Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-08-22 17:43:40 +10:00
Jason Ekstrand	0ae9ce0f29	i965/clear: Quantize the depth clear value based on the format In `f9fd976e8a` we changed the clear value to be stored as an isl_color_value. This had the side-effect same clear value check is now happening directly between the f32[0] field of the isl_color_value and ctx->Depth.Clear. This isn't what we want for two reasons. One is that the comparison happens in floating point even for Z16 and Z24 formats. Worse than that, ctx->Depth.Clear is a double so, even for 32-bit float formats, we were comparing as doubles and not floats. This means that the test basically always fails for anything other than 0.0f and 1.0f. This caused a slight performance regression in Lightsmark 2008 because it was using a depth clear value of 0.999 which can't be stored in a 32-bit float so we were doing unneeded resolves. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/101678 Cc: "17.2" <mesa-stable@lists.freedesktop.org>	2017-08-21 22:18:53 -07:00
Timothy Arceri	3c9ed70d92	mesa/st: simplify some UBO index logic Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 13:32:21 +10:00
Timothy Arceri	36431cf979	i965: enable STD430 packing by default on IVB+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-22 11:29:27 +10:00
Timothy Arceri	4c2422067b	glsl: pass UseSTD430AsDefaultPacking to where it will be used Here we also make use of the UseSTD430AsDefaultPacking constant and call the new get_internal_ifc_packing() helper. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 11:29:27 +10:00
Timothy Arceri	12e1f0c696	glsl: add get_internal_ifc_packing() type helper This is used to avoid code duplication when selecting the packing type for shared and packed layouts. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 11:29:27 +10:00
Timothy Arceri	334a27afa7	mesa: add UseSTD430AsDefaultPacking constant This will be used to enable the STD430 layout as the default for UBOs and SSBOs with layouts of shared/packed rather than STD140. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 11:29:27 +10:00
Aaron Watry	5e253fe338	clover/device: Calculate CL_DEVICE_MEM_BASE_ADDR_ALIGN in device The CL CTS queries CL_DEVICE_MEM_BASE_ADDR_ALIGN for a device and then allocates user pointers aligned to that value for its tests. The minimum value is defined as: the size (in bits) of the largest OpenCL built-in data type supported by the device (long16 in FULL profile, long16 or int16 in EMBEDDED profile) for devices that are not of type CL_DEVICE_TYPE_CUSTOM. At the moment, all known devices that support user pointers require CPU page alignment for buffers created from user pointers, so just query that from sysconf. v3: Use std::max instead of MAX2 (Francisco) Add missing unistd include v2: Use system page size instead of a new pipe cap Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by (v2): Jan Vesely <jan.vesely@rutgers.edu>	2017-08-21 20:21:52 -05:00
Brian Paul	19e9bd4c11	mesa: optimize _mesa_attr_zero_aliases_vertex() After the context is initialized, the API and context flags won't change. So, we can compute whether vertex attribute 0 aliases vertex position just once. This should make the glVertexAttrib*() functions a little quicker. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-21 19:04:51 -06:00
Brian Paul	0ef5aa4128	vbo: use new _is_vertex_position() helper in vbo_attrib_tmp.h Makes the code a bit more understandable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-21 19:04:51 -06:00
Brian Paul	1850256172	vbo: make vbo_bind_arrays() static Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-21 19:04:51 -06:00
Brian Paul	4d2b21a326	svga: replace gotos with conditionals in array drawing code No Piglit regressions. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-08-21 19:04:51 -06:00
Brian Paul	d50b8b91d7	llvmpipe: add some whitespace between functions in lp_texture.c Trivial.	2017-08-21 19:04:51 -06:00
Brian Paul	84509779a9	mesa: formatting clean-up in syncobj.c Line wrap to 78 columns, etc. Trivial.	2017-08-21 19:04:51 -06:00
Brian Paul	196a0b28a0	svga: whitespace clean-up in svga_draw_private.h Trivial.	2017-08-21 19:04:51 -06:00
Timothy Arceri	6fceace7bf	gallium/docs: remove old llvmpipe TODO Features are already covered by features.txt like all the other drivers. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-08-22 11:03:08 +10:00
Timothy Arceri	a4635c84dc	mesa: fix ES only draw if we have vertex positions This code was separated from the validation code so it could use used with KHR_no_error paths. The return values were inverted to reflect the name of the helper, but here the condtion was mistakenly inverted rather than the return value. Fixes: `4df2931a87` (mesa/vbo: move some Draw checks out of validation) Reported-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 10:42:18 +10:00
Matt Turner	91b8d874da	glsl: Add prototype for udivmod64() Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	ca73c3358c	glsl: Mark functions static Cuts 3224 bytes of .text Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	d37d9f84ac	i965: Mark functions static Cuts 300 bytes of .text Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	f30902629c	i965/vec4: Use 'class' src_reg, rather than 'struct' src_reg Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	a77d5b28ac	i965/vec4: Return float from spill_cost_for_type() Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-08-21 14:45:44 -07:00
Matt Turner	76f36607b0	anv: Move clamp_int64() inside the IVB check It's only used in the gen7_cmd_buffer_emit_scissor() function. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	ee2f7aa03b	glsl: Remove unused private fields Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	384e27174d	mesa: Don't compare unsigned for < 0 The INTEL_performance_query spec says "Performance counter id 0 is reserved as an invalid counter." GLuint counterid_to_index(GLuint counterid) just returns counterid - 1, so with unsigned overflow rules, it will generate 0xFFFFFFFF given an input of 0. 0xFFFFFFFF will trigger the counterIndex >= queryNumCounters check, so the code worked as is. It just contained a useless comparison. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Matt Turner	4e97084591	egl: Fix inclusion of egl.h+mesa_glinterop.h Previously clang would warn about redefinition of typedef EGLDisplay. Avoid this by adding preprocessor guards to mesa_glinterop.h and including it after EGL.h is indirectly included. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-08-21 14:45:44 -07:00
Marek Olšák	db039d67aa	radeonsi: don't prefetch VBO descriptors if vertex elements == NULL Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-21 23:06:42 +02:00
Marek Olšák	ea1b97714d	r600g: don't set up and don't call the fetch shader if there are no VS inputs	2017-08-21 23:06:42 +02:00
Matt Turner	a98b1a8922	i965: Optimize reading the destination type brw_hw_type_to_reg_type() needs to know only whether the file is BRW_IMMEDIATE_VALUE or not, which is not a valid file for the destination. gcc and clang will evaluate __builtin_strcmp() at compile time, so we can use it to pass a constant file for the destination. text data bss dec hex filename 7816214 346248 420496 8582958 82f72e i965_dri.so before 7816070 346248 420496 8582814 82f69e i965_dri.so after Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	91ef949054	i965: Mark brw_hw_type_to_reg_type() as a pure function text data bss dec hex filename 7816886 346248 420496 8583630 82f9ce i965_dri.so before 7816214 346248 420496 8582958 82f72e i965_dri.so after Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	e07fe89035	i965: Hide the register type hardware encodings So we stop mixing them with the logical enum. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	4fab67a441	i965: Stop using hardware register types directly Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	c746f1c888	i965: Add brw_hw_reg_type_to_letters() and use it in brw_disasm.c Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	6a2471b501	i965: Move brw_reg_type_letters() as well And add "to_" to the name for consistency with the other functions in this file. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	1cb0a7941b	i965: Switch to using the logical register types Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	cb2cd462b1	i965: Add functions to abstract access to register types Previously the brw_inst{,_set}_{dst,src0,src1}_reg_type() functions provided access to the hardware encodings for the register types. We often mixed these with the logical BRW_REGISTER_TYPE_* enums (which themselves used to be the hardware format!) with bad results. With that functionality now available with the hw_ versions (see previous commit), we now add functions that take the logical BRW_REGISTER_TYPE_* enums and convert into the hardware format and vice versa. To do the conversion we also have to provide the file. Note the asymmetry between the two functions: the new getter reads the file from the instruction word, and to ensure that is always set the setter writes both the file and the type. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	9fb8323328	i965: Rename brw_inst's functions that access the register type Put hw_ in the name so that it's clear these are the hardware encodings. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	3e379af492	i965: Index brw_hw_reg_type_to_size()'s table by logical type I'll be transitioning everything to use the logical types. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	c1ac1a3d25	i965: Add a brw_hw_type_to_reg_type() function Will be used in later commits. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	dbe7dd13dd	i965: Use a common table to translate logical to hardware types Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	bfcc9aa829	i965: Extract functions dealing with register types to separate file I'm going to encapsulate all of the logic dealing with register types in this file. Rename the parameters for the hardware encodings from type -> hw_type at the same time. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	890f863da0	i965: Reverse file/type arguments to register type functions I think of the initial arguments as "state" and the last as the actual subject. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	92f787ff86	i965: Add support for disassembling 64-bit integer immediates After the last patch converted things into enums, I helpfully got a compiler warning about these missing from the switch statement. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	deae25ce37	i965: Use separate enums for register vs immediate types The hardware encodings often mean different things depending on whether the source is an immediate. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	8815b9677f	i965: Reorder brw_reg_type enum values These vaguely corresponded to the hardware encodings, but that is purely historical at this point. Reorder them so we stop making things "almost work" when mixing enums. The ordering has been closen so that no enum value is the same as a compatible hardware encoding. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	ce6b8627d8	i965: Validate destination restrictions with vector immediates Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	1d79c828d8	i965: Don't let raw-move check be tricked by immediate vector types UB and B type encodings are the same as UV and VF. Noticed when writing the following patch. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	48aa6ecb87	i965: Only change type of 0.0f to VF if destination stride == 1 The destination stride must be equivalent to a dword if VF is used. Also, since the only compaction table entires with "i:vf" have the destination as "r:f" specifically check that the destination is of type float. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	56a676eed2	i965: Remove CONT/BREAK from instruction compaction test These cannot be compacted. A similar mistake was fixed in commit `90eaf01616` Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	3d661e6062	i965: Test instruction compaction on all supported Gens Note that there's no point in testing on G45, since its compaction is the same as Gen5. Same logic applies to Gen7 variants and low-power parts. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	9ff7d9b853	i965: Silence signed/unsigned comparison warning Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	eac89911e5	i965: Move compaction "prepass" into brw_eu_compact.c Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Matt Turner	17641f6388	i965: Mark src inst pointer const in compaction code Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-08-21 14:05:23 -07:00
Dave Airlie	b3f87b87f6	vulkan: import 1.0.59 headers and xml. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-22 07:00:50 +10:00
Rob Herring	4734bfc02a	Android: Fix LLVM duplicated symbols linking for N and M Both statically linking libLLVMCore and dynamically linking libLLVM causes duplicated symbols in gallium_dri.so and it fails to dlopen. We don't really need to link libLLVMCore, but just need generated headers to be built first. Dynamically linking to libLLVM instead is enough to do that. Thanks to Qiang Yu for finding the root cause. With this change, we can align all versions and just have libLLVM as a shared lib dependency. This also requires changes in the M and N versions of LLVM to export the include paths for libLLVM. AOSP master is okay. Fixes: `26aee6f4d5` ("Android: rework LLVM build support") Reported-by: Mauro Rossi <issor.oruam@gmail.com> Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Signed-off-by: Rob Herring <robh@kernel.org>	2017-08-21 10:46:21 -05:00
Leo Liu	03b89547b7	st/va: add MJPEG for config To enable MJPEG HW decode Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	5608f44271	st/va: reallocate surface with YUYV stream Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	2ebc530ca3	st/va: detect MJPEG format from bitstream To find if the format is supported YUYV by sampling factor which is embedded from bitstream. So we could use this info for buffer reallocation on the correct format. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	7319ff8787	radeon/uvd: add YUYV format support for target buffer Make chroma plane optional for YUYV support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	c4061bb5fa	st/va: reallocate surface when interlaced Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	fceb52a230	radeon/video: MJPEG not support stacked video buffers So we have to detect it for reallocation of de-interlaced buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	e50ee6d4d5	st/va: make surface allocate functions more usefully Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	130d1f456b	radeon/uvd: reconstruct MJPEG bitstream The current tier 1 mjpeg firmware only supports at the bitstream level, the later tier 2 support will be at the buffers level with newer hardware. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	ef099e6799	st/va: add slice parameter handling for MJPEG Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	8e9175744e	st/va: add huffman table handling for MJPEG Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	93577e6081	st/va: add iq matrix handling for MJPEG Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	535b3c2363	st/va: add picture parameter handling for MJPEG Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	41f17eb5f0	st/va: add handles for MJPEG Buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	38b9686df0	st/va: create decoder for MJPEG format Mjpeg doesn't need reference Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	0a59477372	st/va: add MJPEG picture to context Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	15f3335577	radeon/video: add MJPEG support v2: add ASIC and Kernel version check Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	3fe713ce3d	radeon/uvd: add MJPEG support There is no need of dpb buffer for mjpeg codec v2: check dpb_size instead of format Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	b26cfdaebd	radeon/uvd: add MJPEG stream type Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	4ac38ac3de	vl: add MJPEG picture description Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	11ccb56e9f	vl: add MJPEG profile and format v2: move util video change to here Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Leo Liu	2b1eacabfa	radeon/uvd: get the target buffer pitch correct for different format Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-08-21 10:09:09 -04:00
Samuel Pitoiset	2843c5d15c	radeonsi: update non-resident bindless descriptors if needed Only resident bindless descriptors are currently updated and re-uploaded, this makes sure that the non-resident ones are also updated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-21 15:23:56 +02:00
Louis-Francis Ratté-Boulianne	498814a3ca	dri3: Move up fourcc utility function It will be needed in next patches. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-08-21 12:55:54 +01:00
Daniel Stone	85ef0215dd	egl: Add dma_buf_import_modifiers for glvnd Make sure we advertise the new entrypoints to libglvnd's EGL dispatch. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reported-by: Emmanuel Gil Peyrot <emmanuel.peyrot@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101982 Fixes: `4c412293d0` ("egl: advertise EGL_EXT_image_dma_buf_import_modifiers")	2017-08-21 12:13:50 +01:00
Topi Pohjolainen	393ec1a507	intel/blorp: Adjust intra-tile x when faking rgb with red-only v2 (Jason): Adjust directly in surf_fake_rgb_with_red() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910 CC: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-08-21 09:55:08 +03:00
Dave Airlie	b040f51b61	ac/nir: fixup layer/viewport export for GFX9. GFX9 moved where the viewport index export goes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-21 04:26:37 +01:00
Jason Ekstrand	c366943ebf	i965/bufmgr: s/BO_ALLOC_FOR_RENDER/BO_ALLOC_BUSY/ "Alloc for render" is a terrible name for a flag because it means basically nothing. What the flag really does is allocate a busy BO which someone theorized at one point in time would be more efficient if you're planning to immediately render to it. If the flag really means "alloc a busy BO" we should just call it that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-20 20:14:49 -07:00
Jason Ekstrand	cadcd89278	i965/tex: Change the flags type on create_for_teximage This matches the actual function declaration. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-20 20:14:49 -07:00
Christoph Haag	87556a650a	mesa: only copy requested compressed teximage cubemap faces This is analogous to commit `2259b11` which only fixed the regular case Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102308 Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-08-20 17:01:48 -04:00
Jason Ekstrand	f24cf82d6d	i965/tex: Don't pass samples to miptree_create_for_teximage In `76e2f390f9`, when Topi switched num_samples from 0 to 1 for single-sampled, he accidentally switched the last parameter in the call to miptree_create_for_teximage from 0 to 1 thinking it was num_samples when it was actually layout_flags. Switching from 0 to 1 added the MIPTREE_LAYOUT_ACCELERATED_UPLOAD flag which causes us to allocate a busy BO instead of an idle one. This caused the subsequent CPU upload to consistently stall. The end result was a 15% performance drop in the SynMark v7 DrvRes microbenchmark. This restores the old behavior and fixes the performance regression. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Fixes: `76e2f390f9` Bugzilla: https://bugs.freedesktop.org/102260 Cc: mesa-stable@lists.freedesktop.org	2017-08-19 15:39:12 -07:00
Kenneth Graunke	6f8a577ed2	anv: Use ISL for emitting null surface states. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-19 00:46:48 -07:00
Kenneth Graunke	5ae983c85b	i965: Use ISL for emitting null surface states. We handle the Sandybridge multisampled 2D surface hack here, rather than in ISL, because it requires allocating a BO, and is kind of messy. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-19 00:46:46 -07:00
Kenneth Graunke	5db9757bd7	isl: Add a null surface fill function. ISL already offers functions to fill out most kinds of SURFACE_STATE, so why not handle null surfaces too? Null surfaces are simple, so we can just take the dimensions, rather than an entirte fill structure. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-19 00:46:36 -07:00
Kenneth Graunke	288621b1b7	i965: Remove tabs in intel_batchbuffer.c. Our coding style is to use spaces. Some of this was also messed up during my bufmgr import series. (Trivial, just whitespace changes.)	2017-08-18 23:51:56 -07:00
Jason Ekstrand	61d2f3f1c2	i965/miptree: Return NONE from texture_aux_usage when fully resolved This little optimization improves the performance of SynMark v7 TexFilterTri by almost 10% on Sky Lake GT4 among other improvements. We've been doing it for some time but somehow it got dropped during the miptree refactoring. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/102258 Cc: "17.2" <mesa-stable@lists.freedesktop.org>	2017-08-18 17:31:02 -07:00
Jason Ekstrand	d5e217dbfd	i965: Stop looking at NewDriverState when emitting 3DSTATE_URB Looking at NewDriverState is not safe in general. The state atom system is set up to ensure that new bits that get added to NewDriverState get accumulated into the set of bits used when emitting atoms but it doesn't go the other way. If we read NewDriverState, we may not get the full picture because the per-pipeline state (3D or compute) does not get added to NewDriverState before state emit is done. It's especially dangerous to do this from BLORP (either explicitly or implicitly when BLORP calls gen7_upload_urb) because that does not happen during one of the normal state upload paths. This commit solves the problem by whacking all of the per-shader-stage URB sizes to zero whenever we change the total URB size. We still have to flag BRW_NEW_URB_SIZE to ensure that the gen7_urb atom triggers but the actual decision in gen7_upload_urb can now be based entirely on URB sizes rather than on state atoms. This also makes BLORP correct because it just asks for a new URB config whenever the vsize is too small and so any change to the total URB size will trigger blorp to re-emit as well because 0 < vs_entry_size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/102289 Cc: mesa-stable@lists.freedesktop.org	2017-08-18 17:30:55 -07:00
Kenneth Graunke	bc56dfbf3f	i965: Mark all EGLimages as non-coherent. EGLimages are shared with external users, and we don't know what they're going to do with them. They might scan them out. They might access them in a way that doesn't work with our explicit clflushing. It's safest to simply mark them non-coherent. Chris Wilson caught this problem and wrote a similar (though less aggressive) patch to solve it; the miptree code has since undergone a lot of refactoring so I had to rewrite it. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-08-18 16:28:13 -07:00
Eric Anholt	a727e03360	broadcom/genxml: Add V3D 3.3 packet definitions. This will be used by the new vc5 gallium driver, and a future Vulkan driver.	2017-08-18 12:54:13 -07:00
Eric Anholt	7c576d6091	broadcom/genxml: Check the sub-id field when decoding instructions. VC5 introduces packet variants where the same opcode has behavior that is decided by a sub-id field in the early bits of the packet. Keep iterating over packets until we find the one with the matching sub-id.	2017-08-18 11:56:58 -07:00
Eric Anholt	14fe9fd3f7	broadcom/genxml: Emit code for default headers for structs as well. In the vc5 NIR backend, I want to use the XML code-generation to set up pack/unpack of structs for the texture uniforms, and setting up the unpacked copy needs a default header.	2017-08-18 11:56:58 -07:00
Eric Anholt	9caba0f16f	anv: Move a comment that got left behind in the u_vector refactor.	2017-08-18 11:56:58 -07:00
Marek Olšák	57fb1bb585	gallium/radeon: remove old_fence parameter from r600_gfx_write_event_eop just use the new scratch buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-18 16:06:21 +02:00
Marek Olšák	41e053954d	radeonsi/gfx9: prevent a GPU hang after a timestamp event Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-18 16:06:18 +02:00
Marek Olšák	13aa8d3da9	radeonsi: don't use CLEAR_STATE on SI This fixes random hangs with Unigine Valley. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102201 Fixes: `064550238e` ("radeonsi: use CLEAR_STATE to initialize some registers") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-18 15:59:22 +02:00
Jon Turney	5ee159e4b3	Fix build when HAVE_LIBDRM isn't defined make[4]: Entering directory '/wip/mesa/build/src/gallium/targets/dri' CXXLD gallium_dri.la ../../../../src/gallium/auxiliary/pipe-loader/.libs/libpipe_loader_static.a(libpipe_loader_static_la-pipe_loader.o): In function `pipe_loader_get_driinfo_xml': /mesa/build/src/gallium/auxiliary/pipe-loader/../../../../../src/gallium/auxiliary/pipe-loader/pipe_loader.c:117: undefined reference to `pipe_loader_drm_get_driinfo_xml' `b4ff5e90` uses pipe_loader_get_driinfo_xml() unconditionally in pipe_loader.c, but it's definition in pipe_loader_get_driinfo_xml() is only built if HAVE_LIBDRM. Arrange to always use the default XML if HAVE_LIBDRM isn't defined. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-18 15:08:00 +02:00
Kenneth Graunke	5af7f1ccec	i965: Fix missing newlines in perf_debug messages. perf_debug() doesn't append a newline for you.	2017-08-17 23:42:49 -07:00
Ilia Mirkin	9c8f017f77	glsl: add a few missing int64 constant propagation cases Fixes KHR-GL45.shader_ballot_tests.ShaderBallotAvailability, which causes some silly swizzles to appear, triggering this optimization to get hit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2017-08-18 02:26:16 -04:00
Timothy Arceri	c03eefdf84	glsl: set old ldexp operand to NULL when lowering This fixes an assert during IR validation in LLVMpipe. Fixes: `e2e2c5abd2` (glsl: calculate number of operands in an expression once) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102274 Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2017-08-18 12:07:34 +10:00
Jason Ekstrand	1af8342b0c	intel/isl: Replace switch statements of doom with a macro Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-17 18:09:05 -07:00
Jason Ekstrand	2d68d27071	intel/isl: Reduce header file duplication Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-17 18:09:05 -07:00
Dave Airlie	611076a41a	radv: disable support for VEGA for now. I'm working on this, but I'm not sure I'll make 17.2 at this stage, maybe 17.2.1. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-18 00:49:48 +01:00
Jeremy Huddleston Sequoia	c1c4c18a80	glxcmds: Fix a typo in the __APPLE__ codepath s/DummyContext/dummyContext/ Regressed-in: `5d9b50e596` Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2017-08-17 15:13:33 -07:00
Roland Scheidegger	3e96231457	llvmpipe: enable PIPE_CAP_QUERY_SO_OVERFLOW The driver supported this since way before the GL spec for it existed. Just need to support both the per-stream and for all streams variants (which are identical due to only supporting 1 stream). Passes piglit arb_transform_feedback_overflow_query-basic. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-08-17 18:46:44 +02:00
Roland Scheidegger	26d46b94b4	softpipe: enable PIPE_CAP_QUERY_SO_OVERFLOW The driver was supposed to support this since way before the GL spec for it existed, albeit it was apparently broken, so fix and enable it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-08-17 18:46:44 +02:00
Thomas Hellstrom	0cc4c7e33e	loader_dri3: Make sure we have an updated back v3 With GLX_SWAP_COPY_OML and GLX_SWAP_EXCHANGE_OML it may happen in situations when glXSwapBuffers() is immediately followed by for example another glXSwapBuffers() or glXCopyBuffers() or back buffer age querying, that we haven't yet allocated and initialized a new back buffer because there was no GL rendering in between. Make sure that we have a back buffer in those situations. v2: Eliminate the drawable have_back_format member. v3: Make sure we re-initialize the back even if it exists. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	7c3e3c0faf	loader_dri3: Support GLX_SWAP_EXCHANGE_OML Add support for the exchange swap method. Since we're now forcing a fake front buffer and we exchange the back and fake front on swaps, we don't need to add much code. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	c898e02a33	loader_dri3: Eliminate the back-to-fake-front copy Eliminate the back-to-fake-front copy by exchanging the previous back buffer and the fake front buffer. This is a gain except when we need to preserve the back buffer content but in that case we still typically gain by replacing a server-side blit by a client side non-flushing blit. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	74b4cdd80a	loader_dri3: Remove buffer_type from buffer metadata It's not used anywhere and now that we're about to exchange back- and fake fronts it doesn't serve a purpose. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	16d1a0bcdb	loader_dri3: Support GLX_SWAP_COPY_OML Support the GLX_SWAP_COPY_OML method. When this method is requested, we use the same swapbuffer code path as EGL_BUFFER_PRESERVED. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	1e83baeb4b	loader_dri3: Honor the request to preserve back buffer content EGL uses the force_copy parameter to loader_dri3_swap_buffers_msc() to indicate that it wants to preserve back buffer contents across a buffer swap. While the loader then turns off server-side page-flipping there's nothing to guarantee that a new backbuffer isn't chosen when EGL starts to render again, and that buffer's content is of course undefined. So rework the functionality: If the client supports local blits, allow server-side page flipping and when a new back is grabbed, if needed, blit the old back's content to the new back. If the client doesn't support local blits, disallow server-side page-flipping to avoid a client deadlock and then, when grabbing a new back buffer, sleep until the old back is idle, which may take a substantial time depending on swap interval. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	f71e174bb8	loader_dri3: Increase the likelyhood of reusing the current swap buffer Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	2db9548296	loader_dri3/glx/egl: Optionally use a blit context for blitting operations The code was relying on us always having a current context for client local image blit operations. Otherwise the blit would be skipped. However, glxSwapBuffers, for example, doesn't require a current context and that was a common problem in the dri1 era. It seems the problem has resurfaced with dri3. If we don't have a current context when we want to blit, try creating a private dri context and maintain a context cache of a single context. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Thomas Hellstrom	5198e48a0d	loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback It's not very usable since in the rare, but definitely existing case that we don't have a current context, it will return NULL. Presumably it will always be safe to use the dri screen the drawable was created with for operations on that drawable. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-08-17 07:39:42 +02:00
Ilia Mirkin	934511d1f3	nv50/ir: fix TXQ srcMask src0.x is always read for the LOD, irrespective of which outputs are read. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-08-16 22:39:22 -04:00
Ilia Mirkin	054c54d1be	nv50/ir: fix srcMask computation for TG4 and TXF This affects which inputs are marked as used. In a situation where only the texture instruction uses an input, it might have been ignored as unused due to input masks. Affects subtests of KHR-GL45.texture_cube_map_array.sampling Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-08-16 22:39:21 -04:00
Jason Ekstrand	bf1d2e84f3	anv/gem: Add a stub for sync_file_merge This fixes make check Fixes: `5c4e4932e0`	2017-08-16 18:44:26 -07:00
Dave Airlie	4c02e2bd95	radv: disable texture gather workaround on gfx9. Not required anymore. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-17 02:24:36 +01:00
Brian Paul	3ab0c25939	st/mesa: remove Windows hack for glFinish I see no evidence that opengl32.dll's wglSwapBuffers calls glFinish. It looks like Jose removed that dependency years ago, but this hack remained. Removing this code also fixes the Piglit sync_api test since commit `eceb671002`. No piglit regressions. No glretrace regressions, per Charmaine. Fixes VMware bug 1937990. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-08-16 19:03:10 -06:00
Frank Richter	7fb7287ce7	gallium/os: fix os_time_get_nano() to roll over less Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-08-16 18:32:47 -06:00
Frank Richter	d90e05ad48	st/wgl: check for negative delta in wait_swap_interval() This can happen because of rollover. See bug report for details. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-08-16 18:32:46 -06:00
Frank Richter	496a691e35	st/mesa: fix a null pointer access Fixes crash with llvmpipe on Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102148 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2017-08-16 18:32:41 -06:00
Kenneth Graunke	27fb0899f7	i965: Alphabetize TCS image dirty bits Trivial.	2017-08-16 16:09:29 -07:00
Chris Wilson	49eda75df6	i965: Always allow CPU readback of the scanout on LLC platforms LLC platforms are magic in that reads from the CPU are always cache coherent, or rather GPU writes that bypass LLC do still invalidate the appropriate cache line. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-16 12:25:02 -07:00
Tim Rowley	b333bc753e	swr/rast: Fix invalid casting for calls to Interlocked* functions CID: 1416243, 1416244, 1416255 CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-08-16 14:20:22 -05:00
Boyuan Zhang	a44b334e48	radeon/vce: support all firmwares with major ver 53 The vce firmware interface should now be stable, all firmwares with major version equals to 53 are supported. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig at amd.com>	2017-08-16 14:42:41 -04:00
Tapani Pälli	733422e53c	i965: make sure check_and_emit_atom gets inlined Improves performance of 3DMark "Ice Storm Unlimited" benchmark by 1-2% on Apollolake (on Android-IA using clang 3.8.256229). Change is based on the performance profiling work and results by Aravindan Muthukumar and Yogesh Marathe. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-08-16 12:32:32 +03:00
Ilia Mirkin	f96f210239	a2xx: only update rasterizer settings when they're there The rasterizer being empty can happen e.g. during clears Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-08-15 22:54:40 -04:00
Ilia Mirkin	08f72a8944	a2xx: add logicop support This passes both gl-1.0-logicop and gl-1.1-xor piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-08-15 22:54:40 -04:00
Ilia Mirkin	978c4c597a	glsl/ast: update rhs in addition to the var's constant_value We continue in the code to do some more things with the rhs, including setting a constant initializer. If the type is wrong, this causes some confusion down the line, leading to assertions. This makes sure that the rhs processing continues to flow as-if the type was correct to start with (even though the state has been marked as an error state). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101766 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-08-15 22:14:05 -04:00
Jason Ekstrand	98983503cb	anv: Advertise VK_KHR_external_semaphore Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Jason Ekstrand	55bce22d8d	anv: Use DRM sync objects for external semaphores when available Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Jason Ekstrand	f41a0e4b0d	anv/gem: Add a drm syncobj support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Jason Ekstrand	5c4e4932e0	anv: Implement support for exporting semaphores as FENCE_FD Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Jason Ekstrand	e4054ab77b	anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Jason Ekstrand	017cdb10cf	anv: Submit a dummy batch when only semaphores are provided. Vulkan allows you to do a submit whose only job is to wait on and trigger semaphores. The easiest way for us to support that right now is to insert a dummy execbuf. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Jason Ekstrand	031f57eba3	anv: Add a basic implementation of VK_KHX_external_semaphore This patch adds an implementation based on DRM BOs. We don't actually advertise the extension yet because we want to add a couple more paths first. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-15 19:08:26 -07:00
Aaron Watry	a8296dbd5a	clover/event: Include additional event statuses for clSetEventCallback From CL 2.0 Section 5.11 (Event Objects): clSetEventCallback returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors: ... CL_INVALID_VALUE if pfn_event_notify is NULL or if command_exec_callback_type is not CL_SUBMITTED , CL_RUNNING or CL_COMPLETE . Fixes: OpenCL CTS test_conformance/events/test_events callbacks Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-08-15 19:55:15 -05:00
Jonas Pfeil	494f86bbe5	broadcom/vc4: Port NEON-code to ARM64 Changed all register and instruction names, works the same. v2: Rebase on build system changes (by anholt) v3: Fix build on clang (by anholt, reported by Rob) Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de> Tested-by: Rob Herring <robh@kernel.org>	2017-08-15 13:23:54 -07:00
Eric Anholt	bd5efbd70b	broadcom/vc4: Build the vc4_tiling_lt_neon.c with -mfpu=neon on ARM. If you don't pass this, the compiler refuses to compile the assembly for pre-v7 CPUs. This also keeps us from building identical, non-NEON code on aarch64 and x86. Fixes: `a373f77662` ("vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.") v2: Fix Android build by just appending NEON_C_SOURCES when ARCH_ARM_HAVE_NEON. Tested-by: Rob Herring <robh@kernel.org>	2017-08-15 13:23:54 -07:00
Eric Anholt	b94ddc181b	util: Fix build on old glibc. We need to link librt for u_thread.h's clock_gettime() call. Fixes: `b822d9dd67` ("gallium/util: move u_queue.{c,h} to src/util") Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-08-15 13:23:54 -07:00
Eric Anholt	f785db3d31	broadcom: Add v3d_xml.h to gitignore.	2017-08-15 13:23:54 -07:00
Eric Anholt	463de32b95	broadcom: Add missing libexpat cflags for the decoder. The Raspbian ARMv6 cross compiler wasn't picking up my (amd64) system copy of the header the way that the system gcc and armhf cross-compile did.	2017-08-15 13:23:54 -07:00
Dave Airlie	694d59fbaf	radv/gfx9: for fast clear use is_linear flag. The legacy test won't work on gfx9. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-16 06:27:30 +10:00
David Airlie	31bb8517a1	radv/gfx9: fix tile swizzle handling for gfx9 This sets the tile swizzle up properly for gfx9. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-16 05:54:19 +10:00
David Airlie	e43cc3e3af	radv/gfx9: handle GFX9 opaque metadata port the opaque metadata changes from radeonsi for gfx9. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-16 05:54:15 +10:00
David Airlie	674ecbfef2	radv: emit db_htile_surface reg on gfx9 as well This is also a GFX9 register. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-16 05:54:09 +10:00
Dave Airlie	fc600eb98d	radv/gfx9: remove some leftover gfx6 descriptor setup. We set this later in the non-gfx9 path, just remove these bits from here. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-16 05:54:03 +10:00
Dave Airlie	5247b311e9	radv/gfx9: fix set predication packet. The predication packet changed format on GFX9, update the driver. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-16 05:52:50 +10:00
Scott D Phillips	d6539608a4	intel/genxml: Fix gen10 BLEND_STATE variable length packing BLEND_STATE packing was modified to be variable-length in: `9670124e31` genxml: Make BLEND_STATE command support variable length array. The initial gen10.xml still had the old, fixed-length style definition for BLEND_STATE. So gen10_upload_blend_state would overwrite the packed BLEND_STATE_ENTRYs with its own fixed array of all-zero entries when packing BLEND_STATE. This caused BLEND_STATE upload to not work at all. Fixes: `aa416f515a` ("i965/genxml: Add gen10.xml") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-08-15 09:06:29 -07:00
Timothy Arceri	fe74c8ffbf	mesa: count uniform against storage when its bindless Gallium drivers use this code path so we need to account for bindless after all. Fixes: `365d34540f` ("mesa: correctly calculate the storage offset for i915") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-15 23:51:35 +10:00
Marek Olšák	1ab7fed707	radeonsi: disable CE by default It makes performance worse by a very small (hard to measure) amount. We've done extensive profiling of this feature internally. Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Christian König <christian.koenig@amd.com>	2017-08-15 15:03:43 +02:00
Dave Airlie	e0edfadec8	radeonsi: initialise imported surface to 0. For memobj imports we weren't setting the surface to 0, which meant sometimes we'd end up with tile_swizzle garbage, which would corrupt rendering. This seems to fix the image corruption on the imported memory objects in vrdashboard for me. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-15 01:35:58 +01:00
Timothy Arceri	de0e62e106	st/mesa: correctly calculate the storage offset When generating the storage offset for struct members we need to skip opaque types as they no longer have backing storage. Fixes: `fcbb93e860` ("mesa: stop assigning unused storage for non-bindless opaque types") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101983 Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-15 08:20:57 +10:00
Timothy Arceri	365d34540f	mesa: correctly calculate the storage offset for i915 When generating the storage offset for struct members we need to skip opaque types as they no longer have backing storage. Fixes: `fcbb93e860` ("mesa: stop assigning unused storage for non-bindless opaque types") V2: simplify since bindless will never be supported in this code Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101983 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-15 08:20:57 +10:00
Ben Widawsky	1efd73df39	i965: Advertise the CCS modifier v2: Rename modifier to be more smart (Jason) FINISHME: Use the kernel's final choice for the fb modifier bwidawsk@norris2:~/intel-gfx/kmscube (modifiers $) ~/scripts/measure_bandwidth.sh ./kmscube none Read bandwidth: 603.91 MiB/s Write bandwidth: 615.28 MiB/s bwidawsk@norris2:~/intel-gfx/kmscube (modifiers $) ~/scripts/measure_bandwidth.sh ./kmscube ytile Read bandwidth: 571.13 MiB/s Write bandwidth: 555.51 MiB/s bwidawsk@norris2:~/intel-gfx/kmscube (modifiers $) ~/scripts/measure_bandwidth.sh ./kmscube ccs Read bandwidth: 259.34 MiB/s Write bandwidth: 337.83 MiB/s v2: Move all references to the new fourcc code(s) to this patch. v3: Rebase, remove Yf_CCS (Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Jason Ekstrand	51600b8489	i965/miptree: More conservatively resolve external images Instead of always doing a full resolve, only resolve the bits that are needed. This means that we only do a partial resolve when the miptree modifier is I915_FORMAT_MOD_Y_TILED_CCS. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Ben Widawsky	8f6e54c929	i965: Pretend that CCS modified images are two planes v2: move is_aux into if block. (Jason) Use else block instead of goto (Jason) v3: Fix up logic for is_aux (Ben) Fix up size calculations and add FIXME (Ben) v4 (Jason Ekstrand): Use the aux_pitch in the image instead of calculating it Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Jason Ekstrand	a1e5db9888	i965/screen: Support import and export of surfaces with CCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Ben Widawsky	a068fdc861	i965/miptree: Allocate mcs_buf for an image's CCS This code will disable actually creating these buffers for the scanout, but it puts the allocation in place. Primarily this patch is split out for review, it can be squashed in later if preferred. v2: assert(mt->offset == 0) in ccs creation (as requested by Topi) Remove bogus is_scanout check in miptree_release v3: Remove is_scanout assert in intel_miptree_create. It doesn't work with latest codebase - not sure it ever should have worked. v4: assert(mt->last_level == 0) and assert(mt->first_level == 0) in ccs setup (Topi) v5 (Jason Ekstrand): - Base the decision to allocate a CCS on the image modifier Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Ben Widawsky	f6fbeaf1c4	i965: Support images with aux buffers Previously images did not support any auxiliary compression surfaces (CCS, MCS, or HiZ). That's about to change. This patch just adds the fields to __DRIimageRec to make auxiliary surfaces possible. v2 (Jason Ekstrand): - Add an aux_pitch parameter as well as aux_offset Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Jason Ekstrand	cf2e92262b	intel/isl: Add support for I915_FORMAT_MOD_Y_TILED_CCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-08-14 10:43:30 -07:00
Jason Ekstrand	51eb40d414	i965/screen: Stop redefining DRM_FORMAT_MOD_(INVALID\|LINEAR) Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2017-08-14 10:43:30 -07:00
Scott D Phillips	f7dfc44c61	i965/blorp: Correct type of src_format in call to intel_miptree_texture_aux_usage intel_miptree_texture_aux_usage() takes an isl_format, but we are passing a mesa_format. clang warns: brw_blorp.c:305:52: warning: implicit conversion from enumeration type 'mesa_format' to different enumeration type 'enum isl_format' [-Wenum-conversion] intel_miptree_texture_aux_usage(brw, src_mt, src_format); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~ Fixes: `fc1639e46d` ("i965/blorp: Use texture/render_aux_usage for blits") Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-14 10:41:54 -07:00
Julien Isorce	91d93aa621	st/va: change frame_idx from array to hash table The picture_id was assumed to be a frame number so in 0-31. But the vaapi client gstreamer-vaapi uses the surfaces handles as identifier which are unsigned int. This bug can happen when using a lot of vaapi surfaces within the same process. Indeed Mesa/st/va increments a counter for the surface ID: mesa/util/u_handle_table.c::handle_table_add which starts from 0 and incremented by 1 at each call. So creating more than 32 surfaces was a problem. The following bug contains a test that reproduces the problem by running a couple of vaapih264enc in the same process. The above also explains why there was no pb when running them in separated processes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102006 Signed-off-by: Julien Isorce <jisorce@oblong.com> Tested-by: Tomas Rataj <rataj28@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-and-tested-by: Boyuan Zhang <Boyuan.Zhang@amd.com>	2017-08-14 13:40:19 +01:00
Ilia Mirkin	165e18dd21	nv50/ir: clean up saturated values immediately Since we don't iterate to a fixed point, we can end up in situations where we have a SAT instruction + a long immediate. This is not legal. However since it's immediately computable, just run unary straight away to handle the situation. Fixes: `24a799ad35` ("nv50/ir: fix ConstantFolding with saturation") Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-08-12 14:49:08 -04:00
Ilia Mirkin	ea22ac23e0	nvc0/ir: unlink values pre- and post-call to division function While technically correct, this can lead to e.g. getImmediate assuming that it can walk up the value chain. It could be fixed to not do this, but it seems easier and less error-prone to just not link the two values to save on one LValue object. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-08-12 14:49:08 -04:00
Kenneth Graunke	22e1d8832c	i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41 This should hopefully fix build issues on 32-bit Android-x86. v2: s/USE_SSE4_1/USE_SS41/, caught by Gražvydas Ignotas. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-08-12 01:42:32 -07:00
Kenneth Graunke	da0840246f	i965: Clean up intel_batchbuffer_init(). Passing screen lets us get the kernel features, devinfo, and bufmgr, without needing container_of. This use of container_of could cause crashes due to issues with the "sample" macro parameter. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102062 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-08-12 01:41:24 -07:00
Marek Olšák	b420680ede	gallium/radeon: only pass shader-specific debug flags to the disk shader cache Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-08-11 20:38:29 +02:00
Marek Olšák	d1285a7103	radeonsi/gfx9: fix the scissor bug workaround otherwise there is corruption in most apps. Fixes: `0fe0320` radeonsi: use optimal packet order when doing a pipeline sync Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-11 20:38:29 +02:00
Marek Olšák	27fef5d52d	radeonsi/gfx9: use the VI codepath for clamping Z This fixes corrupted shadows in Unigine Valley. The corruption disappeared when I stopped setting IMG_DATA_FORMAT_24_8 for depth. Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-11 20:38:29 +02:00
Daniel Stone	2eee03b7a1	egl: Update headers from Khronos Taken from egl-registry 7d68647c4dab. Signed-off-by: Daniel Stone <daniels@collabora.com>	2017-08-11 11:16:00 +01:00
Daniel Stone	7d26a52a7a	egl/dri2: Allow modifiers to add FDs to imports When using dmabuf import, make sure that the modifier is actually allowed to add planes to the base format, as implied by the comment. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2017-08-11 10:25:53 +01:00
Iago Toral Quiroga	81615ad444	intel/compiler: properly size attribute wa_flags array for Vulkan Mesa will map user defined vertex input attributes to slots starting at VERT_ATTRIB_GENERIC0 which gives us room for only 16 slots (up to GL_VERT_ATTRIB_MAX). This sufficient for GL, where we expose exactly 16 vertex attributes for user defined inputs, but in Vulkan we can expose up to 28 (which are also mapped from VERT_ATTRIB_GENERIC0 onwards) so we need to account for this when we scope the size of the array of attribute workaround flags that is used during the brw_vertex_workarounds NIR pass. This prevents out-of-bounds accesses in that array for NIR shaders that use more than 16 vertex input attributes. Fixes: dEQP-VK.pipeline.vertex_input.max_attributes.* Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-08-11 10:41:44 +02:00
Timothy Arceri	9d41ec2182	glsl: stop cloning builtin fuctions _mesa_glsl_find_builtin_function() The cloning was introduced in `f81ede4699` to fix a problem with shaders including IR that was owned by builtins. However the approach of cloning the whole function each time we reference a builtin lead to a significant reduction in the GLSL IR compilers performance. The previous patch fixes the ownership problem in a more precise way. So we can now remove this cloning. Testing on a Ryzen 7 1800X shows a ~15% decreases in compiling the Deus Ex: Mankind Divided shaders on radeonsi (which take 5min+ on some machines). Looking just at the GLSL IR compiler the speed up is ~40%. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-11 15:44:15 +10:00
Timothy Arceri	77f5221233	glsl: pass mem_ctx to constant_expression_value(...) and friends The main motivation for this is that threaded compilation can fall over if we were to allocate IR inside constant_expression_value() when calling it on a builtin. This is because builtins are shared across the whole OpenGL context. `f81ede4699` worked around the problem by cloning the entire builtin before constant_expression_value() could be called on it. However cloning the whole function each time we referenced it lead to a significant reduction in the GLSL IR compiler performance. This change along with the following patch helps fix that performance regression. Other advantages are that we reduce the number of calls to ralloc_parent(), and for loop unrolling we free constants after they are used rather than leaving them hanging around. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-08-11 15:44:08 +10:00
Timothy Arceri	d4f79e995f	glsl: use ralloc_str_append() rather than ralloc_asprintf_rewrite_tail() The Deus Ex: Mankind Divided shaders go from spending ~20 seconds in the GLSL IR compilers front-end down to ~18.5 seconds on a Ryzen 1800X. Tested by compiling once with shader-db then deleting the index file from the shader cache and compiling again. v2: - fix rebasing issue in v1 Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-08-11 10:43:34 +10:00
Timothy Arceri	26f4657c3f	util/ralloc: add ralloc_str_append() helper This function differs from ralloc_strcat() and ralloc_strncat() in that it does not do any strlen() calls which can become costly on large strings. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-08-11 10:43:31 +10:00
Timothy Arceri	53320e25b4	glsl: remove unused field from ir_call Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-08-11 10:43:27 +10:00
Timothy Arceri	49d9286a3f	glsl: stop copying struct and interface member names We are currently copying the name for each member dereference but we can just share a single instance of the string provided by the type. This change also stops us recalculating the field index repeatedly. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-08-11 10:43:21 +10:00
Timothy Arceri	43cbcbfee9	glsl: tidy up get_num_operands() Also add a comment that this should only be used by the ir_reader interface for testing purposes. v2: - fix grammar in comment - use unreachable rather than assert Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-08-11 10:43:16 +10:00
Timothy Arceri	e2e2c5abd2	glsl: calculate number of operands in an expression once Extra validation is added to ir_validate to make sure this is always updated to the correct numer of operands, as passes like lower_instructions modify the instructions directly rather then generating a new one. The reduction in time is so small that it is not really measurable. However callgrind was reporting this function as being called just under 34 million times while compiling the Deus Ex shaders (just pre-linking was profiled) with 0.20% spent in this function. v2: - make num_operands a unit8_t - fix unsigned/signed mismatches Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-08-11 10:43:12 +10:00
Kenneth Graunke	5563872dbf	isl: Validate row pitch of stencil surfaces. Also, silence an obnoxious finishme that started occurring for all GL applications which use stencil after the i965 ISL conversion. v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using combined depth-stencil. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-08-10 15:18:58 -07:00
Emil Velikov	26fbb9eacd	egl: avoid eglCreatePlatformSurface{EXT,} crash with invalid dpy If we have an invalid display fed into the functions, the display lookup will return NULL. Thus as we attempt to get the platform type, we'll deref. it leading to a crash. Keep in mind that this will not happen if Mesa is built without X11 or when the legacy eglCreateSurface codepaths are used. A similar check was added with earlier commit `5e97b8f5ce` ("egl: Fix crashes in eglCreate*Surface), although it was only applicable when the surfaceless platform is built. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-08-10 19:41:51 +01:00
Emil Velikov	a51be4f9a6	egl/drm: rename dri2_drm_create_surface() The function can handle only window surfaces, so let's rename it accordingly, killing the wrapper around it. v2: Use native_window in the function args. list. Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-08-10 19:34:04 +01:00
Emil Velikov	430a80a7b6	egl/drm: remove unreachable code in dri2_drm_create_surface() The function can be called only when the type is EGL_WINDOW_BIT. Remove the unneeded switch statement. v2: Rename the local variable window to surface (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)	2017-08-10 19:32:14 +01:00
Emil Velikov	794df9acad	egl/x11: pass NULL instead of XCB_WINDOW_NONE as native_surface Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-08-10 19:30:17 +01:00
Matt Turner	9c0dad0a2b	egl: Clean up native_type vs drawable mess The next patch is going to stop passing XCB_WINDOW_NONE (of type xcb_window_enum_t) as an argument where these functions expect a void *, which clang does not appreciate. This patch cleans things up to better convince me and reviewers that it's safe to do that. v2: Emil Velikov: rebase/integrate with series Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-08-10 19:29:37 +01:00
Emil Velikov	df8efd5b74	egl: handle BAD_NATIVE_PIXMAP further up the stack The basic (null) check is identical across all backends. Just move it to the top. v2: - Split the WINDOW vs PIXMAP into separate patches - Move check after the dpy and config - dEQP expects so Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-08-10 19:28:04 +01:00
Emil Velikov	92b23683eb	egl: drop unreachable BAD_NATIVE_WINDOW conditions The code in _eglCreateWindowSurfaceCommon() already has a NULL check which handles the condition. There's no point in checking again further down the stack. v2: Split the WINDOW vs PIXMAP into separate patches v3: Resolve typos, s/EGL_PIXMAP_BIT_BIT/EGL_PIXMAP_BIT/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-08-10 19:27:03 +01:00
Emil Velikov	47b06f5821	egl: add dri2_setup_swap_interval helper The current two implementations - X11 and Wayland were identical, barrind the upper limit. Instead of having same code twice - introduce a helper and pass the limit as an argument. Thus as Android/DRM/others get support - they only need to call the function ;-) v2: Rebase on top of keeping ::swap_available Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2017-08-10 19:23:31 +01:00

... 2 3 4 5 6 ...

87651 Commits