KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Matt Turner	e2344e11ce	i965/fs: Trim unneeded channels in SampleID setup. The AND and SHR produce a scalar value that we had been replicating across $dispatch_width channels. The immediate MOV produces only four useful channels of data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-22 12:26:54 -07:00
Matt Turner	e10fc055e7	i965/fs: Use type-W for immediate in SampleID setup. Not a functional difference, but register is loaded with a signed immediate (V) and added to a signed type (D) producing a signed result (D). Also change the type of g0 to allow for compaction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-22 12:26:49 -07:00
Matt Turner	cfb67c3d06	i965/vec4: Initialize LOD to 0.0f for textureQueryLevels() and texture(). We implement textureQueryLevels (which takes no arguments, save the sampler) using the resinfo message (which takes an argument of LOD). Without initializing it, we'd generate a MOV from the null register to load the LOD argument. Essentially the same logic applies to texture. A vertex shader cannot compute derivatives and so cannot produce an LOD, so TXL with an LOD of 0.0 is used. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-22 10:16:52 -07:00
Matt Turner	65ffaf2740	i965: Note that the UV immediate type is Gen6+.	2015-10-22 10:16:52 -07:00
Jose Fonseca	718249843b	gallivm: Translate all util_cpu_caps bits to LLVM attributes. This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990 Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <sroland@vmware.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-22 11:11:40 +01:00
Jordan Justen	627c15cde4	i965/fs: Disable CSE optimization for untyped & typed surface reads An untyped surface read is volatile because it might be affected by a write. In the ES31-CTS.compute_shader.resources-max test, two back to back read/modify/writes of an SSBO variable looked something like this: r1 = untyped_surface_read(ssbo_float) r2 = r1 + 1 untyped_surface_write(ssbo_float, r2) r3 = untyped_surface_read(ssbo_float) r4 = r3 + 1 untyped_surface_write(ssbo_float, r4) And after CSE, we had: r1 = untyped_surface_read(ssbo_float) r2 = r1 + 1 untyped_surface_write(ssbo_float, r2) r4 = r1 + 1 untyped_surface_write(ssbo_float, r4) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-22 00:36:37 -07:00
Chia-I Wu	13a5805b64	ilo: make sure there is HiZ before resolving We do not want to perform a depth resolve on an MCS enabled surface.	2015-10-22 14:06:21 +08:00
Chia-I Wu	0b6f6ee50f	ilo: fix max thread count for HS on Gen8 It is in DW2 on Gen8.	2015-10-22 14:06:21 +08:00
Ben Widawsky	8eefdacb38	i965: Advertise ARB_shader_stencil_export (gen9+) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 21:14:44 -07:00
Ben Widawsky	1db44252d0	i965: Implement ARB_shader_stencil_export (gen9+) v2: remove useless source_stencil_to_render_target (Ken) Squash in the actual packing function, which also got to v2: Move the definition of the OPCODE outside of FB_WRITE opcodes (Matt) Reorder the regioning to be in VWH order (Matt) Don't retype src in the backend, just assert instead (Matt) Rename the debug prints to something better (Matt) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 21:14:44 -07:00
Ben Widawsky	5fa7114652	i965/fs: Enumerate logical fb writes arguments Gen9 adds the ability to write out a stencil value, so we need to expand the virtual payload by one. Abstracting this now makes that change easier to read. I was admittedly confused early on about some of the hardcoding. If people believe the resulting code is inferior, I am not super attached to the patch. v2: Remove explicit numbering from the enumeration (Matt). Use a real naming scheme, and reference it in the opcode definition (Curro) Add a missed hardcoded logical position in get_lowered_simd_width (Ben) Add an assertion to make sure the component numbering is correct (Ben) Cc: Matt Turner <mattst88@gmail.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 21:14:44 -07:00
Brian Paul	18a631eb90	svga: fix clip plane regression after recent tgsi_scan change Before the change "tgsi/scan: use properties for clip/cull distance writemasks", the tgsi_shader_info::num_written_clipdistance field was a multiple of four, now it's an accurate count. In the svga driver, we need a minor change to the loop test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-21 17:12:19 -06:00
Kenneth Graunke	48c76eae8e	i965: Implement gl_InvocationID. It's stored in bits 31:27 of g1 (along with the URB handles). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:58 -07:00
Kenneth Graunke	c5ae34f38f	i965: Implement nir_intrinsic_load_primitive. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:56 -07:00
Kenneth Graunke	b3ebf03b84	i965: Add a fs_visitor constructor that takes a brw_gs_compile. Unlike the vs/wm structs, brw_gs_compile is actually useful: it contains the input VUE map and information about the control data headers. Passing this in allows us to share that code in brw_gs.c, and calculate them before deciding on vec4 vs. scalar mode, as it's independent of that choice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:54 -07:00
Kenneth Graunke	55dfd39b5f	i965: Add a brw->scalar_gs flag controlled by INTEL_SCALAR_GS=1. This patch introduces a brw->scalar_gs flag, similar to brw->scalar_vs, which controls whether or not to use SIMD8 geometry shaders. For now, we control it via a new environment variable, INTEL_SCALAR_GS. This provides a convenient way to try it out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:53 -07:00
Kenneth Graunke	ac0a33666b	i965: Make emit_urb_writes() reserve space for GS header information. Geometry shaders have additional header data at the beginning of their output URB entries. Shaders that use EndPrimitive() or multiple streams have a control data header; shaders with a dynamic vertex count have an additional vec4 slot to hold the 32-bit vertex count (and 96 bits of padding). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:52 -07:00
Kenneth Graunke	cb755996d9	i965: Make emit_urb_writes() only set EOT for the VS. The GS will emit a bunch of vertices, and we don't want to do an EOT prematurely. We'll emit GS_OPCODE_THREAD_END when we want to terminate the thread. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:50 -07:00
Kenneth Graunke	6ae419b94d	i965: Make fs_visitor::emit_urb_writes reusable for scalar GS. GS doesn't have ClampVertexColor, and we don't want to go through VS structures. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:49 -07:00
Kenneth Graunke	72d84ae7ce	i965: Introduce a brw_vue_prog_data::include_vue_handles flag. Tessellation shaders and SIMD8 geometry shaders may need to resort to the pull model for inputs at times. When set, the state upload code will tell the hardware to provide URB handles for input data. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:48 -07:00
Kenneth Graunke	ac98888afd	i965: Introduce a new SHADER_OPCODE_URB_READ_SIMD8 opcode. In scalar mode, geometry shader inputs can easily take up hundreds of registers. This makes pushing VUE entries impractical; we'll need to resort to the pull model in some cases. To support this, we introduce a new opcode corresponding to the "URB Read SIMD8" message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:46 -07:00
Kenneth Graunke	bea7522782	i965: Introduce new SHADER_OPCODE_URB_WRITE_SIMD8_MASKED/PER_SLOT opcodes. In the vec4 backend, we have a vec4_instruction::urb_write_flags field. There are many kinds of flags for SIMD4x2 messages. However, there are really only two (per-slot offset, use channel masks) for SIMD8 messages. Rather than adding a boolean flag for per-slot offsets (polluting all instructions), I decided to just make three new opcodes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:41 -07:00
Jason Ekstrand	0e57694745	i965/gs: Do prog_data setup and other calculations in brw_compile_gs This commit moves the large pile of setup calculations we have to do for geometry shaders out of brw_gs_emit and into brw_compile_gs. This has a couple of nice implications. First, it's less work that the caller of brw_compile_gs has to do. Second, it's consistent with the vertex and fragment stages. Finally, it allows us to put brw_gs_compile back behind the API boundary where it belongs. v2 (Jason Ekstrand): - Pull the changes to use nir info into a separate patch - Put brw_gs_compile into brw_shader.h rather than brw_vec4_gs_visitor.h so that we can use it for scalar GS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	f3bc73073a	i965/gs: Use NIR info for setting up prog_data Previously, we were pulling bits from GL data structures in order to set up the prog_data. However, in this brave new world of NIR, we want to be pulling it out of the NIR shader whenever possible. This way, we can move all this setup code into brw_compile_gs without depending on the old GL stuff. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	fac9b21e03	i965/gs: Pull prog_data out of brw_gs_compile Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	6ac2bbec16	i965/gs: Use NIR instead of the brw_geometry_program for GS metadata With this, we can remove the geometry program from brw_gs_compile. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	72148de217	i965/gs: Move the mem_ctx argument to brw_compile_gs This makes it better match the other brw_compile_* functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	8e8b527b27	i965/gs: Set static_vertex_count unconditionally on GEN8+ We always have NIR, so there's no reason for the check. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	2686477d37	nir: Constify nir_gs_count_vertices Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	4eb84a03be	nir/info: Add more information about geometry shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Ben Widawsky	3c5d24363a	i965: (trivial) rename computes stencil to gen9 All the documentation I can find says that this bit (and functionality) only exists on SKL+. Since the bit isn't yet used, there is no real impact here. The original code was added by Ken here (a surprisingly long time ago): commit `f3c6d6f1e1` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Thu Nov 29 21:00:27 2012 -0800 i965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 11:00:03 -07:00
Ben Widawsky	c643518452	i965: Correct the comment about fb write payload Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-21 11:00:00 -07:00
Nanley Chery	f1147a238a	mesa/glformats: Undo code changes from _mesa_base_tex_format() move The refactoring commit, `c6bf1cd`, accidentally reverted `cd49b97` and `99b1f47`. These changes caused more code to be added to the function and removed the existing support for ASTC. This patch reverts those modifications. v2. Actually include ASTC support again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92221 Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-21 10:36:31 -07:00
Matt Turner	2ce659b5e4	i965: Mark compacted 3-src instructions as Gen8+. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	05cc56cca3	i965: Add const to brw_compact_inst_bits. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	b29f92daec	i965: Add mask_control_ex field and handle it in compaction. Documentation is sparse, but it appears to have existed on G45 and ILK as a second bit extension of the mask_control field. Setting the pair of bits to 0b11 enables "NoCMask". Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	3ec9d96d43	i965: Add devinfo->gen assertions for acc_wr_control. ... and for flag_subreg_nr since it's right near by. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	d14907b946	i965: Prepare for next commit by adding more whitespace. We're going to add a field with a longer name that wouldn't align with the rest. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	35f3f06c8a	i965: Compact acc_wr_control only on Gen6+. It only exists on Gen6+, and the next patches will add compaction support for the (unused) field in the same location on earlier platforms. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:37 -07:00
Matt Turner	ee868c46e8	i965: Add devinfo parameter to brw_compact_inst_* funcs. The next commit will add assertions dependent on devinfo->gen. Use compact()/uncompact() macros where possible, like the 3-src code does. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:37 -07:00
Matt Turner	4a132349c3	i965/vec4: Don't emit MOVs for unused URB slots. Otherwise we'd emit a MOV from the null register (which isn't allowed). Helps 24 programs in shader-db (the geometry shaders in GSCloth): instructions in affected programs: 302 -> 262 (-13.25%) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:37 -07:00
Nigel Stewart	04703762e5	osmesa: Expose GL entry points for Windows build via DEF file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92437 CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-21 14:06:58 +01:00
Jonathan Gray	99c4079c37	configure.ac: ensure RM is set GNU make predefines RM to rm -f but this is not required by POSIX so ensure that RM is set. This fixes "make clean" on OpenBSD. v2: use AC_CHECK_PROG Signed-off-by: Jonathan Gray <jsg@jsg.id.au> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-21 14:09:38 +01:00
Neil Roberts	ee77796a5c	i965/fs: Disable opt_sampler_eot for more message types In `bfdae9149e` I disabled the opt_sampler_eot optimisation for TG4 message types because I found by experimentation that it doesn't work. I wrote in the comment that I couldn't find any documentation for this problem. However I've now found the documentation and it has additional restrictions on further message types so this patch updates the comment and adds the others. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-21 11:08:37 +02:00
Neil Roberts	801f151917	i965: Remove block arg from foreach_inst_in_block_*_starting_from Since `49374fab5d` these macros no longer actually use the block argument. I think this is worth doing to make the macros easier to use because they already have really long names and a confusing set of arguments. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-21 11:07:04 +02:00
Timothy Arceri	38ceeeadaa	glsl: check for arrays of arrays when assigning explicit locations This fixes assigning explicit locations in the CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-arrays-of-arrays Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-21 15:49:32 +11:00
Timothy Arceri	9a04057ef1	glsl: add is_array_of_arrays() helper As suggested by Ian Romanick Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-21 15:49:17 +11:00
Kenneth Graunke	156b7d3113	glsl: Fix bad indentation in bit_logic_result_type(). The first level of indentation was using 4 spaces. Mesa uses 3. Trivial. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-20 21:25:11 -07:00
Timothy Arceri	fd01840c0b	glsl: add AoA support to subroutines process_parameters() will now be called earlier because we need actual_parameters processed earlier so we can use it with match_subroutine_by_name() to get the subroutine variable, we need to do this inside the recursive function generate_array_index() because we can't create the ir_dereference_array() until we have gotten to the outermost array. For the remainder of the array dimensions the type doesn't matter so we can just use the existing _mesa_ast_array_index_to_hir() function to process the ast. Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-21 14:56:57 +11:00
Tapani Pälli	a59c1adcc6	glsl: fix record type detection in explicit location assign Check current_var directly instead of using the passed in record_type. This fixes following failing CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-types-structs No Piglit regressions. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-21 06:12:15 +03:00

1 2 3 4 5 ...

73804 Commits All Branches Search

73804 Commits

All Branches