KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Francisco Jerez	cc3bae5cd7	i965/fs: Introduce helper to extract a field from each channel of a register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-05-10 11:25:05 +02:00
Connor Abbott	d17cdacba3	i965/fs: always pass the bitsize to brw_type_for_nir_type() v2 (Sam): - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float() v3 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	a308bae58f	i965/fs: add support for printing double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	0f2e227d5c	i965/fs: don't propagate 64-bit immediates They can only be used with 1-src instructions, which practically (since we should've constant-propagated away all 1-src instructions with 64-bit immediates in NIR) means that they must be kept in separate MOV's and can't be propagated. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	0f1690fd95	i965/fs: use the NIR bit size when creating registers v2 (Iago): - Squashed bits from 'support double precission constant operands for the implementation of 64-bit emit_load_const'. - Do not use BRW_REGISTER_TYPE_D for all 32-bit registers since that breaks asserts and functionality for some piglit tests. Just keep 32-bit types untouched and add 64-bit support. - Use DF instead of Q for 64-bit registers. Otherwise the code we generate will use Q sometimes and DF others and we hit unwanted DF/Q conversions, so always use DF. v3 (Sam): - Mark 'reg_type' occurrences as const (Topi). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani Palli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:04 +02:00
Connor Abbott	76de7af8e2	i965: fixup uniform setup for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	3210870b34	i965: two-argument instructions can only use 32-bit immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	3d10adf603	i965: fix brw_abs_immediate() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	830d87840c	i965: fix brw_saturate_immediate() for doubles v2 (Sam): - Mark 'size' as const (Topi). - Add comment to explain that we do copies 64-bits regardless of the type (Topi) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:03 +02:00
Connor Abbott	7bcc4cccad	i965: fix is_zero(), is_one() and is_negative_one() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	2ae409286c	i965: fix brw_negate_immediate() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	cbf7c7f099	i965/eu: add support for DF immediates v2 (Sam): - Remove 'however' from the comment (Topi) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	c0a1cd24a8	i965: add support for disassembling DF immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	bb175db16b	i965: add support for getting/setting DF immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	5310bca024	i965: add brw_imm_df v2 (Iago) - Fixup accessibility in backend_reg Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	9add73f641	i965/eu: Allow 3-src float ops with doubles v2: - set 3src_src_type for BRW_REGISTER_TYPE_DF (Connor) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Connor Abbott	367e762a71	i965/disasm: fix disasm of 3-src doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	45066a6a59	i965: Tell backend register about double precision type Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani P\344lli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	520b3b2fd1	i965: Determine size of double precision float register This is used to determine how many registers an instruction reads and writes as well as for offseting register region into a desired component. v2 (Connor): rebase on master Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani P\344lli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	e88cf0f2d2	i965: Lower DFRACEXP/DLDEXP v2 (Connor): rebase on master which moved this to brw_link.cpp v3 (Sam): - Only enable DFREXP_DLDEXP_TO_ARITH in process_glsl_ir(). This is used for doubles. Single floating point op is lowered by NIR. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Connor Abbott	30424fd25a	i965: use pack/unpackDouble lowering Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Connor Abbott	bea2f8beb5	i965: use double lowering pass v2: also lower trunc, ceil, floor, fract and roundEven (Iago) v3: also lower mod for doubles (Sam) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez	d00a239b28	freedreno/ir3: lower lrp when operating with double operands Lower lrp when operating with double operands because float version of lrp is also lowered. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez	93e690830a	i965: enable lrp lowering for doubles Broadwell and previous generations does not support lrp instruction operating with doubles. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Dave Airlie	008feb3687	st/glsl_to_tgsi: brown paper bag for the input offsets fix. Oops, thanks compiler. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:41:21 +10:00
Dave Airlie	4d8a71f7f1	glsl: check geometry output vertices limits. This fixes: GL45-CTS.geometry_shader.limits.max_output_vertices Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:26:03 +10:00
Dave Airlie	13c68e1447	mesa/vbo: fix check for zero aliases with 2/10/10/10 This fixes: GL33-CTS.gtf33.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_attrib Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:24:49 +10:00
Eduardo Lima Mitev	60a5d02416	nir/print: Print memory qualifiers in a variable declaration Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-10 06:22:05 +02:00
Eduardo Lima Mitev	7f7f58f17f	glsl: Apply memory qualifiers to vars inside named block interfaces This is missing and memory qualifiers are currently being ignored for SSBOs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-10 06:21:55 +02:00
Dave Airlie	f75a26d1ba	st/glsl_to_tgsi: handle offsets from inputs This fixes: GL45-CTS.gpu_shader5.texture_gather_offset_color_repeat Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 13:14:29 +10:00
Rob Clark	aa730aca20	scripts: bump git_reviewer.pl --git-min-percent default Bump up default percentage of commits required to be auto-picked for CC. Seems from a bit of trial-and-error to come up with a more reasonable list of CC's this way. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 19:30:28 -04:00
Kenneth Graunke	e034d80fe1	Revert "Revert "i965: Switch to scalar TCS by default."" This reverts commit `bd326c229c`. Now that we've fixed the GPU hangs, let's turn it back on. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-09 16:20:27 -07:00
Kenneth Graunke	5ce405ba0f	i965: Actually assign binding table offsets for the TCS. As far as I can tell, this was just entirely missing...honestly, I'm not sure how anything worked at all. Caught by noticing GPU hangs in image load store tests with scalar TCS, but probably has broader implications. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-09 16:20:18 -07:00
Kenneth Graunke	e0e7280db0	i965: Clamp "Maximum VP Index" to 1 when gl_ViewportIndex isn't written. fs_visitor::emit_urb_writes skips writing the VUE header for shaders that don't write gl_PointSize, gl_Layer, or gl_ViewportIndex. This leaves their values uninitialized. Kristian's nearby comment says: "But often none of the special varyings that live there are written and in that case we can skip writing to the vue header, provided the corresponding state properly clamps the values further down the pipeline." However, we were clamping gl_ViewportIndex to [0, 15], so we would end up using a random viewport. To fix this, detect when the shader doesn't write gl_ViewportIndex, and clamp it to [0, 0]. The vec4 backend always writes zeros to the VUE header, so it doesn't suffer from this problem. With vec4-style HWord writes, we can write the header and position together in a single message. In the FS world, we would need 4 extra MOVs of 0 and a longer message, or a separate OWord write. It's likely cheaper to just clamp the value. Fixes DiRT Showdown and Bioshock Infinite, which only rendered half of the screen - the lower left of two triangles. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93054 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-09 15:31:27 -07:00
Jordan Justen	e74812dbfe	i965/hsw: Fix brw_store_data_imm* For Gen6 through Haswell dword 1 is MBZ. In gen 8 it becomes part of the 64-bit address. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-09 15:05:08 -07:00
Kenneth Graunke	96d43f2d08	i965: Reimplement ARB_transform_feedback2 on Haswell and later. My old implementation accumulated <start, end> pairs in a buffer, and eventually processed that data on the CPU. This meant flushing the batchbuffer and waiting for it to completely execute before we could map it, resulting in really long stalls. We could also run out of space in the buffer, and have to do this early. Instead, we can use Haswell's MI_MATH command to do the (end - start) subtraction, as well as the multiplication by 2 or 3 to convert from the number of primitives written to the number of vertices written. We still need to CS stall to read the counters, but otherwise everything is completely pipelined - there's no CPU<->GPU synchronization required. It also uses only 80 bytes in the buffer, no matter what. Improves performance in Manhattan on Skylake GT3e at 800x600 by 6.1086% +/- 0.954166% (n=9). At 1920x1080, improves performance by 2.82103% +/- 0.148596% (n=84). v2: Fix number of primitives -> number of vertices calculation for GL_TRIANGLES (I was multiplying by 4 instead of 3.) Caught by Jordan Justen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 15:00:01 -07:00
Kenneth Graunke	fdb6c1887f	i965: Add a brw_load_register_reg64 helper. It appears that we can't do this in a single command (like we do for MI_LOAD_REGISTER_IMM) - the Skylake simulator gets rather grumpy about the command length if I try to combine them. No matter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 15:00:01 -07:00
Kenneth Graunke	4c71c8a74a	i965: Only enable ARB_query_buffer_object for newer kernels on Haswell. On Haswell, we need version 6 of the kernel command parser in order to write the math registers. Our implementation of ARB_query_buffer_object heavily relies on MI_MATH, so we should only advertise it when MI_MATH is available. We also need MI_LOAD_REGISTER_REG, which requires version 7 of the command parser. To make these checks easier, introduce a screen->has_mi_math_and_lrr flag that will be set when both commands are supported. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 14:59:58 -07:00
Dave Airlie	2d41eb313f	mesa/objectlabel: don't return info on genned but never bound textures. This fixes some cases in the CTS KHR debug tests where it uses glIsTexture to find an invalid ID and then call GetObjectLabel. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 06:06:09 +10:00
Dave Airlie	bbc6a27590	mesa: don't use genned but unnamed xfb objects. If we try to draw or query an XFB object that hasn't been bound, we shouldn't return any information. This fixes a couple if cases in: GL33-CTS.transform_feedback.api_errors_test The ObjectLabel test is inspired by another test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 06:06:09 +10:00
Samuel Pitoiset	eafe3905d9	nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_* We don't need them for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-09 21:58:56 +02:00
Jordan Justen	2e2aa992ff	mesa/compute: Fix indirect dispatch buffer size check on 32-bit systems `2655265fcb`, but for compute. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-09 11:16:39 -07:00
Rob Clark	57763ee735	freedreno/ir3: fix fallout from new block iterators Since this is potentially modifying the block structure of the shader, it needs the _safe() version of the iterator. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 13:52:29 -04:00
Nicolai Hähnle	fe102f7677	radeonsi: workaround for tesselation on SI We request more than 32KB of LDS here, which SI doesn't have. Since LLVM recently started checking the size of declared LDS allocations, all shaders involved in tesselation fail to compile on SI. Note that the entire calculation here seems wrong, given how we calculate indices for generic attributes, so the number ends up wrong on CI+ as well. A proper solution is clearly needed, but this patch should serve as a band-aid for SI in the meantime. Also note that the real size of the LDS allocation in hardware is independent from what we tell LLVM, so this is really more of a "cosmetic" change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198 Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	d8f3e8e626	radeonsi: always allocate export memory for pixel shaders Experiments with framebuffer-no-attachments type draw calls have shown that NULL exports stall terribly unless we ensure that export memory is allocated by the SPI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	ad1782cfb5	radeonsi: expose performance counters as 64 bit This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Rob Clark	f096096b77	nir/search: fix typo Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 12:46:24 -04:00
Tim Rowley	b65f7ec450	gallium: enable intel jitevents profiling LLVM when configured with "intel jitevents" enabled can inform VTune about dynamic code, so individual shaders are attributed profiling data and the resulting assembly can be examined. Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-05-09 11:25:02 -05:00
Bruce Cherniak	0062c5f09b	swr: Add missing break in query switch statement. Missed a switch break in query stat collection when refactoring queries. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-05-09 11:21:47 -05:00
Rob Clark	f33083a216	freedreno/ir3: allow for additional VS sysval inputs There are a total of four possible currently, rather than 2. So we need to be prepared for the input array to grow by 16 components. We could get away with less if we could pack sysval inputs.. and the way this is handled currently isn't really the nicest thing. But it's a tactical fix for an issue hit in: GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 11:51:59 -04:00

... 4 5 6 7 8 ...

81381 Commits All Branches Search

81381 Commits

All Branches