KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Francisco Jerez	d3c10ad427	i965/fs: Migrate shader time to the IR builder. v2: Change null register destination type to UD so it can be compacted. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:33 +03:00
Francisco Jerez	35e64f2a76	i965/fs: Migrate untyped surface read and atomic to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:33 +03:00
Francisco Jerez	db83d9d2d0	i965/fs: Migrate texturing implementation to the IR builder. v2: Remove tabs from modified lines. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:33 +03:00
Francisco Jerez	546839ef63	i965/fs: Migrate pull constant loads to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	8f626c1498	i965/fs: Migrate Gen4 send dependency workarounds to the IR builder. v2: Change brw_null_reg() to bld.null_reg_f(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	4af4cfba9e	i965/fs: Migrate lower_integer_multiplication to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	efa60e49f2	i965/fs: Migrate lower_load_payload to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	8f8c6b7bda	i965/fs: Migrate register spills and fills to the IR builder. Yes, it's incorrect to use the 0-th channel enable group unconditionally without considering the execution and regioning controls of the instruction that uses the spilled value, but it matches the previous behaviour exactly, the builder just makes the preexisting problem more obvious because emitting an instruction of non-native SIMD width without having called .group() or .exec_all() explicitly would have led to an assertion failure. I'll fix the problem in a follow-up series, as the solution is going to be non-trivial. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	3e6ac0bced	i965/fs: Migrate try_replace_with_sel to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	6114ba4dcc	i965/fs: Migrate opt_sampler_eot to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	a800ec04ad	i965/fs: Migrate opt_peephole_sel to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	78f7c9edeb	i965/fs: Create and emit instructions in one step in opt_peephole_sel. This simplifies opt_peephole_sel() slightly by emitting the SEL instructions immediately after they are created, what makes the sel_inst and mov_imm_inst arrays unnecessary and will make it possible to get rid of the explicit inserts when the pass is migrated to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	74c2458ecf	i965/fs: Migrate opt_cse to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:32 +03:00
Francisco Jerez	e7069fbc70	i965/fs: Don't drop force_writemask_all and _sechalf when copying a CSE temporary. LOAD_PAYLOAD instructions need the same treatment as any other generator instructions, at least FB writes and typed surface messages will need a payload built with non-zero execution controls. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:31 +03:00
Francisco Jerez	497d238ae7	i965/vec4: Take into account all instruction fields in CSE instructions_match(). Most of these fields affect the behaviour of the instruction, but apparently we currently don't CSE the kind of instructions for which these fields could make a difference in the VEC4 back-end. That's likely to change soon though when we start using send-from-GRF for texture sampling and surface access messages. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:31 +03:00
Francisco Jerez	8013b8147a	i965/fs: Take into account all instruction fields in CSE instructions_match(). Most of these fields affect the behaviour of the instruction so it could actually break the program if we CSE a pair of otherwise matching instructions with different values of these fields. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:31 +03:00
Francisco Jerez	d86c2e6e53	i965/fs: Migrate opt_peephole_predicated_break to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:31 +03:00
Francisco Jerez	35e5f118a5	i965/fs: Migrate opt_combine_constants to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:31 +03:00
Francisco Jerez	e04b4156a7	i965/fs: Allocate a common IR builder object in fs_visitor. v2: Call fs_builder::at_end() to point the builder at the end of the program explicitly. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:18:31 +03:00
Francisco Jerez	8ea8f83c8f	i965/fs: Introduce FS IR builder. The purpose of this change is threefold: First, it improves the modularity of the compiler back-end by separating the functionality required to construct an i965 IR program from the rest of the visitor god-object, what in turn will reduce the coupling between other components and the visitor allowing a more modular design. This patch doesn't yet remove the equivalent functionality from the visitor classes, as it involves major back-end surgery. Second, it improves consistency between the scalar and vector back-ends. The FS and VEC4 builders can both be used to generate scalar code with a compatible interface or they can be used to generate natural vector width code -- 1 or 4 components respectively. Third, the approach to IR construction is somewhat different to what the visitor classes currently do. All parameters affecting code generation (execution size, half control, point in the program where new instructions are inserted, etc.) are encapsulated in a stand-alone object rather than being quasi-global state (yes, anything defined in one of the visitor classes is effectively global due to the tight coupling with virtually everything else in the compiler back-end). This object is lightweight and can be copied, mutated and passed around, making helper IR-building functions more flexible because they can now simply take a builder object as argument and will inherit its IR generation properties in exactly the same way that a discrete instruction would from the same builder object. The emit_typed_write() function from my image-load-store branch is an example that illustrates the usefulness of the latter point: Due to hardware limitations the function may have to split the untyped surface message in 8-wide chunks. That means that the several functions called to help with the construction of the message payload are themselves required to set the execution width and half control correctly on the instructions they emit, and to allocate all registers with half the default width. With the previous approach this would require the used helper functions to be aware of the parameters that might differ from the default state and explicitly set the instruction bits accordingly. With the new approach they would get a modified builder object as argument that would influence all instructions emitted by the helper function as if it were the default state. Another example is the fs_visitor::VARYING_PULL_CONSTANT_LOAD() method. It doesn't actually emit any instructions, they are simply created and inserted into an exec_list which is returned for the caller to emit at some location of the program. This sort of two-step emission becomes unnecessary with the builder interface because the insertion point is one more of the code generation parameters which are part of the builder object. The caller can simply pass VARYING_PULL_CONSTANT_LOAD() a modified builder object pointing at the location of the program where the effect of the constant load is desired. This two-step emission (which pervades the compiler back-end and is in most cases redundant) goes away: E.g. ADD() now actually adds two registers rather than just creating an ADD instruction in memory, emit(ADD()) is no longer necessary. v2: Drop scalarizing VEC4 builder. v3: Take a backend_shader as constructor argument. Improve handling of debug annotations and execution control flags. v4: Drop Gen6 IF with inline comparison. Rename "instr" variable. Initialize cursor to NULL by default and add method to explicitly point the builder at the end of the program. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-09 15:07:18 +03:00
Francisco Jerez	6e04065729	i965: Define consistent interface to enable instruction result saturation. v2: Use set_ prefix. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-09 13:56:06 +03:00
Francisco Jerez	7624f8410f	i965: Define consistent interface to enable instruction conditional modifiers. v2: Use set_ prefix. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-09 13:56:06 +03:00
Francisco Jerez	239dfc5410	i965: Define consistent interface to predicate an instruction. v2: Use set_ prefix. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-09 13:56:06 +03:00
Francisco Jerez	f9367191b3	mesa: Drop include of simple_list.h from mtypes.h. simple_list.h defines a number of macros with short non-namespaced names that can easily collide with other declarations (first_elem, last_elem, next_elem, prev_elem, at_end), and according to the comment it was only being included because of struct simple_node, which is no longer used in this file. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-06-09 13:56:06 +03:00
Francisco Jerez	277b94f172	dri/nouveau: Include simple_list.h explicitly in nv*_state_tnl.c. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-06-09 13:56:06 +03:00
Francisco Jerez	7065c8153b	tnl: Include simple_list.h explicitly in t_context.c. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-06-09 13:56:06 +03:00
Francisco Jerez	08a1046f67	mesa: Include simple_list.h explicitly in errors.c. This seems to be the only user of simple_list in core mesa not including the header explicitly. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-06-09 13:56:05 +03:00
Dave Airlie	f7aad9da20	mesa/teximage: use correct extension for accept stencil texture. This was using the wrong extension, ARB_stencil_texturing doesn't mention any changes in this area. Fixes "dEQP-GLES3.functional.fbo.completeness.renderable.texture. stencil.stencil_index8." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90751 Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-08 15:47:09 -07:00
Anuj Phogat	556b2fbd24	i965: Make a helper function intel_miptree_set_total_width_height() and some more code refactoring. No functional changes in this patch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-06-08 13:57:11 -07:00
Anuj Phogat	9111377978	i965/gen9: Set vertical alignment for the miptree v3: Use ffs() and a switch loop in tr_mode_horizontal_texture_alignment() (Ben) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-06-08 13:57:11 -07:00
Anuj Phogat	447410b664	i965/gen9: Set horizontal alignment for the miptree v3: Use ffs() and a switch loop in tr_mode_vertical_texture_alignment() (Ben) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-06-08 13:57:11 -07:00
Anuj Phogat	126078faca	i965/gen9: Set tiled resource mode for the miptree Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-06-08 13:57:11 -07:00
Anuj Phogat	ef6b9985ea	i965: Pass miptree pointer as function parameter in intel_vertical_texture_alignment_unit Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-06-08 13:57:11 -07:00
Anuj Phogat	9edac38f2a	i965: Move intel_miptree_choose_tiling() to brw_tex_layout.c and change the name to brw_miptree_choose_tiling(). V3: Remove redundant function parameters. (Topi) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-06-08 13:57:11 -07:00
Anuj Phogat	2cbe730ac5	i965: Choose tiling in brw_miptree_layout() function This refactoring is required by later patches in this series. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-06-08 13:57:11 -07:00
Ben Widawsky	4f2f5c8d81	i965: Disallow saturation for MACH operations. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2015-06-08 12:43:28 -07:00
Chris Wilson	922c0c9fd5	i965: Export format comparison for blitting between miptrees Since the introduction of commit 536003c11e4cb1172c540932ce3cce06f03bf44e Author: Boyan Ding <boyan.j.ding@gmail.com> Date: Wed Mar 25 19:36:54 2015 +0800 i965: Add XRGB8888 format to intel_screen_make_configs winsys buffers no longer have an alpha channel. This causes _mesa_format_matches_format_and_type() to reject previously working BGRA uploads from using the BLT fast path. Instead of using the generic routine for matching formats exactly, export the slightly more relaxed check from intel_miptree_blit() which importantly allows the blitter routine to apply a small number of format conversions. References: https://bugs.freedesktop.org/show_bug.cgi?id=90839 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Alexander Monakov <amonakov@gmail.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-06-08 17:56:14 +01:00
Chris Wilson	c2d0606827	i915: Blit RGBX<->RGBA drawpixels The blitter already has code to accommodate filling in the alpha channel for BGRX destination formats, so expand this to also allow filling the alpha channgel in RGBX formats. More importantly for the next patch is moving the test into its own function for the purpose of exporting the check to the callers. v2: Fix alpha expansion as spotted by Alexander with the fix suggested by Kenneth Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Alexander Monakov <amonakov@gmail.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-06-08 17:56:10 +01:00
Chris Wilson	8da79b8378	i965: Fix HW blitter pitch limits The BLT pitch is specified in bytes for linear surfaces and in dwords for tiled surfaces. In both cases the programmable limit is 32,767, so adjust the check to compensate for the effect of tiling. v2: Tweak whitespace for functions (Kenneth) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-06-08 17:55:56 +01:00
Martin Peres	8614b9e489	softpipe/query: force parenthesis around a logical not This makes GCC5 happy. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-06-08 12:38:08 +03:00
Martin Peres	184e4de3a1	main/version: make sure all the output variables get set in get_gl_override This fixes 2 warnings in gcc 5.1. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-06-08 12:37:42 +03:00
Michel Dänzer	56e38edc96	radeonsi: Add CIK SDMA support Based on the corresponding SI support. Same as that, this is currently only enabled for one-dimensional buffer copies due to issues with multi-dimensional SDMA copies. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-08 18:13:22 +09:00
Michel Dänzer	79f2acb8f8	r600g,radeonsi: Assert that there's enough space after flushing Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-08 18:10:35 +09:00
Emil Velikov	9538902c4f	docs: add news item and link release notes for mesa 10.5.7 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-06-07 13:44:37 +01:00
Emil Velikov	f7db7fe6ea	docs: Add sha256sums for the 10.5.7 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit eb3a704bb0008c1d046abae31dcb0b2b980c66b1)	2015-06-07 13:42:48 +01:00
Emil Velikov	56efe81ab1	Add release notes for the 10.5.7 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit 495bcbc48cf4e7cee0f2de11c1166a1fd6eb3969)	2015-06-07 13:42:46 +01:00
Kenneth Graunke	7b8f20ec55	prog_to_nir: Fix fragment depth writes. In the ARB_fragment_program specification, the result.depth output variable is treated as a vec4, where the fragment depth is stored in the .z component, and the other three components are undefined. This is different than GLSL, which uses a scalar value (gl_FragDepth). To make this consistent for driver backends, this patch makes prog_to_nir use a scalar output variable for FRAG_RESULT_DEPTH, moving result.depth.z into the first component. Fixes Glean's fragProg1 "Z-write test" subtest. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90000 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-06-06 13:26:10 -07:00
Chris Forbes	52e5ad7bf8	i965: Set max texture buffer size to hardware limit Previously we were leaving this at the default of 64K, which meets the spec but is too small for some real uses. The hardware can handle up to 128M. User was complaining about this on freenode ##OpenGL today. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-06 18:40:33 +12:00
Ben Widawsky	b639ed2f1b	i965: Add gen8 fast clear perf debug In an ideal world I would just implement this instead of adding the perf debug. There are some errata involved which lead me to believe it won't be so simple as flipping a few bits. There is room to add a thing for Gen9s flexibility, but since I am actively working on that I have opted to ignore it. Example: Multi-LOD fast clear - giving up (256x128x8). v2: Use braces for if statements because they are multiple lines (Ken) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-05 14:25:47 -07:00
Ben Widawsky	77a44512d9	i965: Add buffer sizes to perf debug of fast clears When we cannot do the optimized fast clear it's important to know the buffer size since a small buffer will have much less performance impact. A follow-on patch could restrict printing the message to only certain sizes. Example: Failed to fast clear 1400x1056 depth because of scissors. Possible 5% performance win if avoided. Recommended-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-06-05 14:25:47 -07:00

1 2 3 4 5 ...

70487 Commits All Branches Search

70487 Commits

All Branches