mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Kenneth Graunke	277dbf08b0	glsl: Remove exec_list iterators now that nothing uses them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:47 -08:00
Kenneth Graunke	826d9fb8c0	glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking. These can't use foreach_list since they want to skip over the first few list elements. Just doing the ad-hoc list walking isn't too bad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:45 -08:00
Kenneth Graunke	48d0faaa43	glsl: Use a new foreach_two_lists macro for walking two lists at once. When handling function calls, we often want to walk through the list of formal parameters and list of actual parameters at the same time. (Both are guaranteed to be the same length.) Previously, we used a pattern of: exec_list_iterator 1st_iter = <1st list>.iterator(); foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) { ... 1st_iter.next(); } This was awkward, since you had to manually iterate through one of the two lists. This patch introduces a foreach_two_lists macro which safely walks through two lists at the same time, so you can simply do: foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) { ... } v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested by Ian Romanick. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:42 -08:00
Kenneth Graunke	02ff2a2758	glsl: Statically cast parameter exec_node to ir_variable. Formal function parameters are always ir_variable objects, not an arbitrary ir_instruction. So there's no need to dynamically cast here. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	8050584096	glsl: Cast ir_call parameters to ir_rvalue, not ir_instruction. A function call's parameters are always rvalues. ir_rvalue may not always be a subclass of ir_instruction in the future, so we should use the right one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	838a6871bb	glsl: Convert piles of foreach_iter to foreach_list_safe. In these cases, we edit the list (or at least might be), so we use the foreach_list_safe variant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	5f7e778fa1	glsl: Convert piles of foreach_iter to the newer foreach_list macro. foreach_iter and exec_list_iterators have been deprecated for some time now; we just hadn't ever bothered to convert code to the newer foreach_list and foreach_list_safe macros. In these cases, we aren't editing the list, so we can use foreach_list rather than foreach_list_safe. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Paul Berry	bce8bc0b25	glsl: Index into ctx->Const.Program[] rather than using ad-hoc code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:19 -08:00
Paul Berry	84732a982c	mesa: replace ctx->Const.{Vertex,Fragment,Geomtery}Program with an array. These are replaced with ctx->Const.Program[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' -o -iname '.py' \ -o -iname '.y' ')' -print0 \| xargs -0 sed -i \ -e 's/Const\.VertexProgram/Const.Program[MESA_SHADER_VERTEX]/g' \ -e 's/Const\.GeometryProgram/Const.Program[MESA_SHADER_GEOMETRY]/g' \ -e 's/Const\.FragmentProgram/Const.Program[MESA_SHADER_FRAGMENT]/g' Suggested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:01 -08:00
Thomas Sondergaard	e8ff08edd8	mesa: Namespace qualify fma to override ambiguity with fma from math.h MSVC 2013 version of math.h includes an fma() function. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:07 -07:00
Thomas Sondergaard	067ad6e53e	mesa: Fix compile error with MSVC 2013 This fixes the following compile error: src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3 overloads have similar conversions Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:06 -07:00
Paul Berry	31ec2f8338	mesa: Remove _mesa_progshader_enum_to_string(), which is no longer used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:32:14 -08:00
Paul Berry	acfc58a7e5	glsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:32:01 -08:00
Paul Berry	2adb9fea77	glsl: Make more use of gl_shader_stage enum in lower_clip_distance.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:58 -08:00
Paul Berry	80ee24823f	glsl: Make more use of gl_shader_stage enum in link_varyings.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename "shaderType" param of is_varying_var() to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:55 -08:00
Paul Berry	9110078209	glsl: Change _mesa_glsl_parse_state ctor to use gl_shader_stage enum. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename "target" param to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:49 -08:00
Paul Berry	e3b86f07da	mesa: Use gl_shader::Stage instead of gl_shader::Type where possible. This reduces confusion since gl_shader::Type is sometimes GL_SHADER_PROGRAM_MESA but is more frequently GL_SHADER_{VERTEX,GEOMETRY,FRAGMENT}. It also has the advantage that when switching on gl_shader::Stage, the compiler will alert if one of the possible enum types is unhandled. Finally, many functions in src/glsl (especially those dealing with linking) already use gl_shader_stage to represent pipeline stages; using gl_shader::Stage in those functions avoids the need for a conversion. Note: in the process I changed _mesa_write_shader_to_file() so that if it encounters an unexpected shader stage, it will use a file suffix of "????" rather than "geom". Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:31:45 -08:00
Paul Berry	65511e5f22	mesa: Store gl_shader_stage enum in gl_shader objects. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:31:28 -08:00
Paul Berry	72a995d307	glsl: make _mesa_shader_stage_to_string() available to non-C++ code. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:30:48 -08:00
Paul Berry	665b8d7b6d	mesa: Clean up nomenclature for pipeline stages. Previously, we had an enum called gl_shader_type which represented pipeline stages in the order they occur in the pipeline (i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several inconsistently named functions for converting between it and other representations: - _mesa_shader_type_to_string: gl_shader_type -> string - _mesa_shader_type_to_index: GLenum (GL__SHADER) -> gl_shader_type - _mesa_program_target_to_index: GLenum (GL__PROGRAM) -> gl_shader_type - _mesa_shader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string This patch tries to clean things up so that we use more consistent terminology: the enum is now called gl_shader_stage (to emphasize that it is in the order of pipeline stages), and the conversion functions are: - _mesa_shader_stage_to_string: gl_shader_stage -> string - _mesa_shader_enum_to_shader_stage: GLenum (GL__SHADER) -> gl_shader_stage - _mesa_program_enum_to_shader_stage: GLenum (GL__PROGRAM) -> gl_shader_stage - _mesa_progshader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES, for consistency with the new name for the enum. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename the "target" field of _mesa_glsl_parse_state and the "target" parameter of _mesa_shader_stage_to_string to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:30:30 -08:00
Kenneth Graunke	847bc36a38	glsl: Optimize pow(2, x) --> exp2(x). On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	5e3fd6a9db	glsl: Refactor is_zero/one/negative_one into an is_value() method. This patch creates a new generic is_value() method, which checks if an ir_constant has a particular value. (For vectors, it must have the single value repeated across all components.) It then rewrites the is_zero/is_one/is_negative_one methods to use this generic helper. All three were basically identical except for the value they checked for. The other difference is that is_negative_one rejects boolean types. The new is_value function maintains this behavior, only allowing boolean types when checking for 0 or 1. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	d6c1d66d3a	glsl: Optimize pow(1.0, X) --> 1.0. Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Brian Paul	8d1400fe12	glsl: rename min(), max() functions to fix MSVC build Evidently, there's some other definition of "min" and "max" that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 16:57:49 -07:00
Maxence Le Doré	1a9e8c23eb	mesa: enable AMD_shader_trinary_minmax Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:10 -08:00
Maxence Le Doré	eb5dc75601	glsl: implement mid3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:09 -08:00
Maxence Le Doré	73c7451587	glsl: implement max3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:08 -08:00
Maxence Le Doré	ce46e14729	glsl: Implement min3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:08 -08:00
Maxence Le Doré	61c450fc81	glsl: add min() and max() functions to builder.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:07 -08:00
Maxence Le Doré	cf70d2a7c0	glsl: add a shader_trinary_minmax predicate Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:06 -08:00
Maxence Le Doré	ff50493bb3	glsl: Add extension tracking for AMD_shader_trinary_minmax Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:02 -08:00
Erik Faye-Lund	eb212c5a30	glcpp: error on multiple #else/#elif directives The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-02 14:22:58 -08:00
Carl Worth	6005e9cb28	glcpp: Replace multi-line comment with a space (even as part of macro definition) The preprocessor has always replaced multi-line comments with a single space character, (as required by the specification), but as of commit `bd55ba568b` the lexer also emitted a NEWLINE token for each newline within the comment, (in order to preserve line numbers). The emitting of NEWLINE tokens within the comment broke the rule of "replace a multi-line comment with a single space" as could be exposed by code like the following: #define FOO a/* */b FOO Prior to commit `bd55ba568b`, this code defined the macro FOO as "a b" as desired. Since that commit, this code instead defines FOO as "a" and leaves a stray "b" in the output. In this commit, we fix this by not emitting the NEWLINE tokens while lexing the comment, but instead merely counting them in the commented_newlines variable. Then, when the lexer next encounters a non-commented newline it switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as necessary (so that subsequent parsing stages still generate correct line numbers). Of course, it would have been more clear if we could have written a loop to emit all the newlines, but flex conventions prevent that, (we must use "return" for each token we emit). It similarly would have been clear to have a new rule restricted to the <NEWLINE_CATCHUP> state with an action much like the body of this if condition. The problem with that is that this rule must not consume any characters. It might be possible to write a rule that matches a single lookahead of any character, but then we would also need an additional rule to ensure for the <EOF> case where there are no additional characters available for the lookahead to match. Given those considerations, and given that the SKIP-state manipulation already involves a code block at the top of the lexer function, before any rules, it seems best to me to go with the implementation here which adds a similar pre-rule code block for the NEWLINE_CATCHUP. Finally, this commit also changes the expected output of a few, existing glcpp tests. The change here is that the space character resulting from the multi-line comment is now emitted before the newlines corresponding to that comment. (Previously, the newlines were emitted first, and the space character afterward.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-02 14:15:51 -08:00
Carl Worth	61cea49014	glcpp: Add a more descriptive comment for the SKIP state manipulation Two things make this code confusing: 1. The uncharacteristic manipulation of lexer start state outside of flex rules. 2. The confusing semantics of the skip_stack (including the "lexing_if" override and the SKIP_NO_SKIP state). This new comment is intended to bring a bit more clarity for any readers. There is no intended beahvioral change to the code here. The actual code changes include better indentation to avoid an excessively-long line, and using the more descriptive INITIAL rather than 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-02 14:15:24 -08:00
Paul Berry	77c74c647b	glsl: Fix gl_type of usamplerCube built-in type. I'm not aware of any piglit tests that this fixes, but the old code was obviously wrong. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-30 11:21:39 -08:00
Paul Berry	99e822fa18	mesa: Improve static error checking of arrays sized by MESA_SHADER_TYPES. This patch replaces the following pattern: foo bar[MESA_SHADER_TYPES] = { ... }; With: foo bar[] = { ... }; STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES); This way, when a new shader type is added in a future version of Mesa, we will get a compile error to remind us that the array needs to be updated. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:27 -08:00
Paul Berry	b30e25f297	glsl: Remove extraneous shader_type argument from analyze_clip_usage(). This argument was carrying the name of the shader target (as a string). We can get this just as easily by calling _mesa_shader_enum_to_string(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:24 -08:00
Paul Berry	d343e3d98c	glsl: Get rid of hardcoded arrays of shader target names. We already have a function for converting a shader type index to a string: _mesa_shader_type_to_string(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:21 -08:00
Paul Berry	26707abe56	Rename overloads of _mesa_glsl_shader_target_name(). Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:08 -08:00
Kevin Rogovin	3b1195f8a6	Report that no function found if signature lookup is empty If no function signature is found for a function name, report that the function is not found instead of printing an empty list of candidates. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 09:03:54 -08:00
Kevin Rogovin	23d294bb60	Use line number information from entire function expression This patch changes the error reporting behavior for incorrect function invocation (triggered by match_function_by_name() unable to find a matching function call) from using the line number information associated to the function name term to using the line number information of the entire function expression. Fixes bug #72264. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-20 09:03:54 -08:00
Paul Berry	7963fde37b	glsl: Replace _mesa_glsl_parser_targets enum with gl_shader_type. These enums were redundant. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:36 -08:00
Paul Berry	d9b55244fd	glsl: Don't return bad values from _mesa_shader_type_to_index. This will avoid compiler warnings in the patch that follows. There should be no user-visible effect because the change only affects the behaviour when an invalid enum is passed to _mesa_shader_type_to_index(), and that can only happen if there is a bug elsewhere in Mesa. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:35 -08:00
Chris Forbes	1d71f38924	glsl: add gl_SampleMaskIn[] builtin Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:24:22 +13:00
Tapani Pälli	a6345f1559	glsl: modify ir_clone to use memcpy Patch copies the whole data structure at once instead of assigning individual variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:13 +02:00
Tapani Pälli	447bb9029f	glsl: move variables in to ir_variable::data, part II This patch moves following bitfields and variables to the data structure: explicit_location, explicit_index, explicit_binding, has_initializer, is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray, from_named_ifc_block_array, depth_layout, location, index, binding, max_array_access, atomic Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:11 +02:00
Tapani Pälli	33ee2c67c0	glsl: move variables in to ir_variable::data, part I This patch moves following bitfields in to the data structure: used, assigned, how_declared, mode, interpolation, origin_upper_left, pixel_center_integer Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:08 +02:00
Tapani Pälli	c1d3080ee8	glsl: introduce data section to ir_variable Data section helps serialization and cloning of a ir_variable. This patch includes the helper bits used for read only ir_variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:06 +02:00
Paul Berry	088494aa03	glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound. Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:09 -08:00
Paul Berry	7ea3baa64d	glsl/loops: Stop creating normatively bound loops in loop_controls. Previously, when loop_controls analyzed a loop and found that it had a fixed bound (known at compile time), it would remove all of the loop terminators and instead set the loop's normative_bound field to force the loop to execute the correct number of times. This made loop unrolling easy, but it had a serious disadvantage. Since most GPU's don't have a native mechanism for executing a loop a fixed number of times, in order to implement the normative bound, the back-ends would have to synthesize a new loop induction variable. As a result, many loops wound up having two induction variables instead of one. This caused extra register pressure and unnecessary instructions. This patch modifies loop_controls so that it doesn't set the loop's normative_bound anymore. Instead it leaves one of the terminators in the loop (the limiting terminator), so the back-end doesn't have to go to any extra work to ensure the loop terminates at the right time. This complicates loop unrolling slightly: when deciding whether a loop can be unrolled, we have to account for the presence of the limiting terminator. And when we do unroll the loop, we have to remove the limiting terminator first. For an example of how this results in more efficient back end code, consider the loop: for (int i = 0; i < 100; i++) { total += i; } Previous to this patch, on i965, this loop would compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop (notice that both g8 and g4 are loop induction variables; one is used to terminate the loop, and the other is used to accumulate the total). After this patch, the same loop compiles to: mov(8) g4<1>.xD 0D loop: cmp.ge.f0(8) null g4<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:06 -08:00

1 2 3 4 5 ...

2496 Commits