Commit Graph

80742 Commits

Author SHA1 Message Date
Iago Toral Quiroga 29541ec531 nir/lower_double_ops: lower floor()
At least i965 hardware does not have native support for floor on doubles.

v2 (Sam):
  - Improve the lowering pass to remove one bcsel (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:35 +02:00
Iago Toral Quiroga 5fab3d178b nir/lower_double_ops: lower trunc()
At least i965 hardware does not have native support for truncating doubles.

v2:
  - Simplified the implementation significantly.
  - Fixed the else branch, that was not doing what we wanted.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Connor Abbott 2ea3649c63 nir: add a pass to lower some double operations
v2: Move to compiler/nir (Iago)
v3: Use nir_imm_int() to load the constants (Sam)
v4 (Sam):
  - Undo line-wrap (Jason).
  - Fix comment (Jason).
  - Improve generated code for get_signed_inf() function (Connor).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Connor Abbott 2cf3b28884 nir/builder: add nir_imm_double()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Samuel Iglesias Gonsálvez 3a150683ce nir/builder: Add bit_size info to nir_build_imm()
v2:
- Group num_components and bit_size together (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Jakob Sinclair 76b8c5cc60 radeonsi: check if value is negative
Fixes a Coverity defect by adding checks to see if a value is negative
before using it to index an array. By checking the value first it makes
the code a bit safer but overall should not have a big impact.

CID: 1355598

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-04-28 11:33:38 +02:00
Michel Dänzer 860210ccfc clover: Fix build against clang SVN >= r267772
(Re-pushing previous fix for clang SVN r265359, which was reverted in
the meantime)

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-28 12:57:03 +09:00
Lars Hamre 32cb7d61a9 glsl: fix lowering outputs for early/nested returns
Return statements in conditional blocks were not having their
output varyings lowered correctly.

This patch fixes the following piglit tests:
/spec/glsl-1.10/execution/vs-float-main-return
/spec/glsl-1.10/execution/vs-vec2-main-return
/spec/glsl-1.10/execution/vs-vec3-main-return

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-28 11:01:51 +10:00
Connor Abbott 122d27e998 nir: rewrite nir_foreach_block and friends
Previously, these were functions which took a callback. This meant that
the per-block code had to be in a separate function, and all the data
that you wanted to pass in had to be a single void *. They walked the
control flow tree recursively, doing a depth-first search, and called
the callback in a preorder, matching the order of the original source
code. But since each node in the control flow tree has a pointer to its
parent, we can implement a "get-next" and "get-previous" method that
does the same thing that the recursive function did with no state at
all. This lets us rewrite nir_foreach_block() as a simple for loop,
which lets us greatly simplify its users in some cases. This does
require us to rewrite every user, although the transformation from the
old nir_foreach_block() to the new nir_foreach_block() is mostly
trivial.

One subtlety, though, is that the new nir_foreach_block() won't handle
the case where the current block is deleted, which the old one could.
There's a new nir_foreach_block_safe() which implements the standard
trick for solving this. Most users don't modify control flow, though, so
they won't need it. Right now, only opt_select_peephole needs it.

The old functions are reimplemented in terms of the new macros, although
they'll go away after everything is converted.

v2: keep an implementation of the old functions around
v3 (Jason Ekstrand): A small cosmetic change and a bugfix in the loop
   handling of nir_cf_node_cf_tree_last().
v4 (Jason Ekstrand): Use the _safe macro in foreach_block_reverse_call

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 15:05:40 -07:00
Connor Abbott 958300137f nir/opt_cp: use nir_block_get_following_if()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 15:05:34 -07:00
Jordan Justen aaaa22c775 vbo: Return INVALID_OPERATION during draw with a mapped buffer
Fixes the OpenGLES 3.1 CTS:
 * ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos

Because this is triggering the error message after the normal API
validation phase, we don't have the API function name available, and
therefore we generate an error message without the draw call name:

Mesa: User error: GL_INVALID_OPERATION in draw call (vertex buffers are mapped)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95142
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 14:30:06 -07:00
Nanley Chery 28d0bc72fb anv/formats: Return proper error code for unsupported formats
Fixes some failures in dEQP-VK.api.info.image_format_properties.* and
enables the test group to execute without assert failing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 11:28:30 -07:00
Nanley Chery 5f7e8eac42 anv/device: Set the compressed texture feature flags correctly
Sampling from an ETC2 texture is supported on Bay Trail and
from Gen8 onwards. While ASTC_LDR is supported on Gen9, the
logic to handle such formats has not yet been implemented in
the driver.

Fixes dEQP-VK.api.info.format_properties.compressed_formats.

v2: Enable ETC2 for Bay Trail (Kenneth Graunke)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-27 11:28:30 -07:00
Jason Ekstrand e0806930ad nir/algebraic: Add a bit-size validator
This commit adds a validator that ensures that all expressions passed
through nir_algebraic are 100% non-ambiguous as far as bit-sizes are
concerned.  This way it's a compile-time error rather than a hard-to-trace
C exception some time later.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand 8a3e344180 nir/opt_algebraic: Fix some expressions with ambiguous bit sizes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand 7e0ee3a38b nir/search: Respect the bit_size parameter on nir_search_value
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand fcc1c8a437 nir/algebraic: Add a mechanism for specifying the bit size of a value
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand cafb885e45 nir/algebraic: Use "uint" instead of "unsigned" for uint types
This is consistent with the rename done for the rest of NIR.  Currently,
"bool" is the only type specifier used in nir_opt_algebraic.py so this is
really a no-op.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand 736ee0bef7 nir/algebraic: Do better error reporting of bad expressions
Previously, if an exception was encountered anywhere, nir_algebraic would
just die in a fire with no indication whatsoever as to where the actual bug
is.  This commit makes it print out the particular search-and-replace
expression that is causing problems along with the exception.  Also, it
will now report all of the errors it finds and then exit at the end like a
standard C compiler would do.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Alejandro Piñeiro b1dcedf393 isl: move -lm at the end of tests_ldadd
The test was failing to build with "undefined reference to `roundf'" errors,
so Make check on mesa was failing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 20:14:56 +02:00
Topi Pohjolainen aef6a6c382 i965/blorp/gen8: Fix blitting of interleaved msaa surfaces
Fixes ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample.

Current logic divides given layer of one by number of samples (four)
trashing the layer to zero. Layer adjustment is only to be used with
non-interleaved msaa surfaces where samples for particular layer are
in multiple slices.

I copy-pasted a bit of documentation from
brw_blorp.c::brw_blorp_compute_tile_offsets().

Also took the opportunity to fix the comment regarding sampling
as 2D, cube textures are the only exception.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-27 19:57:40 +03:00
Brian Paul 1d242b6882 llvmpipe: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 23c55e5c23 tgsi: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 419e386571 os: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul d902504a67 hud: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul e522a76226 gallivm: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul 489df4a71a draw: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul f93802c465 softpipe: s/Elements/ARRAY_SIZE/
Try to standardize on the later, which is defined in the common util/
directory.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Nicolai Hähnle 562c4a17b7 winsys/radeon: remove use_reusable_pool parameter from buffer_create
All callers set this parameter to true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle 13acf2b243 gallium/radeon: remove use_reusable_pool parameter from r600_init_resource
All callers set it to true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle c868974396 radeon/video: always use the reusable buffer pool
A semantic error was introduced in a past refactoring that caused the bind
parameter to be passed into the use_reusable_pool parameter of buffer_create.
Since this clearly makes no sense, and there is no clear reason why the
cache _shouldn't_ be used, just use the cache always.

Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle 8c43c06e04 radeonsi: work around an MSAA fast stencil clear problem
A piglit test (arb_texture_multisample-stencil-clear) has been sent.
This problem was discovered analyzing

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 7a215a3e27 radeonsi: expclear must be disabled on first Z/S clear
The documentation and the HW team say so.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 01a3bb5d8b radeonsi: move blend choice out of loop in si_blit_decompress_color
It does not depend on the level or layer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 450ff0f0d5 radeonsi: use level mask for early out in si_blit_decompress_color
Mostly for consistency with the other decompress functions, but note that
in the non-DCC decompress case, the function can now early-out in slightly
more (albeit probably rare) cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 0ff05b55c6 radeonsi: si_blit_decompress_depth is only used for staging
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 0b70fc2db4 radeonsi: only decompress the required ZS planes from si_blit
This happens to "fix" a rendering bug in KotOR2, because it avoids a still
not quite understood bug with MSAA fast stencil clear decompress. For the
stencil clear bug, I have sent a piglit test (arb_texture_multisample-stencil-clear).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle def53a0b3d radeonsi: decompress Z & S planes in one pass
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle dc6fc2f390 radeonsi: early out of si_blit_decompress_depth_in_place based on dirty mask
Avoid dirtying the db_render_state atom when possible.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle d14d6c3f58 radeonsi: use MIN2 instead of expanded ?: operator
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle 159f182a57 radeonsi: fix brace style
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle 91fb4bb2e9 gallium/util: add u_bit_consecutive for generating a consecutive range of bits
There are some undefined behavior subtleties, so having a function to match
the u_bit_scan_consecutive_range makes sense.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Tim Rowley 504df3a1d7 swr: s/Elements/ARRAY_SIZE/
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 11:07:34 -05:00
Nicolai Hähnle 836cab51c8 radeonsi: emit s_waitcnt for shader memory barriers and volatile
Turns out that this is needed after all to satisfy some strengthened
coherency tests. Depends on support in LLVM, added in r267729.

v2: updated to reflect changes to the LLVM intrinsic

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-27 10:54:05 -05:00
Tim Rowley e7201bd31b swr: [rasterizer] warning cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:54 -05:00
Tim Rowley 24f23817d2 swr: [rasterizer core] implement legacy depth bias enable
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:45 -05:00
Tim Rowley fa36f8ec9c swr: [rasterizer jitter] support for dumping x86 asm
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:32 -05:00
Tim Rowley a646ffdacf swr: [rasterizer core] more backend refactoring
BackendPixelRate should be easier to read/maintain now hopefully.

Small perf bump by moving some of the pfn's to inline functions
without template params.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:21 -05:00
Tim Rowley 8e815ff72c swr: [rasterizer jitter] add mSimdInt1Ty
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:12 -05:00
Tim Rowley 4e1e0b3a32 swr: [rasterizer core] backend refactor
Lump all template args into a bundle of traits, and add some
functionality to the MSAA traits.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:40:44 -05:00