Commit Graph

72184 Commits

Author SHA1 Message Date
Axel Davy e4f69bc394 st/nine: SetAutoGenFilterType should regenerate the sublevels
It should regenerate the sublevels according to the spec

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:46 +02:00
Axel Davy b75f830166 st/nine: Simplify NineVolume9_CopyVolume
We had only one usage for this function.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:46 +02:00
Axel Davy bc42c29013 st/nine: Split NineSurface9_CopySurface
NineSurface9_CopySurface was supporting more cases than what
we needed, and doing checks that were innapropriate for
some NineSurface9_CopySurface use cases.

This patch splits it into two for the two use cases, and moves
the checks to the caller.

This patch also adds a few checks to NineDevice9_UpdateSurface

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:45 +02:00
Axel Davy 3f36ad732c st/nine: Simplify Volume9 dirty region tracking
Similar to what was done for Surface9, track the dirty region
only in VolumeTexture9.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:45 +02:00
Marek Olšák ab0643225e util/u_blitter: implement alpha blending for pipe->blit 2015-08-21 22:21:45 +02:00
Christoph Bumiller 23da32a923 gallium: Add blending to pipe blit
This type of blending is used for gallium nine software cursor

Signed-off-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy a30684712e st/nine: Revert to sw cursor in case of failure to set hw cursor
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy df6f1f77cc st/nine: Do not call ID3DPresent_GetCursorPos for sw cursor
For sw cursor we do not tell wine the cursor position (the app
tells us directly). We shouldn't use ID3DPresent_GetCursorPos.

device->cursor.pos already contains the coordinates the app
gave us.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy 78b304e2f9 st/nine: Force hw cursor for Windowed mode
According to the spec, Windowed mode must
have hw cursor

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy 1b20eaff67 st/nine: Hide hardware cursor when we don't use it
We have either hardware cursor or software cursor.
When we use software cursor, we should hide the hardware
cursor.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy 3470878383 st/nine: fix D3DRS_DITHERENABLE wrong state group
D3DRS_DITHERENABLE was assigned to the rasterizer state
group, but it was used for the blend group.

Assign it to the blend group.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:45 +02:00
Patrick Rudolph 1b645df2f3 st/nine: Account POINTSIZE_MIN and POINTSIZE_MAX for point size
When using D3DRS_POINTSIZE make sure the value is at least
D3DRS_POINTSIZE_MIN but not greater than D3DRS_POINTSIZE_MAX.

Fixes some Wine tests.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-08-21 22:21:45 +02:00
Patrick Rudolph 886227d363 st/nine: Align texture memory
Align texture memory on 32 byte boundry to allow
SSE/AVX memcpy to work on locked rects.

This fixes some crashes with games using SSE.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-08-21 22:21:45 +02:00
Axel Davy 3c4864fa55 st/nine: Always set point_quad_rasterization to 1
Both Points and Point Sprites are rasterized like quads,
according to d3d9 doc and gallium rasterizer doc.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:45 +02:00
Axel Davy 74de849bd4 st/nine: Fix Swizzle for ATI2 format
We had red and green in the wrong channels
for the ATI2 format (RGTC2).

Found thanks to wine tests.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Patrick Rudolph cb2d680232 target/d3dadapter9: Return Windows like card names
Add support for multiple cards and fill in Win
like card name, driver name and version info.
Use fallback for unknown vendors and unknown card names.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-08-21 22:21:45 +02:00
David Heidelberg 56717c0b06 st/nine: Require gcc >= 4.6
Nine code uses some C11 features, and this
leads to compile error on gcc <= 4.5

Another way would have been to use the
-fms-extensions CFLAG

Signed-off-by: David Heidelberg <david@ixit.cz>
Cc: "10.4 10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-08-21 22:21:45 +02:00
Ilia Mirkin 365d631eb2 glsl: fix error message when validating tcs output decls
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-08-21 15:20:23 -04:00
Rob Clark 3b4d03d440 relnote updates
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-21 11:59:07 -04:00
Ilia Mirkin 3525aa1dc9 st/mesa: pass through 4th opcode argument in bitmap/pixel visitors
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-08-21 11:44:13 -04:00
Ilia Mirkin 681efdf7a1 st/mesa: fix assignments with 4-operand arguments (i.e. BFI)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-08-21 11:43:58 -04:00
Martin Peres f142e64b29 i965: allow image_size on float images
This got missed because the piglit test only tested int images to avoid a
combinatiorial explosion of format, targets, stages and sizes which
takes more than 5 minutes to test on nvidia's driver.

This patch also drops the IMAGE_FUNCTION_AVAIL_ATOMIC which is not applicable
to the image_size codepath but was not hurting in any way.

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-08-21 17:48:14 +03:00
Zoltan Gilian df5cdec132 clover: fix llvm 3.5 build error
There is no MDOperand in llvm 3.5.

v2: Check if kernel metadata is present to avoid crash (EdB).
v3: Second attempt to avoid crash: switch off metadata query for llvm < 3.6.

Reviewed-by: Serge Martin (EdB) <edb+mesa@sigluy.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-08-21 14:18:10 +03:00
Tapani Pälli 7eda897bf0 mesa: update fbo state in glTexStorage
We have to re-validate FBOs rendering to the texture like is done
with TexImage and CopyTexImage.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91673
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-08-21 11:22:28 +03:00
Eric Anholt 8cae9f2fda vc4: Add algebraic opt for rcp(1.0).
We're generating rcps as part of backend lowering of the packed coordinate
in the CS, and we don't want to lower them in NIR because of the extra
newton-raphson steps in the common case.  However, GLB2.7 is moving a
vertex attribute with a 1.0 W component to the position, and that makes us
produce some silly RCPs.

total instructions in shared programs: 97590 -> 97580 (-0.01%)
instructions in affected programs:     74 -> 64 (-13.51%)
2015-08-20 23:43:04 -07:00
Eric Anholt c800fef2e2 vc4: Allow unpack_8[abcd]_f's src to stay in r4.
I had QPU emit code to do it, but forgot to flag the register class.

total instructions in shared programs: 97974 -> 97590 (-0.39%)
instructions in affected programs:     25291 -> 24907 (-1.52%)
2015-08-20 23:43:04 -07:00
Eric Anholt 8b36d107fd vc4: Pack the unorm-packing bits into a src MUL instruction when possible.
Now that we do non-SSA QIR instructions, we can take a NIR SSA src that's
only used by the unorm packing and just stuff the pack bits into it.

total instructions in shared programs: 98136 -> 97974 (-0.17%)
instructions in affected programs:     4149 -> 3987 (-3.90%)
2015-08-20 23:43:04 -07:00
Eric Anholt 572a48366d vc4: Add a QIR helper for whether the op is a MUL type. 2015-08-20 23:42:59 -07:00
Eric Anholt fd74da11c4 vc4: Drop an unused algebraic op.
NIR now handles this optimization for us.
2015-08-20 23:42:53 -07:00
Eric Anholt 98728ce071 vc4: Switch QPU_PACK_SCALED to be two non-SSA instructions.
total instructions in shared programs: 98159 -> 98136 (-0.02%)
instructions in affected programs:     12279 -> 12256 (-0.19%)
2015-08-20 23:42:45 -07:00
Eric Anholt 69ef08d303 vc4: Make the pack-to-unorm instructions be non-SSA.
This helps ensure that the register allocator doesn't force the later pack
operations to insert extra MOVs.

total instructions in shared programs: 98170 -> 98159 (-0.01%)
instructions in affected programs:     2134 -> 2123 (-0.52%)
2015-08-20 23:42:17 -07:00
Eric Anholt 0bba4fa070 vc4: Allow QIR registers to be non-SSA.
Now that we have NIR, most of the optimization we still need to do is
peepholes on instruction selection rather than general dataflow
operations.  This means we want to be able to have QIR be a lot closer to
the actual QPU instructions, just with virtual registers.  Allowing
multiple instructions writing the same register opens up a lot of
possibilities.
2015-08-20 23:40:22 -07:00
Eric Anholt ceb1a31842 vc4: We can now move TEX_RESULT accesses across other r4 ops.
No difference on shader-db.
2015-08-20 23:40:16 -07:00
Timothy Arceri ad89748541 glsl: fix binding validation for interface blocks
V2: rebase on SSBO changes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-08-21 15:58:24 +10:00
Timothy Arceri dd6a6dbaf7 glsl: interleave constant propagation and folding
The constant folding pass can take a long time to complete
so rather than running through the entire pass each time
a new constant is propagated (and vice versa) interleave them.

This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1
go from around 2 min -> 23 sec.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-08-21 15:03:22 +10:00
Ilia Mirkin 8483577f6b nv50/ir: pre-compute BFE arg when both bits and offset are imm
Due to a quirk in how the nv50 opt passes run, the algebraic
optimization that looks for these BFE's happens before the constant
folding pass. Rearranging these passes isn't a great idea, but this is
easy enough to fix. Allows a following cvt to eliminate the bfe in
certain situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 22:16:46 -04:00
Ilia Mirkin ecebd3dbfc glsl: expose textureQueryLod in GLSL 4.00+ fragment shaders
See issue from the ARB_texture_query_lod spec for LOD vs Lod confusion:

    (3) The core specification uses the "Lod" spelling, not "LOD".  Should
        this extension be modified to use "Lod"?

      RESOLVED: The "Lod" spelling is the correct spelling for the core
      specification and the preferred spelling for use. However, use of
      "LOD" also exists, as the extension predated the core specification,
      so this extension won't remove use of "LOD".

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-20 21:05:19 -04:00
Nanley Chery 29e953b07b Revert "mesa/formats: refactor by collapsing cases in switch statement by type"
This reverts commit ffe6c6ad5f.

_mesa_format_num_components() does not include the padding bits in mesa formats
containing 'X' channels. This could cause mipmap generation for certain
uncompressed formats to underestimate the number of channels in the source
image by 1.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-20 18:06:47 -07:00
Glenn Kennard 4237dfb978 r600g: Fix handling of TGSI_OPCODE_ARR with SB
FLT_TO_INT goes in the vector pipes on evergreen/NI,
not the trans unit as on earlier chips.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:46:13 +10:00
Edward O'Callaghan 7a32652231 r600: Turn 'r600_shader_key' struct into union
This struct was getting a bit crowded, following the lead of
radeonsi, mirror the idea of having sub-structures for each
shader type. Turning 'r600_shader_key' into an union saves
some trivial memory and CPU cycles for the shader keys.

[airlied: drop as_ls, and reorder so larger fields at start.]
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:46:13 +10:00
Edward O'Callaghan e2145de74d r600: Rewrite r600_shader_selector_key() to use a switch stmt
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:46:13 +10:00
Jason Ekstrand bbf8291bf8 i965: Use NIR by default for vertex shaders
Shader-db results for vec4 on i965:

   total instructions in shared programs: 1499894 -> 1502261 (0.16%)
   instructions in affected programs:     1414224 -> 1416591 (0.17%)
   helped:                                2434
   HURT:                                  10543
   GAINED:                                1
   LOST:                                  0

Shader-db results for vec4 on g4x:

   total instructions in shared programs: 1437411 -> 1439779 (0.16%)
   instructions in affected programs:     1362402 -> 1364770 (0.17%)
   helped:                                2434
   HURT:                                  10544
   GAINED:                                0
   LOST:                                  0

Shader-db results for vec4 on Iron Lake:

   total instructions in shared programs: 1437214 -> 1439593 (0.17%)
   instructions in affected programs:     1362205 -> 1364584 (0.17%)
   helped:                                2433
   HURT:                                  10544
   GAINED:                                1
   LOST:                                  0

Shader-db results for vec4 on Sandy Bridge:

   total instructions in shared programs: 2022092 -> 1941570 (-3.98%)
   instructions in affected programs:     1886838 -> 1806316 (-4.27%)
   helped:                                7510
   HURT:                                  10737
   GAINED:                                0
   LOST:                                  0

Shader-db results for vec4 on Ivy Bridge:

   total instructions in shared programs: 1853749 -> 1804960 (-2.63%)
   instructions in affected programs:     1686736 -> 1637947 (-2.89%)
   helped:                                6735
   HURT:                                  11101
   GAINED:                                0
   LOST:                                  0

Shader-db results for vec4 on Haswell:

   total instructions in shared programs: 1853749 -> 1804960 (-2.63%)
   instructions in affected programs:     1686736 -> 1637947 (-2.89%)
   helped:                                6735
   HURT:                                  11101
   GAINED:                                0
   LOST:                                  0

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-08-20 16:34:31 -07:00
Kai Wasserbäch 6921f170b6 glsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULL
Fixes a crash in Piglit's
spec@arb_shader_subroutine@linker@no-mutual-recursion.vert for me.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:23:52 +10:00
Tobias Klausmann 3e6adbd761 nv50/ir: Handle OP_CVT when folding constant expressions
[imirkin: handle more type combinations, use macro]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin f5b926183d nvc0/ir: undo more shifts still by allowing a pre-SHL to occur
This happens with unpackSnorm lowering. There's yet another
bitfield-extract behind it, but there's too much variation to be worth
cutting through.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin 9ebe7dc094 nvc0/ir: don't require AND when the high byte is being addressed
unpackUnorm* lowering doesn't AND the high byte/word as it's
unnecessary. Detect that situation as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin 63cb85e567 nvc0/ir: detect i2f/i2i which operate on specific bytes/words
Some Unigine shaders have been observed to unpack bytes out of 32-bit
integers and convert them to floats. I2F/I2I can handle this sort of
thing directly. Detect the handleable situations.

This misses 16-bit word capabilities in nv50, but I haven't seen shaders
that would actually make use of that.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin 51499bb5ff nvc0/ir: detect AND/SHR pairs and convert into EXTBF
Some shaders appear to extract bits using shift/and combos. Detect
(some) of those and convert to EXTBF instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Chih-Wei Huang 2a4af36517 nv50/ir: support different unordered_set implementations
If build with C++11 standard, use std::unordered_set.

Otherwise if build on old Android version with stlport,
use std::tr1::unordered_set with a wrapper class.

Otherwise use std::tr1::unordered_set.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Martin Peres 56ebd3314b i965: Fix "handle nir_intrinsic_image_size"
I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by
accident. Not having the Reviewed-by: tags on the last two commits should
have been a red flag but I somehow missed it after the QA check.

This patch should fix image-size for non-int images. I will add support to
the piglit test for all the other image types.

Sorry for the noise.

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-08-20 15:28:21 +03:00