Commit Graph

64020 Commits

Author SHA1 Message Date
Neil Roberts c6398a38af docs: Update GL3.txt and relnotes for GL_ARB_clear_texture 2014-07-23 12:10:37 +01:00
Neil Roberts 0779f37e15 meta: Add a meta implementation of GL_ARB_clear_texture
Adds an implementation of the ClearTexSubImage driver entry point that tries
to set up an FBO to render to the texture and then calls glClearBuffer with a
scissor to perform the actual clear. If an FBO can't be created for the
texture then it will fall back to using _mesa_store_ClearTexSubImage.

When used in combination with _mesa_store_ClearTexSubImage this should provide
an implementation that works for all DRI-based drivers. However as this has
only been tested with the i965 driver it is currently only enabled there.

v2: Only enable the extension for the i965 driver instead of all DRI drivers.
    Remove an unnecessary goto. Don't require GL_ARB_framebuffer_object. Add
    some more comments.

v3: Use glClearBuffer* to avoid having to modify glClearColor and friends.
    Handle sRGB textures. Explicitly disable dithering.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
2014-07-23 11:50:38 +01:00
Neil Roberts 05b52efbc9 meta: Add a state flag for the GL_DITHER
The Meta implementation of glClearTexSubImage is going to want to ensure that
dithering is disabled so that it can get a consistent color across the whole
texture when clearing. This adds a state flag to easily save it and set it to
the default value when performing meta operations.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-23 11:50:38 +01:00
Neil Roberts df9945ca26 texstore: Add a generic implementation of GL_ARB_clear_texture
Adds an implmentation of the ClearTexSubImage driver entry point that just
maps the texture and writes the values in. The extension is not yet enabled by
default because it doesn't work with multisample textures as they don't have a
simple linear layout.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-07-23 11:50:38 +01:00
Neil Roberts fbbbf7529c mesa/main: Add generic bits of ARB_clear_texture implementation
This adds the driver entry point for glClearTexSubImage and fills in the
_mesa_ClearTexImage and _mesa_ClearTexSubImage functions that call it.

v2: Don't clear some of the images if only one of them makes an error

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-07-23 11:50:38 +01:00
Neil Roberts 2e63f91e60 teximage: Add utility func for format/internalFormat compatibility check
In texture_error_check() there was a snippet of code to check whether the
given format and internal format are basically compatible. This has been split
out into its own static helper function so that it can be used by an
implementation of glClearTexImage too.
2014-07-23 11:50:38 +01:00
Ilia Mirkin c4067acd90 mesa/main: add ARB_clear_texture entrypoints
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2014-07-23 11:50:37 +01:00
Michel Dänzer 07c65b85ea r600g/radeonsi: Use write-combined CPU mappings of some BOs in GTT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-23 18:55:50 +09:00
Michel Dänzer 37d43ebb28 winsys/radeon: Use separate caching buffer managers for VRAM and GTT
Should reduce overhead because the caching buffer manager doesn't need to
consider buffers of the wrong type.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-23 15:43:04 +09:00
Dave Airlie 2c947760ed docs/GL3.txt: update status for ARB_compute_shader
since some bits are done in tree, but nobody is working on it anymore.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-23 11:06:15 +10:00
Anuj Phogat 9548ba6e7b mesa: Don't use memcpy() in _mesa_texstore() for float depth texture data
because float depth texture data needs clamping to [0.0, 1.0]. Let the
_mesa_texstore() fallback to slower path.

Fixes Khronos GLES3 CTS tests:
shadow_execution_vert
shadow_execution_frag

V2: Move the check to _mesa_texstore_can_use_memcpy() function.
    Add check for floating point data types.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-21 18:33:29 -07:00
Kenneth Graunke 29af97f280 i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+.
We actually want to use mov(16), not mov(8).

Fixes 7 Piglit tests: ARB_sample_shading/builtin-gl-sample-mask [2468]
and ARB_sample_shading/builtin-gl-sample-mask-simple [468].

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-21 14:59:13 -07:00
Kenneth Graunke 38ffef7840 i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode.
We might be able to do this without an extra program key field, but this
is non-invasive and fixes the bug, for now.

This fixes the following Piglit tests on Broadwell:
- ARB_sample_shading/builtin-gl-sample-id 2
- ARB_sample_shading/builtin-gl-sample-position 2
- EXT_framebuffer_multisample/multisample-blit 2 color
- EXT_framebuffer_multisample/multisample-blit 2 color linear
- EXT_framebuffer_multisample/multisample-blit 2 depth
- EXT_framebuffer_multisample/no-color 2 depth combined
- EXT_framebuffer_multisample/no-color 2 depth separate
- EXT_framebuffer_multisample/no-color 2 depth single
- EXT_framebuffer_multisample/no-color 2 depth-computed combined
- EXT_framebuffer_multisample/no-color 2 depth-computed separate
- EXT_framebuffer_multisample/no-color 2 depth-computed single
- EXT_framebuffer_multisample/unaligned-blit 2 color msaa
- EXT_framebuffer_multisample/unaligned-blit 2 depth msaa

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-21 14:59:12 -07:00
Kenneth Graunke 4cf47c80fc i965: Add missing persample_shading field to brw_wm_debug_recompile.
Otherwise, the performance warning for shader recompiles will just say
"something else".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-07-21 11:19:44 -07:00
Kenneth Graunke caf8c07dd4 i965/disasm: Don't disassemble the URB complete field on Broadwell.
It doesn't exist, so attempting to read it will trigger generation
assertions in the brw_inst API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-21 11:19:17 -07:00
Kenneth Graunke 662f1ccc24 i965: Disable hex offset printing in disassembly.
Printing the hex offsets makes it basically impossible to diff assembly:
if you add even a single instruction, the entire shader shows up as a
difference.  So, every time I want to compare assembly, I have to strip
this out.

The hex offsets might be useful when debugging compaction, or when
inspecting the program cache buffer.  Since it's occasionally useful,
but uncommon, this patch disables it by default, but makes it easy to
re-enable it temporarily when the need arises.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-21 11:19:08 -07:00
Matt Turner 3e9105f7ee i965/vec4: Use foreach_inst_in_block a couple more places.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-21 10:35:41 -07:00
Matt Turner 1761671b06 i965: Replace cfg instances with calls to calculate_cfg().
Avoids regenerating it unnecessarily.

Every program in shader-db improved, none by an amount less than a 1/3
reduction. One Dota2 shader decreased from 62 -> 24.

cfg calculations:     429492 -> 193197 (-55.02%)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-21 10:35:39 -07:00
Matt Turner dd65a6d9ad i965/cfg: Add a foreach_block_and_inst macro.
Will let us abstract how the instructions are stored.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-21 10:35:38 -07:00
Matt Turner 680fe0acb3 i965: Add cfg to backend_visitor.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-21 10:35:34 -07:00
Tom Stellard b0f780345e radeonsi/compute: Add support scratch buffer support v2
The scratch buffer will be used for private memory and also register
spilling.

v2:
  - Code cleanups
2014-07-21 10:00:09 -04:00
Tom Stellard 6cc5334e42 radeonsi/compute: Bump number of user sgprs for LLVM 3.5
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-21 10:00:09 -04:00
Tom Stellard 81385f7596 winsys/radeon: Query the kernel for the number of SEs and SHs per SE
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-21 10:00:09 -04:00
Tom Stellard 245e86168a radeonsi/compute: Share COMPUTE_DBG macro with r600g
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-21 10:00:09 -04:00
Tom Stellard 9ba3105e0a radeonsi: Read rodata from ELF and append it to the end of shaders
The is used for programs that have arrays of constants that
are accessed using dynamic indices.  The shader will compute
the base address of the constants and then access them using
SMRD instructions.
2014-07-21 10:00:09 -04:00
Ian Romanick 01c21c459f glsl: Fix bad indentation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:04:04 -07:00
Ian Romanick 47e2a74a5a i965: Silence unused parameter warning
brw_fs_visitor.cpp:2400:1: warning: unused parameter 'ir' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:04:01 -07:00
Ian Romanick 22b9641edf i965: Silence 'comparison is always true' warning
The parameter is an int16_t, and we're check that it's value will fit in
16-bits.  Yes, the value that is stored in 16-bits will surely fit in
16-bits.

brw_inst.h: In function 'brw_inst_set_gen6_jump_count':
brw_inst.h:321:66: warning: comparison is always true due to limited range of data type [-Wtype-limits]
brw_inst.h:321:66: warning: comparison is always true due to limited range of data type [-Wtype-limits]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:03:57 -07:00
Ian Romanick 1946612b7d i965: Silence many unused parameter warnings
brw_inst.h: In function 'brw_inst_set_src1_vstride':
brw_inst.h:118:76: warning: unused parameter 'brw' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:03:49 -07:00
Vinson Lee f6fc807345 configure.ac: Add LLVM patch version to error message.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-18 21:33:38 -07:00
Jason Ekstrand ecd3e89b32 main/format_pack: Fix a wrong datatype in pack_ubyte_R8G8_UNORM
Before it was only storing one of the color components due to truncation.
With this patch it now properly stores all of them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-18 18:34:36 -07:00
Carl Worth 8ed24543f8 docs: Import 10.2.4 release notes
And add a news item.
2014-07-18 16:50:05 -07:00
Jason Ekstrand f14d217f5c Add support for RGBA8 and RGBX8 textures in intel_texsubimage_tiled_memcpy
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2014-07-17 18:20:09 -07:00
Jason Ekstrand 765f4b8c04 i965: Improve debug output in intelTexImage and intelTexSubimage
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2014-07-17 18:20:09 -07:00
Marek Olšák d808de31bd radeonsi: only update vertex buffers when they need updating
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák 6210d6fdc2 radeonsi: remove nr_vertex_buffers
Unused.

Also inline util_set_vertex_buffers_count and simplify it.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák 0ed0bf0696 radeonsi: move vertex buffer descriptors from IB to memory
This removes the intermediate storage (pm4 state) and generates descriptors
directly in a staging buffer.

It also reduces the number of flushes, because the descriptors no longer
take CS space.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák 1635ded828 radeonsi: add support for fine-grained sampler view updates
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák bea8f2f46d radeonsi: move si_set_sampler_views to si_descriptors.c
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák dd46841bc9 radeonsi: move sampler descriptors from IB to memory
Sampler descriptors are now represented by si_descriptors.
This also adds support for fine-grained sampler state updates and
the border color update is now isolated in a separate function.

Border colors have been broken if texturing from multiple shader stages is
used. This patch doesn't change that.

BTW, blitting already makes use of fine-grained state updates.
u_blitter uses 2 textures at most, so we only have to save 2.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák 2a7b57ad42 radeonsi: implement ARB_draw_indirect
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák 887b69a233 radeonsi: don't add info->start to the index buffer offset
info->start will be invalid once info->indirect isn't NULL, so it shouldn't
be added to ib.offset.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák 09056b352d radeonsi: use an SGPR instead of VGT_INDX_OFFSET
The draw indirect packets cannot set VGT_INDX_OFFSET, they can only set user
data SGPRs. This is the only way to support start/index_bias with indirect
drawing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák a66d934139 radeonsi: assume LLVM 3.4.2 is always present
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák 4ad682461e configure.ac: require LLVM 3.4.2 for radeon
Needed by ARB_draw_indirect.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák 3a86ca54df st/mesa,gallium: add a workaround for Unigine Heaven 4.0 and Valley 1.0
Most (all?) Unigine shaders fail to compile without this if sample shading
is advertised. This is, of course, Unigine developers' fault.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-07-18 01:58:58 +02:00
Marek Olšák b0ff18bd34 glsl: add a mechanism to allow #extension directives in the middle of shaders
This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work
with sample shading.

Also, if this is disabled, the error message at least makes sense now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-18 01:58:58 +02:00
Glenn Kennard 392c9f8dfe r600g: Implement GL_ARB_texture_gather
Only supported on evergreen and later. Currently limited
to single component textures as the hardware GATHER4
instruction ignores texture swizzles.

Piglit quick run passes on radeon 6670 with all
applicable textureGather tests, no regressions.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2014-07-18 01:58:58 +02:00
Anuj Phogat 984a02ba55 i965: Fix z_offset computation in intel_miptree_unmap_depthstencil()
The bug is triggered by using glTexSubImage2d() with GL_DEPTH_STENCIL
as base internal format and non-zero x, y offsets. Currently x, y
offsets are ignored while updating the texture image.

Fixes Khronos GLES3 CTS tests:
npot_tex_sub_image_2d
npot_tex_sub_image_3d
npot_pbo_tex_sub_image_2d
npot_pbo_tex_sub_image_2d

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-07-17 15:52:27 -07:00
Anuj Phogat 5d9f5cd35b Revert "i965: Extend compute-to-mrf pass to understand blocks of MOVs"
This reverts commit bbefb15e01.
Fixes the 11 regressions caused in framebuffer_blit tests in
Khronos GLES3 CTS tests:

Original patch reduced the instruction count but had no performance
benefits. So, it's safe to revert it without causing any performance
regressions.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-07-17 15:49:46 -07:00