Commit Graph

66407 Commits

Author SHA1 Message Date
Iago Toral Quiroga d2719b6e4f glsl: lower SSBO atomic intrinsics
The first argument to SSBO atomics is a reference to a SSBO buffer variable
so we want to compute its block index and offset and provide these values
to an internal version of the intrinsic that takes them instead of the
buffer variable reference.

v2:
- Support single components of integer vectors to be passed in as arguments.
- Get interface packing information from interface's type.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez da659087b9 glsl: use ir_rvalue instead of ir_dereference in auxiliary functions
In a later commit we will need to handle ir_swizzle nodes too, which are
not an ir_dereference. That can happen, for example, when we pass a
component of an integer vector as argument to any of the SSBO atomic
functions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga ea0a1f5beb glsl: Add atomic functions from ARB_shader_storage_buffer_object
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga 2cacebaad3 glsl: Rename atomic counter functions
Shader Storage Buffer Object will add new atomic functions that are not
associated with counters, so better have atomic counter-specific functions
explicitly include the word "counter" in their names.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 586142658e glsl: atomic counters can be declared as buffer-qualified variables
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga 475d9c32d1 nir/glsl_to_nir: ignore an instruction's dest if it hasn't any
Reviewed-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga e3f9c7829c i965/nir/vec4: Implement nir_intrinsic_load_ssbo
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga 5b186aafe7 i965/nir/fs: Implement nir_intrinsic_load_ssbo
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga e59ae238b6 nir: Implement __intrinsic_load_ssbo
v2:
- Fix ssbo loads with boolean variables.

v3:
- Simplify the changes (Kristian)

Reviewed-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 3e70c968de nir: modify the instruction insertion in nir_visitor::visit(ir_call *ir)
This patch moves nir_instr_insert_after_cf_list call into each case
in the intrinsics switch at nir_visitor::visit(ir_call *ir) and
define a nir_dest variable which will be used when handling
ir->return_deref after the switch.

This patch simplifies the code for nir_intrinsic_load_ssbo
implementation changes we are going to do next.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga 922b3d1bb1 i965/nir/vec4: Implement nir_intrinsic_store_ssbo
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga 337dad8cee i965/nir/fs: Implement nir_intrinsic_store_ssbo
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Iago Toral Quiroga 9bb7d9ecf8 nir: Implement __intrinsic_store_ssbo
v2 (Connor):
 - Make the STORE() macro take arguments for the extra sources (and their
   size) and any extra indices required.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Francisco Jerez f17c6b9066 i965/vec4: Import surface message builder functions.
Implement helper functions that can be used to construct and send
untyped and typed surface read, write and atomic messages to the
shared dataport unit.

v2: Split from the FS implementation.
v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Francisco Jerez d5503ce39f i965/vec4: Import helpers to convert vectors into arrays and back.
These functions handle the conversion of a vec4 into the form expected
by the dataport unit in message and message return payloads.  The
conversion is not always trivial because some messages don't support
SIMD4x2 for some generations, in which case a strided copy may be
necessary.

v2: Split from the FS implementation.
v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Francisco Jerez 402cb7ce13 i965/vec4: Introduce VEC4 IR builder.
See "i965/fs: Introduce FS IR builder." for the rationale.

v2: Drop scalarizing VEC4 builder.
v3: Take a backend_shader as constructor argument.  Improve handling
    of debug annotations and execution control flags.  Rename "instr"
    variable.  Initialize cursor to NULL by default and add method to
    explicitly point the builder at the end of the program.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 203cd1bf28 glsl: shader storage blocks use different max block size values than uniforms
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez eb9a9b62b1 glsl: ignore buffer variables when counting uniform components
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 138e4ae8ae glsl: number of active shader storage blocks must be within allowed limits
Notice that we should differentiate between shader storage blocks and
uniform blocks, since they have different limits.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez a7b4ab45d0 glsl: a shader storage buffer must be smaller than the maximum size allowed
Otherwise, generate a link time error as per the
ARB_shader_storage_buffer_object spec.

v2:
- Fix error message (Jordan)

v3:
- Move std140_size() changes to its own patch (Kristian)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez e854a98001 glsl: add std430 interface packing support to ssbo related operations
v2:
- Get interface packing information from interface's type, not the
  variable type.
- Simplify is_std430 condition in emit_access() for readability (Jordan)
- Add a commment explaing why array of three-component vector case is
  different in std430 than the rest of cases.
- Add calls to std430_array_stride().

v3:
- Simplify size_mul change for std430's case (Jordan)
- Fix commit log lines length (Jordan)
- Pass 'packing' instead of 'is_std430' to emit_access() (Kristian)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 1be180b941 glsl: Add std430 support to program_resource_visitor's member functions
They are used to calculate the offset, array stride of uniform/shader
storage buffer variables. Take into account this info to get the right
value for std430.

v2:
- Fix commit log line length and indention. (Jordan)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 8f0167c65b glsl: Add parser/compiler support for std430 interface packing qualifier
v2:
- Fix a missing check in has_layout()

v3:
- Mention shader storage block in error message for layout qualifiers
  (Kristian).

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez 35476c2bae glsl: Add std430 related member functions to glsl_type class
They are used to calculate size, base alignment and array stride values
for a glsl_type following std430 rules.

v2:
- Paste OpenGL 4.3 spec wording as it mentions stride of array. (Jordan)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez a40f917c4b glsl: allow default qualifiers for shader storage block definitions
This kind of definitions:

    layout(xxx) buffer;

was not supported by commit 84fc5fece0.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 3763a0e0a7 glsl: Move interface block processing to glsl_parser_extras.cpp
No functional changes.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 9c1f10b1bc glsl: ignore default qualifier declarations when checking for duplicate layout qualifiers
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 130031168d glsl: layout qualifier can appear more than once since OpenGL 4.20
Also if GL_ARB_shading_language_420pack extension is enabled.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 5bb5eeea00 i965/wm: surfaces should have the API buffer size, not the drm buffer size
The returned drm buffer object has a size multiple of 4096 but that should not
be exposed to the API user, which is working with a different size.

As far as I can see this problem is only visible in the calculation of the
length of unsized arrays used in SSBOs, as the implementation of this needs
to query the underlying buffer size via a message.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez eaa6f01c8d i965/wm: emit null buffer surfaces when null buffers are attached
Otherwise we can expect odd things to happen if, for example, we ask
for the size of the attached buffer from shader code, since that
might query this value from the surface we uploaded and get random
results.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez f5dd2c1822 i965/fs/nir: implement nir_intrinsic_get_buffer_size
v2:
- Remove inst->regs_written assignment as the instruction only
  writes to one register.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez b23eb643eb i965/fs: Implement FS_OPCODE_GET_BUFFER_SIZE
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 65d7f5fe9f i965/vec4/nir: implement nir_intrinsic_get_buffer_size
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 6485880232 i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE
Notice that Skylake needs to include a header in the sampler message
so it will need some tweaks to work there.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 003ce30e36 nir: Implement ir_unop_get_buffer_size
This is how backends provide the buffer size required to compute
the size of unsized arrays in the previous patch

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 750c694474 glsl: implement unsized array length
v2:
- Reduce the number of lines over 80 character line width
  limit. (Thomas Hellan)

v3:
- Inject the formula to compute the array length in the IR, backends
  only need to provide the buffer size (Curro)
- Create an auxiliary function to simplify code (Jordan Justen)
- Rename variables (Jordan Justen)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 273f61a005 glsl: Add parser/compiler support for unsized array's length()
The unsized array length is computed with the following formula:

array.length() =
   max((buffer_object_size - offset_of_array) / stride_of_array, 0)

Of these, only the buffer size needs to be provided by the backends, the
frontend already knows the values of the two other variables.

This patch identifies the cases where we need to get the length of an
unsized array, injecting ir_unop_ssbo_unsized_array_length expressions
that will be lowered (in a later patch) to inject the formula mentioned
above.

It also adds the ir_unop_get_buffer_size expression that drivers will
implement to provide the buffer length.

v2:
- Do not define a triop that will force backends to implement the
  entire formula, they should only need to provide the buffer size
  since the other values are known by the frontend (Curro).

v3:
- Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead
  of using state->ARB_shader_storage_buffer_object_enable (Tapani).

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 1440d2a683 glsl: Add unsized array support to glsl_type::std140_size()
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez 68f5a4e6d2 glsl: fix indention in glsl_types.cpp
No functional changes.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez f3f64cd0c4 glsl: add support for unsized arrays in shader storage blocks
They only can be defined in the last position of the shader
storage blocks.

When an unsized array is used in different shaders, it might be
converted in different sized arrays, avoid get a linker error
in that case.

v2:
- Rework error condition and error messages (Timothy Arceri)

v3:
- Move OpenGL ES check to its own patch.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez f45d39f6af glsl: return error if unsized arrays are found in OpenGL ES
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Iago Toral Quiroga 6335c79236 i965/fs: Do not split buffer variables
Buffer variables are the same as uniforms, only that read/write, so we want
the same treatment.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Iago Toral Quiroga 2773a7cf1d i965: handle visiting of ir_var_shader_storage variables
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Iago Toral Quiroga 37da6a2acd i965: Upload Shader Storage Buffer Object surfaces
Since these are a special kind of UBOs we emit them together reusing the
same infrastructure, however, we use a RAW surface so we can reuse
existing untyped read/write/atomic messages which include a pixel mask
header that we need to set to obtain correct behavior with helper
invocations of the fragment shader.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Iago Toral Quiroga bdbabc57e3 i965: Set MaxShaderStorageBuffers for compute shaders
v2:
- Set it after the driver's MaxShaderStorageBuffers value assignment.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Samuel Iglesias Gonsalvez 36f392c4ef i965: set ARB_shader_storage_buffer_object related constant values
v2:
- Add tessellation shader constants assignment

v3:
- Set MaxShaderStorageBufferBindings to 36.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Iago Toral Quiroga dfdeb94a5a i965: Implement DriverFlags.NewShaderStorageBuffer
We use the same dirty state for SSBOs and UBOs because they share the
same infrastructure.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Iago Toral Quiroga 332ff009ff i965: Use 64-byte offset alignment for shader storage buffers
This should be a cacheline (64 bytes) so that we can safely have the
CPU and GPU writing the same SSBO on non-cachecoherent systems (our
Atom CPUs). With UBOs, the GPU never writes, so there's no
problem. For an SSBO, the GPU and the CPU can be updating disjoint
regions of the buffer simultaneously and that will break if the
regions overlap the same cacheline.

v2:
- Use cacheline size (64 bytes) instead of 16 bytes (Kristian).
- Update commit log and add a comment in the code explaining
  why we use cacheline size (Ben).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Samuel Iglesias Gonsalvez 4cf908f9cb mesa: set MAX_SHADER_STORAGE_BUFFERS to 16.
v2:
- Set the value to 16 and drop the comment. (Kristian)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-25 08:39:20 +02:00
Tapani Pälli 4639cea292 glsl: add packed varyings to program resource list
This makes sure that user is still able to query properties about
variables that have gotten packed by lower_packed_varyings pass.

Fixes following OpenGL ES 3.1 test:
   ES31-CTS.program_interface_query.separate-programs-vertex

v2: fix 'name included in packed list' check (Ilia Mirkin)
v3: iterate over instances of name using strtok_r (Ilia Mirkin)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
2015-09-25 08:14:41 +03:00
Tapani Pälli a6b55beb78 mesa: add packed_varyings list to gl_shader
This is required to store information about packed varyings, currently
these variables get lost and cannot be retrieved later in sensible way
for program interface queries. List will be utilized by next patch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
2015-09-25 08:05:59 +03:00
Jordan Justen ebbe6cdad7 i965/cs: Implement DispatchComputeIndirect support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-24 19:15:13 -07:00
Jordan Justen d11d018ce3 mesa/cs: Implement glDispatchComputeIndirect
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-24 19:15:13 -07:00
Jordan Justen 12cf91db02 mesa/cs: Support GL_DISPATCH_INDIRECT_BUFFER
v2:
 * Use _mesa_has_compute_shaders (Ilia)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-24 19:15:13 -07:00
Jordan Justen 4a1ba7e6bd mesa/cs: Add _mesa_validate_DispatchCompute
Move API validation to _mesa_validate_DispatchCompute in
api_validate.c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-09-24 19:15:13 -07:00
Roland Scheidegger 19604d30e1 mesa: fix mipmap generation for immutable, compressed textures
If the immutable compressed texture didn't have the full mip pyramid,
this didn't work, because it tried to generate mip levels for non-existing
levels. _mesa_prepare_mipmap_level() would correctly handle this by returning
FALSE if the mip level didn't exist, however we actually created the
non-existing mip level right before that because we used _mesa_get_tex_image()
before calling _mesa_prepare_mipmap_level(). It would then proceed to crash
(we allocated the mip level, which is a bad idea on an immutable texture,
but didn't initialize the values, leading to assertion failures or segfaults).
Fix this by using _mesa_select_tex_image() instead and call it after
_mesa_prepare_mipmap_level(), as that function will allocate missing mip levels
for non-immutable textures already.
This fixes a (2 year old) crash with astromenace which was hack-fixed in ubuntu
packages instead: http://bugs.debian.org/718680 (I guess most apps do full mip
chains - I believe this app not doing it is actually unintentional, always one
level less than full mip chain...).

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-25 00:06:10 +02:00
Matt Turner d6bb46bbe8 glsl: Expose gl_MaxTess{Control,Evaluation}AtomicCounters.
... with only ARB_shader_atomic_counters.

I expected to see interactions with ARB_tessellation_shader in the
ARB_shader_atomic_counters spec, but they do not exist. It seems that we
should unconditionally expose these variables in the presence of
ARB_shader_atomic_counters:

   gl_MaxTessControlAtomicCounters
   gl_MaxTessEvaluationAtomicCounters

This partially reverts commit da7adb99e8. The commit also affected
gl_MaxTessControlImageUniforms and gl_MaxTessEvaluationImageUniforms
similarly but the ARB_shader_image_load_store spec does list an
interaction with ARB_tessellation_shader.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92095
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-09-24 12:15:47 -07:00
Alejandro Piñeiro 7fee23569b i965/vec4: check swizzle before discarding a uniform on a 3src operand
Without this commit, copy propagation is discarded if it involves
a uniform with an instruction that has 3 sources. But 3 sourced
instructions can access scalar values.

For example, this is what vec4_visitor::fix_3src_operand() is already
doing:

   if (src.file == UNIFORM && brw_is_single_value_swizzle(src.swizzle))
      return src;

Shader-db results (unfiltered) on NIR:
total instructions in shared programs: 6259650 -> 6241985 (-0.28%)
instructions in affected programs:     812755 -> 795090 (-2.17%)
helped:                                7930
HURT:                                  0

Shader-db results (unfiltered) on IR:
total instructions in shared programs: 6445822 -> 6441788 (-0.06%)
instructions in affected programs:     296630 -> 292596 (-1.36%)
helped:                                2533
HURT:                                  0

v2:
- Updated commit message, using Matt Turner suggestions
- Move the check after we've created the final value, as Jason
  Ekstrand suggested
- Clean up the condition

v3:
- Move the check back to the original place, to keep things
  tidy, as suggested by Jason Ekstrand

v4:
- Fixed missing is_single_value_swizzle() as pointed by Jason Ekstrand

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-09-24 21:12:53 +02:00
Mauro Rossi 1d040160f8 android: radeonsi: fix sid_tables.h missing LOCAL_MODULE_CLASS
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-24 20:05:41 +02:00
Benjamin Bellec ebcc886d87 gallium/radeon: remove the percentage symbol from HUD temperature
The HUD adds '%' if max == 100.

Signed-off-by: Benjamin Bellec <b.bellec@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-24 19:54:50 +02:00
Marek Olšák 7bbce21e45 gallium/u_blitter: handle allocation failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák ae418a7b56 radeonsi: handle dummy constant buffer allocation failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák b737d9c1dc radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák d556346b35 radeonsi: skip drawing if updating the scratch buffer fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 1f99b0be7e radeonsi: skip drawing if PS fails to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 237d7cccce radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 9b6d9dd7d8 radeonsi: handle fixed-func TCS shader create failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 5dbadb0257 radeonsi: handle shader precompile failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 263f5a2cf9 radeonsi: skip drawing if GS ring allocations fail
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:43 +02:00
Marek Olšák 22d3ccf5a8 radeonsi: skip drawing if the tess factor ring allocation fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák 5c219ab552 radeonsi: add malloc fail paths to si_create_shader_state
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák 394d67a58f radeonsi: report alloc failure from si_shader_binary_read
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák dea834e639 gallium/radeon: add a fail path for depth MSAA texture readback
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák f95e695059 gallium/radeon: handle buffer alloc failures in r600_draw_rectangle
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák 282b378012 gallium/radeon: handle buffer_map staging buffer failures better
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák cd27ff6a0f radeonsi: handle constant buffer alloc failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák 29dff6f676 radeonsi: handle index buffer alloc failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-09-24 19:51:42 +02:00
Marek Olšák f3a0819533 st/mesa: fix front buffer regression after dropping st_validate_state in Blit
Broken by: d082c53249
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92072

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-09-24 19:51:42 +02:00
Kristian Høgsberg Kristensen 21c1c7ff81 wayland: Add copyright notice for wayland-egl.c
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-09-24 10:51:10 -07:00
Kristian Høgsberg Kristensen 2ea16966ae i965: Respect stride and subreg_offset for ATTR registers
When we assign hw regs to attributes, we don't incorporate the stride
and subreg_offset from the fs_reg. It's rarely used, but the integer
multiplication lowering uses unusual stride and subreg_offset
combination breaks when one source is an attribute.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-09-24 10:17:27 -07:00
Brian Paul 200aee4247 mesa: rework Driver.CopyImageSubData() and related code
Previously, core Mesa's _mesa_CopyImageSubData() created temporary textures
to wrap renderbuffer sources/destinations.  This caused a bit of a mess in
the Mesa/gallium state tracker because we had to basically undo that
wrapping.

Instead, change ctx->Driver.CopyImageSubData() to take both gl_renderbuffer
and gl_texture_image src/dst pointers (one being null, the other non-null)
so the driver can handle renderbuffer vs. texture as needed.

For the i965 driver, we basically moved the code that wrapped textures
around renderbuffers from copyimage.c down into the met and driver code.

The old code in copyimage.c also made some questionable calls to
_mesa_BindTexture(), etc. which weren't undone at the end.

v2 (Jason Ekstrand): Rework the intel bits
v3 (Brian Paul): Update the temporary st_CopyImageSubData() function.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2015-09-24 07:52:42 -06:00
Thomas Hellstrom c8cb5ed93c st/xa: Fixups for PIPE_FORMAT_R8_UNORM A8 usage v2.
Check for PIPE_FORMAT_R8_UNORM when setting up the copy shader.
Also re-enable the dest alpha blending with A8 destination that
actually turned out to be correct.

Verified using rendercheck that the composite operators
overreverse, in, out, atop, atopreverse and xor seem to work fine
with a8 destiation.

v2: Fix a copy-paste error.

Reported-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-09-24 04:47:48 -07:00
Ilia Mirkin 1614c39a8f st/mesa: keep track of saturated writes when eliminating dead code
It doesn't matter whether a write is saturated or not, in another
implementation it might even have been a separate opcode. This code was
most likely copied from the copy-propagation pass (where one does have
to distinguish saturation).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-09-24 00:19:55 -04:00
Timothy Arceri 827d794834 glsl: correctly detect inactive UBO arrays
Previously the code was trying to get the packing type from the array not the
interface.

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-09-24 10:07:42 +10:00
Ilia Mirkin 71e187430c i965: add ARB_texture_barrier support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 15:49:54 -04:00
Kenneth Graunke 31a36ffbc8 i965/gs: Fix extra level of indentation left by the previous commit.
I left a bunch of code indented a level in the previous patch to make
the diff easier to read.  But now we should fix that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 11:00:00 -07:00
Kenneth Graunke df31c1850d i965/gs: Use new NIR intrinsics.
By performing the vertex counting in NIR, we're able to elide a ton of
useless safety checks around every EmitVertex() call:

total instructions in shared programs: 3952 -> 3720 (-5.87%)
instructions in affected programs:     3491 -> 3259 (-6.65%)
helped:                                11
HURT:                                  0

Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621)
on Haswell GT3e at 1024x768.

This should also make it easier to implement Broadwell's "Static Vertex
Count" feature someday.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 11:00:00 -07:00
Kenneth Graunke 542d40d698 nir: Add new GS intrinsics that maintain a count of emitted vertices.
This patch also introduces a lowering pass to convert the simple GS
intrinsics to the new ones.  See the comments above that for the
rationale behind the new intrinsics.

This should be useful for i965; it's a generic enough mechanism that I
could see other drivers potentially using it as well, so I don't feel
too bad about putting it in the generic code.

v2:
- Use nir_after_block_before_jump for the cursor (caught by Jason
  Ekstrand - I'd mistakenly used nir_after_block when rebasing this
  code onto the new NIR control flow API).
- Remove the old emit_vertex intrinsic at the end, rather than in
  the middle (requested by Jason).
- Use state->... directly rather than locals (requested by Jason).
- Report progress from nir_lower_gs_intrinsics() (requested by me).
- Remove "Authors:" section from file comment (requested by
  Michael Schellenberger Costa).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 11:00:00 -07:00
Kenneth Graunke 0a040975ec nir: Add unit tests for control flow graphs.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
2015-09-23 11:00:00 -07:00
Kenneth Graunke fbaa1b19d7 nir/cf: Fix dominance metadata in the dead control flow pass.
The NIR control flow modification API churns the block structure,
splitting blocks, stitching them back together, and so on.  Preserving
information about block dominance is hard (and probably not worthwhile).

This patch makes nir_cf_extract() throw away all metadata, like we do
when adding/removing jumps.

We then make the dead control flow pass compute dominance information
right before it uses it.  This is necessary because earlier work by the
pass may have invalidated it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 11:00:00 -07:00
Kenneth Graunke 6560838703 nir/cf: Fix unlink_block_successors to actually unlink the second one.
Calling unlink_blocks(block, block->successors[0]) will successfully
unlink the first successor, but then will shift block->successors[1]
down to block->successor[0].  So the successors[1] != NULL check will
always fail.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 11:00:00 -07:00
Kenneth Graunke 024e5ec977 nir/cf: Alter block successors before adding a fake link.
Consider the case of "while (...) { break }".  Or in NIR:

        block block_0 (0x7ab640):
        ...
        /* succs: block_1 */
        loop {
                block block_1:
                /* preds: block_0 */
                break
                /* succs: block_2 */
        }
        block block_2:

Calling nir_handle_remove_jump(block_1, nir_jump_break) will remove the break.
Unfortunately, it would mangle the predecessors and successors.

Here, block_2->predecessors->entries == 1, so we would create a fake
link, setting block_1->successors[1] = block_2, and adding block_1 to
block_2's predecessor set.  This is illegal: a block cannot specify the
same successor twice.  In particular, adding the predecessor would have
no effect, as it was already present in the set.

We'd then call unlink_block_successors(), which would delete the fake
link and remove block_1 from block_2's predecessor set.  It would then
delete successors[0], and attempt to remove block_1 from block_2's
predecessor set a second time...except that it wouldn't be present,
triggering an assertion failure.

The fix appears to be simple: simply unlink the block's successors and
recreate them to point at the correct blocks first.  Then, add the fake
link.  In the above example, removing the break would cause block_1 to
have itself as a successor (as it becomes an infinite loop), so adding
the fake link won't cause a duplicate successor.

v2: Add comments (requested by Connor Abbott) and fix commit message.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 10:59:59 -07:00
Kenneth Graunke 0991b2eb35 nir/cf: Conditionally do block_add_normal_succs() in unlink_jump();
There is a bug where we mess up predecessors/successors due to the
ordering of unlinking/recreating edges/adding fake edges.  In order to
fix that, I need everything in one routine.

However, calling block_add_normal_succs() isn't safe from
cleanup_cf_node() - it would crash trying to insert phi undefs.
So unfortunately I need to add a parameter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 10:59:59 -07:00
Kenneth Graunke 9674c76c0e nir/cf: Don't break outer-block successors in split_block_beginning().
Consider the following NIR:

   block block_0;
   /* succs: block_1 block_2 */
   if (...) {
      block block_1;
      ...
   } else {
      block block_2;
   }

Calling split_block_beginning() on block_1 would break block_0's
successors:  link_block() sets both successors of a block, so calling
link_block(block_0, new_block, NULL) would throw away the second
successor, leaving only /* succ: new_block */.  This is invalid: the
block before an if statement must have two successors.

Changing the call to link_block(pred, new_block, pred->successors[0])
would correctly leave both successors in place, but because unlink_block
may shift successor[1] to successor[0], it may not preserve the original
order.  NIR maintains a convention that successor[0] must point to the
"then" block, while successor[1] points to the "else" block, so we need
to take care to preserve this ordering.

This patch creates a new function that swaps out one successor for
another, preserving the ordering.  It then uses this to fix the issue.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 10:59:59 -07:00
Kenneth Graunke e2637db618 nir/cf: Make a helper function for removing a predecessor.
I need to do this in a second place, and I'd rather make a helper
function than cut and paste the code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 10:59:59 -07:00
Kenneth Graunke 6a67ede6b3 nir: Validate that a block doesn't have two identical successors.
This is invalid, and causes disasters if we try to unlink successors:
removing the first will work, but removing the second copy will fail
because the block isn't in the successor's predecessor set any longer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-09-23 10:59:59 -07:00
Jason Ekstrand 8dcbca5957 nir/lower_vec_to_movs: Don't emit unneeded movs
It's possible that, if a vecN operation is involved in a phi node, that we
could end up moving from a register to itself.  If swizzling is involved,
we need to emit the move but.  However, if there is no swizzling, then the
mov is a no-op and we might as well not bother emitting it.

Shader-db results on Haswell:

   total instructions in shared programs: 6262536 -> 6259558 (-0.05%)
   instructions in affected programs:     184780 -> 181802 (-1.61%)
   helped:                                838
   HURT:                                  0

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-09-23 10:12:39 -07:00
Jason Ekstrand 65e80ce5b5 nir/lower_vec_to_movs: Properly handle source modifiers on vecN ops
I don't know of any piglit tests that are currently broken.  However, there
is nothing stopping a vecN instruction from getting source modifiers and
lower_vec_to_movs is run after we lower to source modifiers.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-09-23 10:12:39 -07:00
Ville Syrjälä aae0c88797 i915: Make hw_prim[] const
The table used to map the GL primitive to the hw primitive never
changes so make it const.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-09-23 09:57:46 -07:00
Ville Syrjälä 84fec757de t_dd_dmatmp: Make the render_tab[]s const
These tables hold function pointers and they never change so
make them const.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-09-23 09:57:46 -07:00