Shader Storage Buffer Object will add new atomic functions that are not
associated with counters, so better have atomic counter-specific functions
explicitly include the word "counter" in their names.
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This patch moves nir_instr_insert_after_cf_list call into each case
in the intrinsics switch at nir_visitor::visit(ir_call *ir) and
define a nir_dest variable which will be used when handling
ir->return_deref after the switch.
This patch simplifies the code for nir_intrinsic_load_ssbo
implementation changes we are going to do next.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2 (Connor):
- Make the STORE() macro take arguments for the extra sources (and their
size) and any extra indices required.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Implement helper functions that can be used to construct and send
untyped and typed surface read, write and atomic messages to the
shared dataport unit.
v2: Split from the FS implementation.
v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
These functions handle the conversion of a vec4 into the form expected
by the dataport unit in message and message return payloads. The
conversion is not always trivial because some messages don't support
SIMD4x2 for some generations, in which case a strided copy may be
necessary.
v2: Split from the FS implementation.
v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
See "i965/fs: Introduce FS IR builder." for the rationale.
v2: Drop scalarizing VEC4 builder.
v3: Take a backend_shader as constructor argument. Improve handling
of debug annotations and execution control flags. Rename "instr"
variable. Initialize cursor to NULL by default and add method to
explicitly point the builder at the end of the program.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Notice that we should differentiate between shader storage blocks and
uniform blocks, since they have different limits.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Otherwise, generate a link time error as per the
ARB_shader_storage_buffer_object spec.
v2:
- Fix error message (Jordan)
v3:
- Move std140_size() changes to its own patch (Kristian)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2:
- Get interface packing information from interface's type, not the
variable type.
- Simplify is_std430 condition in emit_access() for readability (Jordan)
- Add a commment explaing why array of three-component vector case is
different in std430 than the rest of cases.
- Add calls to std430_array_stride().
v3:
- Simplify size_mul change for std430's case (Jordan)
- Fix commit log lines length (Jordan)
- Pass 'packing' instead of 'is_std430' to emit_access() (Kristian)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
They are used to calculate the offset, array stride of uniform/shader
storage buffer variables. Take into account this info to get the right
value for std430.
v2:
- Fix commit log line length and indention. (Jordan)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2:
- Fix a missing check in has_layout()
v3:
- Mention shader storage block in error message for layout qualifiers
(Kristian).
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
They are used to calculate size, base alignment and array stride values
for a glsl_type following std430 rules.
v2:
- Paste OpenGL 4.3 spec wording as it mentions stride of array. (Jordan)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This kind of definitions:
layout(xxx) buffer;
was not supported by commit 84fc5fece0.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Also if GL_ARB_shading_language_420pack extension is enabled.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
The returned drm buffer object has a size multiple of 4096 but that should not
be exposed to the API user, which is working with a different size.
As far as I can see this problem is only visible in the calculation of the
length of unsized arrays used in SSBOs, as the implementation of this needs
to query the underlying buffer size via a message.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Otherwise we can expect odd things to happen if, for example, we ask
for the size of the attached buffer from shader code, since that
might query this value from the surface we uploaded and get random
results.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2:
- Remove inst->regs_written assignment as the instruction only
writes to one register.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Notice that Skylake needs to include a header in the sampler message
so it will need some tweaks to work there.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This is how backends provide the buffer size required to compute
the size of unsized arrays in the previous patch
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2:
- Reduce the number of lines over 80 character line width
limit. (Thomas Hellan)
v3:
- Inject the formula to compute the array length in the IR, backends
only need to provide the buffer size (Curro)
- Create an auxiliary function to simplify code (Jordan Justen)
- Rename variables (Jordan Justen)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
The unsized array length is computed with the following formula:
array.length() =
max((buffer_object_size - offset_of_array) / stride_of_array, 0)
Of these, only the buffer size needs to be provided by the backends, the
frontend already knows the values of the two other variables.
This patch identifies the cases where we need to get the length of an
unsized array, injecting ir_unop_ssbo_unsized_array_length expressions
that will be lowered (in a later patch) to inject the formula mentioned
above.
It also adds the ir_unop_get_buffer_size expression that drivers will
implement to provide the buffer length.
v2:
- Do not define a triop that will force backends to implement the
entire formula, they should only need to provide the buffer size
since the other values are known by the frontend (Curro).
v3:
- Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead
of using state->ARB_shader_storage_buffer_object_enable (Tapani).
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
They only can be defined in the last position of the shader
storage blocks.
When an unsized array is used in different shaders, it might be
converted in different sized arrays, avoid get a linker error
in that case.
v2:
- Rework error condition and error messages (Timothy Arceri)
v3:
- Move OpenGL ES check to its own patch.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Buffer variables are the same as uniforms, only that read/write, so we want
the same treatment.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Since these are a special kind of UBOs we emit them together reusing the
same infrastructure, however, we use a RAW surface so we can reuse
existing untyped read/write/atomic messages which include a pixel mask
header that we need to set to obtain correct behavior with helper
invocations of the fragment shader.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2:
- Set it after the driver's MaxShaderStorageBuffers value assignment.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
We use the same dirty state for SSBOs and UBOs because they share the
same infrastructure.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This should be a cacheline (64 bytes) so that we can safely have the
CPU and GPU writing the same SSBO on non-cachecoherent systems (our
Atom CPUs). With UBOs, the GPU never writes, so there's no
problem. For an SSBO, the GPU and the CPU can be updating disjoint
regions of the buffer simultaneously and that will break if the
regions overlap the same cacheline.
v2:
- Use cacheline size (64 bytes) instead of 16 bytes (Kristian).
- Update commit log and add a comment in the code explaining
why we use cacheline size (Ben).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
v2:
- Set the value to 16 and drop the comment. (Kristian)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This makes sure that user is still able to query properties about
variables that have gotten packed by lower_packed_varyings pass.
Fixes following OpenGL ES 3.1 test:
ES31-CTS.program_interface_query.separate-programs-vertex
v2: fix 'name included in packed list' check (Ilia Mirkin)
v3: iterate over instances of name using strtok_r (Ilia Mirkin)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
This is required to store information about packed varyings, currently
these variables get lost and cannot be retrieved later in sensible way
for program interface queries. List will be utilized by next patch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>