v2 (Jason Ekstrand):
- Use nir_instr_rewrite_dest
- Pass the impl directly into lower_vec_to_movs_block
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Previously, we were passing the shader around, we were just calling it
"mem_ctx". However, the nir_shader is (and must be for the purposes of
mark-and-sweep) the mem_ctx so we might as well pass it around explicitly.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Currently the validation pass only validates that regs_read and
regs_written are consistent with the sizes of VGRF's. We can add more as
we find it to be useful.
Reviewed-by: Matt Turner <mattst88@gmail.com>
In certain conditions, we have to do bounds-checking in the shader for
image_load_store. The way this works for image loads is that we do a
predicated load and then emit a series of selects, one per component,
that gives us 0 or the loaded value depending on whether or not you're
in bounds. However, we were hard-coding 4 components which may not be
correct. Instead, we should be using size which is the number of
components read.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
According to the extensions table and our glext headers,
OES_compressed_ETC1_RGB8_texture is only supported in
GLES1 and GLES2. Since we may give users a GLES3 context
when a GLES2 context is requested, we also allow this
extension for GLES3 as well.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Driver vendors do this as well. The extension specification
lists GLES 1.1 or 2.0 as requirements.
Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
It's extensively used by XA for a8- and planar yuv component surfaces.
This fixes broken XA yuv blits using vgpu10 contexts.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Currently the check was incorrect as it did not consider the (unlikely)
case of fd == 0. In order to fix this we should first correctly
initialize it to -1, as the swrast implementations leave it set to zero
(props to calloc()).
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
Move the fcntl(dupfd_cloexec) to the else branch where it belongs.
Otherwise it's not immediately obvious that the code is hit, only when
an existing device is used.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
v2: [Emil Velikov]
Rework the error path to a common goto, close only if we own the fd.
v3; [Emil Velikov]
Always close the fd (we either opened the device or dup'd) (Boyan, Ian)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
At the moment if a gbm buffer is imported and the gbm buffer
has an old-style GBM_BO_FORMAT format, the import will crash,
since it's passed directly to DRI functions that expect
a fourcc format (as provided by the newer GBM_FORMAT
definitions)
This commit addresses the problem in two ways:
1) it prevents invalid formats from leading to a crash by
returning EINVAL if the image couldn't be created
2) it translates GBM_BO_FORMAT formats into the comparable
GBM_FORMAT formats.
Reference: https://bugzilla.gnome.org/show_bug.cgi?id=753531
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
We're trying to avoid a libdrm dependency in the core compiler, so let's
move the perf_debug code one level up from the brw_*_emit() helpers to
the brw_codegen_*_prog() helpers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
All other precompile functions live in the brw_<stage>.c files, make fs
follow the convention.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
This moves the compute shader code around in order to make the way the
code is split up more consistent. There should be no functional changes.
Typically we have a few files per stage:
brw_vs.c, brw_wm.c brw_gs.c:
code to drive code generation and implement precompiling and
cache search.
genX_<stage>_state.c
gen specific implementation of the state emission for the shader
stage.
The brw_*_emit() functions are all in the same files as the visitor
classes they use (with the exception of VS, which may use either vec4 or
fs).
To make compute follow this convention, we move the brw_cs_emit()
function into brw_fs.cpp. We can then rename brw_cs.cpp to brw_cs.c and
do this in C like the other similar files. Finally, move state setup
and atoms to gen7_cs_state.c.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
See similar fix for Readpixels in mesa commit 0d20790. Jason suggested
we need that for TexSubImage as well.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Curiously this has no actual effect. I think it's because the first 8
textures are bound in multiple slots for some reason. However seems
prudent to use these the same way as regular texturing, esp in the case
where there are more than 8 textures bound.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Loads constants using integer as their register type, like it is
done in FS backend.
No shader-db changes in HSW.
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.
This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.
For example,
mov vgrf7:D vgrf2:D
mov.sat m4:F vgrf7:F
is coalesced to:
mov.sat m4:D vgrf2:D
The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV (without type
conversion implied). In that case, the patch sets the register
types to the type of the final destination.
Shader-db results in HSW (only vec4 instructions shown):
total instructions in shared programs: 1754415 -> 1754416 (0.00%)
instructions in affected programs: 74 -> 75 (1.35%)
helped: 0
HURT: 1
GAINED: 0
LOST: 0
Only one extra instruction in one of the shaders, that comes from
eliminating a saturation error by preventing register coalesce.
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
The image component of the ext is a no-op since there is no image support
in gallium (yet).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
We lower gl_LocalInvocationIndex based on the extension spec formula:
gl_LocalInvocationIndex =
gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y +
gl_LocalInvocationID.y * gl_WorkGroupSize.x +
gl_LocalInvocationID.x;
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
We need to set this variable in main(), even if gl_LocalInvocationIndex
is not referenced by the shader. (It may be used by a linked shader.)
Therefore, we can't eliminate it as a dead variable.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
We initialize gl_GlobalInvocationID based on the extension spec
formula:
gl_GlobalInvocationID =
gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Also rename to _mesa_get_main_function_signature.
We will call it near the end of compilation to insert some code into
main for initializing some compute shader global variables.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
We lower gl_GlobalInvocationID based on the extension spec formula:
gl_GlobalInvocationID =
gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
We need to set this variable in main(), even if gl_GlobalInvocationID
is not referenced by the shader. (It may be used by a linked shader.)
Therefore, we can't eliminate these as dead variables.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This is to avoid needless float<->int conversions, since all
face-related computations are made on integers. Spotted by Emil
Velikov.
Reviewed-by: Brian Paul <brianp@vmware.com>
New enum to add to switch so compiler doesn't complain.
commit 1807a08e4f
Author: Ilia Mirkin <imirkin@alum.mit.edu>
AuthorDate: Thu Aug 27 23:05:03 2015 -0400
Commit: Ilia Mirkin <imirkin@alum.mit.edu>
CommitDate: Thu Sep 10 17:38:33 2015 -0400
nir: add nir_texop_texture_samples and convert from glsl
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rather than make yet another copy of channel(), let's move it into nir.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Sometimes a useful thing for compilers (or, for example, tgsi_to_nir) to
know. And pretty trivial for scan to figure this out for us.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>