Implement HEVC encode functions based on VAAPI HEVC encode interface.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Logics that related to dual instances encode should only be done for
H264, not other codecs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Add entrypoint check for HEVC to differentiate decode and encode jobs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Add HEVC picture desc, and add codec check when creating and destroying
context.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Move all H264 encode related functions into separate file. Similar to
VAAPI decode side, there will be separate file for each codec on encode
side as well.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Implement encoding of sps, pps, vps, aud, and slice headers for HEVC
based on HEVC specs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that
it can be used for different codecs. Add functions to handle picture
parameters that will be used for HEVC encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic
to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Fixes: df1d5174fc ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Obviously it would be good to have an ADD and a MUL and a signal together,
but we can even potentially have multiple signals merged, as well.
total instructions in shared programs: 100423 -> 97874 (-2.54%)
instructions in affected programs: 78812 -> 76263 (-3.23%)
We emit some MOVs to track lifetimes of payload registers, but we don't
need there to be actual MOV instructions for them.
total instructions in shared programs: 101045 -> 100423 (-0.62%)
instructions in affected programs: 37083 -> 36461 (-1.68%)
If this is an image buffer, we need to calculate the correct resource
id.
Fixes:
KHR-GL45.shader_image_size.*
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes a crash in:
KHR-GL45.texture_cube_map_array.texture_size_compute_sh.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When the disk shader cache CI testing was enabled, we started noticing
occasional failures on deqp test runs. (Mainly SNB, rarely HSW)
Before this change, when we cleared the (in memory) program cache we
reused the same bo. Since the disk shader cache quickly restores
programs, it appears that this would lead to overwrites of the older
program binaries in the in memory program cache that apparently were
still executing in some cases. If these programs were still executing,
this could cause a GPU hang.
This issue is probably not disk shader cache specific, but may have
been hidden due to the compiler taking time to recompile programs
after the cache was cleared.
v2:
* Don't add `copy` param to brw_cache_new_bo (Ken)
* Call from brw_program_cache_check_size (Ken)
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This should increase performance by reducing SDRAM bank conflicts when
crossing between UIF columns (particularly on power-of-two height
textures).
The uif_xor_disable setup is dropped, since we need to allow XOR on lower
miplevels even when level 0 is XOR. The level 0 force UIF and level 0 XOR
flags should handle setting XOR properly on imported buffers.
The alignment here means that we can't get back the padded height from the
size/stride any more, so it's now a field in the slice as well.
Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.
The VC5 HW puts A in the low bits and R in the high bits. We can't just
swizzle in the shaders because the blending HW can't pick what channel A
is in, so make a new format to match it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
swapBytes operates on bytes, not 4-bit channels, so you can't just take
non-swapBytes cases and flip the REV flag.
Avoids piglit texture-packed-formats regressions when enabling the
ABGR4444 format.
Fixes: c5a5c9a7db ("mesa/formats: add new mesa formats and their pack/unpack functions.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Move generated files from codegen/meson.build to other directories, in order
to satisfy generated include file dependencies
Add correct file lists for architecture-specific libraries.
cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Currently we always check for 3.9.0, which is pretty safe since
everything except radv work with >= 3.9 and 3.9 is pretty old at this
point. However, radv actually requires 4.0, and there is a patch for
radeonsi to do the same.
Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Currently there is not a separate option for setting the search path of
DRI drivers in meson, like there is in scons and autotools. This is an
oversight and needs to be fixed. This adds an extra option
`dri-search-path`, which will default to the value of
`dri-drivers-path`, like autotools does.
v2: - Split input list before joining.
v3: - use : instead of ; as the delimiter. The autotools help string
incorrectly says ; but the code uses :
v4: - Take list in pre : delimited form (Ilia)
- Ensure that the dri-search-path is absolute when using
dri_drivers_path
Fixes: db9788420d ("meson: Add support for configuring dri drivers directory.")
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
On travis, for OSX, python2 from homebrew is pre-installed. per [1]:
python points to the macOS system Python (with no manual PATH modification)
python2 points to Homebrew’s Python 2.7.x (if installed)
python3 points to Homebrew’s Python 3.x (if installed)
pip doesn't exist
pip2 points to Homebrew’s Python 2.7.x’s pip (if installed)
pip3 points to Homebrew’s Python 3.x’s pip (if installed)
We will end up using 'python2' for building mesa.
Just use 'pip2' instead of 'pip', as that seems to work for all platforms on
travis.
[1] https://docs.brew.sh/Homebrew-and-Python.html
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Use a '|' YAML literal block to avoid the convoluted syntax needed to put
the entire conditional on a single line.
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
An additional stub for applegl_create_context() is needed
Cannot test indirect API as it's not built on osx, currently
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
downsize_format_if_needed takes an integer as number of uploads
parameter. Hence, let's do an integer comparation instead of a boolean
check, since that is confusing.
Since we are at it, fix a couple of wrongly tabbed indents.
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
This only affects drivers that set DriverFlags.NewBlend.
v2: - fix typo advanded -> advanced
- return "enum gl_advanced_blend_mode" from
_mesa_get_advanced_blend_sh_constant
- don't call FLUSH_VERTICES twice
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The old one generates useless instructions in there, found while
comparing geometry shaders between RadeonSI and RADV.
This improves all Vulkan demos that use geometry shaders, +4%
for deferredshadows, +9% for viewportarray, +7% for
geometryshader on Polaris10.
This seems to also improve DOW3 a little bit (+1%).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
I think the cp packets can be made work, but I think it might
need a kernel change, so for now just do the worst thing.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
By the looks of it it seems hemlock is treated separately to cypress, but
certainly it won't need the stack workarounds cedar/redwood (and
seemingly every other eg chip except cypress/juniper) need.
(Discovered by accident.)
Acked-by: Alex Deucher <alexander.deucher@amd.com>
This passes the CTS and piglit tests.
This also disable sb for helper invocations until it doesn't
mess up the VPM flags.
Thanks to Ilia and Glenn for advice, and Roland for working
out the working evergreen path.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
deqp does not allow any KHX extensions, and since deqp is included
in android-cts, android does not allow any khx extensions.
So disable VK_KHX_multiview on android.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.0 <mesa-stable@lists.freedesktop.org>
Using the newly introduced VAO array maps, we can
simplify vbo_bind_vertex_list.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Using the newly introduced VAO array maps, we can
simplify vbo_exec_bind_arrays.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Using the newly introduced VAO state variable, we can
simplify recalculate_input_bindings.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Instead of each context having its own map instance for
this purpose, use a global static const map.
v2: s,unsigned char,GLubyte,g
s,_VP_MODE_MAX,VP_MODE_MAX,g
Change comment style.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>