Commit Graph

83572 Commits

Author SHA1 Message Date
Jason Ekstrand 561be50a1a spirv/nir: Move opcode selection higher up in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand c8da91aa24 anv/image: Assert that the image format is actually supported
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 34a39e91ba spirv/nir: Don't increment coord_components for array lod queries
For lod query instructions, we really don't care whether or not the sampler
is an array type because that doesn't factor into the LOD.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 67b7d876e4 i965: Get rid of the do_lower_unnormalized_offsets pass
We can do this in NIR now.  No need to keep a GLSL pass lying around for
it.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 9f32721f86 i965/nir: Enable NIR lowering of txf and rect offsets
This fixes the following piglit tests on gen6+:

tex-miplevel-selection textureProjGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRectShadow
tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4
tex-miplevel-selection textureProjGradOffset 2DRectShadow

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand d9156efc52 nir/lower_tex: Add support for lowering coordinate offsets
On i965, we can't support coordinate offsets for texelFetch or rectangle
textures.  Previously, we were doing this with a GLSL pass but we need to
do it in NIR if we want those workarounds for SPIR-V.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:53 -07:00
Jason Ekstrand 843fc8f3e7 nir/lower_tex: Add some helpers for working with tex sources
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:53 -07:00
Jason Ekstrand 09135cd55a nir: Add a helper for determining the type of a texture source
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 3c0077a6ec anv/pipeline: Set binding_table.gather_texture_start
This should get texture gather working on gen8+ and mostly working on gen7.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 95e9d58bdb spirv/nir: Properly handle gather components
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 7c7acf53b2 spirv/nir: Add support for shadow samplers that return vec4
While SPIR-V technically doesn't support "old style" shadow, the
shadow-compare gather instruction does return a vec4 so we need to be able
to set the old_style_shadow bit in NIR.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 2ddefd03b7 spirv/nir: Fix some texture opcode asserts
We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an
explicit-LOD texturing instruction.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Samuel Pitoiset 3f5cf8c488 nv50/ir: allow to swap sources for OP_SUB
This allows the load-propagation pass to swap the sources in presence
of immediate values.

Maxwell (GM107):

total instructions in shared programs :1928187 -> 1927634 (-0.03%)
total gprs used in shared programs    :330741 -> 330154 (-0.18%)
total local used in shared programs   :28032 -> 28032 (0.00%)

                local        gpr       inst      bytes
    helped           0         271         425         425
      hurt           0           0         194         194

Fermi (GF114):

total instructions in shared programs :2334474 -> 2333829 (-0.03%)
total gprs used in shared programs    :380934 -> 380215 (-0.19%)
total local used in shared programs   :33304 -> 33264 (-0.12%)

                local        gpr       inst      bytes
    helped           5         314         521         521
      hurt           0           4         195         195

No regressions on GM107 and GF114 with full piglit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22 22:51:37 +02:00
Marek Olšák 2e890b5350 gallium/radeon: make deferred flushes asynchronous
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-22 22:34:49 +02:00
Marek Olšák d17b35e671 gallium: add PIPE_FLUSH_DEFERRED
There are 2 uses:
- Asynchronous flushing for multithreaded drivers.
- Return a fence without flushing (mid-command-buffer fence). The driver
  can defer flushing until fence_finish is called.

This is required to make Bioshock Infinite faster, which creates
1000 fences (flushes) per frame.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-07-22 22:34:49 +02:00
Marek Olšák 4cdc482283 gallium/os: use CLOCK_MONOTONIC for sleeps (v2)
v2: handle EINTR, remove backslashes

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-07-22 22:34:49 +02:00
Eric Engestrom 4da9f7e7ce mapi: fix typo in macro name
Fixes: 5ec140c17b ("mapi: Massage code to allow clang to compile.")
Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-07-22 10:14:00 -07:00
Kenneth Graunke 44ef2ce6ec docs: Put swr back on the GL_ARB_texture_buffer_object_rgb32 list.
Looks like this was lost when resolving merge conflicts in
commit d1fbd4cdb1.
2016-07-22 09:57:54 -07:00
Andres Gomez d068b38e46 glsl: subroutine types cannot be compared
subroutine variables are to be used just in the way functions are
called. Although the spec doesn't say it explicitely, this means that
these variables are not to be used in any other way than those left
for function calls. Therefore, a comparison between 2 subroutine
variables should also cause a compilation error.

From The OpenGL® Shading Language 4.40, page 117:

  "  To use subroutines, a subroutine type is declared, one or more
     functions are associated with that subroutine type, and a
     subroutine variable of that type is declared. The function
     currently assigned to the variable function is then called by
     using function calling syntax replacing a function name with the
     name of the subroutine variable. Subroutine variables are
     uniforms, and are assigned to specific functions only through
     commands (UniformSubroutinesuiv) in the OpenGL API."

From The OpenGL® Shading Language 4.40, page 118:

  "  Subroutine uniform variables are called the same way functions
     are called. When a subroutine variable (or an element of a
     subroutine variable array) is associated with a particular
     function, all function calls through that variable will call that
     particular function."

Fixes GL44-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-22 17:30:25 +03:00
Timothy Arceri a2b3c146d2 i965: fix varying output setup
Since 7f53fead5c we treat every location as using all
four components so we only need special handling for
doubles when they cross multiple locations.

This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations
where the outputs array would overflow when a dmat2 was stored at
the max varying location i.e 30.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-07-23 00:04:10 +10:00
Samuel Pitoiset c2801f9272 nvc0/mme: fix offsets used for indirect draws
This fixes a regression introduced in
1da704a94c because the offset has moved
from 0x180 to 0x1a0, and the macros have to be re-compiled.

Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22 11:32:09 +02:00
Samuel Pitoiset dbcff7fdbb nvc0: fix offsets of MP perf counters input parameters
This fixes a regression introduced in
1da704a94c because the offset has moved
from 0x600 to 0x620, and the kernels used for reading MP perf counters
have to be re-assembled.

This also fixes amd_performance_monitor_measure piglit.

Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22 11:32:04 +02:00
Kenneth Graunke cb70773129 mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.
The GL_EXT_texture_format_BGRA8888 extension specification defines a
GL_BGRA_EXT unsized internal format (which is a little odd - usually
BGRA is a pixel transfer format).  The extension is written against
the ES 1.0 specification, so it's a little hard to map, but I believe
it's effectively adding it to the table used here, so we should allow
it here as well.

Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true),
so we don't need to check if it's enabled here.

This fixes mipmap generation in Skia and ChromeOS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Stéphane Marchesin <marcheu@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
2016-07-21 21:31:57 -07:00
Kenneth Graunke be1c53d2cf i965: Fix "operation operation" in comment.
From the redundant redundant department.

Reported-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
2016-07-21 21:31:57 -07:00
Kenneth Graunke 76e161056a i965: Fix shared atomic intrinsics to pay attention to base.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-07-21 21:31:55 -07:00
Kenneth Graunke cf6f2d3ce7 nir: Add a base const_index to shared atomic intrinsics.
Commit 52e75dcb8c made nir_lower_io
start using nir_intrinsic_set_base instead of writing const_index[0]
directly.  However, those intrinsics apparently don't /have/ a base,
so this caused assert failures.

However, the old code was happily setting non-existent const_index
fields, so it was pretty bogus too.

Jason pointed out that load_shared and store_shared have a base,
and that the i965 driver uses that field.  So presumably atomics
should have one as well, so that loads/stores/atomics all refer
to variables with consistent addressing.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-07-21 21:31:41 -07:00
Timothy Arceri 91dde3ddca glsl: re-enable varying packing in GL4.4+
We can still do packing we just need to get the packing type from the consumer
rather than the producer.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97033
2016-07-22 10:21:08 +10:00
Kenneth Graunke 2db357e4c3 i965: Include VUE handles for GS with invocations > 1.
We always resort to the pull model for instanced GS inputs.  So, we'd
better include the VUE handles, or else we can't actually pull anything.

Ian reports that on his branch with OES_geometry_shader enabled,
this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests::

- instanced.draw_2_instances_geometry_2_invocations
- instanced.draw_2_instances_geometry_8_invocations
- instanced.draw_4_instances_geometry_2_invocations
- instanced.draw_4_instances_geometry_8_invocations
- instanced.draw_8_instances_geometry_2_invocations
- instanced.draw_8_instances_geometry_8_invocations
- instanced.geometry_2_invocations
- instanced.geometry_32_invocations
- instanced.geometry_8_invocations
- instanced.geometry_max_invocations
- instanced.geometry_output_different_2_invocations
- instanced.geometry_output_different_32_invocations
- instanced.geometry_output_different_8_invocations
- instanced.geometry_output_different_max_invocations
- instanced.invocation_output_vary_by_attribute
- instanced.invocation_output_vary_by_texture
- instanced.invocation_output_vary_by_uniform
- query.primitives_generated_instanced

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 11:15:12 -07:00
Matt Turner 8c8c3f859e mesa: Add -fno-math-errno -fno-trapping-math to CXXFLAGS.
Not sure why I forgot to add them to CXXFLAGS in commit f55c408067 or
commit 875458b778. Cuts about 1k of .text.

   text     data      bss      dec      hex  filename
5806354   287816    29384  6123554   5d7022  i965_dri.so before
5805497   287744    29384  6122625   5d6c81  i965_dri.so after

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-21 10:45:28 -07:00
Matt Turner 5353855e9d mesa: Drop -fno-builtin-memcmp.
According to the referenced bug report, gcc-4.5 and newer do not inline
memcmp(). I see no difference in performance of ipers with llvmpipe on a
Sandybridge (which does not have "Enhanced REP MOVSB/STOSB") by removing
this flag.

I attempted to confirm the problem with gcc-4.4, but it fails to compile
for quite a few different reasons.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:45:28 -07:00
Matt Turner 5ec140c17b mapi: Massage code to allow clang to compile.
According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code
was violating the spec, resulting in it failing to compile.

Cc: mesa-stable@lists.freedesktop.org
Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-21 10:45:28 -07:00
Ian Romanick 6bc5491193 docs: Add extensions not part of any GL or GL ES version
Based loosely on patches submitted ages ago by Thomas Helland.

v2: Add lots of missing data provided by Ilia.  Fix sort order of
GL_ARB_sparse_texture extensions suggested by Ilia.

v3: Note that Dave Airlie has started work on GL_ARB_bindless_texture.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:31:04 -07:00
Ian Romanick d1fbd4cdb1 docs: Update GL3.txt for OpenGL 4.0 on i965-ish hardware
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:30:20 -07:00
Ian Romanick 7dc99da81a docs: Update GL3.txt for OpenGL ES on i965-ish hardware
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:26:55 -07:00
Timothy Arceri 4f89cf4941 i965: print error messages if gs fails to compile
We do this for all other stages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-21 15:05:05 +10:00
Timothy Arceri b463b1d7cc i965: enable GL4.4 for Gen8+
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-21 12:06:11 +10:00
Timothy Arceri 4ba9bd138a i965: enable ARB_enhanced_layouts for gen6+
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri f3805c5f09 i965/vec4: add packing support for tcs load outputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri 255388a965 i965/vec4: add support for packing tes inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-07-21 12:06:11 +10:00
Timothy Arceri d07cfb31c4 i965/vec4: add support for packing tcs outputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri b25e49a3c7 i965/vec4: support packing tcs inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri d1192bef7e i965/vec4: add component packing for gs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri d1b1fca0b7 i965/vec4: add support for packing vs/gs/tes outputs
Here we create a new output_generic_reg array with the ability to
store the dst_reg for each component of user defined varyings.
This is needed as the previous code only stored the dst_reg based
on the varying location which meant packed varyings would overwrite
each other.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-07-21 12:06:11 +10:00
Timothy Arceri b427abba0c i965/vec4: add support for packing inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri 138aad06b3 i965: add helper for creating packing writemask
For example where n=3 first_component=1 this will give us
0xE (WRITEMASK_YZW).

V2:
Add assert to check first component is <= 4 (Suggested by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri 4b57b53f85 i965: add helpers for creating component layout swizzle
This will be used to swizzle components to the beginning or end
of the vector based on the component layout qualifier and whether
we are doing a load or store.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Eric Anholt d2b4b16589 vc4: Return V3D version details in the GL renderer info.
This is as close as we get to a name for the 3D blocks.
2016-07-20 16:15:15 -07:00
Eric Anholt d81934cded vc4: Check the V3D version reported by the kernel.
We don't want to bring up an old userspace driver on a kernel for
newer hardware.  We'll also want to look at the other ident fields in
the future.
2016-07-20 16:15:15 -07:00
Eric Anholt 83b8ca58e1 vc4: Detect and report kernel support for branching. 2016-07-20 16:15:15 -07:00
Eric Anholt 16985eb308 vc4: Switch to using the libdrm-provided vc4_drm.h.
The required version is set to .69 for the getparam ioctl that will be
used in the next commit.
2016-07-20 16:15:15 -07:00