Commit Graph

83680 Commits

Author SHA1 Message Date
Marek Olšák 1ffe77e7bb gallium: split transfer_inline_write into buffer and texture callbacks
to reduce the call indirections with u_resource_vtbl.

The worst call tree you could get was:
  - u_transfer_inline_write_vtbl
    - u_default_transfer_inline_write
      - u_transfer_map_vtbl
        - driver_transfer_map
      - u_transfer_unmap_vtbl
        - driver_transfer_unmap

That's 6 indirect calls. Some drivers only had 5. The goal is to have
1 indirect call for drivers that care. The resource type can be determined
statically at most call sites.

The new interface is:
  pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data)
  pipe_context::texture_subdata(ctx, resource, level, usage, box, data,
                                stride, layer_stride)

v2: fix whitespace, correct ilo's behavior

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-07-23 13:33:42 +02:00
Kenneth Graunke 0ba7288376 nir: Lower interp_var_at_* like a normal load_var for flat inputs.
"flat centroid" and "flat sample" both just mean "flat", so we should
ignore interpolateAtCentroid/Sample and just return the flat value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-07-22 20:31:20 -07:00
Kenneth Graunke f80bea2d80 mesa: Don't call GenerateMipmap if Width or Height == 0.
One of the WebGL 2.0 conformance tests is trying to call
glGenerateMipmaps with a width and height of 0.  With the meta
implementation, this generates a "framebuffer attachment incomplete"
status, and falls back to the CPU path, calling MapTextureImage.

Except that there's no actual texture to map, and we assert fail.

There's no work to do in this case.  The test expects it to succeed,
so just return early with no error and avoid hassling the driver.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-07-22 20:31:20 -07:00
Jason Ekstrand b33bccb519 anv/pipeline: Set up point coord enables
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 9e05e51cff spirv/nir: Add support for ImageQuerySamples
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 71202352c8 spirv/nir: Handle texture projectors
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 36c31b8fa2 nir/spirv: Refactor coordinate handling in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand b820c8b78c spirv/nir: Refactor type handling in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 561be50a1a spirv/nir: Move opcode selection higher up in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand c8da91aa24 anv/image: Assert that the image format is actually supported
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 34a39e91ba spirv/nir: Don't increment coord_components for array lod queries
For lod query instructions, we really don't care whether or not the sampler
is an array type because that doesn't factor into the LOD.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 67b7d876e4 i965: Get rid of the do_lower_unnormalized_offsets pass
We can do this in NIR now.  No need to keep a GLSL pass lying around for
it.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand 9f32721f86 i965/nir: Enable NIR lowering of txf and rect offsets
This fixes the following piglit tests on gen6+:

tex-miplevel-selection textureProjGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRectShadow
tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4
tex-miplevel-selection textureProjGradOffset 2DRectShadow

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:54 -07:00
Jason Ekstrand d9156efc52 nir/lower_tex: Add support for lowering coordinate offsets
On i965, we can't support coordinate offsets for texelFetch or rectangle
textures.  Previously, we were doing this with a GLSL pass but we need to
do it in NIR if we want those workarounds for SPIR-V.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:53 -07:00
Jason Ekstrand 843fc8f3e7 nir/lower_tex: Add some helpers for working with tex sources
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:53 -07:00
Jason Ekstrand 09135cd55a nir: Add a helper for determining the type of a texture source
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 3c0077a6ec anv/pipeline: Set binding_table.gather_texture_start
This should get texture gather working on gen8+ and mostly working on gen7.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 95e9d58bdb spirv/nir: Properly handle gather components
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 7c7acf53b2 spirv/nir: Add support for shadow samplers that return vec4
While SPIR-V technically doesn't support "old style" shadow, the
shadow-compare gather instruction does return a vec4 so we need to be able
to set the old_style_shadow bit in NIR.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand 2ddefd03b7 spirv/nir: Fix some texture opcode asserts
We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an
explicit-LOD texturing instruction.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Samuel Pitoiset 3f5cf8c488 nv50/ir: allow to swap sources for OP_SUB
This allows the load-propagation pass to swap the sources in presence
of immediate values.

Maxwell (GM107):

total instructions in shared programs :1928187 -> 1927634 (-0.03%)
total gprs used in shared programs    :330741 -> 330154 (-0.18%)
total local used in shared programs   :28032 -> 28032 (0.00%)

                local        gpr       inst      bytes
    helped           0         271         425         425
      hurt           0           0         194         194

Fermi (GF114):

total instructions in shared programs :2334474 -> 2333829 (-0.03%)
total gprs used in shared programs    :380934 -> 380215 (-0.19%)
total local used in shared programs   :33304 -> 33264 (-0.12%)

                local        gpr       inst      bytes
    helped           5         314         521         521
      hurt           0           4         195         195

No regressions on GM107 and GF114 with full piglit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22 22:51:37 +02:00
Marek Olšák 2e890b5350 gallium/radeon: make deferred flushes asynchronous
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-22 22:34:49 +02:00
Marek Olšák d17b35e671 gallium: add PIPE_FLUSH_DEFERRED
There are 2 uses:
- Asynchronous flushing for multithreaded drivers.
- Return a fence without flushing (mid-command-buffer fence). The driver
  can defer flushing until fence_finish is called.

This is required to make Bioshock Infinite faster, which creates
1000 fences (flushes) per frame.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-07-22 22:34:49 +02:00
Marek Olšák 4cdc482283 gallium/os: use CLOCK_MONOTONIC for sleeps (v2)
v2: handle EINTR, remove backslashes

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-07-22 22:34:49 +02:00
Eric Engestrom 4da9f7e7ce mapi: fix typo in macro name
Fixes: 5ec140c17b ("mapi: Massage code to allow clang to compile.")
Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-07-22 10:14:00 -07:00
Kenneth Graunke 44ef2ce6ec docs: Put swr back on the GL_ARB_texture_buffer_object_rgb32 list.
Looks like this was lost when resolving merge conflicts in
commit d1fbd4cdb1.
2016-07-22 09:57:54 -07:00
Andres Gomez d068b38e46 glsl: subroutine types cannot be compared
subroutine variables are to be used just in the way functions are
called. Although the spec doesn't say it explicitely, this means that
these variables are not to be used in any other way than those left
for function calls. Therefore, a comparison between 2 subroutine
variables should also cause a compilation error.

From The OpenGL® Shading Language 4.40, page 117:

  "  To use subroutines, a subroutine type is declared, one or more
     functions are associated with that subroutine type, and a
     subroutine variable of that type is declared. The function
     currently assigned to the variable function is then called by
     using function calling syntax replacing a function name with the
     name of the subroutine variable. Subroutine variables are
     uniforms, and are assigned to specific functions only through
     commands (UniformSubroutinesuiv) in the OpenGL API."

From The OpenGL® Shading Language 4.40, page 118:

  "  Subroutine uniform variables are called the same way functions
     are called. When a subroutine variable (or an element of a
     subroutine variable array) is associated with a particular
     function, all function calls through that variable will call that
     particular function."

Fixes GL44-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-22 17:30:25 +03:00
Timothy Arceri a2b3c146d2 i965: fix varying output setup
Since 7f53fead5c we treat every location as using all
four components so we only need special handling for
doubles when they cross multiple locations.

This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations
where the outputs array would overflow when a dmat2 was stored at
the max varying location i.e 30.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-07-23 00:04:10 +10:00
Samuel Pitoiset c2801f9272 nvc0/mme: fix offsets used for indirect draws
This fixes a regression introduced in
1da704a94c because the offset has moved
from 0x180 to 0x1a0, and the macros have to be re-compiled.

Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22 11:32:09 +02:00
Samuel Pitoiset dbcff7fdbb nvc0: fix offsets of MP perf counters input parameters
This fixes a regression introduced in
1da704a94c because the offset has moved
from 0x600 to 0x620, and the kernels used for reading MP perf counters
have to be re-assembled.

This also fixes amd_performance_monitor_measure piglit.

Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-22 11:32:04 +02:00
Kenneth Graunke cb70773129 mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.
The GL_EXT_texture_format_BGRA8888 extension specification defines a
GL_BGRA_EXT unsized internal format (which is a little odd - usually
BGRA is a pixel transfer format).  The extension is written against
the ES 1.0 specification, so it's a little hard to map, but I believe
it's effectively adding it to the table used here, so we should allow
it here as well.

Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true),
so we don't need to check if it's enabled here.

This fixes mipmap generation in Skia and ChromeOS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Stéphane Marchesin <marcheu@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
2016-07-21 21:31:57 -07:00
Kenneth Graunke be1c53d2cf i965: Fix "operation operation" in comment.
From the redundant redundant department.

Reported-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
2016-07-21 21:31:57 -07:00
Kenneth Graunke 76e161056a i965: Fix shared atomic intrinsics to pay attention to base.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-07-21 21:31:55 -07:00
Kenneth Graunke cf6f2d3ce7 nir: Add a base const_index to shared atomic intrinsics.
Commit 52e75dcb8c made nir_lower_io
start using nir_intrinsic_set_base instead of writing const_index[0]
directly.  However, those intrinsics apparently don't /have/ a base,
so this caused assert failures.

However, the old code was happily setting non-existent const_index
fields, so it was pretty bogus too.

Jason pointed out that load_shared and store_shared have a base,
and that the i965 driver uses that field.  So presumably atomics
should have one as well, so that loads/stores/atomics all refer
to variables with consistent addressing.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-07-21 21:31:41 -07:00
Timothy Arceri 91dde3ddca glsl: re-enable varying packing in GL4.4+
We can still do packing we just need to get the packing type from the consumer
rather than the producer.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97033
2016-07-22 10:21:08 +10:00
Kenneth Graunke 2db357e4c3 i965: Include VUE handles for GS with invocations > 1.
We always resort to the pull model for instanced GS inputs.  So, we'd
better include the VUE handles, or else we can't actually pull anything.

Ian reports that on his branch with OES_geometry_shader enabled,
this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests::

- instanced.draw_2_instances_geometry_2_invocations
- instanced.draw_2_instances_geometry_8_invocations
- instanced.draw_4_instances_geometry_2_invocations
- instanced.draw_4_instances_geometry_8_invocations
- instanced.draw_8_instances_geometry_2_invocations
- instanced.draw_8_instances_geometry_8_invocations
- instanced.geometry_2_invocations
- instanced.geometry_32_invocations
- instanced.geometry_8_invocations
- instanced.geometry_max_invocations
- instanced.geometry_output_different_2_invocations
- instanced.geometry_output_different_32_invocations
- instanced.geometry_output_different_8_invocations
- instanced.geometry_output_different_max_invocations
- instanced.invocation_output_vary_by_attribute
- instanced.invocation_output_vary_by_texture
- instanced.invocation_output_vary_by_uniform
- query.primitives_generated_instanced

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 11:15:12 -07:00
Matt Turner 8c8c3f859e mesa: Add -fno-math-errno -fno-trapping-math to CXXFLAGS.
Not sure why I forgot to add them to CXXFLAGS in commit f55c408067 or
commit 875458b778. Cuts about 1k of .text.

   text     data      bss      dec      hex  filename
5806354   287816    29384  6123554   5d7022  i965_dri.so before
5805497   287744    29384  6122625   5d6c81  i965_dri.so after

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-21 10:45:28 -07:00
Matt Turner 5353855e9d mesa: Drop -fno-builtin-memcmp.
According to the referenced bug report, gcc-4.5 and newer do not inline
memcmp(). I see no difference in performance of ipers with llvmpipe on a
Sandybridge (which does not have "Enhanced REP MOVSB/STOSB") by removing
this flag.

I attempted to confirm the problem with gcc-4.4, but it fails to compile
for quite a few different reasons.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:45:28 -07:00
Matt Turner 5ec140c17b mapi: Massage code to allow clang to compile.
According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code
was violating the spec, resulting in it failing to compile.

Cc: mesa-stable@lists.freedesktop.org
Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-21 10:45:28 -07:00
Ian Romanick 6bc5491193 docs: Add extensions not part of any GL or GL ES version
Based loosely on patches submitted ages ago by Thomas Helland.

v2: Add lots of missing data provided by Ilia.  Fix sort order of
GL_ARB_sparse_texture extensions suggested by Ilia.

v3: Note that Dave Airlie has started work on GL_ARB_bindless_texture.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:31:04 -07:00
Ian Romanick d1fbd4cdb1 docs: Update GL3.txt for OpenGL 4.0 on i965-ish hardware
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:30:20 -07:00
Ian Romanick 7dc99da81a docs: Update GL3.txt for OpenGL ES on i965-ish hardware
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-21 10:26:55 -07:00
Timothy Arceri 4f89cf4941 i965: print error messages if gs fails to compile
We do this for all other stages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-21 15:05:05 +10:00
Timothy Arceri b463b1d7cc i965: enable GL4.4 for Gen8+
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-21 12:06:11 +10:00
Timothy Arceri 4ba9bd138a i965: enable ARB_enhanced_layouts for gen6+
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri f3805c5f09 i965/vec4: add packing support for tcs load outputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri 255388a965 i965/vec4: add support for packing tes inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-07-21 12:06:11 +10:00
Timothy Arceri d07cfb31c4 i965/vec4: add support for packing tcs outputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri b25e49a3c7 i965/vec4: support packing tcs inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00
Timothy Arceri d1192bef7e i965/vec4: add component packing for gs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-07-21 12:06:11 +10:00