required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush
the context
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
to reduce the call indirections with u_resource_vtbl.
The worst call tree you could get was:
- u_transfer_inline_write_vtbl
- u_default_transfer_inline_write
- u_transfer_map_vtbl
- driver_transfer_map
- u_transfer_unmap_vtbl
- driver_transfer_unmap
That's 6 indirect calls. Some drivers only had 5. The goal is to have
1 indirect call for drivers that care. The resource type can be determined
statically at most call sites.
The new interface is:
pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data)
pipe_context::texture_subdata(ctx, resource, level, usage, box, data,
stride, layer_stride)
v2: fix whitespace, correct ilo's behavior
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
There are 2 uses:
- Asynchronous flushing for multithreaded drivers.
- Return a fence without flushing (mid-command-buffer fence). The driver
can defer flushing until fence_finish is called.
This is required to make Bioshock Infinite faster, which creates
1000 fences (flushes) per frame.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by
v2:
* proper commit message and non-joke title;
* replace two 'as is' followed by 'is' to 'as-is'.
v3:
* 'a integer' => 'an integer' and similar (originally spotted by
Jason Ekstrand, I fixed a few other similar ones while at it)
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Window rectangles apply to all framebuffer operations, either in
inclusive or exclusive mode. They may also be specified as part of a
blit operation.
In exclusive mode, any fragment inside any of the specified rectangles
will be discarded.
In inclusive mode, any fragment outside every rectangle will be
discarded.
The no-op state is to have 0 rectangles in exclusive mode.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This patch adds a new interface to support hardware mipmap generation.
PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify
if this new interface is supported; if not supported, the state tracker will
fallback to mipmap generation by rendering/texturing.
v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers
v3: add format to the generate_mipmap interface to allow mipmap generation
using a format other than the resource format
v4: fix return type of trace_context_generate_mipmap()
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This will allow gallium drivers to send messages to KHR_debug endpoints
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
It wasn't completely clear from the docs, so I had to figure out by
looking at piglit results. Hopefully this saves the next driver writer
implementing queries some time.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
D3D10 allows setting of the internal offset of a buffer, which is
in general only incremented via actual stream output writes. By
allowing setting of the internal offset draw_auto is capable
of rendering from buffers which have not been actually streamed
out to. Our interface didn't allow. This change functionally
shouldn't make any difference to OpenGL where instead of an
append_bitmask you just get a real array where -1 means append
(like in D3D) and 0 means do not append.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The new function replaces four old functions: set_fragment/vertex/
geometry/compute_sampler_views().
Note: at this time, it's expected that the 'start' parameter will
always be zero.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
r600g needs explicit flushing before DRI2 buffers are presented on the screen.
v2: add (stub) implementations for all drivers, fix frontbuffer flushing
v3: fix galahad
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
In particular noone is interested in the vertex count, so drop that,
and also drop the duplicated num_primitives_generated /
so.primitives_storage_needed variables in drivers. I am unable for now to figure
out if primitives_storage_needed in SO stats (used for d3d10) should
increase if SO is disabled, though the equivalent num_primitives_generated
used for OpenGL definitely should increase. In any case we were only counting
when SO is active both in softpipe and llvmpipe anyway so don't pretend there's
an independent num_primitives_generated counter which would count always.
(This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just
as before, should eventually fix this by doing either separate counting for this
query or adjust the code so it always counts this even if SO is inactive depending
on what's correct for d3d10.)
Reviewed-by: Brian Paul <brianp@vmware.com>
The semantics didn't really make sense, not really matching neither d3d9
(though the docs are all broken there) nor d3d10. So make it match d3d10
semantics, which actually gives meaning to the "disjoint" part.
Drivers are fixed up in a very primitive way, I have no idea what could
actually cause the counter to become unreliable so just always return
FALSE for the disjoint part.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Transfers always use z/depth for layers no matter if it's a 1d or 2d array
texture, we don't follow OpenGL's crazyness there. Luckily this appears to
only be a doc bug, everyone doing the right thing already.
While here also document z/depth parameter for cube map arrays.
v2: fix typo spotted by Eric Anholt
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
"get_transfer + transfer_map" becomes "transfer_map".
"transfer_unmap + transfer_destroy" becomes "transfer_unmap".
transfer_map must create and return the transfer object and transfer_unmap
must destroy it.
transfer_map is successful if the returned buffer pointer is not NULL.
If transfer_map fails, the pointer to the transfer object remains unchanged
(i.e. doesn't have to be NULL).
Acked-by: Brian Paul <brianp@vmware.com>
Define an interface that exposes the minimal functionality required to
implement some of the popular compute APIs. This commit adds entry
points to set the grid layout and other state required to keep track
of the usual address spaces employed in compute APIs, to bind a
compute program, and execute it on the device.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Namely:
- EXT_transform_feedback
- ARB_transform_feedback2
- ARB_transform_feedback_instanced
The old interface was not useful for OpenGL and had to be reworked.
This interface was originally designed for OpenGL, but additional
changes have been made in order to make st/d3d1x support easier.
The most notable change is the stream-out info must be linked
with a vertex or geometry shader and cannot be set independently.
This is due to limitations of existing hardware (special shader
instructions must be used to write into stream-out buffers),
and it's also how OpenGL works (stream outputs must be specified
prior to linking shaders).
Other than that, each stream output buffer has a "view" into it that
internally maintains the number of bytes which have been written
into it. (one buffer can be bound in several different transform
feedback objects in OpenGL, so we must be able to have several views
around) The set_stream_output_targets function contains a parameter
saying whether new data should be appended or not.
Also, the view can optionally be used to provide the vertex
count for draw_vbo. Note that the count is supposed to be stored
in device memory and the CPU never gets to know its value.
OpenGL way | Gallium way
------------------------------------
BeginTF = set_so_targets(append_bitmask = 0)
PauseTF = set_so_targets(num_targets = 0)
ResumeTF = set_so_targets(append_bitmask = ~0)
EndTF = set_so_targets(num_targets = 0)
DrawTF = use pipe_draw_info::count_from_stream_output
v2: * removed the reset_stream_output_targets function
* added a parameter append_bitmask to set_stream_output_targets,
each bit specifies whether new data should be appended to each
buffer or not.
v3: * added PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME for ARB_tfb2,
note that the draw-auto subset is always required (for d3d10),
only the pause/resume functionality is limited if the CAP is not
advertised
v4: * update gallium/docs
v5: * compactified struct pipe_stream_output_info, updated dump/trace
Resolve via glBlitFramebuffer allows resolving a sub-region of a
renderbuffer to a different location in any mipmap level of some
other texture, and, with a new extension, even scaling. Therefore,
location and size parameters are needed.
The mask parameter was added because resolving only depth or only
stencil of a combined buffer is possible as well.
Full information about the blit operation allows the drivers to
take the most efficient path they possibly can.