And not all existing queries. The only reason we have that list is to be able
to suspend and resume the active ones.
This reduces looping over queries when suspending and resuming.
The queries no longer have to track some of their states.
Use all zpass data for predication instead of the last block only.
Use query buffer as a ring instead of reusing the same area
for each new BeginQuery. All query buffer offsets are in bytes
to simplify offsets math.
The drivers have been changed so that they behave as if all of the flags
were set. This is already implicit in most hardware drivers and required
for multiple contexts.
Some state trackers were also abusing the PIPE_FLUSH_RENDER_CACHE flag
to decide whether flush_frontbuffer should be called.
New flag ST_FLUSH_FRONT has been added to st_api.h as a replacement.
This is reliant on a drm patch that I posted on the list + a version bump.
These will appear in drm-next today.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This allow to share code path btw old & new, also
remove check on reference this might make things
a little slower but new design doesn't use reference
stuff.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Make state statically allocated, this kills a bunch of code
and avoid intensive use of malloc/free. There is still a lot
of useless duplicate function wrapping that can be kill. This
doesn't improve yet performance, needs to avoid memcpy states
in radeon_ctx_set_draw and to avoid rebuilding vs_resources,
dsa, scissor, cb_cntl, ... states at each draw command.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
The current states code had an unhealthy relationship between
that had to somehow magically align themselves, editing either
place meant renumbering all states after the one you were on,
and it was pretty unapproachable code.
This replaces the huge types structures with a simple type + sub
type struct, which is keyed on an stype enum in radeon.h. Each
stype can have a per-shader type subclassing (4 types supported,
PS/VS/GS/FS), and also has a number of states per-subtype. So you
have 256 constants per 4 shaders per one CONSTANT stype.
The interface from the driver is changed to pass in the tuple,
(stype, id, shader_type), and we look for this. If
radeon_state_shader ever shows up on profile, it could use a
hashtable based on stype/shader_type to speed things up.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This reverts commit bd25e23bf3.
Apart from introducing a lot of hex magic numbers and being highly impenetable code,
it causes lots of lockups on an average piglit run that always runs without lockups.
Always run piglit before/after doing big things like this.
Directly build PM4 packet, avoid using malloc (no states are
bigger than 128 dwords), remove unecessary informations,
remove pm4 building in favor of prebuild pm4 packet.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>