This doesn't fix any known issue (I haven't run piglit with this yet),
but the code was obviously completely wrong. It looks like copy-pasted from CMP.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
If the argument to emit_bool_to_cond_code() is an ir_expression, we
loop over the operands, calling accept() on each of them, which
generates assembly code to compute that subexpression. We then emit
one or two final instruction that perform the top-level operation on
those operands.
If it's not an expression (say, a boolean-valued variable), we simply
call accept() on the whole value.
In commit 80ecb8f1 (i965/fs: Avoid generating extra AND instructions on
bool logic ops), Eric made logic operations jump out of the expression
path to the non-expression path.
Unfortunately, this meant that we would first accept() the two operands,
skip generating any code that used them, then accept() the whole
expression, generating code for the operands a second time.
Dead code elimination would always remove the first set of redundant
operand assembly, since nothing actually used them. But we shouldn't
generate it in the first place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ironlake's counters are always enabled; userspace can simply send a
MI_REPORT_PERF_COUNT packet to take a snapshot of them. This makes it
easy to implement.
The counters are documented in the source code for the intel-gpu-tools
intel_perf_counters utility.
v2: Adjust for core data structure changes. Add a table mapping buffer
object offsets to exposed counters (which changes each generation).
Finally, add report ID assertions to sanity check the BO layout
(thanks to Carl Worth).
v3: Update for core BeginPerfMonitor hook changes (requested by Brian).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This provides an interface for applications (and OpenGL-based tools) to
access GPU performance counters. Since the exact performance counters
available vary between vendors and hardware generations, the extension
provides an API the application can use to get the names, types, and
minimum/maximum values of all available counters. Counters are also
organized into groups.
Applications create "performance monitor" objects, select the counters
they want to track, and Begin/End monitoring, much like OpenGL's query
API. Multiple monitors can be in flight simultaneously.
v2: Pass ctx to all driver hooks (suggested by Christoph), and attempt
to fix overallocation of bitsets (caught by Christoph). Incomplete.
v3: Significantly rework core data structures. Store counters in groups
rather than in a global list. Use their array index in the group's
counter list as the ID rather than trying to store a globally unique
counter ID. Use bitsets for active counters within a group, and
also track which groups are active so that's easy to query.
v4: Remove _mesa_ prefix on static functions; detect out of memory
conditions in new_performance_monitor(); make BeginPerfMonitor hook
return a boolean rather than setting m->Active or raising an error.
Switch to GLuint/unsigned for NumGroups, NumCounters, and
MaxActiveCounters (which also means switching a bunch of temporary
variable types). All suggested by Brian Paul. Also, remove
commented out code at the bottom of the block. Finally, fix the
dispatch sanity test (noticed by Ian Romanick).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com> [v3]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is better than overriding the extension enable based on the
language version; it's robust against shaders that do:
#version 140
#extension GL_ARB_uniform_buffer_object : disable
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Explicit attribute locations are supported with GLSL 3.30, GLSL ES 3.00,
or "#extension GL_ARB_explicit_attrib_location: enable". Using a helper
function makes it easy to check for this.
This enables support in GLSL 3.30, which was previously missing.
Previously, we overrode the extension enable flag for ES 3.00. This is
not robust against a shader such as:
#version 330
#extension GL_ARB_explicit_attrib_location : disable
Disabling extensions should not remove core language functionality.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Hardware requires the magnitude of the largest component to not exceed
1; brw_cubemap_normalize ensures that this is the case.
Unfortunately, we would previously multiply the array index for cube
arrays by the normalization factor. The incorrect array index would then
cause the sampler to attempt to access either the wrong cube, or memory
outside the cube surface entirely, resulting in garbage rendering or in
the worst case, hangs.
Alter the normalization pass to only multiply the .xyz components.
Fixes broken rendering in the arb_texture_cube_map_array-cubemap piglit,
which was recently adjusted to provoke this behavior.
V2: Fix indent.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Compress empty triangles (don't emit more than one in a row) and
never emit empty triangles if we already generated a triangle
covering a non-null area. We can't skip all null-triangles
because c_primitives expects ones that were generated from vertices
exactly at the clipping-plane, to be emitted.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
We need to count the clipper primitives before the rasterizer
discards one it considers to be null.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
We need to subdivide triangles if either of the dimensions is
larger than the max edge length, not when both of them are larger.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The fix is at the end (TGSI_TEXTURE_SHADOWCUBE handling), but I also
restructured the code for it to be more readable.
Fixes spec/!OpenGL 3.0/sampler-cube-shadow.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This fixes compressedteximage piglit tests.
+10 piglits
Evergreen and Cayman have the same issue. R600 and R700 don't.
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This fixes some piglits, e.g:
spec/!OpenGL 3.0/required-renderbuffer-attachment-formats.
This can be ported to r600g.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Only create one screen for each winsys instance.
This helps with buffer sharing and interop handling.
v2: rebased and some minor cleanup
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Allows us to share more code between different targets.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Allows us to share more code between different targets.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Allows us to share more code between different targets.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
This reverts commit 755c11dc5e.
We agreed that this is band-aid that's not very useful and
the proper solution is to rewrite the rasterization algo
so that it operates on 64 bit values.
Signed-off-by: Zack Rusin <zackr@vmware.com>
No such argument exists since this commit:
commit 92f3fca0ea
Author: Ian Romanick <ian.d.romanick@intel.com>
AuthorDate: Sun Aug 21 17:23:58 2011 -0700
Commit: Ian Romanick <ian.d.romanick@intel.com>
CommitDate: Tue Aug 23 14:52:09 2011 -0700
mesa: Remove target parameter from dd_function_table::BufferSubData
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When subdiving a triangle we're using a temporary array to store
the new coordinates for the subdivided triangles. Unfortunately
the array used for that was not aligned properly causing
random crashes in the llvm jit code which was trying to load
vectors from it.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This patch fixes the MSVC build error introduced by commit
673129e0b9.
enums.c
mesa\main\enums.c(3776) : error C2143: syntax error : missing ';' before 'type'
mesa\main\enums.c(3781) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3781) : warning C4047: '!=' : 'int' differs in levels of indirection from 'void *'
mesa\main\enums.c(3782) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3782) : error C2223: left of '->offset' must point to struct/union
mesa\main\enums.c(3782) : warning C4033: '_mesa_lookup_enum_by_nr' must return a value
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Normally, LD_PRELOAD will take precedence over your own symbols, which you
want for things like malloc() in libc. But we don't have any local
symbols we would want overridden (like hash_table_insert(), for example!),
so tell the linker to resolve them internally. This also avoids calls
through the PLT.
Saves almost 100k on libdricore's size, and gets us a bunch of the
performance back that we had with non-dricore.
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
This gives the compiler the chance to inline and not export class symbols
even in the absence of LTO. Saves about 60kb on disk.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
Noticed while grepping through the code for something else.
v2: Don't convert really-runtime asserts to static asserts.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
Since it's only used for debug information, we can misalign the struct and
save the disk space. Another 19k on a 64-bit build.
v2: Make a compiler.h macro to only use the attribute if we know we can.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
Now that there's no name -> enum direction, we can drop the extra strings,
and merge the offsets table and the reduced_enums table.
Between the previous commit and this one, Mesa core drops by 30k.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>