KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Chia-I Wu	4b5c0a8341	ilo: replace ilo_sampler_cso with ilo_state_sampler	2015-06-15 01:06:45 +08:00
Chia-I Wu	745ef2c07b	ilo: replace ilo_view_surface with ilo_state_surface	2015-06-15 01:06:45 +08:00
Chia-I Wu	c10c1ac0cf	ilo: replace ilo_zs_surface with ilo_state_zs	2015-06-15 01:06:44 +08:00
Chia-I Wu	6dad848d1a	ilo: add ilo_state_ps We want to make ilo_shader_cso a union of ilo_state_{vs,hs,ds,gs,ps}.	2015-06-15 01:06:44 +08:00
Chia-I Wu	df9f846ac6	ilo: add ilo_state_{vs,hs,ds,gs} We want to make ilo_shader_cso a union of ilo_state_{vs,hs,ds,gs} and ps payload.	2015-06-15 01:06:44 +08:00
Chia-I Wu	a0bb1c2d17	ilo: add ilo_state_sbe We want to replace ilo_kernel_routing with ilo_state_sbe.	2015-06-15 01:06:44 +08:00
Chia-I Wu	1ccab943b6	ilo: add ilo_state_vf We want to replace ilo_ve_state with ilo_state_vf.	2015-06-15 01:06:44 +08:00
Chia-I Wu	9c77ebef24	ilo: add ilo_state_urb	2015-06-15 01:06:44 +08:00
Chia-I Wu	3ff40be0ee	ilo: add ilo_state_sol	2015-06-15 01:06:44 +08:00
Chia-I Wu	62bb643718	ilo: add ilo_state_cc We want to replace ilo_dsa_state and ilo_blend_state with ilo_state_cc.	2015-06-15 01:06:44 +08:00
Chia-I Wu	6be8b6053d	ilo: add ilo_state_raster We want to replace ilo_rasterizer_state with ilo_state_raster.	2015-06-15 01:06:44 +08:00
Chia-I Wu	4fa7ed99a1	ilo: add ilo_state_viewport We want to replace ilo_viewport_cso and ilo_scissor_state with ilo_state_viewport.	2015-06-14 23:00:04 +08:00
Chia-I Wu	61fea171af	ilo: add ilo_state_sampler We want to replace ilo_sampler_cso with ilo_state_sampler.	2015-06-14 23:00:04 +08:00
Chia-I Wu	f5f2007322	ilo: add ilo_state_surface We want to replace ilo_view_surface with ilo_state_surface.	2015-06-14 23:00:04 +08:00
Chia-I Wu	b91250a56b	ilo: add ilo_state_zs We want to replace ilo_zs_surface with ilo_state_zs. One noteworthy difference is that ilo_state_zs always aligns level 0 to 8x4 when HiZ is enabled. HiZ will not be enabled for 1D surfaces as a result.	2015-06-14 23:00:03 +08:00
Chia-I Wu	9af1fc590d	ilo: update genhw headers Generate these new enums enum gen_reorder_mode; enum gen_clip_mode; enum gen_front_winding; enum gen_fill_mode; enum gen_cull_mode; enum gen_pixel_location; enum gen_sample_count; enum gen_inputattr_select; enum gen_msrast_mode; enum gen_prefilter_op; Correct the type of GEN6_SAMPLER_DW0_BASE_LOD. Rename gen_logicop_function, gen_sampler_mip_filter, gen_sampler_map_filter, gen_sampler_aniso_ratio, and others.	2015-06-14 15:43:20 +08:00
Chia-I Wu	9cb0df4b50	ilo: add ilo_image_disable_aux() When aux bo allocation fails, ilo_image_disable_aux() should be called to disable aux buffer.	2015-06-14 15:43:20 +08:00
Chia-I Wu	f0de65cbc2	ilo: add array_size and level_count to ilo_image We will use them for bound checking.	2015-06-14 15:43:20 +08:00
Chia-I Wu	f9d2bbe967	ilo: add pipe_texture_target to ilo_image Save the target in ilo_image instead of passing it around.	2015-06-14 15:43:20 +08:00
Chia-I Wu	9da9cf729f	ilo: fix "Render Cache Read Write Mode" It needs be set to R/W only when using certain messages via DP render cache. Since we only use RT wrties with the render cache, we never need to set it.	2015-06-14 15:43:20 +08:00
Chia-I Wu	1885ac4908	ilo: avoid resource owning in core It is up to the users whether to reference count the BOs or not.	2015-06-14 15:43:20 +08:00
Chia-I Wu	ab7229b9b6	ilo: assert core objects are zero-initialized Core objects are usually embedded inside calloc()'ed objects and we expect them to be zero-initialized.	2015-06-14 15:43:20 +08:00
Tom Stellard	4d35eef326	radeon/llvm: Handle LLVM backend rename from R600 to AMDGPU Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-06-12 21:02:00 -07:00
Emil Velikov	d15c06b514	vc4: automake: enable subdir-objects Silence the warnings about the future incompatibility with automake 2.0 Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-06-12 15:42:22 +01:00
Emil Velikov	1df5a6c71e	mesa; add a dummy _mesa_error_no_memory() symbol to libglsl_util Rather than forcing everyone to provide their own definition of the symbol provide a common (dummy) one. This helps us resolve the build of the standalone pipe-drivers (amongst others), which are missing the symbol. Cc: Rob Clark <robclark@freedesktop.org> Cc: "10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-06-12 15:32:18 +01:00
Emil Velikov	4722743f4b	gallium: use $(top_builddir) when referencing static archives Just like every other place in gallium. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-06-12 15:32:17 +01:00
Emil Velikov	3f5dc9b94f	freedreno: use CXX linker rather than explicit link against libstdc++ Cc: Rob Clark <robclark@freedesktop.org> Cc: "10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-06-12 15:32:17 +01:00
Jose Fonseca	0dde821bcc	trace: Add missing p_compiler.h include. For boolean. Trivial.	2015-06-12 12:14:11 +01:00
Brian Paul	7217faf39f	llvmpipe: simplify lp_resource_copy() Just implement it in terms of util_resource_copy_region(). Both the original code and util_resource_copy_region() boil down to mapping, calling util_copy_box() and unmapping. No piglit regressions. This will also help to implement GL_ARB_copy_image. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-06-10 08:20:58 -06:00
Dave Airlie	c6877c9e59	nouveau: set imported buffers to what the kernel gives us When we import a dma-buf fd from another driver the kernel gives us the right info, and this trashes it. Convert the kernel bo flags into the domain flags. This helps getting reverse prime and glamor working. Cc: mesa-stable@lists.freedesktop.org Acked-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-06-10 14:10:01 +10:00
Eric Anholt	9dca3beb62	vc4: Drop qir include from vc4_screen.h We didn't need any of it except for the list header, and qir.h pulls in nir.h, which is not really interesting to winsys.	2015-06-09 12:25:50 -07:00
Eric Anholt	8d10b2a046	vc4: Drop subdirectory in vc4 build. Just because we put the source in a subdir, doesn't mean we need helper libraries in the build. This will also simplify the Android build setup.	2015-06-09 12:25:50 -07:00
Eric Anholt	e67b12eaf8	vc4: Update to current kernel validation code. After profiling on real hardware, I found a few ways to cut down the kernel overhead.	2015-06-09 12:25:50 -07:00
Chih-Wei Huang	c5e11e5f7f	android: build with libcxx on android lollipop On Lollipop, apparently stlport is gone and libcxx must be used instead. We still support stlport when building on earlier android releases. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-06-09 12:25:50 -07:00
Chih-Wei Huang	1842832660	android: enable the radeonsi driver Based on the nice work of Paulo Sergio Travaglia <pstglia@gmail.com>. The main modifications are: - Include paths for LLVM header files and shared/static libraries - Set C++ flag "c++11" to avoid compiling errors on LLVM header files - Set defines for LLVM - Add GALLIVM source files - Changes path of libelf library for lollipop Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Anholt <eric@anholt.net>	2015-06-09 12:25:50 -07:00
Martin Peres	8614b9e489	softpipe/query: force parenthesis around a logical not This makes GCC5 happy. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-06-08 12:38:08 +03:00
Michel Dänzer	56e38edc96	radeonsi: Add CIK SDMA support Based on the corresponding SI support. Same as that, this is currently only enabled for one-dimensional buffer copies due to issues with multi-dimensional SDMA copies. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-08 18:13:22 +09:00
Michel Dänzer	79f2acb8f8	r600g,radeonsi: Assert that there's enough space after flushing Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-08 18:10:35 +09:00
Marek Olšák	6aff87bb01	r600g: fix a coverity defect in streamout code Reported by Ilia Mirkin.	2015-06-05 19:44:32 +02:00
Marek Olšák	b6ebe7eabf	tgsi/ureg: don't emit in/out arrays if drivers don't support ranged declarations Softpipe, llvmpipe, r300g, and radeonsi pass tests. Other drivers need testing. Freedreno and nv30 are definitely broken. Other drivers seem to be alright.	2015-06-05 19:44:32 +02:00
Roland Scheidegger	4fd42a7c27	llvmpipe: Implement stencil export Pretty trivial, fixes the issue that we're expected to be able to blit stencil surfaces (as the blit just relies on util blitter code which needs stencil export to do it). 2 piglits skip->pass, 11 fail->pass v2: prettify, keep different stencil ref value handling out of depth/stencil test itself. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-06-04 03:56:19 +02:00
Eric Anholt	ec1c72d38e	vc4: Don't bother with safe list traversal in CSE. We don't remove or move instructions.	2015-05-29 22:09:53 -07:00
Eric Anholt	78c773bb36	vc4: Convert from simple_list.h to list.h list.h is a nicer and more familiar set of list functions/macros.	2015-05-29 22:09:53 -07:00
Eric Anholt	21a22a61c0	vc4: Make sure we allocate idle BOs from the cache. We were returning the most recently freed BO, without checking if it was idle yet. This meant that we generally stalled immediately on the previous frame when generating a new one. Instead, allocate new BOs when the oldest BO is still busy, so that the cache scales with how much is needed to keep some frames outstanding, as originally intended. Note that if you don't have some throttling happening, this means that you can accidentally run the system out of memory. The kernel is now applying some throttling on all execs, to hopefully avoid this.	2015-05-29 18:15:00 -07:00
Eric Anholt	c821ccf0e3	vc4: Fix return value handling for BO waits. If the wait ever returned -ETIME, we'd abort because the errno was stored in errno and not drmIoctl()'s return value.	2015-05-29 18:15:00 -07:00
Marek Olšák	7116250b7a	radeon/llvm: reset temps_count on deallocation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-29 11:52:44 +02:00
Marek Olšák	7afc992c20	radeon/llvm: don't use a static array size for radeon_llvm_context::arrays (v2) v2: - don't use realloc (tgsi_shader_info provides the size) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-29 11:52:44 +02:00
Dave Airlie	065978d36b	softpipe: fix offset wrapping calculations (v2) Roland pointed out my previous attempt was lacking, so I enhanced the texwrap piglit test, and tested them. This fixes the offset calculations in a number of areas by adding the offset first, it also fixes the fastpaths, which I forgot to address in the previous commit. v2: try and avoid divides in most paths, the repeat mirror path really was ugly no matter which way I went, so I left it having the divide. Also fix the gather lod calculation bug. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-29 13:15:47 +10:00
Eric Anholt	10aacf5ae8	vc4: Just stream out fallback IB contents. The idea I had when I wrote the original shadow code was that you'd see a set_index_buffer to the IB, then a bunch of draws out of it. What's actually happening in openarena is that set_index_buffer occurs at every draw, so we end up making a new shadow BO every time, and converting more of the BO than is actually used in the draw. While I could maybe come up with a better caching scheme, for now just do the simple thing that doesn't result in a new shadow IB allocation per draw. Improves performance of isosurf in drawelements mode by 58.7967% +/- 3.86152% (n=8).	2015-05-27 17:29:11 -07:00
Eric Anholt	f8de6277bf	vc4: Don't try to put our dmabuf-exported BOs into the BO cache. We'd sometimes try to reallocate something that X was using as a new pipe_resource, and potentially conflict in our rendering. But even worse, if we reallocated the BO as a shader, the kernel would reject rendering using the shader.	2015-05-27 17:29:11 -07:00
Eric Anholt	b0edc19a52	vc4: Don't forget to make our raster shadow textures non-raster. Not sure what happened in my testing that made the previous shadow code fix glxgears swapbuffering, but this also fixes lots of CopyArea in X (like dragging xlogo around in metacity).	2015-05-27 17:29:11 -07:00
Samuel Pitoiset	41630c0653	vc4: make vc4_begin_query() return a boolean I forgot to make the change in `96f164f6f0`. This fixes a warning with GCC and probably an error with Clang. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-05-27 17:29:03 -07:00
Marek Olšák	224a77cc60	radeonsi: use a switch statement in si_delete_shader_selector Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-26 12:42:37 +02:00
Marek Olšák	0c5a309cee	radeonsi: use a switch statement in si_shader_selector_key Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-26 12:42:37 +02:00
Marek Olšák	fa7f606e89	radeonsi: fix scratch buffer setup for geometry shaders Cc: 10.6 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-26 12:42:37 +02:00
Marek Olšák	f41517242a	radeonsi: remove unused cases from si_shader_io_get_unique_index These can't occur between VS and GS, because GS is only supported in the core profile. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-26 12:42:37 +02:00
Marek Olšák	af4b9c7c2e	radeonsi: don't count special outputs for the VS export count Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-26 12:42:36 +02:00
Marek Olšák	e4339bc988	radeonsi: add support for PIPE_CAP_TGSI_TEXCOORD Without it, texcoords are mapped to GENERIC[0..7], PointCoord is mapped to GENERIC[8], and user-defined varyings start from GENERIC[9]. Since texcoords can only be used between VS and PS, and PointCoord is PS-only, it's silly to always start from GENERIC[9] in all other shaders (such as LS, HS, ES, GS). This adds support for TEXCOORD and PCOORD semantics. As a result, st/mesa will use GENERIC[0] as a base for user-defined varyings, which should make linking ES and GS as well as tessellation shaders at runtime easier. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-05-26 12:42:31 +02:00
Marek Olšák	92c31bb0dd	gallium: use const in set_tess_state Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-26 11:46:28 +02:00
Ilia Mirkin	3ec1815285	nv30: falling back to draw path for edgeflag does no good The problem is that the EDGEFLAG has to be toggled at vertex submission time. This can be done from either the draw or the regular paths. Avoid falling back to draw just because there's an edgeflag. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 21:45:31 -04:00
Ilia Mirkin	25be70462d	nv30/draw: switch varying hookup logic to know about texcoords Commit `8acaf862df` switched things over to use TEXCOORD instead of GENERIC, but did not update the nv30 swtnl draw paths. This teaches the draw logic about TEXCOORD. Among other things, this fixes a crash in demos/arbocclude when using swtnl. Curiously enough, the point-sprite piglit works without this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 21:45:31 -04:00
Ilia Mirkin	c3d36a2e1a	nv30/draw: allocate vertex buffers in gart These are only used once per draw, so it makes sense to keep them in GART. Also take this opportunity to modernize the buffer mapping API usage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 21:45:22 -04:00
Ilia Mirkin	fdad7dfbda	nv30/draw: only use the DMA1 object (GART) if the bo is not in VRAM Instead of always having it in the data, let the bo placement decide it. This fixes glxgears with swtnl forced on. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 21:45:08 -04:00
Ilia Mirkin	3600439897	nv30/draw: fix indexed draws with swtnl path and a resource index buffer The map = assignment was missing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 20:16:51 -04:00
Roland Scheidegger	6a111e54d7	llvmpipe: (trivial) add parantheses in (!x == y) expression Apparently some compilers think we probably wanted to do !(x == y) instead and issue a warning, so just shut it up... No functional change, obviously. Cc: <mesa-stable@lists.freedesktop.org>	2015-05-25 22:24:42 +02:00
Ilia Mirkin	147816375d	nv30/draw: draw expects constbuf size in bytes, not vec4 units This fixes glxgears with NV30_SWTNL=1 forced on. Probably fixes a bunch of other situations where we fall back to the swtnl path. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 14:11:16 -04:00
Ilia Mirkin	89585edf3c	nv30/draw: avoid leaving stale pointers in draw state Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-25 14:11:16 -04:00
Ilia Mirkin	7518fc3c66	nv30: fix clip plane uploads and enable changes nv30_validate_clip depends on the rasterizer state. Also we should upload all the new clip planes on change since next time the plane data won't have changed, but the enables might. This fixes fixed-clip-enables and vs-clip-vertex-enables shader tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-24 12:00:03 -04:00
Ilia Mirkin	aba3392541	nv30: avoid doing extra work on clear and hitting unexpected states Clearing can happen at a time when various state objects are incoherent and not ready for a draw. Some of the validation functions don't handle this well, so only flush the framebuffer state. This has the advantage of also not doing extra work. This works around some crashes that can happen when clearing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2015-05-24 12:00:03 -04:00
Ilia Mirkin	9870ed05dd	nv30: avoid leaking render state and draw shaders Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-24 02:26:29 -04:00
Ilia Mirkin	605ce36d7f	nv30: don't leak fragprog consts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-24 01:33:06 -04:00
Ilia Mirkin	fa7f9f123b	nv50/ir: avoid messing up arg1 of PFETCH There can be scenarios where the "indirect" arg of a PFETCH becomes known, and so the code will attempt to propagate it. Use this opportunity to just fold it into the first argument, and prevent the load propagation pass from touching PFETCH further. This fixes gs-input-array-vec4-index-rd.shader_test and vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-23 22:15:15 -04:00
Ilia Mirkin	c922758685	nv30: check nouveau_bo_map output of notify bo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-23 19:10:07 -04:00
Ilia Mirkin	921917c8d8	nvc0: a geometry shader can have up to 1024 vertices output The 1024 is already reported everywhere, not sure where this 0x1ff came from. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-23 17:55:21 -04:00
Samuel Pitoiset	c783fd476c	nv50: fix PIPE_QUERY_TIMESTAMP_DISJOINT, based on nvc0 PIPE_QUERY_TIMESTAMP_DISJOINT could not work because q->ready was always set to FALSE. To fix this issue, add more different states for queries according to nvc0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-23 19:00:55 +02:00
Ilia Mirkin	217301843a	nvc0/ir: LOAD's can't be used for shader inputs We forgot to convert to VFETCH in case of indirect access. Fix that. This avoids crashes on the new gs-input-array-vec4-index-rd and vs-output-array-vec4-index-wr-before-gs but they still fail. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-22 19:08:24 -04:00
Ilia Mirkin	0bab3962f5	nv50/ir: guess that the constant offset is the starting slot of array When we get something like IN[ADDR[0].x+5], we will now guess that we should look at IN[5] for the "base" information. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-22 19:08:14 -04:00
Ilia Mirkin	d1eea18a59	nvc0/ir: set ftz when sources are floats, not just destinations In the case of a compare, the destination might be a predicate, but we still want to flush denorms. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>	2015-05-22 16:51:05 -04:00
Ilia Mirkin	a85aba190d	nv50/ir: allow OP_SET to merge with OP_SET_AND/etc as well as a neg This covers the pattern where a KILL_IF is used, which triggers a comparison of -x to 0. This can usually be folded into the comparison whose result is being compared to 0, however it may, itself, have already been combined with another comparison. That shouldn't impact the logic of this pass however. With this and the & 1.0 change, code like 00000020: 001c0001 80081df4 set b32 $r0 lt f32 $r0 0x3e800000 00000028: 001c0000 201fc000 and b32 $r0 $r0 0x3f800000 00000030: 7f9c001e dd885c00 set $p0 0x1 lt f32 neg $r0 0x0 00000038: 0000003c 19800000 $p0 discard becomes 00000020: 001c001d b5881df4 set $p0 0x1 lt f32 $r0 0x3e800000 00000028: 0000003c 19800000 $p0 discard Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-22 16:51:05 -04:00
Ilia Mirkin	d2a474e8d4	nvc0/ir: optimize set & 1.0 to produce boolean-float sets This has started to happen more now that the backend is producing KILL_IF more often. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2015-05-22 16:51:05 -04:00
Ilia Mirkin	e5ad19a46e	nvc0/ir: allow iset to produce a boolean float Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-22 16:51:05 -04:00
Ilia Mirkin	0ec6b8ea8c	nvc0/ir: avoid jumping to a sched instruction Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-22 16:51:05 -04:00
Samuel Pitoiset	a21d23e191	nv50: fix PIPELINE_STATISTICS with HUD, based on nvc0 Tested on NVA8. No regression for ARB_pipeline_statistics piglit tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-22 11:39:23 +02:00
Samuel Pitoiset	867fd2b5f5	nv50: fix 64-bit queries with HUD, based on nvc0 A sequence number is written for 32-bits queries to make sure they are ready, but not for 64-bits queries. Instead, we have to use a fence in order to fix the HUD because it doesn't wait until the result is ready. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-22 11:39:23 +02:00
Christian König	6921ea42a1	radeon/vce: adapt new firmware interface changes v2: make this also compatible with original released firmware v3 (chk): switch to original idea of separate files for fw versions Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2)	2015-05-22 10:17:24 +02:00
Christian König	2b40c306d2	radeon/vce: move CPB handling function into common code They are not firmware version dependent. Signed-off-by: Christian König <christian.koenig@amd.com>	2015-05-22 10:17:24 +02:00
Ilia Mirkin	6cdb29d52f	freedreno/a3xx: set .zw of sprite coords to .01 Fixes non-determinism in bin/point-sprite rendering, and the stars on the intro screen to neverball. Cc: "10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-20 21:54:00 -04:00
Ilia Mirkin	3e7bc67285	freedreno/ir3: fix immediate usage in tgsi tex fe get_immediate will return a const reference, the requested immediate isn't necessarily in the x slot. Make sure to use the swizzle. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-05-20 21:53:59 -04:00
Jason Ekstrand	2126c68e5c	nir: Get rid of the array elements parameter on load/store intrinsics Previously, we used intrinsic->const_index[1] to represent "the number of array elements to load" for load/store intrinsics. However, this set to 1 by every pass that ever creates a load/store intrinsic. Also, while it might make some sense for registers, it makes no sense whatsoever in SSA. On top of that, the i965 backend was the only backend to ever support it; freedreno and vc4 just assert that it's always 1. Let's just delete it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-05-20 09:28:06 -07:00
Marek Olšák	e1c4e8aaaa	gallium: remove TGSI_SAT_MINUS_PLUS_ONE It's a remnant of some old NV extension. Unused. I also have a patch that removes predicates if anyone is interested. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-05-20 15:40:46 +02:00
Dave Airlie	55a7b5165d	softpipe: start adding gather support (v2) This adds both ARB_texture_gather and the enhanced gather for ARB_gpu_shader5. This passes all the piglit tests, it relies on the GLSL lowering pass to make textureGatherOffsets work. v2: use inline to get gather component (Brian) fix function name, add asserts (Brian) Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-20 12:32:59 +10:00
Dave Airlie	0108eae291	softpipe: use arrays to make gather easier This is a prep change for gather, and it makes more sense to use an array in these cases. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-20 12:32:55 +10:00
Dave Airlie	3f5c67d651	softpipe: add textureOffset support. This was an oversight when GLSL1.30 was enabled, I think my misunderstanding. This fixes a bunch of tex-miplevel-selection tests under softpipe, and is required for textureGather support. I'm not sure this won't make sampling slowering, but its softpipe, correctness first and all that. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-20 12:32:47 +10:00
Dave Airlie	8bec83a307	softpipe: move control into a filter args struct more stuff for offsets and gather will go in here later. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-20 12:32:44 +10:00
Dave Airlie	99e583120c	softpipe: move some image filter parameters into a struct This moves some of the image filter args into a struct, and passes that instead, this is prep work for adding texture gather support which needs new arguments. review: make filter args const. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-20 12:32:27 +10:00
Rob Clark	e6f912f07e	freedreno: fence fix A fence can outlive the ctx, so we shouldn't deref the ctx to get at the screen. We need some updates in libdrm_freedreno API to completely handle fences properly, but this is at least an improvement. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-05-18 17:47:54 -04:00
Ilia Mirkin	ae405d429f	gk110/ir: switch to gk104-style sched codes rather than all-in-one Matches change to envydis/envyas tools. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-18 12:59:52 -04:00
Marek Olšák	369aca1b4a	trace: implement new tessellation functions Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-05-16 14:51:22 +02:00
Alexander von Gluck IV	624b38add9	gallium/drivers: Add extern "C" wrappers to public entry Reviewed-by: Brian Paul <brianp@vmware.com>	2015-05-15 13:55:59 -04:00
Rob Clark	4925c35660	freedreno: fix bug in tile/slot calculation This was causing corruption with hw binning on a306. Unlikely that it is a306 specific, but rather the smaller gmem size resulted in different tile configuration which was triggering the bug at certain resolutions. Signed-off-by: Rob Clark <robclark@freedesktop.org> Cc: "10.4" and "10.5" and "10.6" <mesa-stable@lists.freedesktop.org>	2015-05-14 14:46:14 -04:00
Rob Clark	fcc7d6323b	freedreno: enable a306 Whitelist adreno 306 (as found in msm8916/apq8016). Works pretty much out of the box, although the smaller GMEM size requires more tiles to fit 1920x1080, so bump up the max # of tiles as well. Since it is just whitelist + trivial change, it makes sense to land on all the active release branches. Note that a305c ends up with gpu-id "306", hence a306 ends up with gpu-id of "307". Apparently that is what happens when you let the marketing dept name things. Cc: "10.4" and "10.5" and "10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-05-14 14:46:14 -04:00
Samuel Pitoiset	175cbb447a	nvc0: remove unused nv50_tsc_wrap_mode() function Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-14 13:27:44 -04:00
Samuel Pitoiset	ac1ac94b38	nv50/ir: silence compiler warnings about mismatched tags These warnings have been detected by Clang 3.6. codegen/nv50_ir_from_tgsi.cpp:1319:10: warning: struct 'Source' was previously declared as a class [-Wmismatched-tags] const struct tgsi::Source *code; Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-14 13:27:44 -04:00
Samuel Pitoiset	70651b7041	nv50/ir: remove unused private field cycle to SchedDataCalculator Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-14 13:27:43 -04:00
Samuel Pitoiset	7469f2fd23	nv30: remove unused nvfx_fp_memcpy() function and comment nv40_fp_bra() The nv40_fp_bra() function in the same file is also unused but this is the only place where the nv30/nv40 isa is documented. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-14 13:27:43 -04:00
Samuel Pitoiset	48c84a36dd	nvc0: do not expose MP counters for nvf0 (GK110+) This fixes a crash when trying to monitor MP counters because compute support is not implemented for nvf0. Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-14 13:27:43 -04:00
Roland Scheidegger	adcf8f8a13	softpipe: enable ARB_texture_view Some bits were already there for texture views but some were missing. In particular for cube map views things needed to change a bit. For simplicity I ended up removing the separate face addr bit (just use the z bit) - cube arrays didn't use it already, so just follow the same logic there. (In theory using separate bits could allow for better hash function but I don't think anyone ever did some measurements of that so probably not worth the trouble, if we'd reintroduce it we'd certainly wanted to use the same logic for cube arrays and cube maps.) Also extend the seamless cube sampling to cube arrays - as there were no piglit failures before this is apparently untested, but things now generally work quite the same for cube textures and cube array textures so there hopefully shouldn't be any trouble... 49 new piglits, 47 pass, 2 fail (both due to fake multisampling). v2: incorporate Brian's feedback, add sampler view validation, function rename, formatting fixes. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-05-13 22:57:50 +02:00
Roland Scheidegger	e6c66f4fb0	llvmpipe: enable ARB_texture_view All the functionality was pretty much there, just not tested. Trivially fix up the missing pieces (take target info from view not resource), and add some missing bits for cubes. Also add some minimal debug validation to detect uninitialized target values in the view... 49 new piglits, 47 pass, 2 fail (both related to fake multisampling, not texture_view itself). No other piglit changes. v2: move sampler view validation to sampler view creation, update docs. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-05-13 22:57:50 +02:00
Ilia Mirkin	c696a318ef	nouveau: document nouveau_heap Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-12 18:58:49 -04:00
Ilia Mirkin	d06ce2f1df	nvc0: switch mechanism for shader eviction to be a while loop This aligns it to work similarly to nv50. However there's no library code there, so the whole thing can be freed. Here we end up with an allocated node that's not attached to a specific program. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86792 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-05-12 18:47:17 -04:00
Marek Olšák	79ffc08ae8	gallium: add PIPE_CAP_DEVICE_RESET_STATUS_QUERY Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-05-12 19:38:31 +02:00
Dave Airlie	9ab90c058f	r600: use pipe->hw prim convert from radeonsi This avoids future addition to PIPE_PRIM_ from causing regressions on r600g. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-05-11 06:43:18 +10:00
Rob Clark	1cbdafc47a	freedreno/ir3/nir: fix build break after `f752effa` Our lower if/else pass was missed when converting NIR to use linked lists rather than hashsets to track use/def sets. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-05-10 06:03:53 -04:00
Ilia Mirkin	da136dc07d	nv50/ir: only enable mul saturate on G200+ Commit `44673512a8` enabled support for saturating fmul. However experimentally this does not seem to work on the older chips. Restrict the feature to G200 (NVA0) and later. Reported-by: Pierre Moreau <pierre.morrow@free.fr> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90350 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: mesa-stable@lists.freedesktop.org	2015-05-09 13:41:51 -04:00
Ilia Mirkin	7892210400	nvc0: reset the instanced elements state when doing blit using 3d engine Since we update num_vtxelts here, we could otherwise end up with stale instancing information in the upper bits which wouldn't otherwise get reset. (Also we run the risk of the previous draw having set the first element as instanced.) This appears as one of the causes for the test pointed out in fdo#90363 to fail on nvc0. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-05-09 13:36:23 -04:00
Ilia Mirkin	e9b1ea29bf	nvc0: keep track of PGRAPH state in nvc0_screen See identical commit for nv50. Destroying the current context and then creating a new one or switching to another existing context would cause the "current" state to not be properly initialized, so we save it off in the screen. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-05-09 13:36:23 -04:00
Ilia Mirkin	f617029db3	nv50: keep track of PGRAPH state in nv50_screen Normally this is kept in nv50_context, and on switching the active context, the state is copied from the previous context. However when the last context is destroyed, this is lost, and a new context might later be created. When the currently-active context is destroyed, save its state in the screen, and restore it when setting the current context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363 Reported-by: Matteo Bruni <matteo.mystral@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-05-09 13:36:23 -04:00
Jason Ekstrand	7a30668ad6	util: Move gallium's linked list to util The linked list in gallium is pretty much the kernel list and we would like to have a C-based linked list for all of mesa. Let's not duplicate and just steal the gallium one. Acked-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-05-08 17:16:13 -07:00
Ilia Mirkin	c4ac09e30e	nv50/ir: only propagate saturate up if some actual folding took place The former logic would copy the saturate up to any mul with an immediate if there was a subsequent mul with a saturate. However we only want to do that if we collapsed 2 muls by multiplying their immediates (or were able to put the immediate in as a post-multiplier). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-05-08 18:56:56 -04:00
Ilia Mirkin	55b66dc4de	nv50/ir: add SHL to the list of U32 opcodes Having the wrong inferred type prevents a number of optimizations, including constant propagation (since float immediates work differently than integer immediates). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-05-06 20:50:03 -04:00
Vinson Lee	382b1a36e3	r600g: Fix Clang return-type build error. Fix Clang return-type error introduced with commit `96f164f6f0` "gallium: make pipe_context::begin_query return a boolean". CC r600_query.lo r600_query.c:443:3: error: non-void function 'r600_begin_query' should return a value [-Wreturn-type] return; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-05-06 12:21:34 -07:00
Chia-I Wu	ef5d4bcc3a	ilo: silence a compiler warning Silence ilo_query.c:120:7: warning: 'return' with no value, in function returning non-void since commit `96f164f6`.	2015-05-06 16:35:30 +08:00
Samuel Pitoiset	cea910bc28	nvc0: all queries use an unsigned 64-bits integer by default Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Martin Peres <martin.peres@free.fr>	2015-05-06 00:03:36 +03:00
Samuel Pitoiset	35a9286be6	nvc0: make begin_query return false when all MP counters are used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Martin Peres <martin.peres@free.fr>	2015-05-06 00:03:36 +03:00
Samuel Pitoiset	ed7d3886cc	nvc0: define driver-specific query groups This patch defines "Driver statistics" and "MP counters" groups, but only the latter will be exposed through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Martin Peres <martin.peres@free.fr>	2015-05-06 00:03:36 +03:00
Samuel Pitoiset	96f164f6f0	gallium: make pipe_context::begin_query return a boolean GL_AMD_performance_monitor must return an error when a monitoring session cannot be started. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Martin Peres <martin.peres@free.fr>	2015-05-06 00:03:36 +03:00
Samuel Pitoiset	546ec980f8	gallium: replace pipe_driver_query_info::max_value by a union This allows queries to return different numeric types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Martin Peres <martin.peres@free.fr>	2015-05-06 00:03:35 +03:00
Samuel Pitoiset	b620829b5e	gallium: add new fields to pipe_driver_query_info According to the spec of GL_AMD_performance_monitor, valid type values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT. This also introduces the new field group_id in order to categorize queries into groups. v2: add PIPE_DRIVER_QUERY_TYPE_BYTES v3: fix incorrect query type for radeon and svga drivers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Martin Peres <martin.peres@free.fr>	2015-05-06 00:03:35 +03:00
Chia-I Wu	4348046a2f	ilo: use ilo_image exclusively in core Initialize ilo_view_surface and ilo_zs_surface from ilo_image instead of ilo_texture.	2015-05-02 22:28:31 +08:00
Chia-I Wu	9b705ec32d	ilo: add ilo_image_can_enable_aux() It replaces ilo_texture_can_enable_hiz().	2015-05-02 22:14:07 +08:00
Chia-I Wu	430594c34f	ilo: make ilo_image more self-contained Add depth0, sample_count, and scanout to ilo_image.	2015-05-02 22:14:06 +08:00
Chia-I Wu	f6ca4084c7	ilo: add ilo_image_init_for_imported() It replaces ilo_image_update_for_imported_bo() and enables more error checkings for imported textures.	2015-05-02 22:14:06 +08:00
Chia-I Wu	938c9b8cea	ilo: prepare for image init for imported bo Refactoring in prepraration for ilo_image_init_for_imported().	2015-05-02 22:14:06 +08:00
Chia-I Wu	3f9415077b	ilo: constify ilo_image_params Make ilo_image_params const in functions that do not modify it.	2015-05-02 22:14:06 +08:00
Chia-I Wu	c209aa7a8f	ilo: improve readability of ilo_image Improve docs, rename struct fields, and reorder walk types. No real changes.	2015-05-02 22:14:06 +08:00
Chia-I Wu	9b72bf5bd2	ilo: move command builder to core	2015-05-02 22:14:06 +08:00
Chia-I Wu	9e24c49e64	ilo: move ilo_state_3d* to core ilo state structs (struct ilo_xxx_state) are moved as well.	2015-05-02 22:14:06 +08:00
Chia-I Wu	8ab18262c5	ilo: add ilo_buffer.h to core Rename the original ilo_buffer to ilo_buffer_resource to avoid name conflict.	2015-05-02 22:14:06 +08:00
Chia-I Wu	3afbeb115a	ilo: move BOs from ilo_texture to ilo_image We want to work with ilo_image instead of ilo_texture in core.	2015-05-02 22:14:06 +08:00
Chia-I Wu	ac47563cb4	ilo: move ilo_layout.[ch] to core as ilo_image.[ch] Move files and s/layout/image/.	2015-05-02 22:14:06 +08:00
Chia-I Wu	8252765532	ilo: add ilo_format.[ch] to core The original ilo_format.[ch] are removed.	2015-05-02 22:14:06 +08:00
Chia-I Wu	9b7080c8b3	ilo: add ilo_fence.h to core Implement pipe_fence_handle on top of ilo_fence.	2015-05-02 22:14:06 +08:00
Chia-I Wu	2182beb431	ilo: add ilo_dev_init() to core Move init_dev() from ilo_screen.c to core.	2015-05-02 22:14:06 +08:00
Chia-I Wu	7562f9e907	ilo: rename ilo_dev_info to ilo_dev With intel_winsys being embedded in it, drop the "_info" suffix.	2015-05-02 22:14:06 +08:00
Chia-I Wu	19351af53d	ilo: move intel_winsys to ilo_dev_info We want to use ilo_dev_info instead of ilo_screen in core.	2015-05-02 22:14:06 +08:00
Chia-I Wu	b3197fe5f4	ilo: add ilo_dev.h to core Move what are remaining in ilo_common.h (that is, ilo_dev_*) to ilo_dev.h.	2015-05-02 22:14:06 +08:00
Chia-I Wu	7bb4fa72c0	ilo: add ilo_debug.[ch] to core They consist of the debug helpers that used to live in ilo_common.h and ilo_screen.c.	2015-05-02 22:14:06 +08:00
Chia-I Wu	a5797873d0	ilo: add ilo_core.h to core ilo_core.h includes the common gallium headers that were included in ilo_common.h.	2015-05-02 22:14:05 +08:00
Chia-I Wu	bbe91576b7	ilo: move intel_winsys.h to core Add a new subdirectory and start moving files that do not depend on ilo_screen/ilo_context to it.	2015-05-02 22:14:05 +08:00
Ilia Mirkin	33f0d1138d	nvc0/ir: fix predicated PFETCH for real Commit `a9d08a250` accidentally didn't make use of the new src1 variable. Use it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-04-30 02:02:47 -04:00
Ilia Mirkin	db269ae495	nv50/ir: fix asFlow() const helper for OP_JOIN Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-04-29 23:34:30 -04:00
Ilia Mirkin	a9d08a250a	nvc0/ir: fix predicated PFETCH emission src1 would contain the predicate, which would get emitted as a register source by an undiscerning srcId helper. Work around this in the same way as in emitTEX. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-04-29 23:34:22 -04:00
Ilia Mirkin	515ac907e6	gk110/ir: fix set with a register dest to not auto-set the abs flag This was causing src0 to always have the absolute value flag set. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-04-29 18:03:19 -04:00
Marek Olšák	a582b22c63	winsys/radeon: add a private interface for radeon_surface	2015-04-29 21:51:40 +02:00
Marek Olšák	dcfbc006b6	winsys/radeon: move radeon_winsys.h to drivers/radeon	2015-04-29 21:51:40 +02:00
Emil Velikov	b124dc2b70	r300: do not link against libdrm_intel Accidentally added since the introduction of the file. Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-04-29 15:15:19 +01:00
Axel Davy	559342d01d	gallium/svga: Remove useless ARRAY_SIZE declaration This is already declared in util/macros.h Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-04-29 08:28:10 +02:00
Axel Davy	64880d073a	util/macros: Move DIV_ROUND_UP to util/macros.h Move DIV_ROUND_UP to a shared location accessible everywhere Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-04-29 08:28:10 +02:00
Ilia Mirkin	6fe0d4f035	nvc0/ir: flush denorms to zero in non-compute shaders This will set the FTZ flag (flush denorms to zero) on all opcodes that can take it. This resolves issues in Unigine Heaven 4.0 where there were solid-filled boxes popping up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89455 Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-28 20:17:03 -04:00
Ilia Mirkin	e312a69958	nvc0: expose GLSL version 410 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-28 12:48:22 -04:00
Marek Olšák	6d05396b00	r600g,radeonsi: add a driver query returning GPU load Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-04-28 16:05:45 +02:00
Marek Olšák	0b8e73a6ae	r600g,radeonsi: add driver queries for GPU temperature and shader+memory clocks Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-04-28 16:05:45 +02:00
Ilia Mirkin	9143940da2	gm107/ir: add lane/vertex count sysvals Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 21:25:29 -04:00
Ilia Mirkin	89e0b08794	gk110/ir: add support for writing per-patch and shader outputs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 21:25:28 -04:00
Ilia Mirkin	52614f59b7	freedreno/a3xx: color masking works like a blend for some formats When there is a colormask active that does not cover all the channels, enable reading in the destination like with a combining blend operation. This fixes fbo-blending-formats on a3xx. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 20:17:07 -04:00
Ilia Mirkin	9fc3f47278	freedreno/a3xx: add support for S8 and Z32F_S8 Enables ARB_depth_buffer_float. There is no sampling support for interleaved Z32F_S8, so we store the two textures separately, one as Z32F, the other as S8. As a result, we need a lot of additional logic for restores and transfers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 20:17:07 -04:00
Ilia Mirkin	1571da6ac3	freedreno/a3xx: add Z32F support 32-bit depth buffers are stored as unorm, and thus need special handling when moving to and from gmem. They are copied into gmem by writing depth, and resolved from gmem using a special resolve bit which apparently float-ifies the data. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 20:17:07 -04:00
Ilia Mirkin	0a4cb00c77	freedreno: add fd_transfer to wrap around pipe_transfer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 20:17:07 -04:00
Ilia Mirkin	f5c1101996	freedreno/a3xx: add support for disabling depth clipping Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-27 20:17:07 -04:00
Zoë Blade	05e7f7f438	Fix a few typos Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-04-27 17:28:29 +03:00
Marek Olšák	db2415189a	radeonsi: set an optimal value for DB_Z_INFO.ZRANGE_PRECISION Required because of a VI hw bug. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-04-27 15:57:07 +02:00
Marek Olšák	bed98eef9a	radeonsi: remove deprecated and useless registers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-04-27 15:56:27 +02:00
Marek Olšák	393b0e0531	radeonsi: remove useless includes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-04-27 15:56:27 +02:00
Marek Olšák	d8269be1ce	gallium/radeon: print winsys info with R600_DEBUG=info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-04-27 15:56:27 +02:00
Marek Olšák	ecc7f2ed91	gallium/radeon: don't crash when getting out-of-bounds TEMP references Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-04-23 16:14:39 +02:00
Dave Airlie	8a41cd2407	softpipe: fix stencil write to use an integer value This fixes a number of regressions since `61393bdcdc` u_tile: fix stencil texturing tests under softpipe Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89960 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-23 08:32:30 +10:00
Rob Clark	cb24d3b7ad	freedreno: misc minor cleanups Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:28 -04:00
Rob Clark	1b58d8c2bf	freedreno/a4xx: (partial) gl_FragCoord.zw The bit to enable .z is still commented out, as it is triggering gpu hangs in 0ad. But at least gl_FragCoord.w works now, and we know what bits we are supposed to set for .z (with that uncommented all piglit fragcoord tests are passing). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:28 -04:00
Rob Clark	a869183123	freedreno/a4xx: primitive-restart This was the missing bit to get dolphin-emu working on a4xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:28 -04:00
Rob Clark	632ea2a113	freedreno/nir: sysval fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:28 -04:00
Rob Clark	13527df143	freedreno/a4xx: wire up integer texture sampling Similar to a3xx, the compiler needs to know the return type of the sam, etc, instructions. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:28 -04:00
Rob Clark	48a651e98c	freedreno/a4xx: formats updates/fixes Update formats table with new formats that Ilia has figured out, and fix sampling from srgb texture and integer vbo's. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:28 -04:00
Rob Clark	21ceedfd8b	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 13:20:27 -04:00
Emil Velikov	86919352e3	android: use LOCAL_SHARED_LIBRARIES over TARGET_OUT_HEADERS ... to manage the LIBDRM*_CFLAGS. The former is the recommended approach by the Android build system developers while the latter has been depreciated for quite some time. Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-04-22 14:23:28 +01:00
Emil Velikov	413bc0a618	ilo: remove unused include from Android.mk Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>	2015-04-22 14:18:47 +01:00
Ilia Mirkin	0904774af1	freedreno/a3xx: enable polymode setting with non-fill modes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-18 17:35:23 -04:00
Ilia Mirkin	6357601628	freedreno/a3xx: fix integer and 32-bit float border colors Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-18 17:35:23 -04:00
Ilia Mirkin	6895c3554e	freedreno/a3xx: add support for float R/RG render targets Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-18 17:35:23 -04:00
Rob Clark	95e68adcd9	freedreno/ir3/nir: few little fixes isaml needs to scale up coords based on LoD. Also fix bogus bary.f varying # when there are non-bary frag shader inputs. And use sub.s of a positive immediate rather than add.s of negative (since CP is better about figuring out that those can be collapsed into the cat2 instr). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 11:40:14 -04:00
Rob Clark	efbf14e893	freedreno/ir3/nir: lower if/else For now, completely flatten if/else blocks. That will almost certainly change once we have flow control. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 11:40:14 -04:00
Rob Clark	e5e11b5baf	freedreno/a4xx: support for large shaders Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 10:40:50 -04:00
Rob Clark	20ea698c49	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 10:40:44 -04:00
Rob Clark	57f0d3b3c6	freedreno/ir3/nir: UBO support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 10:40:36 -04:00
Rob Clark	87807e5cc5	freedreno/ir3: move out helper We'll also want it in NIR f/e for implementing UBO support. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 10:40:28 -04:00
Rob Clark	70b2f872ea	freedreno/a4xx: sysvals and UBOs Basically just sync up the cmdstream emit parts to match the changes already done on a3xx. Also, fix scheduling for mem instructions. This is needed on a4xx, and I am a bit surprised it isn't needed for a3xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-17 10:40:18 -04:00
Marek Olšák	b79c620663	radeonsi: add a debug option to compile shaders when they're created Tested-by: Tom Stellard <thomas.stellard@amd.com>	2015-04-16 18:36:29 +02:00
Emil Velikov	a7d018accf	radeonsi: remove bogus r600-- triple As mentioned by Michel Dänzer for LLVM >= 3.6 we create the LLVMTargetMachine (with triple amdgcn--), as we setup the radeonsi context. For older LLVM or hardware (r600) the triple is always r600-- and is created at a later stage - radeon_llvm_compile() Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-04-16 14:15:19 +01:00
Glenn Kennard	17d69862a9	r600g/sb: Skip empty ALU clause while scheduling Fixes assert triggered by ext_transform_feedback-intervening-read output use_gs piglit test. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-16 12:43:20 +10:00
Eric Anholt	b229e6c7de	vc4: Don't try to use color load/stores to blit across format changes. We could potentially support the right combination of 8888 to 565, but the important thing for now is to not mix up our orderings of 8888. Fixes fbo-copyteximage regressions.	2015-04-15 16:50:23 -07:00
Eric Anholt	cff2e08c4c	vc4: Don't try to use color load/stores to do depth/stencil blits. Fixes regressions in fbo-generatemipmap-formats on depth/stencil (which does blits to work around baselevel/lastlevel).	2015-04-15 16:50:23 -07:00
Eric Anholt	3a728d4dfb	vc4: Update the shadow texture for public textures on every draw. We don't know who else has written to it, so we'd better update it every time. This makes the gears spin in X again.	2015-04-15 16:50:23 -07:00
Eric Anholt	bd957b1b79	vc4: Hook up VC4_DEBUG=perf to some useful printfs.	2015-04-15 16:50:22 -07:00
Tom Stellard	e0994e0f97	radeon/llvm: Improve codegen for KILL_IF Rather than emitting one kill instruction per component of KILL_IF's src reg, we now or the components of the src register together and use the result as a condition for just one kill instruction. shader-db stats (bonaire): 979 shaders Totals: SGPRS: 34872 -> 34848 (-0.07 %) VGPRS: 20696 -> 20676 (-0.10 %) Code Size: 749032 -> 748452 (-0.08 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 12288 -> 12288 (0.00 %) bytes per wave Totals from affected shaders: SGPRS: 1184 -> 1160 (-2.03 %) VGPRS: 600 -> 580 (-3.33 %) Code Size: 13200 -> 12620 (-4.39 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Increases: SGPRS: 2 (0.00 %) VGPRS: 0 (0.00 %) Code Size: 0 (0.00 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Decreases: SGPRS: 5 (0.01 %) VGPRS: 5 (0.01 %) Code Size: 25 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) * BY PERCENTAGE * Max Increase: SGPRS: 32 -> 40 (25.00 %) VGPRS: 0 -> 0 (0.00 %) Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Max Decrease: SGPRS: 32 -> 24 (-25.00 %) VGPRS: 16 -> 12 (-25.00 %) Code Size: 116 -> 96 (-17.24 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave * BY UNIT * Max Increase: SGPRS: 64 -> 72 (12.50 %) VGPRS: 0 -> 0 (0.00 %) Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Max Decrease: SGPRS: 32 -> 24 (-25.00 %) VGPRS: 16 -> 12 (-25.00 %) Code Size: 424 -> 356 (-16.04 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-04-14 13:37:12 +00:00
Tom Stellard	c6d79ed289	radeon/llvm: Run LLVM's instruction combining pass This should improve code quality in general and will help with some future changes to how we emit kill instructions. shader-db shows a few regressions, but these don't seem to be the result of deficiencies in instcombine. They're mostly caused by the scheduler making different decisions than before. shader-db stats (bonaire): 979 shaders Totals: SGPRS: 35056 -> 34872 (-0.52 %) VGPRS: 20624 -> 20696 (0.35 %) Code Size: 764372 -> 749032 (-2.01 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 12288 -> 12288 (0.00 %) bytes per wave Totals from affected shaders: SGPRS: 13264 -> 13072 (-1.45 %) VGPRS: 8248 -> 8316 (0.82 %) Code Size: 486320 -> 470992 (-3.15 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 11264 -> 11264 (0.00 %) bytes per wave Increases: SGPRS: 6 (0.01 %) VGPRS: 20 (0.02 %) Code Size: 14 (0.01 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Decreases: SGPRS: 32 (0.03 %) VGPRS: 8 (0.01 %) Code Size: 244 (0.25 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) * BY PERCENTAGE * Max Increase: SGPRS: 32 -> 48 (50.00 %) VGPRS: 12 -> 20 (66.67 %) Code Size: 216 -> 224 (3.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Max Decrease: SGPRS: 40 -> 32 (-20.00 %) VGPRS: 16 -> 12 (-25.00 %) Code Size: 368 -> 280 (-23.91 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave * BY UNIT * Max Increase: SGPRS: 32 -> 48 (50.00 %) VGPRS: 28 -> 36 (28.57 %) Code Size: 39320 -> 40132 (2.07 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Max Decrease: SGPRS: 72 -> 64 (-11.11 %) VGPRS: 48 -> 40 (-16.67 %) Code Size: 6272 -> 5852 (-6.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 0 -> 0 (0.00 %) bytes per wave Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-04-14 13:37:05 +00:00
Tom Stellard	2569c7109d	radeonsi: Add header and footer to shader stat dump This makes it easier to parse. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-04-14 13:36:59 +00:00
Eric Anholt	1be329e64c	vc4: Add a blitter path using just the render thread. This accelerates the path for generating the shadow tiled texture when asked to sample from a raster texture (typical in glamor).	2015-04-13 23:20:46 -07:00
Eric Anholt	76d56752cc	vc4: Allow submitting jobs with no bin CL in validation. For blitting, we want to fire off an RCL-only job. This takes a bit of tweaking in our validation and the simulator support (and corresponding new code in the kernel).	2015-04-13 23:20:45 -07:00
Eric Anholt	43b20795b7	vc4: Move the blit code to a separate file. There will be other blit code showing up, and it seems like the place you'd look.	2015-04-13 23:20:45 -07:00
Eric Anholt	e214a59635	vc4: Separate out a bit of code for submitting jobs to the kernel. I want to be able to have multiple jobs being set up at the same time (for example, a render job to do a little fixup blit in the course of doing a render to the main FBO).	2015-04-13 23:20:45 -07:00
Eric Anholt	44b63cf5c0	vc4: When asked to sample from a raster texture, make a shadow tiled copy. So, it turns out my simulator doesn't quite match the hardware. And the errata about raster textures tells you most of what's wrong, but there's still stuff wrong after that. Instead, if we're asked to sample from raster, we'll just blit it to a tiled temporary. Raster textures should only be screen scanout, and word is that it's faster to copy to tiled using the tiling engine first than to texture from an entire raster texture, anyway.	2015-04-13 22:34:06 -07:00
Eric Anholt	d04b07f8e2	vc4: Fix off-by-one in branch target validation.	2015-04-13 22:34:06 -07:00
Eric Anholt	7fa2f2e366	vc4: Use NIR-level lowering for idiv. This fixes the idiv tests in piglit.	2015-04-13 21:36:40 -07:00
Eric Anholt	84ebaff1b7	vc4: Add a bunch of type conversions. These are required to get piglit's idiv tests working. The unsigned<->float conversions are wrong, but are good enough to get piglit's small ranges of values working.	2015-04-13 21:36:40 -07:00
Eric Anholt	adae027260	vc4: Use the blit interface for updating shadow textures. This lets us plug in a better blit implementation and have it impact the shadow update, too.	2015-04-13 10:39:24 -07:00
Eric Anholt	39b6f7e76c	vc4: Remove dead fields from vc4_surface.	2015-04-13 10:39:24 -07:00
Eric Anholt	5100221ff7	vc4: Skip sending down the clear colors if not clearing.	2015-04-13 10:39:24 -07:00
Eric Anholt	725620f21d	vc4: Sync with kernel changes to relax BCL versus RCL validation. There was no reason to tie the two packets' values together.	2015-04-13 10:39:23 -07:00
Eric Anholt	cb88d2cfcb	vc4: Fix another space allocation mistake. We're over-allocating our BCL in vc4_draw.c, so this never mattered. However, new RCL-only blit support might end up here without having set up any BCL contents.	2015-04-13 10:39:02 -07:00
Eric Anholt	8eb9304ee7	vc4: Add missed accounting for the size of the semaphore. This wouldn't have mattered except in the worst case scenario RCL setup.	2015-04-13 10:33:30 -07:00
Rob Clark	b98c0262d1	freedreno/ir3/nir: couple little fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:41:03 -04:00
Rob Clark	1b936bb9f8	freedreno/ir3/nir: handle system values Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:57 -04:00
Rob Clark	715b2e0dbb	freedreno/ir3/nir: handle txs and query_levels tex ops These correspond to the tgsi TXQ opcode (plus sneak in a fix for two-sided color) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:43 -04:00
Rob Clark	97e8fc3fdd	freedreno/ir3/nir: split out tex helpers We'll need these in one or two other spots. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:36 -04:00
Rob Clark	6e8160d6e3	freedreno/ir3/nir: simplify emit_tex() Just build up arrays for src0/src1, and use create_collect().. Also add back missing .3d flag for 3d/cube textures. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:28 -04:00
Rob Clark	d5357c16cc	freedreno/ir3/cp: handle indirect properly I noticed some cases where we where trying to copy-propagate indirect src's into places they cannot go, like 2nd src for cat3 (mad, etc). Expand out valid_flags() to be aware of relativ flag, and fix up a few related spots. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:21 -04:00
Rob Clark	49be76166b	freedreno/ir3/sched: avoid getting stuck on addr conflicts When we get in a scenario where we cannot schedule any more instructions due to address register conflict, clone the instruction that writes the address register, and switch the remaining unscheduled users for the current address register over to the new clone. This is simpler and more robust than the previous attempt (which tried and sometimes failed to ensure all other dependencies of users of the address register were scheduled first).. hint it would try to schedule instructions that were not actually needed for any output value. We probably need to do the same with predicate register, although so far it isn't so heavily used so we aren't running into problems with it (yet). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:15 -04:00
Rob Clark	4cf4006674	freedreno/ir3/nir: add variable-indexing support A bit fugly.. try and make this cleaner.. note if we hoist all the get_addr() out of the loop we can drop the hashtable and just use create_addr().. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:09 -04:00
Rob Clark	972ce757d7	freedreno/ir3/asm: change assert to warning It probably should be an assert, but for now TGSI f/e isn't very good about dealing w/ CONST vs ABS/NEG. So for debug builds, print a warning instead of crashing with an assert for now. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:40:03 -04:00
Rob Clark	09cbd97a47	freedreno/ir3/nir: set first_driver_param Without this, a3xx breaks.. a4xx would too if it had already implemented support for passing driver params. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:39:56 -04:00
Rob Clark	f0e9a632a1	freedreno/ir3/cp: support to swap mad src's For a normal MAD (ie. not MADSH), if first source is gpr and second source is const, we can swap the first two sources to avoid needing a mov instruction. This gives back the biggest advantage TGSI f/e had over NIR f/e for common shaders, since TGSI f/e had this logic in the f/e. Note that doing this in copy-prop step has the advantage that it will also work for cases like: MOV TEMP[b], CONST[x] MAD TEMP[d], TEMP[a], TEMP[b], TEMP[c] Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-11 11:39:46 -04:00
Roland Scheidegger	586536a4e1	gallivm: don't use control flow when doing indirect constant buffer lookups llvm goes crazy when doing that, using way more memory and time, though there's probably more to it - this points to a very much similar issue as fixed in `8a9f5ecdb1`. In any case I've seen a quite plain looking vertex shader with just ~50 simple tgsi instructions (but with a dozen or so such indirect constant buffer lookups) go from a terribly high ~440ms compile time (consuming 25MB of memory in the process) down to a still awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious improvements possible (but I have no clue why it's so slow...). The resulting shader is most likely also faster (certainly seemed so though I don't have any hard numbers as it may have been influenced by compile times) since generally fetching constants outside the buffer range is most likely an app error (that is we expect all indices to be valid). It is possible this fixes some mysterious vertex shader slowdowns we've seen ever since we are conforming to newer apis at least partially (the main draw loop also has similar looking conditionals which we probably could do without - if not for the fetch at least for the additional elts condition.) v2: use static vars for the fake bufs, minor code cleanups Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-04-09 01:32:30 +02:00
Glenn Kennard	f2947807c8	r600g/sb: Enable SB for geometry shaders Add SV_GEOMETRY_EMIT special variable type to track the implicit dependencies between CUT/EMIT_VERTEX/MEM_RING instructions so GCM/scheduler doesn't reorder them. Mark emit instructions as unkillable so DCE doesn't eat them. Enable only for evergreen/cayman as there are a few unexplained GS piglit regressions on R6xx/R7xx with SB enabled otherwise. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-08 08:18:35 +10:00
Glenn Kennard	06bb68da4a	r600g/sb: Update last_cf for loops CF_END could end up emitted in the middle of a shader on cayman when there was a loop at the very end. Fixes glsl-1.50-geometry-end-primitive and ext_transform_feedback-geometry-shaders-basic piglit tests. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-08 08:18:17 +10:00
Ilia Mirkin	ae720c66cb	nv50,nvc0: limit the y-tiling of 3d textures to the first level's tiling We limit y-tiling to 0x20 when depth is involved. However the function is run for each miplevel, and the hardware expects miplevel 0 to have the highest tiling settings. Perform the y-tiling limit on all levels of a 3d texture, not just the ones that have depth. Fixes: texelFetch fs sampler3D 98x129x1-98x129x9 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nick Tenney <nick.tenney@gmail.com> # GT216 Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-04-06 23:06:55 -04:00
Dave Airlie	ad84689f73	r600g: fix op3 abs issue This code to handle absolute values on op3 srcs was a bit too simple, it really needs a temp reg per src, not one per channel, make it easier and let sb clean up the mess. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89831 Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-07 11:40:16 +10:00
Rob Clark	8b0b81339b	freedreno/ir3: add NIR compiler The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:40 -04:00
Ilia Mirkin	700d949ea1	freedreno/a3xx: don't decode srgb on mem2gmem Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	b060b56772	freedreno/a3xx: pass sprite coord mode through to program emit Use the correct sprite replacement depending on the flip of the coord mode, using either T or 1-T depending on whether we have an upper-left or lower-left coordinate origin. This fixes all the point sprite piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	1de72dfc8a	freedreno/a3xx: add UBO support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	c7811f56c2	freedreno/ir3: insert nop between sfu/mem operations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	14dfd8cc43	freedreno: dirty context when reallocating a bound bo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	bde2045fa2	freedreno: keep track of buffer valid ranges Copies nouveau_buffer and radeon_buffer. This allows a write to proceed to an uninitialized part of a buffer even when the GPU is using the previously-initialized portions. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	dacf22e0a3	freedreno: mark resources as being read so that writes flush the queue Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	2e1445c8f3	freedreno: don't bother setting resource timestamps Waiting on a bo being ready is handled in fd_bo_cpu_prep. No need to keep separate timestamps around. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	1fee3061d5	freedreno: add a reading flag to indicate gpu is reading rsc Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	ea0952a9db	freedreno: fix resource flushing confusion A resource flush is an upload of a hypothetically-staging texture to the GPU. For a UMA system, this will largely be a no-op or cache-maintenance. Move the render flush logic into transfer_map where it belongs, and clear out the transfer_flush function. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	bfb0a8eb69	freedreno: remove tex_resource pipe_sampler_view already contains a texture, remove the redundant tex_resource member which pointed at the same thing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Rob Clark	6cd9c94ce4	freedreno/ir3: handle FRAG IN's without interpolation specified Fallback to picking based on semantic name. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00
Rob Clark	f513f006ce	freedreno/ir3/cmdline: add @const headers for immediates Since NIR f/e currently encodes immediates in instructions (rather than passing via const), we need to ensure that when const's are used the get initialized to the proper values. Otherwise comparing NIR to TGSI compiler, it will use proper immediate values in one case, and randomly initialize values in the other. Which confuses ir3test. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00
Rob Clark	6bc12bb5fd	freedreno/ir3/cmdline: remove hack for old compiler Since we dropped the old compiler, we don't need this hack anymore. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00

... 3 4 5 6 7 ...

13958 Commits