mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Kenneth Graunke	5b682143da	nir: Make nir_lower_clip_vs optionally work with variables. The way nir_lower_clip_vs() works with store_output intrinsics makes a ton of assumptions about the driver_location field. In i965 and iris, I'd rather do this lowering early and work with variables. v3d may want to switch to that as well, and ir3 could too, but I'm not sure exactly what would need updating. For now, handle both methods. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 14:33:16 -08:00
Eric Anholt	538bca78e2	v3d: Don't try to set PF flags on a LDTMU operation We need an ALU op in order to set PF. Fixes a recent assertion failure in dEQP-GLES3.functional.ubo.single_basic_type.shared.bool_vertex	2018-11-15 11:12:54 -08:00
Eric Anholt	4e1b163eed	v3d: Update the TLB config for depth writes on V3D 4.2. Fixes 311 piglit cases on the simulator.	2018-11-01 13:56:30 -07:00
Emil Velikov	986033a275	configure: allow building with python3 Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python2 chosen prior to python3 v2: use python2 by default Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-31 19:15:50 +00:00
Eric Anholt	cc54e1acf9	v3d: Use nir_remove_unused_io_vars to handle binner shader output DCE We were doing this late after nir_lower_io, but we can just reuse the core code. By doing it at this stage, we won't even set up the VS attributes as inputs, reducing our VPM size.	2018-10-30 10:46:52 -07:00
Eric Anholt	c152c79d5e	v3d: Only add output slot tracking for the current varying slot. We always emit 4 slots per slot because things like color output and position processing in the epilogue will potentially look up more values than the variable declaration had. However, when we get a .location_frac != 0, we don't want to overwrite components of the following .driver_location.	2018-10-30 10:46:52 -07:00
Eric Anholt	17c8198952	v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components. This lets us trim unused trailing components in the vertex attributes, reducing the size of our VPM allocations.	2018-10-30 10:46:52 -07:00
Eric Anholt	fc85f7cfdc	v3d: Don't rely on sorting input vars for VPM read setup. For supporting scalar VPM i/o at the NIR level, we need to do a pass over the vars to figure out how big each attribute is after DCE. Once we've done that, we can just walk over c->vattr_sizes[] instead of bothering with vars.	2018-10-30 10:46:52 -07:00
Eric Anholt	cc78676030	v3d: Split out NIR input setup between FS and VPM. They don't share much code, and I'm about to rewrite the remaining shared code for the VPM case.	2018-10-30 10:46:52 -07:00
Eric Engestrom	bb84fa146f	util: use C99 declaration in the for-loop hash_table_foreach() macro Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-25 12:43:18 +01:00
Eric Anholt	8ec83dc51e	v3d: Add support for hardware pack/unpack of half floats. Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test in half.	2018-10-15 17:16:44 -07:00
Mauro Rossi	cc3b99bb48	android: broadcom/cle: export the broadcom top level path headers Fixes the following building error in vc4 build: In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34: In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27: In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34: In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39: In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10: fatal error: 'cle/v3d_packet_helpers.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `5b102160ae` ("broadcom/genxml: Introduce a V3D packet/struct decoder.") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-15 09:14:46 +02:00
Mauro Rossi	9158e0bd82	android: broadcom/cle: add gallium include path Fixes the following building error: In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38: In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29: external/mesa/src/gallium/auxiliary/util/u_math.h:42:10: fatal error: 'pipe/p_compiler.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `5b102160ae` ("broadcom/genxml: Introduce a V3D packet/struct decoder.") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-15 09:14:42 +02:00
Mauro Rossi	3341429d74	android: broadcom/genxml: fix collision with intel/genxml header-gen macro Fixes the following building error, happening when building both intel and broadcom: Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h /bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \ external/mesa/src/broadcom/cle/v3d_packet_v21.xml \ > gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h" Traceback (most recent call last): File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module> p = Parser(sys.argv[2]) IndexError: list index out of range header-gen macro is already defined by Intel genxml building rules and the existing header-gen does not have the $(PRIVATE_VER) argument, infact the bash command line logged in the building error is missing exactly $(PRIVATE_VER) argument Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk solves the building error, another possible way is to keep the gen rules commands expanded and not use the macros. Fixes: `7f80a9ff13` ("vc4: Introduce XML-based packet header generation like Intel's.") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-15 09:14:33 +02:00
Dylan Baker	80825abb5d	move u_math to src/util Currently we have two sets of functions for bit counts, one in gallium and one in core mesa. The ones in core mesa are header only in many cases, since they reduce to "#define _mesa_bitcount popcount", but they provide a fallback implementation. This is important because 32bit msvc doesn't have popcountll, just popcount; so when nir (for example) includes the core mesa header it doesn't (and shouldn't) link with core mesa. To fix this we'll promote the version out of gallium util, then replace the core mesa uses with the util version, since nir (and other non-core mesa users) can and do link with mesautils. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-07 10:21:26 -07:00
Eric Anholt	a91b158bd9	v3d: Fix setup of the VCM cache size. There were two bugs working together to make things mostly work: I wasn't dividing the VPM output size available by the size of a batch (vertex), but I also had the size of the VPM reduced by a factor of 8. Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it seems also my intermittent varying failures. Fixes: `1561e4984e` ("v3d: Emit the VCM_CACHE_SIZE packet.")	2018-09-07 08:11:38 -07:00
Emil Velikov	cff80b6c15	Revert "configure: allow building with python3" This reverts commit `ae7898dfdb`. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html	2018-08-24 11:14:15 +01:00
Emil Velikov	ae7898dfdb	configure: allow building with python3 Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-23 17:00:13 +01:00
Mathieu Bridon	2ee1c86d71	meson: Build with Python 3 Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-10 15:15:09 -07:00
Eric Anholt	1561e4984e	v3d: Emit the VCM_CACHE_SIZE packet. This is needed to ensure that we don't get blocked waiting for VPM space with bin/render overlapping. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	50a8713d4f	v3d: Avoid spilling that breaks the r5 usage after a ldvary. Fixes bad rendering when forcing 2 spills in glxgears. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	f2c0d310d6	v3d: Make sure that QPU instruction-has-a-dest matches VIR. Found when debugging register spilling -- we would try to spill the dest of a STVPMV, inserting spill code after entering the last segment. In fact, we were likely to to choose to do this, given that the STVPMV "dest" temp was never read from, making it cheap to spill. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	3f9cb2eb05	v3d: Wait for TMU writes to complete before continuing after a spill. The simulator complained that we had write responses outstanding at shader end. It seems that a TMU read does not guarantee that previous TMU writes by the thread have completed, which surprised me. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	ccbe33af5b	v3d: Make sure we don't emit a thrsw before the last one finished. Found while forcing some spilling, which creates a lot of short tmua->thrsw->ldtmu sequences. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	f9d54dc3cf	v3d: Add some debug code for forcing register spilling. This is useful for periodically testing out register spilling to see how it goes on simple shaders, rather than only failing on insanely complicated ones.	2018-08-06 13:03:23 -07:00
Eric Anholt	c2eab33b08	v3d: Actually put the "%s" in the snprintf. I missed an important part when porting the change over, fixing my compiler warning but breaking -Werror=format-security. Fixes: `e6ff5ac446` ("v3d: use snprintf(..., "%s", ...) instead of strncpy") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107443	2018-08-01 11:39:19 -07:00
Eric Anholt	e6ff5ac446	v3d: use snprintf(..., "%s", ...) instead of strncpy Fixes a compiler warning about terminator NUL, based on `f836d799f9` ("intel/decoder: use snprintf(..., "%s", ...) instead of strncpy")	2018-07-31 16:42:11 -07:00
Eric Anholt	3471ce9985	v3d: Add support for the TMUWT instruction. This instruction is used to ensure that TMU stores have been processed before moving on. In particular, you need any TMU ops to be done by the time the shader ends.	2018-07-31 16:05:04 -07:00
Eric Anholt	d934492ff9	v3d: Dump the contents off all the buffers in CLIF mode. A V3D_DEBUG=clif file from a non-texturing .shader_test can now be successfully run through the CLIF runner in the simulator. Now I need to build an open source CLIF runner against the v3d DRM module.	2018-07-30 14:29:01 -07:00
Eric Anholt	99a5ac250b	v3d: Split walking the CLs to generate relocs from walking CLs to dump. We need to dump each buffer's contents in order for a CLIF file, so we need to collect all of the relocs into a buffer (such as the indirect CL full of both uniforms and GL shader states) before we start dumping.	2018-07-30 14:29:01 -07:00
Eric Anholt	2df6f1a3df	v3d: Include commands to run the BCL and RCL in CLIF dumps.	2018-07-30 14:29:01 -07:00
Eric Anholt	c6449e33e3	v3d: Use a short, underscored name for packets in CLIF/CL dumping. These will match the names that the CLIF parser expects to see. I may in the future decide to change more of the other names so that I match the names the HW/closed SW team uses for their packets, rather than the names in the spec (which only they and I can read anyway).	2018-07-30 14:29:01 -07:00
Eric Anholt	b56f8c475e	v3d: Rename "configuration" and "config" in the XML to "cfg" This matches what CLIF parsing expects, and makes TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more legible TILE_BINNING_MODE_CFG_COMMON.	2018-07-30 14:29:01 -07:00
Eric Anholt	300e609feb	v3d: s/colour/color in the XML. The CLIF format expects american english spelling, and the rest of Mesa is too. I was previously adhering to the spec's spelling, which is counterproductive.	2018-07-30 14:29:01 -07:00
Eric Anholt	3a8550ad06	v3d: Rename primitives to prims in the XML to match CLIF names. This makes us match up with the V3D HW team's names a bit more.	2018-07-30 14:29:01 -07:00
Eric Anholt	6237c64049	v3d: Print CLIF fixed-point values as just their decimal value. The parser doesn't handle float input, so we have to dump the raw value.	2018-07-30 14:29:01 -07:00
Eric Anholt	8da47b7648	v3d: When not doing terminal pretty-printing, comment struct field names. The struct field names aren't part of the CLIF ABI, just the order of fields within the struct. The comments are there for human readability.	2018-07-30 14:29:01 -07:00
Eric Anholt	103f21b13d	v3d: Add a separate flag for CLIF ABI output versus human-readable CLs. A few of the upcoming changes would make the V3D_DEBUG=cl output less readable, so let's make proper CLIF file production be under a separate V3D_DEBUG=clif flag.	2018-07-30 14:29:01 -07:00
Eric Anholt	89ac6fa403	v3d: Add pack header support for f187 values. V3D only has one of these (the top 16 bits of a float32) left in its CLs, but VC4 had many more. This gets us proper pretty-printing of the values instead of a large uint.	2018-07-30 14:29:01 -07:00
Eric Anholt	27f1bfe471	vc4: Fix meson build when enabled without v3d. Reported-by: Rob Clark <robdclark@gmail.com> Fixes: `e92959c4e0` ("v3d: Pass the whole clif_dump structure to v3d_print_group().")	2018-07-29 19:13:29 -07:00
Eric Anholt	942456f646	v3d: Skip printing sub-id or pad fields in CLIF dumping. The parser doesn't expect them, so our fields would end up mismatched. They're not really useful in console output, either.	2018-07-27 18:00:48 -07:00
Eric Anholt	3ee0ab599e	v3d: Emit commands to switch CLIF parser to CL/shader/attr input mode. By default after saying you are emitting a buffer, it'll expect a buffer size. Once you set a format, it'll keep parsing that format until you announce something else.	2018-07-27 18:00:46 -07:00
Eric Anholt	a57770aa37	v3d: Dump fields in CLIF output in increasing offset order. Previously, we emitted in XML order, which I happen to type in the decreasing offset order of the specifications. However, the CLIF parser wants increasing offsets.	2018-07-27 17:56:55 -07:00
Eric Anholt	95bafeeabf	v3d: Print addresses in CLIFs as references to buffers. With CLIFs, the parser will choose an address for the buffer being created, so we need to use effectively relocations to buffers instead of the addresses that the driver uses. This is also a whole lot more intelligible for console output than raw addresses!	2018-07-27 17:56:36 -07:00
Eric Anholt	3c02838d29	v3d: Stop doing pretty-printed colorful booleans in CLIF output. The parser wants to see a 1 or 0. We can put "true" and "false" in a comment to clarify that it's a boolean and the parser will skip it.	2018-07-27 17:55:57 -07:00
Eric Anholt	422910d2e7	v3d: Move clif dumping to a separate step from noting where the CLs are. Now all the printing happens from the same worklist processing.	2018-07-27 17:08:35 -07:00
Eric Anholt	01b4952773	v3d: Move clif dump BO lookup into the clif dumper. The clif dumper is going to need information about all of our BOs if we're going to dump them for replay purposes.	2018-07-27 17:08:35 -07:00
Eric Anholt	e92959c4e0	v3d: Pass the whole clif_dump structure to v3d_print_group(). To generate CLIF files that the v3dv3 simulator can parse, we're going to need to decode addresses, and for that we'll need the vaddr lookup function from the clif structure from within v3d_decoder.	2018-07-27 17:08:35 -07:00
Eric Anholt	9bf9a6d6a1	v3d: Drop the VG support from the XML. This reflects a change on the HW/closed SW side to drop this unused HW. With it dropped on their side, the CLIF parser no longer expects to find VG fields.	2018-07-27 12:56:36 -07:00
Eric Anholt	5a1cc3861c	v3d: Use /* */ instead of () for enum names in CLIF output. This lets the comments be ignored by the CLIF parser.	2018-07-27 12:56:36 -07:00
Eric Anholt	95a0f99825	v3d: CLIF-dump the "Vec size" field as 0 == maximum value. That's what a user should want to see, and what the CLIF parser wants. This should maybe be generalized.	2018-07-27 12:56:36 -07:00
Eric Anholt	d934d3206e	nir: Add flipping of gl_PointCoord.y in nir_lower_wpos_ytransform. This is controlled by a new nir_shader_compiler_options flag, and fixes dEQP-GLES3.functional.shaders.builtin_variable.pointcoord on V3D. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-26 11:00:34 -07:00
Mathieu Bridon	9ebd8372b9	python: Use range() instead of xrange() Python 2 has a range() function which returns a list, and an xrange() one which returns an iterator. Python 3 lost the function returning a list, and renamed the function returning an iterator as range(). As a result, using range() makes the scripts compatible with both Python versions 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Eric Anholt	6b73a97f84	v3d: Implement a small immediates optimization, based on VC4's. We can do one per instruction, and we have to be careful not to overwrite raddr_b, but this greatly reduces the pressure on uniform loads (particularly around ldvpm/stvpm instructions). total instructions in shared programs: 90768 -> 88220 (-2.81%) instructions in affected programs: 82711 -> 80163 (-3.08%)	2018-07-23 10:21:43 -07:00
Eric Anholt	79e0f042bc	v3d: Return an invalid src number if asked for a missing implicit uniform. Sometimes when iterating over sources, we might want to check if it's the implicit one. We wouldn't want to match on a non-implicit src using this function.	2018-07-23 10:21:43 -07:00
Eric Anholt	f2ea936f48	v3d: Skip emitting texture config parameter 2 if it's just the defaults. shader-db: total instructions in shared programs: 91275 -> 90768 (-0.56%) instructions in affected programs: 20702 -> 20195 (-2.45%)	2018-07-23 10:21:43 -07:00
Eric Anholt	421e99d777	v3d: Update an XXX comment for a path we handled in HW on V3D 4.x.	2018-07-23 10:21:43 -07:00
Eric Anholt	e7ae900341	v3d: Switch to using the new SFU instructions on V3D 4.x. These instructions let us write directly to the phys regfile, instead of just R4. That lets us avoid moving out of R4 to avoid conflicting with other SFU results, and to avoid conflicting with thread switches. There is still an extra instruction of latency, which is not represented in the scheduler at the moment. If you use the result before it's ready, the QPU will just stall, unlike the magic R4 mode where you'd read the previous value. That means that the following shader-db results aren't quite representative (since we now cause some stalls instead of emitting nops), but they're impressive enough that I'm happy with the change. total instructions in shared programs: 95669 -> 91275 (-4.59%) instructions in affected programs: 82590 -> 78196 (-5.32%)	2018-07-23 10:21:43 -07:00
Eric Anholt	58c1d3860f	v3d: Add QPU pack/unpack for the new SFU instructions. These instructions allow writing the result to any register, instead of a special writeback to r4.	2018-07-23 10:21:43 -07:00
Eric Anholt	cdfa99657d	v3d: Fix the name of the "flpop" operation. Noticed while trying to sort a new op into the appropriate place to match the documentation.	2018-07-23 10:21:43 -07:00
Eric Anholt	91e24e5718	v3d: Print the instruction we're testing in the QPU disasm/pack round-trip. If we fail initial disassembly, it's good to know what instruction it was that failed.	2018-07-23 10:21:42 -07:00
Eric Anholt	a1beb333d8	v3d: Drop unused vir_SAT() operation. We lower saturates in NIR.	2018-07-23 10:21:42 -07:00
Eric Anholt	8dfc6ee317	v3d: Rotate through registers to improve post-RA scheduling options. Similarly to VC4's implementation, by not picking r0 immediately upon freeing it, we give the scheduler more of a chance to fit later writes in earlier. I'm not clear on whether there's any real cost to picking phys over accumulators, so keep that behavior for now. shader-db: total instructions in shared programs: 96831 -> 95669 (-1.20%) instructions in affected programs: 77254 -> 76092 (-1.50%)	2018-07-23 10:21:42 -07:00
Eric Anholt	1fb31819ae	v3d: Allow reading from physical regs written in the previous instruction. This restriction existed in V3D 2.x, but lifting it was a major change in 3.x. shader-db results: total instructions in shared programs: 98117 -> 96831 (-1.31%) instructions in affected programs: 48520 -> 47234 (-2.65%)	2018-07-23 10:21:23 -07:00
Eric Anholt	229836fb37	v3d: Disable shader-db cycle estimates until we sort out TMU estimates. I keep having to ignore these shader-db changes since I don't trust them, so just disable the reports entirely.	2018-07-16 14:39:59 -07:00
Eric Anholt	2baab6bf2a	v3d: Emit the lowered uniform just before its first use in a block. total instructions in shared programs: 98578 -> 98119 (-0.47%) instructions in affected programs: 27571 -> 27112 (-1.66%) and it also eliminates most spills/fills on the CTS's randomized uniform usage testcases.	2018-07-16 14:39:59 -07:00
Eric Anholt	26f830d9fc	v3d: Add an assert that we don't provide an invalid texture return words. The docs had an update noting this restriction, so reflect it in the code.	2018-07-16 14:39:59 -07:00
Eric Anholt	d661d78464	v3d: Apply GFXH-1625 restriction on TMUWT in the end of the shader. This doesn't affect us yet since we're not doing TMUWTs, but I think we will for GLES 3.1.	2018-07-16 14:39:59 -07:00
Eric Anholt	beeb94402f	v3d: Implement noperspective varyings on V3D 4.x. Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.	2018-07-09 11:48:32 -07:00
Eric Anholt	93f437d128	v3d: Fix typo in dither mode offset. We weren't using the field yet, so it didn't affect anything. Fixes: `c0476d964a` ("v3d: Express dithering mode in the same way that the CLIF parser does.")	2018-07-09 11:48:32 -07:00
Eric Anholt	5601ab3981	v3d: Add support for GL_SAMPLE_ALPHA_TO_ONE. Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one	2018-07-05 12:39:36 -07:00
Eric Anholt	7b63371420	v3d: Respect swap_color_rb for the f32_color_rb case. We don't actually set the two flags together, but I want to use the r/g/b/a reordered fields in the next commit.	2018-07-05 12:39:36 -07:00
Eric Anholt	49f7631c9f	v3d: Emit a TF flush after each draw using TF. This fixes GPU hangs on 7278 in transform feedback tests such as GTF-GLES3.gtf.GL3Tests.transform_feedback2.transform_feedback2_basic	2018-07-02 10:05:14 -07:00
Eric Anholt	a77cb724da	v3d: Move GL shader state dumping out of per-version compilation. It doesn't depend on V3D_VER, since it's just calling v3d_print_group.	2018-06-29 13:36:28 -07:00
Eric Anholt	c2901ff80f	v3d: Add missing Stream field to transform feedback specs on V3D 4.1. Noticed when trying to CLIF parse a transform feedback job that hangs on HW.	2018-06-29 13:36:28 -07:00
Eric Anholt	69efc1e025	v3d: Add missing "tri trip or fan" flag in Primitive List Format.	2018-06-29 13:36:28 -07:00
Eric Anholt	b341b39db3	v3d: Fix the shader code address field widths on V3D 4.1+ We were overlapping it with the threadable/nan flags, resulting in incorrect relocations (threadable/nan included in the offset) and wrong ordering in the CLIF files.	2018-06-29 13:36:28 -07:00
Eric Anholt	6c3c11ba19	v3d: Add missing "no prim pack" field to the V3D4.1+ GL shader state. It looks like we don't need this flag for anything (not that I'm clear on what it does), but it makes our struct dumping line up with CLIF parsing.	2018-06-29 13:36:28 -07:00
Eric Anholt	c0476d964a	v3d: Express dithering mode in the same way that the CLIF parser does.	2018-06-29 13:36:28 -07:00
Eric Anholt	24d2f1347d	v3d: Add missing "number of bin tile lists" field. Noticed when trying to feed our dumps through the CLIF parser. Since this is a "minus one" field, we were already filling in the value we wanted (0).	2018-06-29 13:36:28 -07:00
Eric Anholt	b65b61cefe	v3d: Rewrite the color write masks to match CLIF format. The render_target_* fields gave us pretty(ish) printing, but meant we were incompatible with CLIF, and had much more verbose code generating them.	2018-06-29 13:36:28 -07:00
Eric Anholt	38172dcba9	v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML. The XML ends up noisier if you're only looking at one version, but from the diffstat there's obvious wins in terms of deduplication. This will get even more significant if we ever support 3.2 or 4.0.	2018-06-29 13:36:28 -07:00
Eric Anholt	725561c0b6	v3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields. The XML zipper wants one XML per version for filling out its tables, but we want to do more than one GPU version per XML now. Assume that the "gen" field will be the same as min_ver and look up our XML text assuming that they're listed in increasing min_ver.	2018-06-29 13:36:28 -07:00
Eric Anholt	f8af5c58c3	v3d: Create XML fields for min_ver and max_ver of a packet/struct/enum. This will be used to merge together the V3D 3.3-4.1 XML with the variants disabled based on the version.	2018-06-29 13:36:28 -07:00
Eric Anholt	6f7ad7ed11	v3d: Pass the version being generated to the pack generator script. It turns out that most V3D versions change very few packets, so keeping separate copies of the XML per version makes changing the XML a pain as you have to replicate your changes to each one. This is the start of changing it so that one XML can generate headers for multiple versions.	2018-06-29 13:36:28 -07:00
Eric Anholt	9f80bcc2bc	v3d: Convert a bunch of our "minus one" fields over to the new XML attr. This fixes up their formatting for CLIF files and makes the code more legible.	2018-06-27 09:13:48 -07:00
Eric Anholt	18b1bb0b63	v3d: Add pack/unpack/decode support for fields with a "- 1" modifier. Right now, we name these fields as "field name minus one" so that your C code obviously states what the value should be. However, it's easy enough to handle at the codegen level with another little XML attribute, meaning less C code and easier-to-read values in CLIF dumping and gdb as well. (The actual CLIF format for simulator and FPGA replay takes in pre-minus-one values, so we need it there too).	2018-06-27 09:13:48 -07:00
Eric Anholt	ee9a6a13fb	v3d, vc4: Disable valgrind checking of CLE inputs when NDEBUG is set. For a meson -Db_ndebug=true release build on x86_64, reduces text size of libv3d.a from 53.0k to 51.6k. Inspired by `0d5329d626` ("anv: Disable __gen_validate_value if NDEBUG is set.")	2018-06-21 15:46:40 -07:00
Eric Anholt	f49d112a01	v3d: Implement ALPHA_TO_COVERAGE. There's a convenient "FTOC" instruction for generating the coverage now, unlike vc4. This fixes dEQP-GLES3.functional.multisample.fbo_4_samples.proportionality_alpha_to_coverage	2018-06-20 09:30:46 -07:00
Eric Anholt	07b243674f	v3d: Add missing always_flush debug flag. The #define existed and was checked in the driver.	2018-06-19 09:42:20 -07:00
Eric Anholt	778594ae12	v3d: Limit shader threading according to our maximum TMU fifo usage. Fixes simulator assertion failures in dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment and similar complicated cases.	2018-06-15 16:09:39 -07:00
Eric Anholt	e130ada243	v3d: Fix shaders using pixel center W but no varyings. The docs called this field "uses both center W and centroid W", but actually it's "do you need center W even if varyings don't obviously call for it?" Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w	2018-06-15 16:09:39 -07:00
Eric Anholt	d91e06a065	v3d: Fix configuration setup of mixed f32 and f16 render targets. Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.	2018-06-14 16:52:25 -07:00
Eric Anholt	48011c42aa	v3d: Remove unused QUNIFORM_STENCIL left over from vc4.	2018-06-14 16:52:25 -07:00
Eric Anholt	a40bc33b11	v3d: Fix undefined results for a swap_color_rb RT from a float shader output. Fixes segfaults and undefined behavior in dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float	2018-06-14 16:52:25 -07:00
Eric Anholt	9d5860310d	v3d: Enable the new NIR bitfield operation lowering paths. These together get the GLSL 3.00 unorm/snorm pack functions and MESA_shader_integer operations working. v2: Fix commit message typo. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	2b1b2cbf61	v3d: Be more explicit about include directory from our generated code. You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir. nir was fine at that, but automake didn't have it. Bugzilla: https://github.com/anholt/mesa/issues/104	2018-06-05 12:44:49 -07:00
Eric Anholt	97894b1267	v3d: Add support for glSampleMask / glSampleCoverage.	2018-05-17 15:09:46 +01:00
Eric Anholt	9bbc3f8cf1	v3d: Enable NaN propagation in the VS and CS as well. Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.	2018-05-17 15:09:12 +01:00
Eric Anholt	8c47ebbd23	v3d: Rename the driver files from "vc5" to "v3d".	2018-05-16 21:19:07 +01:00
Eric Anholt	c4c488a2ae	v3d: Rename the vc5_dri.so driver to v3d_dri.so. This allows the driver to load against the merged kernel DRM driver. In the process, rename most of the build system variables and gallium plumbing functions.	2018-05-16 21:19:07 +01:00
jenny.q.cao	ff7521c9ba	android: change include "cutils/log.h" to "log/log.h" on Android API >=26 There is a compile warning from Android 8 (API version 26) from "include cutils/log.h" warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings, Change to include "log/log.h" on Android 8 or later major version to avoid this warning Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-14 08:08:31 +03:00
Eric Anholt	76ee9edcb4	broadcom/vc5: Add support for centroid varyings. It would be nice to share the flags packet emit logic with flat shade flags, but I couldn't come up with a good way while still using our pack macros. We need to refactor this to shader record setup at compile time, anyway. Fixes ext_framebuffer_multisample-interpolation * centroid-*	2018-04-26 11:30:22 -07:00
Eric Anholt	77b4f30bae	broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements. We don't use ldunifa yet, but we will eventually for UBOs.	2018-04-26 11:30:22 -07:00
Eric Anholt	089c32eefd	broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements. We don't use TMUWT yet, but we will once we do SSBOs.	2018-04-26 11:30:22 -07:00
Eric Anholt	dc4cb04ee5	broadcom/vc5: Add QPU validation for register writes after thrend. The next shader gets to start writing the register file during these slots, so make sure we don't stomp over them. The only case of hitting this that I could imagine would be dead writes.	2018-04-26 11:30:22 -07:00
Eric Anholt	503716fa86	broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key.	2018-04-25 09:21:54 -07:00
Eric Anholt	5710532e9e	broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x. For single-sample we have to always program SAMPLE_0, but for multisample we want to store all the samples.	2018-04-25 09:21:54 -07:00
Ian Romanick	d76c204d05	util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:23 -07:00
Aaron Watry	1dae92f150	broadcom/vc4: Fix out-of-tree build with automake. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-28 17:48:41 -07:00
Eric Anholt	81f82ecc56	broadcom/vc5: Start using nir_opt_move_load_ubo(). In the absence of a general NIR or VIR-level scheduler, this at least avoids spilling in GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts	2018-03-28 17:48:41 -07:00
Eric Anholt	c2b13627d9	broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes. Just like TLB without a config uniform, we don't have a register index.	2018-03-26 17:46:23 -07:00
Eric Anholt	d7a015cbc6	broadcom/vc5: Account for InstanceID/VertexID in VPM segment size. Fixes failure in GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size	2018-03-22 15:12:21 -07:00
Eric Anholt	ba29b89dc7	broadcom/vc5: Set up a vertex position if the shader doesn't. Our backend needs some sort of vertex position value to emit the scaled viewport values and such. Fixes potential segfaults in KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx	2018-03-22 15:12:21 -07:00
Eric Anholt	baeb6a4b4a	broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI. Unfortunately TGSI doesn't record the type of the FS output like GLSL does, but VC5's TLB writes depend on the output's base type. Just record the type in the key at variant compile time when we've got a TGSI input and then fix it up. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a GPU hang that breaks most tests that come after it.	2018-03-21 14:02:34 -07:00
Eric Anholt	00910e3057	broadcom/vc5: Don't annotate dumps with stale live intervals. As you're debugging register allocation, you may have changed the intervals and not recomputed yet. Just skip the dump in that case.	2018-03-19 16:44:20 -07:00
Eric Anholt	facc3c6f58	broadcom/vc5: Add support for register spilling. Our register spilling support is nice to have since vc4 couldn't at all, but we're still very restricted due to needing to not spill during a TMU operation, or during the last segment of the program (which would be nice to spill a value of, when there's a long-lived value being passed through with little modification from the start to the end). We could do better by emitting unspills for the last-segment values just before the last thrsw, since the last segment is probably not the maximum interference area. Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3 others.	2018-03-19 16:44:06 -07:00
Eric Anholt	271fc58ba1	broadcom/vc5: Remove redundant last_inst lookup. The point was to get the MOV, which the MOV_dest already returned.	2018-03-19 16:42:59 -07:00
Eric Anholt	34dc64f627	broadcom/vc5: On QPU pack error, dump the instruction and return cleanly. This is nice for debugging when you've made a bad instruction.	2018-03-19 16:42:59 -07:00
Eric Anholt	d721348dcd	broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's. This will let me do lowering late in compilation using the same instruction builder as we use in nir_to_vir.	2018-03-19 16:42:59 -07:00
Eric Anholt	c81d681742	broadcom/vc5: Move the umul macro to a header. Anywhere we want to multiply, we probably want this.	2018-03-19 16:42:59 -07:00
Eric Anholt	9e28c18cd1	broadcom/vc5: Correct the arg count of TIDX/EIDX.	2018-03-19 16:42:59 -07:00
Eric Anholt	55bf298333	broadcom/vc5: Re-do live variables after removing thrsws. Otherwise our start/ends ips won't line up with the actual instructions.	2018-03-19 16:42:59 -07:00
Eric Anholt	c3a504f470	broadcom/vc5: Add a QPU helper for instructions using the TLB. This will be used for detecting last thread segment in register spilling.	2018-03-19 16:42:59 -07:00
Eric Anholt	09c4dd1971	broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm(). These helpers will be used in register spilling to determine where to add a last thrsw if needed, and might help refactor QPU scheduling.	2018-03-19 16:42:59 -07:00
Eric Anholt	407f21ef1b	broadcom/vc5: The ldvpm signal also a case of using the VPM. The QPU scheduling code calling this function already separately checked this signal.	2018-03-19 16:42:59 -07:00
Eric Anholt	4760040c09	broadcom/vc5: Extract v3d_qpu_writes_tmu() helper. This will be reused in register spilling.	2018-03-19 16:42:59 -07:00
Timothy Arceri	a050ea60ee	nir: add lower_ldexp to nir compiler options Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Eric Anholt	e29988c908	broadcom/vc5: Fix "hardwrae" typo in a field name in XML.	2018-02-05 13:53:38 +00:00
Eric Anholt	8bb000f460	broadcom/vc5: Try to merge more than 2 QPU instructions together. Obviously it would be good to have an ADD and a MUL and a signal together, but we can even potentially have multiple signals merged, as well. total instructions in shared programs: 100423 -> 97874 (-2.54%) instructions in affected programs: 78812 -> 76263 (-3.23%)	2018-02-05 09:29:37 +00:00
Eric Anholt	dc78643ace	broadcom/vc5: Remove no-op MOVs after register allocation. We emit some MOVs to track lifetimes of payload registers, but we don't need there to be actual MOV instructions for them. total instructions in shared programs: 101045 -> 100423 (-0.62%) instructions in affected programs: 37083 -> 36461 (-1.68%)	2018-02-05 09:29:37 +00:00
Eric Anholt	f3978a7380	broadcom/vc5: Add missing shader-db instruction counting. I must have misplaced it in the instruction packing rework.	2018-02-05 09:29:37 +00:00
Eric Anholt	353b42ccc7	broadcom/vc5: Fix a segfault on mix of booleans. We don't have a src1 to look up if the compare instruction is "i2b".	2018-02-01 11:02:29 -08:00
Timothy Arceri	9a2e085680	nir: add lower_all_io_to_temps flag This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Eric Anholt	71c7e9bea1	broadcom/vc5: Enable CLIF dumping of V3D 4.2.	2018-01-27 19:04:21 +11:00
Eric Anholt	91f899cbc1	broadcom/vc5: Update the compiler for V3D 4.2.	2018-01-27 19:04:21 +11:00
Eric Anholt	f2e41daac5	broadcom/vc5: Update QPU instruction pack/unpack for v4.2. After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because it's used for other barriers in compute.	2018-01-27 19:03:55 +11:00
Eric Anholt	96d3e8f134	broadcom/vc5: Add XML for V3D 4.2.	2018-01-27 18:57:58 +11:00
Eric Anholt	b026063b16	broadcom/vc5: Fix a race between XML codegen build and CLIF build.	2018-01-27 18:57:58 +11:00
Eric Anholt	de60ea4432	Android: Attempt to fix broadcom build after vc5 changes.	2018-01-27 18:03:58 +11:00
Dylan Baker	436ed65d38	autotools: include meson build files in tarball This adds the meson.build, meson_options.txt, and a few scripts that are used exclusively by the meson build. v2: - Remove accidentally included changes needed to test make dist with LLVM > 3.9 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 16:30:51 -08:00
Emil Velikov	393cf04fa4	broadcom: add missing headers to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-18 11:21:35 +00:00
Eric Anholt	5bc0b63799	broadcom/vc5: Use MSF to ignore discards/non-dispatched channels in loops. Prevents potential infinite loops when a non-dispatched or discarded channel never triggers the loop break condition.	2018-01-12 21:58:24 -08:00
Eric Anholt	762dd52951	broadcom/vc5: Use XOR instead of SUB for execute flags comparisons. I think this should be equivalent other than power, and it's the kind of comparison we use for nir_op_ieq.	2018-01-12 21:58:18 -08:00
Eric Anholt	8e4cba9d92	broadcom/vc5: Also check the update flags for avoiding DCE. I was trying to do a NULL-destination UF, and it got removed.	2018-01-12 21:58:11 -08:00
Eric Anholt	aa77a9cf5a	broadcom/vc5: Rename V3D 3.x Flat Shade Action to match v4.x naming. Now that the actions are reused for centroid and nonperspective, give them a more generic name.	2018-01-12 21:57:45 -08:00
Eric Anholt	368bab43fd	broadcom/vc5: Add support for loading varyings in V3D 4.1. The LDVARY signal now writes an arbitrary register, so I took out the magic src register file and replaced it with an instruction with LDVARY set so we have somewhere to hang a QFILE_TEMP destination for register allocation.	2018-01-12 21:57:21 -08:00
Eric Anholt	5aaea3c4a0	broadcom/vc5: Add compiler support for V3D 4.x texturing.	2018-01-12 21:56:57 -08:00
Eric Anholt	028f6b327c	broadcom/vc5: Add the new TMU write addresses for V3D 4.x (and r5rep). The V3D 3.x series of TMU writes with meaning depending on the texture type is replaced with writes to specific registers for each texture argument semantic.	2018-01-12 21:56:48 -08:00
Eric Anholt	42a35da96d	broadcom/vc5: Move V3D 3.3 texturing to a separate file. V3D 4.x texturing changes enough that #ifdefs would just make a mess of it.	2018-01-12 21:56:37 -08:00
Eric Anholt	acf30e4916	broadcom/vc5: Move V3D 3.3 VPM write setup to a separate file. For V4.1 texturing, I need the V4.1 XML, so the main compiler needs to stop including V3.3 XML.	2018-01-12 21:56:24 -08:00
Eric Anholt	34898c8c45	broadcom/vc5: Add support for V3D 4.1 CLIF dumping.	2018-01-12 21:55:49 -08:00
Eric Anholt	409696b76e	broadcom/vc5: Move the body of CLIF dumping to a per-version file. I want the library's entrypoints to still be unversioned, but the actual packet dumping needs to be per-version.	2018-01-12 21:55:38 -08:00
Eric Anholt	90269ba353	broadcom/vc5: Use THRSW to enable multi-threaded shaders. This is a major performance boost on all of V3D, but is required on V3D 4.x where shaders are always either 2- or 4-threaded.	2018-01-12 21:55:30 -08:00
Eric Anholt	86a12b4d5a	broadcom/vc5: Properly schedule the thread-end THRSW. This fills in the delay slots of thread end as much as we can (other than being cautious about potential TLBZ writes). In the process, I moved the thread end THRSW instruction creation to the scheduler. Once we start emitting THRSWs in the shader, we need to schedule the thread-end one differently from other THRSWs, so having it in there makes that easy.	2018-01-12 21:55:23 -08:00
Eric Anholt	a075bb6726	broadcom/vc5: Implement GFXH-1684 workaround. Apparently the VPM writes need to be flushed out before we end the shader.	2018-01-12 21:55:15 -08:00
Eric Anholt	f50d39ab49	broadcom/vc5: Add a test for .ifb in ADD ops. I had a .ifb being decoded weird in sampid, so this is to check that .ifb is fine.	2018-01-12 21:54:57 -08:00
Eric Anholt	267f13dbee	broadcom/vc5: Add the new tesselation opcodes in V3D 4.1.	2018-01-12 21:54:50 -08:00
Eric Anholt	edbd817c30	broadcom/vc5: Use a physical-reg-only register class for LDVPM. This is needed for LDVPM on V3D 4.x, but will also be needed for keeping values out of the accumulators across THRSW.	2018-01-12 21:54:42 -08:00
Eric Anholt	22a02f3e34	broadcom/vc5: Use the new LDVPM/STVPM opcodes on V3D 4.1. Now, instead of a magic write register for VPM stores we have an instruction to do them (which means no packing of other ALU ops into it), with the ability to reorder the VPM stores due to the offset being baked into the instruction. VPM loads also gain the ability to be reordered by packing the row into the A argument. They also no longer write to the r3 accumulator, and instead must be stored to a physical register.	2018-01-12 21:54:33 -08:00
Eric Anholt	55f8a01aca	broadcom/vc5: Drop dead VC5_QPU_* defines from qpu_instr.c. I had all the packing code in this file at one point, but these defines now live in qpu_pack.c.	2018-01-12 21:54:27 -08:00
Eric Anholt	2bd378647b	broadcom/vc5: Add support for QPU pack/unpack/disasm of small immediates.	2018-01-12 21:54:18 -08:00
Eric Anholt	c81cc767e4	broadcom/vc5: Drop signal bit #defines. Signals are more complicated than that, and tables ended up being better.	2018-01-12 21:53:53 -08:00
Eric Anholt	dfee62eed3	broadcom/vc5: Add support for V3Dv4 signal bits. The WRTMUC replaces the implicit uniform loads in the first two texture instructions. LDVPM disappears in favor of an ALU op. LDVARY, LDTMU, LDTLB, and LDUNIF*RF now write to arbitrary registers, which required passing the devinfo through to a few more functions.	2018-01-12 21:53:45 -08:00
Eric Anholt	81ec2ba229	broadcom/vc5: Fix pack/unpack of vfmul input unpack flags.	2018-01-12 21:53:38 -08:00
Eric Anholt	fb4face86a	broadcom/vc5: Introduce v3dx_macros.h and v3dx_pack.h headers. This will be used by vc5 for prefixing functions and including the pack header in v3d-version-dependent code, following the model of anv.	2018-01-12 21:51:40 -08:00
Eric Anholt	7dedfd9660	broadcom/cle: Fix error path of missing a "type" in the XML. We try to emit a #error and continue so that you can debug the missing type at C compile time, but were missing a couple of definitions in that path (sigh, python).	2018-01-12 21:51:34 -08:00
Eric Anholt	3d8ad50370	broadcom/vc5: Add XML for V3D v4.1 (BCM7278)	2018-01-12 21:48:07 -08:00
Dylan Baker	2083a14179	meson: Use dependencies for nir This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	60856a7b49	meson: don't use intermediate variables that are immediately discarded For things like: loop x = func() list += x end just do: loop list += func() end Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	4ccb981673	meson: Use consistent style for tests Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	fbf192a67e	meson: Use consistent style Currently the meosn build has a mix of two styles: arg : [foo, ... bar], and arg : [ foo, ..., bar, ] For consistency let's pick one. I've picked the later style, which I think is more readable, and is more common in the mesa code base. v2: - fix commit message Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Eric Anholt	e60e3a56a2	broadcom/vc5: Fix discard_if during control flow. I want to do the SETMSF.IFA to discard only if execute == 0 and cond, so our dest of the PUSHZ needs to be nonzero if execute or !cond are nonzero. Fixes dEQP-GLES3.functional.shaders.discard.dynamic_loop_dynamic.	2018-01-03 14:31:36 -08:00
Eric Anholt	635131a238	broadcom/vc5: Don't emit component 3/4 F16 TLB writes for float/vec2. Fixes a simulator assertion failure on dEQP-GLES3.functional.fragment_out.array.fixed.r8_highp_float.	2018-01-03 14:31:28 -08:00
Eric Anholt	39811a2894	broadcom/vc5: Introduce enums for internal depth/type, with V3D prefixes.	2018-01-03 14:25:23 -08:00
Eric Anholt	d3e8a4b96c	broadcom/xml: Fix up safe name confusion with prefixing. For enums we were doubling the underscore if the value had a numeric first character of its name (which safe_name() adds an underscore to). A little helper function cleans up the other instance of prefixing while also fixing this.	2018-01-03 14:25:23 -08:00
Eric Anholt	48cabc1e75	broadcom/vc5: Turn the decimate mode field into an enum in the XML.	2018-01-03 14:25:23 -08:00
Eric Anholt	17cb634b1c	broadcom/vc5: Turn the output image format into an enum.	2018-01-03 14:25:23 -08:00
Eric Anholt	883a9b02c9	broadcom/vc5: Turn the CLE XML's memory format into an enum.	2018-01-03 14:25:23 -08:00
Eric Anholt	8e5a0ed953	broadcom/vc5: Emit flat shade flags for varying components > 24. This means that with no flatshading we'll emit the single-byte ZERO_ALL_FLAT_SHADE_FLAGS, and otherwise emit a set of FLAT_SHADE_FLAGS to get all the bits we need set. There's a _SET enum in the packet we could use to possibly set entire ranges of the bitfield without using another packet, but this at least fixes the conformance failure.	2018-01-03 14:25:23 -08:00
Eric Anholt	2056e4a777	broadcom/vc5: Emit proper flatshading code for glShadeModel(GL_FLAT). In updating the simulator, behavior changed slightly so that our old code wasn't getting glxgears's flatshading interpolated right. Emit flat shading code just like we would for a normal flat-shaded varying, by passing a flag in the shader key for glShadeModel(GL_FLAT) state and customizing the color inputs based on that.	2018-01-03 14:25:23 -08:00
Eric Anholt	4764699552	braodcom/vc5: Rely on OVRTMUOUT always being set. It seems that the HW team has decided that it's the only supported mode, and it's the mode I actually meant to be using but forgot. Our table of return_32_bit should have matched the default non-OVRTMUOUT behavior, so this change should be invisible. However, the change revealed that some my return_size checks for swizzling were a bit confused in the shadow case, so I had to move them to draw time once we have both the sampler and the view together. Fixes assertion failures in the updated simulator, where the non-OVRTMUOUT support has been removed.	2018-01-03 14:25:23 -08:00
Eric Anholt	ba965084b6	broadcom/vc5: Move texture return channel setup into the compiler. The compiler decides how many LDTMUs we're going to emit, and that must match the P1 flags. This brings the return channel counting to a single place (so all that's passed into the compiler is "how many return channels you may request from this texture's format), and was a necessary step for shadow samplers once we stop using OVRTMUOUT=0.	2018-01-03 14:25:23 -08:00
Eric Anholt	22ceb1f99b	broadcom/vc5: Add missing setting of the UIF XOR disable flag in textures. Most piglit textures happened to work out by RGBW not changing in that bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.	2017-12-19 15:55:14 -08:00
Eric Anholt	49e2586bfc	broadcom/vc5: Fix a typo in memcmp for sig unpack checking. This shockingly ended up working out, because only the first byte of sig is used and (sizeof(sig) != 0) == 1. Fixes a compiler warning. Link: https://bugs.freedesktop.org/show_bug.cgi?id=104183	2017-12-14 14:36:24 -08:00
Eric Anholt	1171f1749d	broadcom/vc5: Enable NIR txd lowering on all txd instructions. Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for the base -texgrad/texgradcube ones which fail on what appear to be precision problems. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	52f024b052	broadcom/vc5: Fix shader input/outputs for gallium's new NIR linking.	2017-12-14 14:36:17 -08:00
Eric Engestrom	4cba39331d	meson: add dep_thread to every lib that includes threads.h Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104141 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-12-07 17:29:42 +00:00
Eric Anholt	fefff74b0d	broadcom/vc4: Use the new enum functionality of the XML to decode better.	2017-12-01 15:37:28 -08:00
Eric Engestrom	bb46111c01	broadcom: use NDEBUG to guard asserts Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 09:50:36 +00:00
Eric Anholt	6a78416dab	broadcom/vc5: Fix BASE_LEVEL handling with txl. The HW doesn't add the base level anywhere (the min/max lod clamping is what does base level), so we need to add it manually in this case. Fixes piglit tex-miplevel-selection *Lod 2D.	2017-11-22 10:56:31 -08:00
Eric Anholt	514db90448	broadcom/vc5: Fix up integer texture handling. The original spec I had didn't expose integer textures and suggested that you use unfiltered floats. Now there are proper formats for them. Fixes 16- and 32-bit texwrap integer tests in piglit, and dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.	2017-11-19 10:12:30 -08:00
Eric Anholt	87391e23cf	broadcom/vc5: Ensure that there is always a TLB write. This should fix some GPU hangs in our (currently always single-threaded) fragment shaders, and definitely fixes assertion failures in simulation.	2017-11-17 16:09:55 -08:00
Andreas Boll	4f29ed38f3	broadcom/vc5: Remove unused v3d_compiler.c Unused since original import of VC5. Fixes: `ade416d023` ("broadcom: Add VC5 NIR compiler.") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 18:30:47 +00:00
Eric Anholt	50906e4583	broadcom/vc5: Do 16-bit unpacking of integer texture returns properly. We were doing f16 unpacks, which trashed "1" values. Fixes many piglit texwrap GL_EXT_texture_integer cases.	2017-11-07 12:58:03 -08:00
Eric Anholt	dfff9ce45e	broadcom/vc5: Fix scheduling for a non-SFU R4 write after a dead R4 write. The v3d_qpu_writes_r*() were only checking for fixed-function accumulator writes, not normal ALU writes to those regs. Fixes fs-discard-exit-2 on simulation (but not HW).	2017-11-07 12:57:49 -08:00
Eric Anholt	4f33344e7a	broadcom/vc5: Add occlusion query support. Fixes all of piglit's OQ tests.	2017-11-07 12:56:40 -08:00
Eric Anholt	a266f78741	broadcom/vc5: Fix mipmap filtering enums. The ordering of the values was even less obvious than I thought, with both the mip filter and the min filter being in different bits depending on whether the mip filter is none. Fixes piglit fs-textureLod-miplevels.shader_test	2017-11-07 09:40:25 -08:00
Eric Anholt	dd429cb2db	broadcom/vc5: Fix missing enum decode for indexed primitives.	2017-11-07 09:19:48 -08:00
Eric Anholt	bb6997e6a3	broadcom/vc5: Drop padding bits from the bottom of the TSDA address. Fixes misaligned-looking addresses in decode.	2017-11-07 09:19:48 -08:00

... 2 3 4 5 6 ...

418 Commits