KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Ben Widawsky	169d7e5cb1	i965: Extract scalar region checking logic There are currently 2 users of this functionality. I have 2 more users coming up, and having a simple function makes the results much cleaner. The existing interface semantics was proposed by Matt. v2 (Ken): Rename to region_matches()/has_scalar_region(). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-20 15:24:40 -08:00
Ben Widawsky	9394f58383	i965: Add QWORD sizes to type_sz macro GEN8 added the QWORD as a valid type for certain operations on the EU. In order to calculate the number of registers used one must have the type size as part of the equation. Quoting the formula in the code: regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32; Adding this separately for bisection since there is no simple way to add an assert in the type_sz function. NOTE: As a side note, I was confused for a while because it's impossible to calculate the region, ie. registers needed, without vstride. However, at this point these are all part of the IR, and so no vstride must exist. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-20 15:24:40 -08:00
Eric Anholt	b368c91f26	vc4: Fix build since `8ed5305d28`	2015-01-20 14:19:29 -08:00
Rob Clark	fd6e18d651	freedreno/a4xx: sysmem bypass Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-20 13:27:28 -05:00
Rob Clark	5da3bec44b	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-20 13:27:19 -05:00
Tom Stellard	17a2f11a06	radeonsi: Re-enable LLVM IR dumps This was inadvertently disabled by `761e36b4ca`.	2015-01-20 09:55:44 -05:00
Tom Stellard	73bc0fdb6f	radeonsi/compute: Use relocs for scratch pointer rather than user sgprs v2 Instead of passing a pointer to the scratch buffer via user sgprs, we now patch the shader with the buffer address using reloc information from the LLVM generated ELF. v2: - Make sure not to break older LLVM.	2015-01-20 09:55:44 -05:00
Tom Stellard	dfdaf3eb7e	radeon: Teach radeon_elf_read() how to parse reloc information v3 v2: - Use strdup for copying reloc names. - Free reloc memory. v3: - Add free_relocs parameter to radeon_shader_binary_free_members()	2015-01-20 09:55:43 -05:00
Tom Stellard	5667aa58c4	radeon: Add a helper function for freeing members of radeon_shader_binary	2015-01-20 09:55:43 -05:00
Kenneth Graunke	c4fd0c9052	i965: Work around mysterious Gen4 GPU hangs with minimal state changes. Gen4 hardware appears to GPU hang frequently when using Chromium, and also when running 'glmark2 -b ideas'. Most of the error states contain 3DPRIMITIVE commands in quick succession, with very few state packets between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER. I trimmed an apitrace of the glmark2 hang down to two draw calls with a glUniformMatrix4fv call between the two. Either draw by itself works fine, but together, they hang the GPU. Removing the glUniform call makes the hangs disappear. In the hardware state, this translates to removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets. Flushing before emitting CONSTANT_BUFFER packets also appears to make the hangs disappear. I observed a slowdown in glxgears by doing it all the time, so I've chosen to only do it when BRW_NEW_BATCH and BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or already flushed the whole pipeline). I'd much rather understand the problem, but at this point, I don't see how we'd ever be able to track it down further. We have no real tools, and the hardware people moved on years ago. I've analyzed 20+ error states and read every scrap of documentation I could find. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2015-01-19 13:13:51 -08:00
Kenneth Graunke	a5ca86a983	i965/nir: Enable SIMD16 support in the NIR FS backend. With the previous commits in place, it just works. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-19 13:13:50 -08:00
Kenneth Graunke	45123ee818	i965/nir: Use offset() instead of altering reg_offset directly. offset() properly handles reg_width, so it'll work for SIMD16. While we're in the area, simplify a few cases, and use retype() to cut a few more lines of code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-19 13:13:48 -08:00
Kenneth Graunke	3f263ffbb3	i965/nir: Replace fs_reg(GRF, virtual_grf_alloc(...)) with vgrf(...). brw_fs_nir.cpp creates almost all of its registers via: fs_reg reg = fs_reg(GRF, virtual_grf_alloc(num_components)); When we add SIMD16 support, we'll need to set reg->width = 16 and double the VGRF size...on pretty much every VGRF it allocates. This patch replaces that pattern with a new "vgrf" helper method: fs_reg reg = vgrf(num_components); The new function correctly takes reg_width into account. For now, reg_width is always 1, so this should have no functional change. v2: Just make vgrf() account for reg_width right away, rather than changing the behavior in the next patch. v3: Replace one last virtual_grf_alloc I missed. It's used in code that only runs for dispatch_width == 8, so it doesn't matter, but consistency is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-19 13:13:46 -08:00
Kenneth Graunke	d1533d87cc	i965: Replace fs_reg(fs_visitor, type) with fs_visitor::vgrf(type). I dislike how fs_reg has a constructor that knows about fs_visitor. Apart from that, it stands alone, with no need to interact with the rest of the compiler. Which is sensible - a class that represents a register should do just that. Allocating virtual register numbers should be left up to the compiler (fs_visitor). This patch replaces the constructor with a new fs_visitor::vgrf method, eliminating fs_reg's dependency on fs_visitor. It ends up being no more code. v2: Rebase from May 2014 -> January 2015. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-19 13:13:34 -08:00
Marek Olšák	5b01512df3	st/mesa: don't set vs.key.clamp_color if a shader doesn't write any colors And update some comments.	2015-01-19 20:15:27 +01:00
Marek Olšák	ccc5b60b06	winsys/radeon: increase the size of buffer cache This should fix this performance regression: https://bugs.freedesktop.org/show_bug.cgi?id=88227 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-19 20:15:27 +01:00
Carl Worth	3b8ccca8a3	Rename sha1.c and sha1.h to mesa-sha1.c and mesa-sha1.h The filename of sha1.h was conflicting with the system-provided sha1.h, (and in some confiurations, our sha1.c was unsuccessfully attemping to include "sha1.h" and <sha1.h> as two different files). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88523	2015-01-19 10:53:07 -08:00
Martin Peres	7a182d2335	mesa: fix a trivial spelling mistake Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-19 01:23:07 -08:00
Tapani Pälli	d74a817b86	mesa: support GL_RGB for GL_EXT_texture_type_2_10_10_10_REV Commit `8ec6534` changed texture upload path and the way how texture format is being checked, this commit adds support for GL_RGB with GL_UNSIGNED_INT_2_10_10_10_REV as specified by the extension EXT_texture_type_2_10_10_10_REV specification. This fixes regression in ES3 conformance test ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels v2: add MESA_FORMAT_R10G10B10X2_UNORM format (Iago Toral) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88385 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-01-19 08:11:45 +02:00
Micah Fedke	d36fa60191	mesa: Add ARB_shader_precision infrastructure Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-19 16:33:21 +13:00
Kenneth Graunke	461103ef64	i965/fs: Fix the dummy fragment shader. We hit an assertion that the destination of the FB write should not be an immediate. (I don't know what we were thinking.) Use ARF null. Trying to substitute real shaders with the dummy shader would crash when trying to upload non-existent uniforms. Say there are none. It also wouldn't generate any code because we didn't compute the CFG, and code generation now requires it. Compute it. Gen4-5 also require a message header to be present. On Gen6+, there were assertion failures in SF/SBE state because urb_setup was memset to 0 instad of -1, causing it to think there were attributes when nothing was set up right. Set to no attributes. Finally, you have to ensure "Setup URB Entry Read Length" is non-zero or you get GPU hangs, at least on Crestline. It now works on at least Crestline and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-17 14:20:41 -08:00
Kristian Høgsberg	8c6018e9bc	gbm: Define _DEFAULT_SOURCE to avoid warning glibc 2.19 introduced _DEFUAULT_SOURCE as a replacement for _BSD_SOURCE, and deprecates _BSD_SOURCE with an annoying warning. Defining both is how you're supposed to transition so let's do that. It gets rid of the warning and we can figure out when/if we can drop _BSD_SOURCE later. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-16 21:54:54 -08:00
Vinson Lee	9075823c17	sha1: Fix gcry_md_hd_t typo. Fix build error. CC libmesautil_la-sha1.lo sha1.c: In function '_mesa_sha1_final': sha1.c:210:22: error: 'grcy_md_hd_t' undeclared (first use in this function) gcry_md_hd_t h = (grcy_md_hd_t) ctx; ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88519 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2015-01-16 16:25:39 -08:00
Vinson Lee	10a4f1e77a	nir: s/malloc.h/stdlib.h/ Fix build error on Mac OS X. CC nir_to_ssa.lo nir_to_ssa.c:29:10: fatal error: 'malloc.h' file not found ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88478 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2015-01-16 16:14:51 -08:00
Kristian Høgsberg	a9f657ded1	i965: Fix up too-wide comment Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-16 14:42:27 -08:00
Kristian Høgsberg	9bf2c7166a	gbm/dri: Fix const confusion The driver name is no longer const, it's always allocated dynamically one way or another. Drop const from dri_screen_create_dri2 driver_name argument to avoid warning. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-16 14:29:40 -08:00
Carl Worth	59216f53ec	configure: Add machinery for --enable-shader-cache (and --disable-shader-cache) We don't actually have the code for the shader cache just yet, but this configure machinery puts everything in place so that the shader cache can be optionally compiled in. Specifically, if the user passes no option (neither --disable-shader-cache, nor --enable-shader-cache), then this feature will be automatically detected based on the presence of a usable SHA-1 library. If no suitable library can be found, then the shader cache will be automatically disabled, (and reported in the final output from configure). The user can force the shader-cache feature to not be compiled, (even if a SHA-1 library is detected), by passing --disable-shader-cache. This will prevent the compiled Mesa libraries from depending on any library for SHA-1 implementation. Finally, the user can also force the shader cache on with --enable-shader-cache. This will cause configure to trigger a fatal error if no sutiable SHA-1 implementation can be found for the shader-cache feature. Bug fix by José Fonseca <jfonseca@vmware.com>: Fix to put conditional assignment in Makefile.am, not Makefile.sources to avoid breaking scons build. Note: As recommended by José, with this commit the scons build will not compile any of the SHA-1-using code. This is waiting for someone to write SConstruct detection of the available SHA-1 libraries, (and set the appropriate HAVE_SHA1_* variables). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-16 13:47:40 -08:00
Carl Worth	a24bdce46f	mesa: Add mesa SHA-1 functions The upcoming shader cache uses the SHA-1 algorithm for cryptographic naming. These new mesa_sha1 functions are implemented with any one of several differeny cryptographics libraries. This code was copied from the xserver repository, (where it has apparently been functioning well on a variety of operating systems), and comes licensed with a license identical to that of Mesa. Bug fixes by José Fonseca <jfonseca@vmware.com>: Fix to put conditional assignment in Makefile.am, not Makefile.sources to avoid breaking scons build. Fix include file for CryptoAPI section. Fix missing cast in openssl section. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-16 13:47:40 -08:00
Carl Worth	670826b431	configure: Add copyright and license block to configure.ac Prior to copying in code from the xserver configure.ac file, it makes sense to have the license of this file clearly marked, (to show that it's licensed identically to the configure.ac file from the xserver repository). And since the text of the license refers to "the above copyright notice" it also makes sense to have an actual copyright attribution in place. I generated this list of names by looking at the output of: git shortlog -n --format=%aD -- configure.ac (and arbitrarily stopping for contributors with fewer than 15 commits). Then for each name, I looked for existing Copyright attributions in the mesa source tree with the same name, (and using "Intel Corporation" as the copyright holder where I knew that was appropriate).	2015-01-16 13:47:40 -08:00
Carl Worth	977ddecb69	glsl: Add unit tests for blob.c In addition to exercising all of the functions in blob.h, this includes a stress test that forces some reallocing, and also tests to verify the alignment and overrun-detection code in blob.c.	2015-01-16 13:47:40 -08:00
Tapani Pälli	ffcad3a548	glsl: Add blob_overwrite_bytes and blob_overwrite_uint32 These functions are useful when serializing an unknown number of items to a blob. The caller can first save the current offset, write a placeholder uint32, write out (and count) the items, then use blob_overwrite_uint32 with the saved offset to replace the placeholder value. Then, when deserializing, the reader will first read the count and know how many subsequent items to expect. (I wrote this code after reading a very similar patch written by Tapani when he wrote serialization code for IR. Since I re-used the idea of his code so directly, I've credited him as the author of this code. --Carl) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 13:47:40 -08:00
Carl Worth	1c9877327e	glsl: Add blob.c---a simple interface for serializing data This new interface allows for writing a series of objects to a chunk of memory (a "blob").. The allocated memory is maintained within the blob itself, (and re-allocated by doubling when necessary). There are also functions for reading objects from a blob as well. If code attempts to read beyond the available memory, the read functions return 0 values (or its moral equivalent) without reading past the allocated memory. Once the caller is done with the reads, it can check blob->overrun to ensure whether any invalid values were previously returned due to attempts to read too far. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 13:47:40 -08:00
Tapani Pälli	165575d0a8	mesa: Add iterate method for string_to_uint_map The upcoming shader cache needs this to be able to cache hash data from the gl_shader_program structure. Edited-by: Carl Worth <cworth@cworth.org>: There is an internal implementation detail that the hash table underlying the struct string_to_uint_map stores each value internally as (value+1). The user needn't be very concerned with this (other than knowing that a value of UINT_MAX cannot be stored) since put() adds 1 and get() subtracts 1. So in this commit, rather than call the user's function directly with hash_table_call_foreach, we call through a wrapper that fixes up the off-by-one values before the caller's callback sees them. And with this wrapper in place, we also give a better signature to the callback function being passed to iterate(), so that this callback function can actually expect a char* and an unsigned argument, (rather than a couple of void* ). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-16 13:47:40 -08:00
Carl Worth	62d5b4b03a	util: Make unreachable at least be an assert Previously, if __builtin_unreachable() was unavailable, the unreachable macro was defined to do nothing. We do better here, by at least still making it an assert. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-16 13:47:40 -08:00
Carl Worth	f87ffd5cc3	glsl: Add convenience function get_sampler_instance This is similar to the existing functions get_instance, get_array_instance, etc. for getting a type singleton. The new get_sampler_instance() function will be used by the upcoming shader cache. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-16 13:47:40 -08:00
Kenneth Graunke	127c972492	i965: Fix some oddities in FB_WRITE register width and execution size. Previously, we generated this for FB writes in SIMD16 mode: load_payload(16) vgrf5@8+0.0:F, vgrf1:F, vgrf2:F, vgrf3:F, vgrf4:F fb_write(8) (null):UD, vgrf5@8+0.0:F 1sthalf The LOAD_PAYLOAD's destination had its register width set to 8, and the FB_WRITE had its execution size set to 8. This seems wrong, and while it probably doesn't affect anything, we should fix it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 12:39:35 -08:00
Kenneth Graunke	faaca23734	i965/fs: Make lower_load_payload etc. appear in INTEL_DEBUG=optimizer. In order to support calling lower_load_payload() inside a condition, this patch makes OPT() a statement expression: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html We recently did the equivalent change in the vec4 backend (commit `9b8bd67768`). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 12:38:26 -08:00
Neil Roberts	a4ab08bf45	format_utils: Use a more precise conversion when decreasing bits When converting to a format that has fewer bits the previous code was just shifting off the bits. This doesn't provide very accurate results. For example when converting from 8 bits to 5 bits it is equivalent to doing this: x * 32 / 256 This works as if it's taking a value from a range where 256 represents 1.0 and scaling it down to a range where 32 represents 1.0. However this is not correct because it is actually 255 and 31 that represent 1.0. We can do better with a formula like this: (x * 31 + 127) / 255 The +127 is to make it round correctly. The new code has a special case to use uint64_t when the result of the multiplication would overflow an unsigned int. This function is inline and only ever called with constant values so hopefully the if statements will be folded. The main incentive to do this is to make the CPU conversion path pick the same values as the hardware would if it did the conversion. This fixes failures with the ‘texsubimage pbo’ test when using the patches from here: http://lists.freedesktop.org/archives/mesa-dev/2015-January/074312.html v2: Use 64-bit arithmetic when src_bits+dst_bits > 32 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 13:53:15 +00:00
Iago Toral Quiroga	6367ca8b41	i965/gen6: Fix crash with VS+TF after rendering with GS Rendering with a GS and then using transform feedback with a program that does not have a GS can crash in gen6. The reason for this is that brw_begin_transform_feedback checks brw->geometry_program to decide if there is a GS program, but this is not correct: brw->geometry_program is updated when issuing drawing commands, so after rendering with a GS it will be non-NULL until we draw again with a program that does not have a GS. If the next program uses TF, we will call glBegintransformFeedback before issuing the drawing command and hence brw->geometry_program will be non-NULL if the previous rendering used a GS. The right thing to do here is to check ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY] instead. This is what the gen7 code path does too. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=87694 Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-16 14:16:59 +01:00
Jason Ekstrand	bc6e57e019	nir/live_variables: Use a worklist This is a rework of the liveness algorithm using a worklist as suggested by Connor. Doing so reduces the number of times we walk over the instructions because we don't have to do an entire pointless walk over the instructions just to figure out it's time to stop. Also, the stuff after the last loop in the funciton will only ever get visited once. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 16:54:21 -08:00
Jason Ekstrand	4839d1aed1	nir: Add a worklist helper structure A worklist is a common concept in optimizations. This adds a structure that we can reuse for many different types of optimizations. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 16:54:21 -08:00
Brian Paul	0aaaa13ec9	nir: fix incorrect argument passed to validate_src() in validate_tex_instr() Silences a compiler warning. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 17:41:42 -07:00
Brian Paul	aa479a69d6	nir: silence compiler warning from visit_src() call v2: use proper argument Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 17:09:02 -07:00
Brian Paul	337eca4ac8	mesa: move GET_CURRENT_CONTEXT() to top of _mesa_init_renderbuffer() To fix MSVC build. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-15 16:15:34 -07:00
Mike Mason	e407fb1af4	mesa: Fix render buffer initial internal format in GLES 3 Changes the initial internal format of a render buffer to GL_RGBA4 in GLES 3. This fixes a failure in the following DrawElements test: dEQP-GLES3.functional.state_query.rbo.renderbuffer_internal_format Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-15 13:29:48 -08:00
Jason Ekstrand	153b8b3525	util/hash_set: Rework the API to know about hashing Previously, the set API required the user to do all of the hashing of keys as it passed them in. Since the hashing function is intrinsically tied to the comparison function, it makes sense for the hash set to know about it. Also, it makes for a somewhat clumsy API as the user is constantly calling hashing functions many of which have long names. This is especially bad when the standard call looks something like _mesa_set_add(ht, _mesa_pointer_hash(key), key); In the above case, there is no reason why the hash set shouldn't do the hashing for you. We leave the option for you to do your own hashing if it's more efficient, but it's no longer needed. Also, if you do do your own hashing, the hash set will assert that your hash matches what it expects out of the hashing function. This should make it harder to mess up your hashing. This is analygous to `94303a0750` where we did this for hash_table Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	4c99e3ae78	util: Move main/set to util/hash_set Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	8ed5305d28	hash_table: Rename insert_with_hash to insert_pre_hashed We already have search_pre_hashed. This makes the APIs match better. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Matt Turner	f0aec4ee1e	i965: Don't consider null dst instructions as matching non-null dst. When performing common subexpression elimination on instructions with non-null destinations we emit a MOV to copy the result to a new register that must have no other uses. In the case of: cmp.g.f0.0(8) null:D, vgrf43:F, 0.500000f ... cmp.g.f0.0(8) vgrf113:D, vgrf43:F, 0.500000f we put the first instruction in the AEB and decided that we could reuse its result when we found the second. Unfortunately, that meant that we'd emit a MOV from the first's destination, which is null. Don't do anything if the entry's destination is null and the instruction's destination is non-null. Tested-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-15 10:11:42 -08:00
Matt Turner	41d9f232b6	i965/vec4: Make sure that imm writes are to registers in the same file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887	2015-01-15 10:11:42 -08:00

1 2 3 4 5 ...

67605 Commits All Branches Search

67605 Commits

All Branches