github intel/intel-graphics-compiler igc-1.0.6410

latest releases: igc-1.0.15610.11, igc-1.0.15770.10, igc-1.0.15770.9...
3 years ago

Fixed Issues / Improvements

  • Consider a WA table entry before inserting a flush sampler instruction
  • Location expressions improvements
  • Do not split arithmetic instructions in IGC as vISA will handle it
  • Backing out Simple push algorithm Optimization
  • Fix reg number issue in translate math
  • Changes for -O2. Optimizing non-user functions to save compiling time.
  • Fix the SWSB when there is no send in kernel
  • Add support to generate thread IDs in 2x2 blocks.
  • Seperate global and local variables to reduce compilation time.
  • Don't replace OpDecorate with OpGroupDecorate.
  • Add InferAddressSpacesPass only if needed.
  • Fix crash in SIMD32 mode caused by pseudo_ret instruction's source operand right bound computation.
  • Update DispatchGPGPUWalkerAlongYFirst lookup
  • Changes for -O2. Optimizing non-user functions to save compiling time.
  • Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
  • Cleanup unnecessary dynamic allocations.
  • Avoid warning of implicit i64->i32 by forcing explicit conversion.
  • Optimization for signed reminder for constant power of 2 int32.
  • Switch TPM to SVM entirely.
  • Do not modify wrregion input in non-overlapping region optimization.
  • Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
  • Avoid warning of implicit i64->i32 by forcing explicit conversion
  • Simplify usage of IGC_BUILD__VC_ENABLED cmake option Change IGC_VC_DISABLED macro to more consistent IGC_VC_ENABLED
  • Removed external dependency on llvm_patches and improved llvm setup in project
  • Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts
  • Fix missing barrier when inline ASM is used in a kernel
  • Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
  • Set InlineAsm usage per function group, to create correct builder for multiple FGs.
  • Support for stackcalls with InlineAsm by parsing multiple functions in single text stream.
  • Broadcast uniform variables if 'rw' constraint was specified (Inline ASM)
  • Optimize generic pointer load for kernels not using local memory.
  • Bug fix for SWSB when comparing the footprint.
  • Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts.
  • Extend GAS phi resolution to all loops, not only top level ones.
  • Remove the dependence between dummy csel instructions.
  • Adds custom iterator class for Function Group. Can iterate through the FunctionGroup class, which uses a 2D vector storage.
  • Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
  • Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
  • Cast Base and Insert parameters to unsigned to avoid sign extension while shifting
  • Add check for compute shaders that may need XYZ walk of thread IDs.
  • ZEBinary: Fix scractch memory buffer creation.
  • If unmasked regions are nested then the most nested intrinsic llvm.genx.GenISA.UnmaskedRegionEnd switched off unmasked code generation, resulting in other embracing nested regions generatedr as masked code.
  • Fix missing barrier when inline ASM is used in a kernel.
  • Extra flag has been added to WIAnalysis Runner to not mark some uniform instructions as random.
  • Added a field to implicit argument structure for stack calls. Modified layout of local ids based on SIMD size.
  • IGA: add disassembler option "--output-on-fail"
  • Fix discovery of inlined DISubprogram nodes
  • Implement support for both SPV-IR forms for BitFieldInsert builtins
  • Introduction of new entry in IGC constant folder for bfrev.
  • Update TracePointerSource() function to detect cases where two different resource pointer values describe the same resource.
  • Vector backend does not support creation of L0 module with external functions. Insert assert in GenXCisaBuilder, explaining that.
  • Take SpillMemOffset into consideration when reporting spill size.
  • Split send has argument no 4, and it can be addr register. Make sure check dependence on src3 as well.
  • Add case when propagating non-generic pointer to store.
  • Disable certain transformations when compiling code for debug.
  • Add -vc-promote-array-alloca-limit knob to control array promotion total size (2nd edition). Force array promotion for CMRT binary.
  • Replace strcat by compound assignment operator
  • Now appropriately handling shl instructions with unsupported types.
  • More fixes to get local RA to honor declare even-alignment.
  • Print SLMsize in compiler output file
  • IGA SWSB refactoring: Unify InstType getter function
  • Fix missing barrier when inline ASM is used in a kernel
  • Extract vc input handling into another function
  • Fix an assertion due to unexpected RAUW with a constant
  • Extend supported subtargets in VC
  • Solve the memory leak issue of SWSB
  • Add control to route some resources to LSC/HDC
  • Fix scratch surface allocation for VC
  • Remove addrspacecast only if there no other uses.
  • Set alwaysinline on invoke kernels. Don't add stack call or indirect call attributes.
  • Extract vc input handling into another function
  • Add interface target for vc intrinsics headers
  • Move stepping into Options instead of a global variable.
  • Add DoNotSpill attribute for vISA variables.
  • ZEBinary: Support buffer_offset implicit argument
  • If all its operands are region invariant, an inst is region invariant.
  • Commit base data structures for implicit argument handling for bindless offsets. Changes in StatelessToBindless promotion will come later.
  • For optnone builtins, allow -O0 flag to determine if we should call them as subroutines or stackcalls.
  • Allow EnableA64WA env variable in Linux relesae mode.
  • BinaryEncodingIGA: fix math pipe instruction check
  • Upgraded error messages with source file locations and names of the kernel causing the error.
  • Implement support for both SPV-IR forms for conversion builtins
  • Prevent redundant lowering attempt during SIMD CF Conformance
  • Now appropriately handling shl instructions with unsupported types.
  • Make sure trivial RA honors even-alignment.
  • ZEBinary: add regkey to enable .bss section for zero-initialized global variables
  • add -vc-promote-array-alloca-limit knob to control array promotion total size
  • Add simplify CFG pass to pass manager to simplify work of LICM
  • Add an option for GenXPromoteArray threshold
  • Debug location expression improvements
  • Reduce memory footprint in GraphColor
  • Fix binary encoding for simd2 align16 instructions
  • Filter out "endif" and "else" when inserting dummy mov.
  • Avoid localization of large data for oclbin, use relocation instead.
  • Add option for TPM memory placement.
  • Correct localization costs for global vectors.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

Don't miss a new intel-graphics-compiler release

NewReleases is sending notifications on new releases.