Fixed Issues / Improvements
- API option to control per-thread memory,
- Add GenXAggregatePseudoLowering to the list of GenX passes,
- Add helper macro for tablegen,
- Add missing dependencies on llvm libs for VC,
- Add missing dl to vc driver link libraries,
- Add missing library for vc codegen,
- Add option 'EnableDivergentBarrierCheck' to check for barriers that may be in divergent control flow,
- Add option to strip debug info from llvm IR,
- Add support for function pointer relocation to global/constant buffer. Save relocation data to module metadata, to be patched with actual function address by runtime,
- Added missing header in preparation of LLVM 11,
- Added more passes to igc_opt,
- After IGCInstructionCombiningPass address of an indirectly called function is used inside 'combined' store instruction. In order to process by further passes such constant expressions has to be splitted,
- Allow cmd arg, registered in passInfo, to be used as pass names in IGC keys PrintAfter and PrintBefore,
- Allow float to packed half-float move on select platforms, Nth try,
- Avoid same space AS casts in LowerGPCallArg,
- BCR tunning,
- Bug fixes to enable function pointers directly passed by FE (Commit attempt #3),
- Change default stateless private size,
- Change encoding of return register location in each function's epilogue in FDE,
- Cleanup old emulation functions to preserve compatibility with old ISPC,
- Correction in LLD build,
- Cosmetic fixes for VC emu boilerplate generator,
- Detect local to generic pointer casts, second try,
- Disable Read suppression with single IGC key,
- Do not load zero values from genx.alloca,
- Dump just after each pass execution,
- Enable ZWDelta in Code Patch,
- Enable immediate pool for cmp instruction,
- Enabling preRA_Schedule in default ForceFastestSIMD due to the ACOdyssey regression in IGC-4149,
- Expanding BufferType Buffer Type range is set to 32 instead of 16,
- Fix CSEL before EOT, there may be atomic URB inst,
- Fix Travis environmental error, libc6,
- Fix Ubuntu build instruction,
- Fix bug where class type was encoded as struct type in dwarf,
- Fix calculation of TPM address offset and emit warning if allocated space is not enough,
- Fix debug info reader for static members,
- Fix debug line info in GenericAddressDynamicResolution,
- Fix erroneous unary "~" implementation for cm-cl vector,
- Fix frame destruction boundary condition,
- Fix incorrect CISA offset attached to EOT,
- Fix lowering shader interpreted values (GS),
- Fix packed immediate handling on platforms that don't have byte regioning,
- Fix phi nodes coalescing. In the case of indirectbr instruction several phi nodes may have the same PHICPY segment. Coalescing analysis was not ready for this and it led to excess copies,
- Fix stack mem option parsing,
- Fix the bug in flag register spill/fill clean up,
- Fix the optimization EnableMergeTransposeSLM,
- Fixing DebugLoc's in VectorPreProcess pass,
- Flip DispatchAlongY override setting to reflect new default,
- For padding constant/global buffers, use stringstream width/fill instead of writing characters one by one,
- Handle convert jmpi to goto correctly on platforms that don't support predCtrl width,
- Handling ConstantInt and PtrToInt in evaluating constant address Also adding a new helper function for retreiving constant address,
- High-Level Load/Store G4IR support,
- Implement OpTypeBufferSurfaceINTEL in the old SPIRV-LLVM-Translator,
- Implement more efficient emulation for floating point global atomics,
- Improve jump codegen by setting uniform if jump's flag is workgroup/global uniform under EU fusion,
- Included lldELF library to IGC solution,
- Initial implementation of 64-bit integer division routines for VC backend,
- Introducing lowering diagnostics kind,
- Keep pass disabled in this checkin,
- Limit the display of the warning for the ShowFullVectorsInShaderDumps flag to one in the console,
- Link debug info with required llvm libraries,
- Make VC option parser IGC top level component,
- Make emitStateRegID() inputs easier to read,
- Make run()/reset() members private. Add inProgress flag to avoid recomputation when it is already in progress causing stack overrun,
- Merge stores/loads from different SLM buffers,
- Minor improvement to TGL workaround,
- Minor improvements to split aligned scalar pass,
- Move code to strip debug info before CheckInstrType pass,
- Move common LLVM build setup to one placeo,
- Move dependent instructions for ZWDelta into the payload section,
- Move link libraries from plugin to codegen library,
- Move linux backend plugin code to separate file,
- Need to check Bti Value is constant first,
- Optimize mergeScalar pass to benefit more cases, third try,
- Option for explicit stateless private size,
- Option for globals localization configuration. More modes of localization is supported,
- Packetizer need to handle the argument with an addrspacecast as its use,
- Pass llvm options using compilation options,
- Pre-allocate all R1Lo aliases,
- Re-enable IGC registry key TotalGRFNum,
- Remember backend config in genx module,
- Remove debug llvm options parsing in VC,
- Report unsupported SPIRV opcode,
- Requested by debugger, copy "sret" argument to return register upon stack function exit, such that debugger can query the return value of the function. Also added the "sret" attribute to implicit vector pointer argument used to represent the return value as specified in the IGC call convention ABI,
- Require mad dst to be aligned in split aligned scalar pass,
- Resolve circular dependency in vc metadata headers,
- Same code style of 'const' order,
- Set Payload LiveOut as Output to prevent reallocation,
- Set default globals localization to always localize vectors,
- Simplify legalization of store instructions with constants,
- Simplifying the geometry shader lowering pass,
- Some crean up passes, which were added after LateInlineUnmaskedFunc=1, created complex or combined instructions. These instructions have to split with BreakConstantExpr pass,
- Support build of spirv translator with prebuilt LLVM,
- Support load inst in GenXAggregatePseudoLowering,
- Switch to OCL conformant return value in VC printf,
- Switch to use InstVisitor in GenXAggregatePseudoLowering,
- Transform SPV_INTEL_optimization_hints into SPV_KHR_expect_assume,
- Try to insert VectorUniform intrinsic to the lowest common dominator of all loads and stores,
- Update addSamplerFlushBeforeEOT with additional HW requirements,
- Update the insertInstLabel to fixEndifWhileLabels,
- Use cmake argument parser in build bif function,
- Use link wrapper to filter link command for vc plugin,
- Use llvm source hook for SPIRV translator,
- When LateInlineUnmaskedFunc is on the following SROA pass created ShuffleVector instructions which is not supported in code generation. Add legalization pass to expand such instruction into supported ones,
- When creating a new call, make sure call inst's calling convention matches its callee, otherwise the call would be deleted (by instcombine),
- When promoting arrays to registers wrong assumption regarding support for fp64 and int64 is made,
- Workaround for IMAGE_SUPPORT macro definition in Clang,
- ZEBinary: Add sampler_index to payload arguments,
- ZEBinary: fallback/assert when encounter inline sampler,
- ZEBinary: support Rela format in ZEELFObjectBuilder,
- Other fixes and improvements.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@069ced1
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/llvm-project@llvmorg-10.0.0
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.