DXVK-GPLAsync-LowLatency 2.6.6-1 (DXVK-GPLALL 2.6.6-1)
Based on DXVK 2.6.2 and commits from DXVK 2.7, DXVK 2.7.1 and later, DXVK GPLAsync 2.6.2 and DXVK GPLAsync 2.7-patch, DXVK Low Latency 2.7.1-commit 9659672.
It consists of:
- DXVK-GPLALL-GCC-WinMacLinux-SSE2-O3-LTO 2.6.6-1
- DXVK-GPLALL-GCC-WinMacLinux-SSE4.2-O3-LTO-GENERIC 2.6.6-1
- DXVK-GPLALL-GCC-WinMacLinux-SSE4.2-O3-LTO-INTEL 2.6.6-1
- DXVK-GPLALL-MSVC-Windows-SSE2-O1_Oi_Ob3-LTCG 2.6.6-1
- DXVK-GPLALL-MSVC-Windows-AVX2-O1_Oi_Ob3-LTCG-AMD64 2.6.6-1
- DXVK-GPLALL-MSVC-Windows-SSE4.2-O1_Oi_Ob3-LTCG-INTEL64 2.6.6-1
- DXVK Native-GPLALL-GCC-SSE2-O3-LTO 2.6.6-1
- DXVK Native-GPLALL-GCC-SSE4.2-O3-LTO-GENERIC 2.6.6-1
- DXVK Native-GPLALL-GCC-SSE4.2-O3-LTO-INTEL 2.6.6-1
- Source Code
- dxvk.conf
Release Notes:
IMPORTANT: DXVK-GPLALL 2.6.x version will be maintained for GPUs/drivers that do not have VK_KHR_maintenance5 and/or maxPushConstantsSize=256. Commits from DXVK, DXVK GPLAsync, DXVK Low Latency will be ported to DXVK-GPLALL 2.6.x, if possible. Also, DXVK-GPLALL 2.6.x will be recompiled from time to time, when new compiler versions with significant improvements are out.
Fourth maintenance release of DXVK-GPLALL 2.6.x branch.
IMPORTANT: Please refer to DXVK-GPLALL Wiki for information about changes to the recommended configuration of Low Latency Frame Pacing Options.
- Significant changes:
DXVK: [dxvk] Throttle resource eviction- Improves performance under extreme memory pressure.
DXVK Low Latency: Use calibrated device timestamps for each submit- Significant precision and smoothness improvements for GPU Drivers that support
VK_EXT_calibrated_timestamps. Note:VK_EXT_calibrated_timestampswas used instead ofVK_KHR_calibrated_timestampsdue to larger coverage of devices and drivers. - IMPORTANT:
timestampValidBitscheck can't be implemented, due to DXVK 2.6.x codebase. In order to check, whether GPU hastimestampValidBits = 64, use Vulkan GPUInfo website, section "Queue Families". Example - NVIDIA GeForce GT730M. - IMPORTANT: On devices that do not have
timestampValidBits = 64(Adreno/Mail/PowerVR etc. mobile GPUs, Intel iGPUs, Intel ARC A GPU Series, possibly more) DXVK-GPLALL will crash applications randomly, ifdxvk.framePaceis set todxvk.framePace = "low-latency"(default value) ordxvk.framePace = "low-latency-vrr-x". In order to use DXVK-GPLALL 2.6.6-1 on such devices setdxvk.framePace = "max-frame-latency"ordxvk.framePace = "min-latency".
- Significant precision and smoothness improvements for GPU Drivers that support
DXVK Low Latency: Minor optimizations- CPU-bound performance improvements.
DXVK Low Latency: Threaded sleep- Significant smoothness improvements, minor performance and latency improvement.
- DXVK-GPLALL changes:
GCC: Added-ffunction-sections,-fdata-sections,-Wl,--gc-sectionsflags.- Has the same effect as
/Gw,/GF,/Gy,/OPT:REF,ICFforMSVC- reduces DLLs size and potentially improves performance.
- Has the same effect as
GCC-SSE4.2-O3-LTO-INTEL: Added-mbmi,-mbmi2,-mmovbeflags.- Allows compiler usage of BMI, BMI2 and MOVBE instructions, which improve CPU-bound performance. These instructions are non-AVX part of
x86-64-v3set and do not change DXVK-GPLALL requirements specified in Builds Reference Guide.- Note:
MSVC-AVX2-O1_Oi_Ob3-LTCG-AMD64build uses all of these instructions as a part of/arch:AVX2.MSVC-SSE4.2-O1_Oi_Ob3-LTCG-INTEL64build does not use these instructions, due to/arch:SSE4.2, and there is no way to specify usage of separate instruction inMSVCcompiler.
- Note:
- Allows compiler usage of BMI, BMI2 and MOVBE instructions, which improve CPU-bound performance. These instructions are non-AVX part of
GCC-SSE4.2-O3-LTO-GENERIC: Added-mbmi,-mmovbeflags.- Allows compiler usage of BMI and MOVBE instructions, which improve CPU-bound performance. These instructions are non-AVX part of
x86-64-v3set and do not change DXVK-GPLALL requirements specified in Builds Reference Guide.- Note:
-mbmi2flag is not used, in order to avoid performance regression on AMD CPUs up to and including Zen2, due to usage of BMI2PDEP/PEXTinstructions.
- Note:
- Allows compiler usage of BMI and MOVBE instructions, which improve CPU-bound performance. These instructions are non-AVX part of
MSVC: Changed optimization level from/O2to/O1with/Oi.- CPU-bound performance improvements. For DXVK-type workloads (DirectX-to-Vulkan translation layer) on out-of-order CPUs (all x86-64 CPUs) fitting a code segment into the instruction cache is preffered to fastest code. Older CPUs with slower and smaller cache will see the most significant performance improvements.
- Note: For
GCC, change from-O3to-Osleads to compilation failure, probably due to alignment issues.
- Note: For
- CPU-bound performance improvements. For DXVK-type workloads (DirectX-to-Vulkan translation layer) on out-of-order CPUs (all x86-64 CPUs) fitting a code segment into the instruction cache is preffered to fastest code. Older CPUs with slower and smaller cache will see the most significant performance improvements.
GCC: Added-Wno-deprecatedflag to silence not very useful compilation warnings about-mtune=x86-64usage.MSVC: Set/Qpar-report:1and/Qvec-report:1flags to decrease compilation logs size.MSVCandMSVC-AVX2-O1_Oi_Ob3-LTCG-AMD64: Additions of defaultMSVCflags, in order to have an explicit way to changeMSVCdefaults - no functional changes.Clang: Added-fwhole-program-vtablescompiler flag.- Requires FullLTO, enables single-implementation devirtualization and virtual constant propagation, improves CPU-bound performance.
GCC/MSVC/Clang: Added explanatory comments to compiler optimization flags.
Next maintenance releases will have less changes, due to more significant incompatibilities between DXVK 2.6.x and next DXVK versions.
List of ported and adapted upstream DXVK commits:
584041f - [util] Add a frame cap for The Evil Within 2
c774d93 - [include] Fix Windows typedefs in native headers
5c3437c - [d3d11] Fix surface map flags check
a26a504 - [d3d11] Add D3D11Device::RegisterDeviceRemovedEvent() stub
b776f63 - [dxvk] Fix VVL error about vkGetPhysicalDeviceMultisamplePropertiesEXT
c10c75e - [d3d8] Readjust ZBIAS factor
aa7f779 - Support for MW:R HMW-Mod
08925ba - Add comment for AMD AGS crash in HMW-Mod
77d57f2 - [util] Add infinite warfare / iw7-mod support
8710c80 - [dxgi,dxvk] Add support for P010/P016 textures
a5d2ded - [d3d9] Allow cursor clears on hidden HW cursors
9d5dd98 - [d3d9] Add a tiny amount of padding to texture mapping data - [d3d9] Add a tiny amount of padding to texture mapping data
252bdd8 - [util] Remove BlazBlue textureMemory workaround - [d3d9] Add a tiny amount of padding to texture mapping data
c27ef1c - [util] Spoof GPU for Sims 3
fe277ae - Correct a few native API data types
eab5145 - [d3d11] Construct D3D11DXGISurface with only ID3D11Resource. - [d3d11] Implement subresource surface support.
0bf876e - [d3d11] Implement subresource surface support. - [d3d11] Implement subresource surface support.
4bbe487 - [util] Fix crash on resolution change in Pirates
c8d5d8e - [dxgi] Do not update target frame rate when PRESENT_TEST is set
aadc4b4 - meson: Fix glfw dependency request to use correct pkgconfig name
9f97d4c - native: Add define for DECLARE_INTERFACE_IID
31c5433 - [dxvk] Throttle resource eviction
d222d2c - [util] Disable direct image mapping for Total War Pharaoh Dynasties
b02cac8 - [meta] FIFA regex adjustment
4357d63 - [util] Enable zeroMappedMemory for Dawn of War DE
fae3c43 - [meta] Expand the W40K DoW DE regex
6f3a758 - [util] Enable zeroMappedMemory for Nioh 2
1ebe643 - [CI] Run steamrt container with --user 1001
1676dca - [dxvk] Fix build failure with Clang 22
List of ported and adapted DXVK Low Latency commits:
9b15fdc - [dxvk] Switch to implicit VRR frame pacing - Switch to implicit VRR pacing
- Note: No functional changes compared to 691cab5
075d652 - [dxvk] Drop the V-Sync requirement for VRR pacing - Switch to implicit VRR pacing
- Note: No functional changes compared to b7e76e6
1a928cd - [dxvk] Disable V-Sync buffer tracking
cf650d7 - [dxvk] Add VK_KHR_calibrated_timestamps - Use calibrated device timestamps for each submit
- Note:
VK_EXT_calibrated_timestampswas used instead ofVK_KHR_calibrated_timestampsdue to larger coverage of devices and drivers.
ae4db00 - [dxvk] Improve GPU runtime estimation - Use calibrated device timestamps for each submit
177919e - [util] Add ringbuffer allocator - Use calibrated device timestamps for each submit
b10ff87 - [dxvk] Add query pools to the frame pacer - Use calibrated device timestamps for each submit
28a142e - [dxvk] Write device timestamp at the end of each submit - Use calibrated device timestamps for each submit
76592cf - [dxvk] Use device timestamps for frame pacing - Use calibrated device timestamps for each submit
96f0acf - [dxvk] Check VK_KHR_calibrated_timestamps support for frame pacing - Use calibrated device timestamps for each submit
- Note:
VK_EXT_calibrated_timestampswas used instead ofVK_KHR_calibrated_timestampsdue to larger coverage of devices and drivers. - IMPORTANT:
timestampValidBitscheck can't be implemented, due to DXVK 2.6.x codebase. In order to check, whether GPU hastimestampValidBits = 64, use Vulkan GPUInfo website, section "Queue Families". Example - NVIDIA GeForce GT730M - IMPORTANT: On devices that do not have
timestampValidBits = 64(Adreno/Mail/PowerVR etc. mobile GPUs, Intel iGPUs, Intel ARC A GPU Series, possibly more) DXVK-GPLALL will crash applications randomly, ifdxvk.framePaceis set todxvk.framePace = "low-latency"(default value) ordxvk.framePace = "low-latency-vrr-x". In order to use DXVK-GPLALL 2.6.6-1 on such devices setdxvk.framePace = "max-frame-latency"ordxvk.framePace = "min-latency".
55558cb - [dxvk] Fix GPU runtime in GpuProgress
330afbd - [dxvk] Improve robustness to stutters
40e778a - [dxvk] Enable calibrated timestamps only for low-latency modes - Minor optimizations
- IMPORTANT: This commit makes possible for DXVK-GPLALL to work on devices that do not have
timestampValidBits = 64(Adreno/Mail/PowerVR etc. mobile GPUs, Intel iGPUs, Intel ARC A GPU Series, possibly more) indxvk.framePace = "max-frame-latency"ordxvk.framePace = "min-latency"modes.
159ff57 - [dxvk] Fix ringbuffer allocator bug - Minor optimizations
eeb9e83 - [dxvk] Reduce memory footprint of latency markers - Minor optimizations
e554b1b - [dxvk] Optimize DxvkSharedAllocationCache::freeAllocation - Minor optimizations
c42092a - [dxvk] Optimize DxvkCsChunkPool - Minor optimizations
b81475c - [dxvk] Optimize sync::Fence - Minor optimizations
96daf3c - [util] Add sync::AtomicSignal based on WaitOnAddress / futex - Threaded sleep
5311d42 - [dxvk] Add ThreadedSleep - Threaded sleep
6ccd7b4 - [dxvk] Add frame duration measurement for app-thread - Jitter display
9b85b1e - [dxvk] Add jitter tracking - Jitter display
07d5b3e - [hud] Change initialization location of latency hud items - Jitter display
9659672 - [hud] Add jitter hud item - Jitter display