MvTools2 + Depan + DepanEstimate
(Highlights: Masks, Speed, DepanEstimate)
2.7.19.22 (20170525)
New: [MMask] Support any planar input video formats e.g. greyscale, Planar RGB.
Input clip can even be of different bit depth or format from vector's original format
For kind==5 where U and V is filled, the greyscale option is not allowed
Mod: [MMask] Faster: request source frame only for kind=5.
Fix: [MxxxxFPS,MMask]: MakeVectorOcclusionMaskTime garbage in bottom blocks (30 hrs of debugging)
Fix: [MMask] bottom padding garbage for padded frame dimension
Fix: [MMask] proper 10+ bits scene change values (for default: 1023, 4095, 16383, 65535. Was: 65535)
Parameter is still in 8-bit range
Fix: [MRecalculate] prevent overflow during thSAD scaling in 16 bits or large block sizes (32, 48...)
Fix: [DepanEstimate] Giving wrong motion instead of scene change detection
Fix: [MAnalyze] Possible overflow in MAnalyze 8 bit, block size 48x48 and above.
Overflow-safe predictor recalc for big block sizes
New: [General] Add block size 12x3 for SAD, allow 6x24
List of available block sizes
64x64, 64x48, 64x32, 64x16
48x64, 48x48, 48x24, 48x12
32x64, 32x32, 32x24, 32x16, 32x8
24x48, 24x24, 24x32, 24x12, 24x6
16x64, 16x32, 16x16, 16x12, 16x8, 16x4, 16x2
12x48, 12x24, 12x16, 12x12, 12x6, 12x3
8x32, 8x16, 8x8, 8x4, 8x2, 8x1
6x24, 6x12, 6x6, 6x3
4x8, 4x4, 4x2
3x6, 3x3
2x4, 2x2
Mod: [Internal] Reorganized 10-16 bit SAD simd intrinsics, faster 8-12% for BlkSize 12-32
2.7.18.22 (20170512)
Fix: 10-16 bit: DCT buffer possible overflow
Fix: DCT is fast again for non 8x8 blocksizes. Regression since 2.7.5.22.
New: Chroma SAD is now always half of luma SAD, regardless of video format.
Without this: YV24's luma:chroma SAD ratio is 4:8 instead of 4:2 (of YV12)
New: MAnalyze, MRecalculate new parameter: "scaleCSAD" integer, default 0
Fine tune chroma SAD weight relative to luma SAD.
ScaleCSAD values for luma:chroma SAD ratio
-2: 4:0.5
-1: 4:1
0: 4:2 (default, same as the native ratio for YV12)
1: 4:4
2: 4:8
New: Block sizes 64, 48, 24, 12, 6
MAnalyze/MRecalculate new block sizes (SATD support mod4 sizes)
List of available block sizes
64x64, 64x48, 64x32, 64x16
48x64, 48x48, 48x24, 48x12
32x64, 32x32, 32x24, 32x16, 32x8
24x48, 24x24, 24x32, 24x12, 24x6
16x64, 16x32, 16x16, 16x12, 16x8, 16x4, 16x2
12x48, 12x24, 12x16, 12x12, 12x6
8x32, 8x16, 8x8, 8x4, 8x2, 8x1
6x24, 6x12, 6x6, 6x3
4x8, 4x4, 4x2
3x6, 3x3
2x4, 2x2
Note: some smaller block sizes can only be available in 4:4:4 formats, due to block size division (chroma subsampling)
New: All block sizes are supported in MDegrain1-6, MDegrainN, and MScaleVect
New: Changed to 2017 version of asm files for 8 bit SAD/SATD functions from x265 project.
Added not implemented asm code for 12, 24, 48 sizes
For some block sizes AVX2 and SSE4 is supported (AVX2 if reported under AviSynth+)
e.g. BlkSize 32 is faster now.
New: MMask SAD Mask to give identical weights for other-than-YV12 formats, e.g. for YV24
MvTools2 2.7.17.22 (20170426)
Fix: Regression in 2.7.16.22: MDegrain right pixel artifacts on non-modulo 16 widths
Misc: MMask, mode SADMask output is normalized further by video subsampling (YV16/YV24 has larger SAD value due to bigger chroma part that classic YV12)
MvTools2 2.7.16.22 (20170423)
Fix: MMask 10-16 bits
Fix: MRecalculate 14-16 bits passed nSCD1=999999 internally which caused overflow (scene change problems later). Fix is done by clamping SCD1 to 88(255-0) (maximum value of sum of SADs on a 8x8 block)
Misc: MDegrainX 8 bits: internal 16 bit buffer to 8 bits: SSE2
MvTools2 2.7.15.22 (20170316)
Fix: 16 bit SAD for non-AVX code path
Misc: MDegrain1-6: add error on lsb_flag=true for non-8 bit sources
MvTools2 2.7.14.22 (20170206)
Fix: MAnalyze divide=2 showed "vector clip is too small", inherited from 2.6.0.5, sanity check was done but length was not filled for divideextra data)
Fix: MFlow access violation in internal mv resizer when resizing factor was big (MCaWarpSharp3 4x supersampling case), bug introduced in upstream 2.5.11.22
MvTools2 2.7.13.22 (20170201)
Fix: MDegrain1-6,N: 10-16 bit thSCD scaling
Fix: MVShow: tolerance scaling for 10-16 bits
MvTools2 2.7.12.22 (20170120)
New: Faster SATD (dct=5..10) 8 bit: updated x264 function selectors, SSE2/4/AVX/AVX2
+10% speed for a whole typical MDegrain3 process on my i7-3770
New: Much Faster SATD (dct=5..10) 10-16 bit: SSE2/SSE4 instead of C
+50% speed for a whole typical MDegrain3 process (which is approx half speed of 8 bit)
MvTools2 2.7.11.22 (20170116)
New: MDegrain6
Mod: MDegrain1-6 SSE4 for 10-16 bit (was: C. 3-5% gain, wasn't bottleneck)
MvTools2 2.7.10.22 (20161228)
fix: YV12 debug info display wrong text placement on chroma planes
(also depans)
MvTools2 2.7.9.22 (20161220)
Apply 2.5.11.9-svp analysis speedup, mainly when chroma is involved
MvTools2 2.7.8.22 (20161218)
YUY2 input fix, MDegrain YUY2 plane allocation fix (freeze at script exit)
YUY2 input fix also for DepanStabilize (Depan.dll 2.3.11->2.3.11.1)
MvTools2 2.7.7.22 (20161215)
Yet another speed up, I am trying to getting nearer to the classic YV12 only builds.
MvTools2 2.7.6.22 (20161204)
This release contains some fixes and speedup.
Speedup is compared to the previous releases, now it's around the 2.7.0.22d performance.
Note that using 64 bit version yields usually 10-20% speed gain over 32 bits version.
MvTools2 2.7.5.22 (20161119)
Milestone release:
General support of 10-16 bit formats with Avisynth Plus (r2294 or newer recommended)
with new MDegrain4 and MDegrain5 filters.
Depan 2.13.1.2 (20161228)
Fix: DepanStabilize: removed too strict checking for large motion vectors received from depan
that resulted in like scene change
Fix: for YV12 the debug info text chroma part was positioned at wrong place
Bundled filters:
Depan: 2.13.1.2 (20161228)
DepanEstimate: 2.10.0.1 (20161228)
General support of 10-16 bit formats with Avisynth Plus (r2294 or newer recommended)
x86 and x64 builds
Windows XP is still supported