github pinterf/mvtools 2.7.45
MvTools2 2.7.45 with depans.

Change log
(mvtools2 only, depans are unchanged)

  • 2.7.45 (20210608)
    • Fix: change parameter 'ml' from int to float in MBlockFPS. (Other filters with 'ml' are O.K.: MMask, MFlowInter, MFlowFPS are using float.)
    • Fix MBlockFPS html doc as well, which mentions 'thres' instead of 'ml'. Add mode 5-8 to MBlockFPS docs
    • Move change log from readme to
    • Code change/speedup: MSuper: rfilter=0 and 1
      8 bit: drop old SSE code, port to SIMD intrinsics. Add SIMD to 16 bit case. Quicker, much quicker.
      (rfilter: Hierarchical levels smoothing and reducing (halving) filter)
    • Code change/speedup: MSuper: sharp=1 for pel=2 or 4
      Bicubic resizer drop old SSE code, port to SIMD intrinsics, implement SIMD intrinsics to 16 bit case.
      No need for Bilinear.asm and Bilinear-x64.asm any more.
    • SATD: add 8 bit C versions (geee, there wasn't one) (as an alternative to the external asm)
    • SAD: add internal SIMD for 8 bit SAD (SSE4.1) (as an alternative to the external asm)
    • Overlaps: Add internal SIMD for 8 bit. (as an alternative to the external asm)
    • In def.h any existing external assembler file can be disabled.
      The primary reason for this was to quickly test the linux port, for me this was easier than bothering with asm compilation and linking.
      For non-Windows cases all of these are disabled now.
    • USE_COPYCODE_ASM (CopyCode-a.asm). Has internal alternative. Same speed.
    • USE_OVERLAPS_ASM (Overlap-a.asm). asm implements 8 bit only. Has internal SIMD alternative. About the same speed.
    • USE_SAD_ASM (sad-a.asm) asm implements 8 bit only. Note: Internal 8 bit SIMD SAD is a bit slower that these handcrafted ones.
    • USE_SATD_ASM (Pixel-a.asm) asm implements 8 bit only. Note: SATD 8 bit has no SIMD replacement yet.
    • USE_LUMA_ASM (Variance-a.asm) asm implements 8 bit only. Has internal alternative.
    • USE_FDCT88INT_ASM (fdct_mmx.asm, fdct_mmx_x64.asm)
      Only used for 8x8 block sizes. Quick integer version instead of fftw3.
      No internal alternative, fftw3 is used instead.
    • USE_AVSTP (do find search for avstp.dll on Windows)
    • Minor and not so minor cosmetics, mainly for GCC.
    • Add Cmake build system.
    • MvTools2: Linux/GCC port (needs sse4.1), Dewindowsification.
      fftw3: MAnalyze dct modes that require fftw3 library will search for
      Install either libfftw3-single3 (deb) or fftw-devel (rpm) package");
      e.g. sudo apt-get update
      sudo apt-get install libfftw3-dev
    • Not done (will be done in a second phase):
      Add back some external asms for linux build. For 8 bit SAD and SATD mainly.
      Linux port is still Intel-only, though every part has C alternative by now.
    • Separate the 3 projects (mvtools2, depan, depan_estimate).
    • DePan and DePanEstimate: Linux port
    • DePanEstimate: add fft_threads variable (default 1) for fftw3 mode (experimental)
    • DepanEstimate: add MT guard around sensible fft3w functions
    • mingw build fixes
    • Add build instructions to
    • experimental avx2 for MDegrain1..6 (was not worth speedwise on my i7700 - memory transfer is bottleneck)
    • experimental 32-bit float internal Overlaps result buffer for MDegrain1..6
      whether if it is any quicker/more exact than the integer-scaled arithmetic version
      (when out32=true, do not use, it is only for development/test, maybe will be removed in the future)
4 months ago