Bugfixes
-
Fixed
extract_block.py, which was incorrectly usingprintfinstead ofprint. -
Support LZ4 compression levels above 9.
Features
-
Added
--filteroption to support simple (rsync-like) filter rules. This was driven by a discussion on github #6. -
Added
--input-listoption to support reading a list of input files from a file or stdin. At least partially fixes github #6. -
The compression code has been made more modular. This should make it much easier to add support for more compression algorithms in the future.
-
Added support for Brotli compression. This is generally much slower at compression than ZSTD or LZMA, but faster than LZMA, while offering a compression ratio better than ZSTD. Fixes github #76.
-
Added support for choosing the file hashing algorithm using the
--file-hashoption. This allows you to pick a secure hash instead of the default XXH3. Also fixes github #92. -
Improved de-duplication algorithm to only hash files with the same size. File hashing is delayed until at least one more file with the same size is discovered. This happens automatically and should improve scanning speed, especially on slow file systems.
-
Added
--max-similarity-sizeoption to prevent similarity hashing of huge files. This saves scanning time, especially on slow file systems, while it shouldn't affect compression ratio too much. -
Honour user locale when formatting numbers.
-
Added
--num-scanner-workersoption. -
Added support for extracting corrupted file systems with
dwarfsextract. This is enabled using the--continue-on-errorand, if really needed,--disable-integrity-checkoptions. Fixes github #51.
Other
-
Added unit tests for progress class.
-
Lots of internal cleanups.