Big "news": developing (underway) to handle SHA-1 collisions
Disclaimer: is this a real issue? Can my backups become broken?
In fact, no. The SHA-1 files collision was created "in the lab" to prove its existence. In the "real world" I consider it extremely unlikely (aka: bordering on impossible) to have this kind of problem. So I believe I can say that it is safe to use zpaqfranz to make backups. After all, one of the first functions I implemented was precisely an enhancement to the t command, which (for years) does collision detection on zpaqfranz. Short version: if you want to be sure, use the t command (to test prepared archives); the new collision command does a faster test (than t), and is targeted only at collisions, not file integrity, which t tests instead. Finally, there are the paranoid commands and switches for people like me. The real problem is maintaining backward compatibility with zpaq 7.15. And, believe me, it is not easy at all.
Switch -collision in add
zpaqfranz can now recover from a SHA-1 collision in the current version (of the archive). AKA: if you are a bit paranoid, you can make sure that files with SHA-1 collision will be extracted correctly
Let's see a collision-aware zpaqfranz detecting a problem, and fixing
First of all: older zpaqfranz (default) says... nothing
release\58_10\zpaqfranz a z:\bydefaultnothing.zpaq message*
zpaqfranz v58.10o-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-10-01)
franz:-hw
Creating z:/bydefaultnothing.zpaq at offset 0 + 0
Add 2023-11-10 12:54:11 2 1.280 ( 1.25 KB) 32T (0 dirs)
2 +added, 0 -removed.
0 + (1.280 -> 640 -> 1.803) = 1.803 @ 40.32 KB/s
0.031 seconds (000:00:00) (all OK)
Older zpaqfranz, with -verify, early shows something is wrong
release\58_10\zpaqfranz a z:\doverify.zpaq message* -verify
zpaqfranz v58.10o-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-10-01)
franz:-hw -verify
Creating z:/doverify.zpaq at offset 0 + 0
Add 2023-11-10 12:54:55 2 1.280 ( 1.25 KB) 32T (0 dirs)
29604 SOMETHING WRONG ON messageA
GURU-C: on file messageA
GURU: CRC-32 from fragments 92433266
GURU: CRC-32 from file 072E2B0E
2 +added, 0 -removed.
(...)
OK, let's try the brand-new 58.11
zpaqfranz a z:\collision message* -collision
zpaqfranz v58.11z-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-11-10)
franz:-collision -hw
Creating z:/collision.zpaq at offset 0 + 0
Add 2023-11-10 12:31:55 2 1.280 ( 1.25 KB) 32T (0 dirs)
29604 SOMETHING WRONG ON messageA
GURU-C: on file messageA
GURU: CRC-32 from fragments 92433266
GURU: CRC-32 from file 072E2B0E
2 +added, 0 -removed.
0 + (1.280 -> 640 -> 1.803) = 1.803 @ 78.12 KB/s
##################
87571: Restoring this file will get incorrect data due to suspected SHA-1 collision(s)
<<messageB>>
#################
87538: SHA-1 collision detection time 31 ms
87653: Need a second pass on <<messageB>>
z:/collision.zpaq:
1 versions, 2 files, 1.803 bytes (1.76 KB)
AVAILABLE -stdout 1
Updating z:/collision.zpaq at offset 1.803 + 0
Add 2023-11-10 12:31:55 1 640 ( 640.00 B) 32T (0 dirs)
Warning: adjusting date from 2023-11-10 12:31:55 to 2023-11-10 12:31:56
1 +added, 0 -removed.
1.803 + (640 -> 640 -> 1.764) = 3.567 @ 9.92 KB/s
Now we extract the collisioned file, then check
Please note: we are extracting WITH zpaq 7.15, NOT with zpaqfranz ! (backward compatibilty is fully preserved, this is hard to achieve)
zpaq64 x z:\collision.zpaq -to z:\zpaqfixed
diff z:\zpaqfixed\messageA z:\zpaqfixed\messageB
Files z:\zpaqfixed\messageA and z:\zpaqfixed\messageB differ
The two files (messageA and messageB) are different (aka: restoring is OK even with SHA-1 collision)
Let's try with zpaq 7.15
zpaq64 a z:\undetected.zpaq message* -summary 1
zpaq v7.15 journaling archiver, compiled Aug 17 2016
Creating z:/undetected.zpaq at offset 0 + 0
Adding 0.001280 MB in 2 files -method 14 -threads 32 at 2023-11-10 11:37:02.
2 +added, 0 -removed.
0.000000 + (0.001280 -> 0.000640 -> 0.001739) = 0.001739 MB
0.031 seconds (all OK)
zpaq64 x z:\undetected.zpaq -to z:\zpaq715restored
diff z:\zpaq715restored\messageA z:\zpaq715restored\messageB
The two files (messageA and messageB) are THE SAME (aka: zpaq 7.15 fail to restore with SHA-1 collision)
In this release the recovery mechanism works for a single version. If there is a collision between two files, in two different versions, it will not be possible to restore them (I am working on it)
Command collision
Quickly check against SHA-1 collisions inside archive. This is faster than a "full" t (test)
zpaqfranz collision z:\1.zpaq
zpaqfranz collision z:\1.zpaq -all
Command dump
Show internal structure for not-multiparted, not-encrypted archives
zpaqfranz dump z:\kajo.zpaq
zpaqfranz dump z:\kajo.zpaq -verbose
zpaqfranz dump z:\kajo.zpaq -summary
zpaqfranz dump z:\kajo.zpaq -verbose -summary
"Truncate-Touching"
Fix back the archive timestamp, whenever no update is done
Arch Linux AUR package
-collision -kill
Just the first release of SHA-1 "full scale" recovery function