Maintenance Release

PDF 2.0 Support

PDF 2.0 encryption is now supported and you are free to use the following commands with your PDF 2.0 input files:

encrypt
decrypt
changeopw
changeupw
permissions

Performance

We can report another 🚀 @fancycode parser improvement resulting in a significant performance boost and lower memory overhead especially for large files:

Before:

$ time go run test.go 
2024/03/21 09:03:55.874443 Parsing ...
2024/03/21 09:04:07.947987 Done, uses 4244 MiBytes heap memory, 6755 MiBytes system memory
2024/03/21 09:04:07.948013 Parsed 1133 pages

real	0m12,743s
user	0m21,830s
sys	0m2,589s

After:

$ time go run test.go 
2024/03/21 09:04:30.639673 Parsing ...
2024/03/21 09:04:30.899588 Done, uses 12 MiBytes heap memory, 11 MiBytes system memory
2024/03/21 09:04:30.899609 Parsed 1133 pages

real	0m0,568s
user	0m0,881s
sys	0m0,228s

Configuration Changes

We have added options to skip some optimization steps or disable internal optimization alltogether:

If you disable the following option there will be no internal optimization of the cross reference table once it is loaded into memory.
This will only affect commands that do not rely on optimization like e.g. optimize

# toggle optimization
optimize: true

The following will disable the parsing of page content streams in order to detect unused resources like images or fonts.

# optimize page resources via content stream analysis.
optimizeResourceDicts: true

The following option decides if pdfcpu will scan for and remove duplicate content streams.

# optimize duplicate content streams across pages.
optimizeDuplicateContentStreams: false

⚡ Caution is advised and you have to know what you are doing when using these options.
Tuning or turning optimization off can make sense in environments where you deal with large PDF files that usually look the same structure wise so there are no surprises.

Since the pdfcpu configuration has changed you are encouraged to recreate your config.yml:

Locate your config.yml using pdfcpu conf
Remove/backup your config.yml
Create a new config.yml from scratch by executing any pdfcpu cmd on the CLI eg. execute one more time pdfcpu conf
Edit your configuration

Thanks

for all of you test driving pdfcpu and reporting 🐛 s along the way.
Special PR thanks 👍🏻 also to @adamgreenhall for improving the booklet command and to @xelan as well.

Changelog

576f15e Bump version
38b2992 Fix #851
41333df Cancel parsing in "buffer" if context is cancelled.
b462c01 Handle case where referenced stream length does not exist.
ca6d15e Avoid pointer receiver and don't call PDFString of lazy objects internally.
91619f0 Write out LazyObjectStreamObject without temporary decoding.
df5d53d Lazily decode data of StreamObject objects.
82f1929 filter: Add API to partially decode data.
fc09c1f Lazily parse ObjectStream objects.
7188e6a Fix #852
5bafded Merge PR #855
2d06bc7 Add missing author info
0988c5e Add PDF 2.0 encryption
f783bf2 Fix #834
87abdcc Fix #849
d568466 Fix #844
deb697d Fix #847
a647579 Fix #843
05d2d1f Fix #841
6d95797 Fix #839
9d76f84 Merge in PR #817
b9c7a89 Fix #838
c5db1d9 Fix #826
26c5fb2 Fix #826
aff022f Fix #835, Add config flags for optimization
3282d8a Fix #823
6158a91 Another fix for #828
57030ec Fix #828
5ccea97 Fix #135
fd34b05 Fix #821

pdfcpu/pdfcpu v0.8.0 on GitHub