github pdfcpu/pdfcpu v0.7.0

latest release: v0.8.0
one month ago

Hello!

🧑‍🔬 We packed lots of goodies into this release for you..

Performance

You will like this ✨
Thanks to @fancycode we have improved PDF parsing significantly.
While this is not easily comparable running the pdfcpu testsuite is now 8 seconds faster under MacOS 14.2.1:

Before:

./coverage.sh  67.60s user 13.35s system 119% cpu 1:07.93 total

After:

./coverage.sh  59.64s user 12.55s system 107% cpu 1:07.01 total

PDF 2.0 Support

We now have basic support for writing back PDF 2.0 files.
This means you may start using all pdfcpu operations that update validated PDF 2.0 files.
Basic support means, your mileage may vary, especially when you try to process a file using one of the new 2.0 features.

Since it is hard to get a hand on PDF 2.0 files using a specific new 2.0 feature there is a disclaimer printed on the command line asking for your input and contribution. Please open an issue and share your file in case pdfcpu has a problem digesting your file.
The same applies if you just want to see some specific 2.0 feature supported.

In general, please 🙏🏻 report back any issues - there is no way to fix something that does not get reported!

New Zoom Command

pdfcpu zoom [-p(ages) selectedPages] -- description inFile [outFile]

Zoom in/out of selected pages either by magnification factor or corresponding margin.
When zooming out the unused page content space results into horizontal and vertical margins.
These are different from each other but correspond to a certain factor.

Examples:

Zoom into magnification of 200%

pdfcpu zoom -- "factor: 2"  in.pdf out.pdf

Zoom out to magnification of 50%

pdfcpu zoom -- "factor: .5" in.pdf out.pdf

Zoom out to a magnification equivalent to a horizontal margin of 1 cm

pdfcpu zoom -unit cm -- "hmargin: 1" in.pdf out.pdf

Zoom out to a magnification equivalent to a vertical margin of 30 points.
Draw a border around zoomed out page content and fill unused page space light gray

pdfcpu zoom -- "vmargin: 30, border:true, bgcolor:lightgray" in.pdf out.pdf ... 

Please consult pdfcpu help zoom for more and also the official documentation

Enhanced Booklet command

Thanks to @adamgreenhall we have an even more powerful booklet command for producing zines:

We now have booklet styles 2, 4, 6 and 8 and you may choose one of the following booklet types, each representing a certain method for arranging pages into a booklet:

booklet, bookletadvanced, perfectbound

Examples:

Arrange pages of in.pdf 2 per sheet side (4 per sheet, back and front) onto out.pdf

pdfcpu booklet -- "formsize:Letter" out.pdf 2 in.pdf

Arrange pages of in.pdf 4 per sheet side (8 per sheet, back and front) onto out.pdf:

pdfcpu booklet -- "formsize:Ledger" out.pdf 4 in.pdf

Arrange pages of in.pdf 6 per sheet side (12 per sheet, back and front) onto out.pdf

pdfcpu booklet -- "formsize:Ledger" out.pdf 6 in.pdf

Arrange pages of in.pdf 8 per sheet side (16 per sheet, back and front) onto out.pdf

pdfcpu booklet -- "formsize:A3" out.pdf 8 in.pdf

Arrange pages of in.pdf 4 per sheet side, with short-edge binding onto out.pdf

pdfcpu booklet -- "formsize:A3, binding:short" out.pdf 4 in.pdf

Arrange pages of in.pdf 2 per sheetside as sequence of folios covering 4*foliosize pages each.

pdfcpu booklet -- "formsize:A4, multifolio:on" hardbackbook.pdf 2 in.pdf

Arrange pages of in.pdf 2 per sheet side, arranged for perfect binding, onto out.pdf

pdfcpu booklet -- "formsize:A4, btype:perfectbound" out.pdf 2 in.pdf

Arrange pages of in.pdf 4 per sheet side, arranged for advanced binding, onto out.pdf

pdfcpu booklet -- "formsize:A3, btype:bookletadvanced" out.pdf 4 in.pdf

Please consult pdfcpu help booklet for more and also the official documentation

Configuration Changes

There are two changes to the configuration:

  1. validationNone was eliminated
  2. postProcessValidate is new and enables safeguard validation

Validation mode ValidationNone has been eliminated for a couple of reasons.
First of all during validation there are a lot of things happening like internalizing and caching needed for command processing,
secondly PDF validation has become quite performant.

We are introducing the new config flag postProcessValidate.
This flag which is turned on by default enables the validation of your processed cross reference table right before writing.
This is considered a useful safeguard, since in cases when writing back a problematic cross reference table without problems,
only the next read/parse/validation attempt will take notice of a problem.
If you disable this you will get an additional performance boost overall but with the caveat described above.

As usual please renew your configuration!

Form filling now expects the user font Roboto-Regular when using eastern european scripts.
You can do this manually or just remove your pdfcpu configuration all together and recreate it like so:

  1. Locate the pdfcpu folder using pdfcpu conf
  2. Remove/backup the pdfcpu folder
  3. Recreate a brand new pdfcpu folder by executing any pdfcpu cmd on the CLI eg. execute one more time pdfcpu conf
  4. Edit your configuration

Samples And Tests

This all is complementing the official documentation

To get a better understanding of pdfcpu's operations please make sure you check out all tests and the corresponding PDF output and all json input where appropriate:

pdfcpu/pkg/samples/* comes loaded with 230 MB worth of PDFs produced by corresponding tests and json input located at:

  • pdfcpu/pkg/api/test
  • pdfcpu/pkg/testdata/json
Screenshot 2024-03-05 at 14 11 56

Thanks

🙏 to all bug reporters and feature requestors.
Special thanks for contributed PRs go to @adamgreenhall, @fancycode, @kalimit, @sivukhin and @afh

Little Commercial Break

pdfcpu is in need of more frequent financial supporters!
Please consider becoming a sponsor especially if you are a (small) business 🙏
If you are a developer within a business please go to your superior or team lead and have them compare the benefits/costs vs. commercial solutions. If you prefer to operate in stealth mode that's fine - you can always become a private sponsor.
What's important is to keep the project funded and on a clear, steady path 🚀

Meet The Maintainer

I will be in the San Francisco Bay Area this fall.
If you are a recurring sponsor or not but a business using pdfcpu I would like to get to know you and your pdfcpu use case. I'll be happy to meet also one-on-one possibly over 🍻 for a technical chat/discussion and to get feedback right from the trenches.
Just get in touch with me: hhrutter@gmail.com

Next Steps

Support for PDF 2.0 encryption will be tackled next, after that digital signatures.
A Beta version is within reach 👍🏻

ghstars0324

Have fun 💚 with pdfcpu!

Changelog

Don't miss a new pdfcpu release

NewReleases is sending notifications on new releases.