github dichovsky/pdf-to-png-converter v4.1.0

8 hours ago

Removed

  • Removed the bespoke NodeCanvasFactory (src/node.canvas.factory.ts) and its tests. Rendering now uses pdf.js's built-in Node canvas factory (PDFDocumentProxy.canvasFactory, backed by @napi-rs/canvas) directly. The previous code selected this factory at runtime anyway — the isNodeCanvasFactory() duck-type guard always matched pdf.js's own factory, so the project's class and its new NodeCanvasFactory() fallback were never exercised on the render path. The @napi-rs/canvas dependency is unchanged (kept as a direct dependency so pdf.js's renderer is always able to load it). Rendered PNG output is unchanged — the visual-comparison suites pass. Resolves backlog item ARCH-015. pdf.js's canvasFactory is validated at runtime (it must expose callable create/destroy) rather than force-cast, the render path now asserts both the returned canvas and context are non-null before use, and destroy() receives the exact CanvasAndContext object pdf.js returned (preserving any internal fields it needs for cleanup).

Fixed

  • outputFileMaskFunc now rejects non-string return values before page processing. Previously truthy non-string values passed the separator check through implicit coercion and could escape into metadata results as a non-string name, violating the PngPageOutput contract.
  • Parallel page processing now propagates a worker rejection whose reason is undefined. Previously processPagesWithSlidingWindow() used undefined as both the "no error" sentinel and a possible rejection payload, so that failure was swallowed and the conversion resolved with an undefined page result.
  • PngPageOutput.width / height are now always integer pixel dimensions that match the rendered PNG. Previously they were reported straight from pdf.js's PageViewport, whose lengths are unrounded floats, while @napi-rs/canvas truncates fractional dimensions when it allocates the bitmap. Any PDF whose viewportScale × pageDimension was fractional therefore reported a non-integer size that disagreed with the actual image — e.g. a 595×842 pt (A4) page at viewportScale: 1.5 reported width: 892.5 for an 892 px-wide PNG. Both the render path (renderPdfPage) and the returnMetadataOnly path (getPageMetadata) now floor viewport lengths to pixels via the shared toPixelDimension helper, so the two paths agree and both match the bitmap. US-Letter assets (612×792) at integer scales are unaffected.
  • A viewportScale small enough to floor a page to 0 px in either dimension now throws an actionable "…cannot produce a valid image. Increase viewportScale." error from both renderPdfPage and getPageMetadata, instead of returning a phantom 0×0 metadata result or surfacing an opaque canvas-factory AssertionError. The page is released before the render path throws.
  • returnMetadataOnly (getPageMetadata) now enforces the MAX_CANVAS_PIXELS limit, matching renderPdfPage. Previously the oversized-page guard lived only on the render path, so a viewportScale whose viewport area exceeded the limit threw "Canvas …×… px exceeds the … pixel limit. Reduce viewportScale." on a real render but silently returned those (unrenderable) dimensions in metadata-only mode — a phantom result for a page that cannot be rendered, the same failure mode the floor-to-zero guard already prevents on both paths. The two paths now reject oversized pages with the identical message via the shared canvasPixelLimitError builder (mirroring nonRenderableDimensionsError).
  • The MAX_CANVAS_PIXELS canvas-area guard now bounds the rendered (floored) canvasfloor(viewportWidth) × floor(viewportHeight) — instead of the unrounded fractional viewport area. Because the canvas is allocated with floored dimensions (via the shared toPixelDimension helper), a page whose un-floored viewport area slightly exceeded the limit while its actually-allocated bitmap fit within it was wrongly rejected with "Canvas …×… px exceeds the … pixel limit. Reduce viewportScale.". This affects a narrow viewportScale band — e.g. a 612×792 pt US-Letter page at viewportScale ≈ 14.3636 produces an un-floored area of 100,000,739 px (over the 100,000,000 cap) but a real 8790×11375 = 99,986,250 px bitmap (under it), so the page is renderable yet was refused. Both renderPdfPage and the returnMetadataOnly path (getPageMetadata) now floor viewport lengths before the area check, so the guard matches the bitmap actually allocated and the two paths stay symmetric. Pages that genuinely exceed the limit still throw the identical message on both paths, and peak canvas memory remains bounded at MAX_CANVAS_PIXELS × 4 bytes ≈ 400 MB.

Security

  • SEC-001: outputFileMaskFunc filenames are now rejected synchronously when they contain a / or \ path separator, closing a residual TOCTOU window where a co-tenant with write access to outputFolder could swap an intermediate directory for a symlink between the realpath(dirname(...)) check and the open(..., 'wx') call in savePNGfile(). The guard fires both in resolvePageName (early) and in savePNGfile (defense in depth). The existing flat-filename contract is unchanged.
  • SEC-002: Added PdfToPngOptions.maxInputBytes (default 256 MiB via MAX_INPUT_BYTES) bounding input PDF size. The path branch of getPdfFileBuffer() now runs fs.stat() before fs.readFile() and rejects (a) non-regular files (/dev/zero, FIFOs, sockets, character devices) and (b) inputs whose size exceeds maxInputBytes. The buffer / Uint8Array branch validates byteLength against the same cap. Together these block unbounded memory consumption from untrusted input paths and oversized buffers.
  • SEC-003: concurrencyLimit now enforces an upper bound of MAX_CONCURRENCY_LIMIT (16) when processPagesInParallel is true. At the cap, peak in-flight canvas memory ≈ 16 × MAX_CANVAS_PIXELS × 4 bytes ≈ 6.4 GiB — a defensible ceiling for typical service containers. Values above 16 (e.g. Number.MAX_SAFE_INTEGER) throw synchronously before any rendering starts. The default 4 and lower values are unaffected.

Changed

  • Migrated pdfjs-dist from ~5.7.284 to ~6.0.227. pdf.js v6 removed PDFDocumentProxy.destroy(), so document/worker teardown now uses pdfDocument.loadingTask.destroy() (the loadingTask getter exists in both v5 and v6, and the removed destroy() previously delegated to it). The public API, default options, asset paths (cmaps / standard_fonts), the legacy/build/pdf.mjs import path, and rendered PNG output are all unchanged — the visual-comparison suites pass against the existing v5-generated reference images.
  • CI now blocks on npm run build:strict; the strict type-check is no longer advisory. continue-on-error: true is removed from .github/workflows/test.yml and the dedicated CI "Strict type check" step is replaced by pretest gating (avoiding a double run on CI). pretest now runs build:strict alongside build:test — the two type-checks enforce different contracts: build:test (using tsconfig.json, no DOM lib) gates src/ against accidental DOM globals (document, window) that production builds would reject; build:strict (using tsconfig.strict.json, skipLibCheck: false + DOM lib for @napi-rs/canvas type resolution) gates against upstream type regressions in pdfjs-dist / @napi-rs/canvas. Local npm test and prepublishOnly now gate on both.
  • Improved README accuracy and usability for npm consumers, and simplified the package funding metadata so npm fund exposes the Buy Me a Coffee URL.

Refactored

  • Updated the stale version pin in the existing @ts-ignore suppression in src/pageRenderer.ts from pdfjs-dist@~5.6.205 to pdfjs-dist@~6.0.x and clarified why @ts-ignore (not @ts-expect-error) is required for this site — the underlying type error is hidden by build:test's skipLibCheck:true, which would cause @ts-expect-error to report as unused. Added a comment in tsconfig.strict.json explaining the intentional DOM-lib divergence from tsconfig.json. Added a "Strict type-check" section to CONTRIBUTING.md documenting the failure-handling playbook (default @ts-expect-error for self-cleaning; @ts-ignore exception for skipLibCheck-hidden errors).

Don't miss a new pdf-to-png-converter release

NewReleases is sending notifications on new releases.