Breaking Changes
- Vue SFC tokenization —
.vuefiles are no longer tokenized asmarkup. Each block is now dispatched to its own sub-format:<script>→javascript,<script lang="ts">→typescript,<template>→markup,<style>→css,<style lang="scss">→scss,<style lang="less">→less. Clone reports for.vuefiles now appear under these resolved sub-format names. Any tooling or configuration that relied on.vueclones being reported undermarkupmust be updated. --formatsExtsusers — custom mappings that pointed.vuetomarkup(e.g."formatsExts": { "markup": ["vue"] }) will no longer take effect because.vueis handled by the dedicatedvueformat processor. Remove or update such mappings.
New Features
- Custom tokenizer backend — replaced the
prismjsnpm package with a self-contained reprism-based grammar engine. ~11.5% faster tokenization on real projects (avg 1126 ms → 997 ms on a 548-file, 223-format scan). - Cross-format detection — Vue SFC (
.vue), Svelte (.svelte), Astro (.astro), and Markdown files are now tokenized per-block/per-section. A<script>block in a.vuefile can match a.tsfile; a fenced code block in Markdown can match a.pyfile. - 223 supported formats — Apex, CFML/ColdFusion, GDScript, Svelte, Astro, and 70+ additional languages added (up from 152). See supported_formats.md.
- Shebang detection — extensionless executable scripts (e.g.
/usr/bin/env python3) are auto-detected by their#!shebang line and tokenized in the correct language. --store-path— configure a custom directory for the LevelDB cache, eliminating collisions when multiple jscpd processes run in parallel on the same machine.--skipComments— shorthand flag for--mode weak, which strips comments before detection.--formats-names— map specific filenames (e.g.Makefile,Dockerfile) to a detection format.
Bug Fixes
- Entire-file duplicates silently dropped (
@jscpd/core#728) — RabinKarp flushed the pending clone on a store hit at end-of-file instead of on a miss. Files that are complete copies of each other were undetected. Fixed. - ReDoS hang on Lisp/Elisp files (
@jscpd/tokenizer#737) — the Lisp string regex/"(?:[^"\\]*|\\.)*"/could catastrophically backtrack (O(2ⁿ)) on unterminated strings. Replaced with a linear/"(?:[^"\\]|\\[\s\S])*"/pattern. - Process crash on malformed
package.json(#739) —readJSONSyncthrew an unhandledSyntaxErrorwhenpackage.jsoncontained invalid JSON, killing the process. Now emits a warning and continues with an empty config. - Vue SFC cross-file detection broken — the detector used the file-level format (
vue) as the store namespace for all SFC blocks, preventing a<script>block in one.vuefile from ever matching a<script>block in another. The namespace now reflects each block's resolved sub-format. - Vue SFC incorrect column numbers — tokens on the first line of a block carried block-relative column 1 instead of file-absolute column numbers. Fixed in
@jscpd/tokenizer. - 50 dependency security vulnerabilities remediated across the monorepo (Dependabot batches).
Known Limitations
- Malformed SFC blocks (e.g. unclosed tags, invalid attributes) are silently skipped and do not contribute tokens.