dlt 1.24.0 Release Notes
Breaking Changes
- Custom resource metrics now stored as tables (#3718 @rudolfix) — Incremental metrics in the trace are now represented in table format. This changes the location and structure of incremental metrics in the trace object.
Highlights
- Insert-only merge strategy (#3741 @rudolfix, based on #3372 by @OnAzart) — New
insert-onlymerge strategy that performs idempotent, key-based appending: inserts records whose primary key doesn't exist in the destination while silently skipping duplicates. No updates or deletes. Supported across all SQL destinations, Delta Lake, and Iceberg. - Parallelize all sources in Airflow (#3652 @JustinSobayo) — In
parallelandparallel-isolateddecompose modes, all source components now fan out concurrently from a shared start node. Previously the first source had to complete before others could begin, adding unnecessary wall-clock time. This release also adds basic Airflow 3 support with smoke tests. - ClickHouse ReplacingMergeTree support (#3366 @prevostc) — New
replacing_merge_treetable engine type for ClickHouse that enables native deduplication and soft deletes viadedup_sortandhard_deletecolumn hints. - Custom resource metrics as tables (#3718 @rudolfix) — Resources can now emit custom metrics that are stored as tables in the trace, enabling richer observability for pipelines.
Core Library
- Insert-only merge strategy (#3741 @rudolfix, based on #3372 by @OnAzart) — See Highlights.
- ClickHouse ReplacingMergeTree support (#3366 @prevostc) — See Highlights.
- Parallelize all sources in Airflow (#3652 @JustinSobayo) — See Highlights.
- Custom resource metrics as tables (#3718 @rudolfix) — See Highlights.
- Configurable Arrow table concatenation promote_options (#3701 @AyushPatel101) —
arrow_concat_promote_optionscan now be set to"default"or"permissive"instead of the hardcoded"none", enabling automatic type promotion when yielding multiple Arrow tables with slightly different inferred types. - Fix: CLI info/show fails on custom destinations (#3676 @anuunchin) —
dlt pipeline info/showno longer crashes withUnknownDestinationModuleon pipelines using@dlt.destination. - Fix: Primary key assignment for incremental resources (#3679 @shnhdan) — Passing
primary_key=()toIncrementalto disable deduplication is no longer silently overwritten by the resource's own primary key. - Fix: MotherDuck missing catalog validation (#3723 @YuF-9468) — Connection strings that omit the catalog/database name (e.g. bare
md:) now raise a clear configuration error instead of a confusing connection failure. - Fix: BigQuery infinite loop on internal error (#3732 @aditypan) — BigQuery jobs that encounter an internal error no longer cause an infinite retry loop.
- Fix: SCD2 column order mismatch in SQLAlchemy destinations (#3733 @anuunchin) — SCD2 validity column insert jobs now match the column order of existing tables in SQLAlchemy destinations.
- Fix: Timezone mapping in SQL timestamp datatype (#3735 @aditypan) — Timezone is now correctly set for timestamp/datetime column datatypes.
Docs
- Realistic closure-based data masking example (#3617 @veeceey) — Replaced the hardcoded example with a reusable
mask_columns()function supporting allsql_databasebackends. - Redirects for removed pages (#3688 @djudjuu)
- AI workbench license info (#3729 @lis365b)
- Minor doc fixes (#3734 @anuunchin)
Chores
- Bumps npm docs deps (#3728 @rudolfix)
- Switch lancedb example from Spotify to PodcastIndex (#3736 @Travior)
- Adds CLI docs check to docs CI workflow (#3739 @rudolfix)
- Moves render CLI docs command to a separate tool (#3740 @rudolfix)