github dlt-hub/dlt 1.25.0

one day ago

dlt 1.25.0 Release Notes

Breaking Changes

  1. Multischema datasets (#3770 @burnash) — Datasets can now hold multiple schemas. The main benefit is to be able to see tables from all source in multi-source pipelines. This is a new default behavior.
    Users can pass a list of schemas to dataset() method and still go back to single-schema dataset by providing pipeline.default_schema when creating dataset.

Highlights

  • lance destination (#3810 @jorritsandbrink) — New destination for the Lance table format with optional vector embedding generation via lancedb. Supports local storage and s3/az/gs, uses the Lance Directory Namespace V2 spec, and supports branching. Complements the existing lancedb destination (which targets LanceDB Cloud).
  • Multischema datasets (#3770 @burnash) — See Breaking Changes above. Enables sidecar schemas (e.g. data-quality quarantine tables) to live alongside the primary schema in a single dataset.
  • Improved progress and load metrics (#3768 @rudolfix) — Load metrics now persist across restarts, normalizer metrics are updated via update files, and the follow-up job graph is saved into the trace. Closes the long-standing #853.

Core Library

  • lance destination (#3810 @jorritsandbrink) — See Highlights.
  • Multischema datasets (#3770 @burnash) — See Highlights.
  • Improved progress and load metrics (#3768 @rudolfix) — See Highlights.
  • ducklake: metadata_schema ATTACH option (#3763 @sangwookWoo) — Adds metadata_schema to DuckLakeCredentials so the DuckLake metadata schema can be configured independently from ducklake_name.
  • Fix: preserve credential chain in AWS credentials (#3798 @rudolfix) — Default credential mixing applied correctly, STS scoped to Databricks only. Closes #3115.
  • Fix: replay state transitions after crash (#3767 @rudolfix) — Writes a pending state-transition marker right after the DB commit so an interrupted load no longer leaves the load package in an inconsistent state.
  • Fix: create all eligible tables on staging dataset (#3765 @rudolfix) — Closes #2862.
  • Fix: normalize pool workers skip __main__ in orchestrators (#3784 @rudolfix) — Closes #3586.
  • Fix(clickhouse): lightweight DELETE for single-table merge (#3783 @rudolfix) — Removes the _dlt_id requirement when merging arrow tables without nested tables on ClickHouse.
  • Fix(clickhouse): pass aws_session_token to staging s3() table function (#3769 @anuunchin) — Temporary AWS credentials now work for ClickHouse staging.
  • Fix: avoid leaking PUA markers in nested fields (#3760 @serl) — Fixes Pydantic nested-model PUA-marker leak. Closes #3755.
  • Fix: deepcopy paginator in child resource (#3779 @anuunchin) — Prevents paginator state corruption across child-resource invocations. Closes #3772.
  • Fix: honor explicit non-utf8 encoding in filesystem read_csv (#3743 @biefan) — File is opened with the requested encoding so SFTP/paramiko stacks no longer pre-decode as UTF-8.
  • Fix: don't filter out trace steps with exceptions (#3843 @anuunchin) — trace.asdict() now retains pipelines that fail in the sync step before extract.
  • Fix: check duckdb version when installing lance extension (#3773 @zilto) — Handles the lance extension promotion to built-in in duckdb 1.5.
  • Fix: transient Windows file-lock PermissionError in rename_tree (#3853 @burnash) — Resolves intermittent Windows CI failures during normalize→loaded rename.
  • Fix: deprecation warnings across supported package versions (#3831 @anuunchin) — Closes #3785, #3807, #3787, #3794.

Docs

  • Cookbook section (#3860 @zilto) — Tested examples moved to a dedicated top-level tab; dlt tab added for navigation back; UI cleanups.
  • Same-domain docs button (#3859 @zilto) — Avoids full page reload when navigating.
  • Explore-and-transform page (#3782 @hibajamal) — New page covering data-exploration and transformations workbench toolkits.
  • Expand handover-to-other-toolkits section (#3737 @njaltran) — Expands data-exploration toolkit coverage in llm-native-workflow.md.
  • Add EAI instructions (#3803 @kaliole)
  • Update name to dlt Connector App (#3857 @kaliole) — Snowflake Native App docs renamed.
  • Update source count to 8,000+ (#3830 @Pawansingh3889) — Closes #3761.
  • Rename dltHub Basic tier to dltHub Pro (#3795 @elviskahoro)
  • Fix outdated hf login command (#3781 @julien-c)

Chores

  • Move mypy configs to pyproject.toml (#3780 @zilto) — Partially resolves #3346.
  • Remove Python 3.9 from CI matrices (#3777 @zilto) — Python 3.9 reached EOL in October 2025. Resolves #3587, #3619.
  • Increase Playwright timeout in e2e dashboard test (#3848 @burnash) — Matches the 15s timeout used elsewhere; reduces Windows CI flakiness.
  • Silence Airflow 3.2 smoke-test log noise (#3835 @burnash) — Fixes #3834.

New Contributors

Don't miss a new dlt release

NewReleases is sending notifications on new releases.