github dlt-hub/dlt 1.22.0

7 hours ago

Breaking Changes

  1. Pydantic v1 support removed (#3572 @anuunchin) — All Pydantic v1 compatibility code has been removed. The codebase now requires Pydantic v2 only.
  2. data_type contract semantic change (#3572 @anuunchin @rudolfix) — The data_type contract now applies to full data type (ie. precision, nullability), not only to variant columns (data type change). Users with data_type: freeze who relied on changing nullable/precision/scale on existing columns will now be blocked.
  3. merge_columns now removes compound properties (#3431 @anuunchin) — Previously merge_columns was purely additive, which caused compound properties like merge_key to be incorrectly replaced rather than properly merged. The function now correctly removes compound properties that should be removed.

Highlights

  • Pydantic data validation overhaul (#3572 @anuunchin @rudolfix ) — Major rework of Pydantic support: discriminated union RootModel types (validation of event streams with various event types), schema contracts properly separate resource-defined vs data-derived hints, Pydantic model columns bypass contract checks when authoritative. Supports Pydantic models on arrow and model items with full schema contract enforcement. Prepares for Pydantic v3.
  • Snowflake atomic table swap for replace (#3540 @Travior) — Uses ALTER TABLE ... SWAP for staging-optimized replace strategy on Snowflake, eliminating table downtime during data replacement.
  • Custom backends for sql_database (#3595 @rudolfix) — Register custom TableLoader implementations as named backends. ConnectorX backend ported as PoC; ADBC and paginated loader implemented as test cases.
  • SQLAlchemy destination dialect customization (#3600 @rudolfix) — Customize type mapping, adjust SQLAlchemy table schemas before creation, and override destination capabilities per-dialect.
  • llms.txt and Markdown docs generation (#3635 @rudolfix) — Generates llms.txt index and Markdown versions of docs pages with a "View Markdown" navigation option, making the docs LLM-friendly.

Core Library

  • rest_api: parallelized dependent resources (#3574 @Shadesfear) — Add parallelized flag to dependent resources (transformers) so child resource fetches run concurrently.
  • dlt.Relation: filter by load_id (#3547 @zilto) — Filter dataset relations by load ID (experimental).
  • dlt.Relation: flatten logic and improve typing (#3578 @zilto) — Remove dynamic methods; explicit return types for .df(), .arrow(), etc.
  • Source preprocessors on SourceFactory (#3636 @rudolfix) — Add preprocessor hooks to dlt.source factory for modifying source instances.
  • engine_kwargs for sql_database/sql_table sources (#3414 @tetelio) — Pass SQLAlchemy engine arguments directly to create_engine() for sources.
  • DECFLOAT support for Snowflake (#3513 @ivasio) — Properly handles DECFLOAT columns via the SQLAlchemy backend.
  • Athena query_result_bucket now optional (#3566 @arel) — Omit or set to None when using Athena's managed results bucket.
  • ClickHouse extra_credentials for S3 (#2888 @warje) — Adds extra_credentials config for role-based S3 authentication.
  • Fix: Snowflake sort column escaping (#3594 @rudolfix)
  • Fix: BigQuery partition clause on ALTER TABLE (#3571 @kien-truong)
  • Fix: Redshift schema existence check (#3570 @timH6502)
  • Fix: _dlt_load_id written as dict on MSSQL + ADBC (#3584 @rudolfix)
  • Fix: ClickHouse CREATE OR REPLACE for merge temp tables (#3589 @rudolfix)
  • Fix: read_csv_duckdb respects filename=True (#3606 @karlanka)
  • Fix: column order mismatch in sql_database (#3638 @rudolfix)
  • Fix: consistent UUID handling as strings (#3599 @rudolfix)
  • Fix: managed SQLAlchemy engine ref counting (#3601 @rudolfix)
  • Fix: suppress psutil warning during dlt init (#3615 @rudolfix)
  • Fix: query lifecycle cleanup (#3627 @rudolfix)
  • Fix: Pydantic model synthesis bugs (#3605 @rudolfix)
  • Detect AI agent execution context (#3628 @rudolfix)
  • Upgrade ibis-framework, remove sqlglot constraint (#3621 @Travior)
  • Vibe sources: use new scaffold API (#3512 @djudjuu)
  • Update GitHub API pipeline template (#3603 @ShreyasGS)

Docs

Chores

New Contributors

Don't miss a new dlt release

NewReleases is sending notifications on new releases.