This is the first public release of Soda Core version 4. This release introduces Data Contracts as the default way to define data quality rules for tables. The new approach offers a cleaner, more structured, and more maintainable way to define and manage data quality rules, based on community feedback and real-world usage.
Breaking change: Soda Core is moving from the checks language to a Data Contracts–based syntax.
Highlights
- Introduced support for parsing, publishing and (both locally and remotely) verifying data contracts.
- Introduced extended check diagnostics providing deeper visibility into the count and percentage of tested, passing, and failing rows.
- Introduced seamless flexibility to run Contracts via Core or the Soda Agent, with definitions stored either as external files or in Soda Cloud.
- Introduced "Missing", "Invalid", "Duplicate", "Aggregate", "Failed Rows", "Metric" checks in data contract.
- Introduced data contracts support for multiple data sources: Postgres, Snowflake, BigQuery, Databricks, Redshift, SQL Server, Fabric, Synapse, Athena, DuckDB (in memory).
- Introduced new CLI with a noun-verb structure and better integration with the Soda Cloud APIs (e.g. contract fetching based on a dataset identifier).
- Introduced support for
variablesin contracts, allowing you to parameterize contracts. - Introduced support for extending functionality using plugins.
- Introduced extensible check types.
- Introduced concept and hooks for contract verification result handlers, allowing post-processing of the contract verification results.
Extensions
- Implemented the contract generation plugin.
- Implemented the diagnostics warehouse plugin.
- Implemented reconciliation checks.
- Implemented group-by checks.
- Implemented support for Oracle as a data source.
- Implemented support for Dremio as a data source.
- Implemented support for contract requests.