Added
- Airflow: add Trino extractor #1288 @sekikn
Adds a Trino extractor to the Airflow integration. - Airflow: add
S3FileTransformOperator
extractor #1450 @sekikn
Adds anS3FileTransformOperator
extractor to the Airflow integration. - Airflow: add standardized run facet #1413 @JDarDagran
Creates one standardized run facet for the Airflow integration. - Airflow: add
NominalTimeRunFacet
andOwnershipJobFacet
#1410 @JDarDagran
AddsnominalEndTime
andOwnershipJobFacet
fields to the Airflow integration. - dbt: add support for postgres datasources #1417 @julienledem
Adds the previously unsupported postgres datasource type. - Proxy: add client-side proxy (skeletal version) #1439 #1420 @fm100
Implements a skeletal version of a client-side proxy. - Proxy: add CI job to publish Docker image #1086 @wslulciuc
Includes a script to build and tag the image plus jobs to verify the build on every CI run and publish to Docker Hub. - SQL: add
ExtractionErrorRunFacet
#1442 @mobuchowski
Adds a facet to the spec to reflect internal processing errors, especially failed or incomplete parsing of SQL jobs. - SQL: add column-level lineage to SQL parser #1432 #1461 @mobuchowski @StarostaGit
Adds support for extracting column-level lineage from SQL statements in the parser, including adjustments to Rust-Python and Rust-Java interfaces and the Airflow integration's SQL extractor to make use of the feature. Also includes more tests, removal of the old parser, and removal of the common-build cache in CI (which was breaking the parser). - Spark: pass config parameters to the OL client #1383 @tnazarew
Adds a mechanism for making new lineage consumers transparent to the integration, easing the process of setting up new types of consumers.
Fixed
- Airflow: fix
collect_ignore
, add flags to Pytest for cleaner output #1437 @JDarDagran
Removes the extractors directory from the ignored list, improving unit testing. - Spark & Java client: fix README typos @versaurabh
Fixes typos in the SPDX license headers.