Added
- Clients: add
DEBUG
logging of events to transports#1633
by @mobuchowski
Ensures that theDEBUG
loglevel on properly configured loggers will always log events, regardless of the chosen transport. - Spark: add
CustomEnvironmentFacetBuilder
class#1545
by New contributor @Anirudh181001
Enables the capture of custom environment variables from Spark. - Spark: introduce the new output visitors
AlterTableAddPartitionCommandVisitor
andAlterTableSetLocationCommandVisitor
#1629
by New contributor @nataliezeller1
Adds visitors for extracting table names from the Spark commandsAlterTableAddPartitionCommand
andAlterTableSetLocationCommand
. The intended use case is a custom transport for the OpenMetadata lineage API. - Spark: add column lineage for JDBC relations
#1636
by @tnazarew
Adds column lineage information to JDBC events with data extracted from query by the SQL parser. - SQL: add Linux-aarch64 native library to Java SQL parser
#1664
by @mobuchowski
Adds a Linux-ARM version of the native library. The Java SQL parser interface had only Linux-x64 and MacOS universal binary variants previously.
Changed
- Airflow: get table database in Athena extractor
#1631
by New contributor @rinzool
Changes the extractor to get a table's database from thetable.schema
field or the operator default if the field isNone
.
Fixed
- dbt: add dbt
seed
to the list of dbt-ol events#1649
by New contributor @pohek321
Ensures thatdbt-ol test
no longer fails when run against an event seed. - Spark: make column lineage extraction in Spark support caching
#1634
by @pawel-big-lebowski
Collect column lineage from Spark logical plans that contain cached datasets. - Spark: add support for a deprecated config
#1586
by @tnazarew
Maps the deprecatedspark.openlineage.url
tospark.openlineage.transport.url
. - Spark: add error message in case of null in url
#1590
by @tnazarew
Improves error logging in the case of undefined URLs. - Spark: collect complete event for really quick Spark jobs
#1650
by @pawel-big-lebowski
Improves the collecting of OpenLineage events on SQL complete in the case of quick operations. - Spark: fix input/outputs for one node
LogicalRelation
plans#1668
by @pawel-big-lebowski
For simple queries likeselect col1, col2 from my_db.my_table
that do not write output,
the Spark plan contained just a single node, which was wrongly treated as both
an input and output dataset. - SQL: fix file existence check in build script for openlineage-sql-java
#1613
by @sekikn
Ensures that the build script works if the library is compiled solely for Linux.
Removed
- Airflow: remove
JobIdMapping
and update macros to better support Airflow version 2+#1645
by @JDarDagran
Updates macros to useOpenLineageAdapter
's method to generate deterministic run UUIDs because using theJobIdMapping
utility is incompatible with Airflow 2+.