Added
- Flink: create Openlineage configuration based on Flink configuration
#2033
@pawel-big-lebowski
Flink configuration entries starting withopenlineage.*
are passed to the Openlineage client. - Java: add Javadocs to the Java client
#2004
@julienledem
The client was missing some Javadocs. - Spark: append output dataset name to a job name
#2036
@pawel-big-lebowski
Solves problem of multiple jobs, writing to different datasets while having the same job name. The feature is enabled by default and results in different job names and can be disabled by settingspark.openlineage.jobName.appendDatasetName
tofalse
.
Unifies job names generated on the Databricks platform (using a dot job part separator instead of an underscore). The default behaviour can be altered withspark.openlineage.jobName.replaceDotWithUnderscore
. - Spark: support Spark 3.4.1
#2057
@pawel-big-lebowski
Bumps the latest Spark version to be covered in integration tests.
Fixed
- Airflow: do not use database as fallback when no schema parsed
#2023
@mobuchowski
Sets the schema toNone
inTablesHierarchy
to skip filtering on the schema level in the information schema query. - Flink: fix a bug when getting schema for
KafkaSink
#2042
@pentium3
Fixes the incomplete schema fromKafkaSinkVisitor
by changing theKafkaSinkWrapper
to catch schemas of typeAvroSerializationSchema
. - Spark: filter
CreateView
events#1968
#1987
@pawel-big-lebowski
Clears events generated by logical plans havingCreateView
nodes as root. - Spark: fix
MERGE INTO
for delta tables identified by physical locations#2026
@pawel-big-lebowski
Delta tables identified by physical locations were not properly recognized. - Spark: fix incorrect naming of JDBC datasets
#2035
@mobuchowski
Makes the namespace generated by the JDBC/Spark connector conform to the naming schema in the spec. - Spark: fix ignored event
adaptive_spark_plan
in Databricks#2061
@algorithmy1
Removesadaptive_spark_plan
from theexcludedNodes
inDatabricksEventFilter
.