github redpanda-data/connect v4.48.1

latest releases: v4.49.1, v4.49.0
2 days ago

For installation instructions check out the getting started guide.

Added

  • Enterprise licenses can now be loaded directly from an environment variable REDPANDA_LICENSE. (@rockwotj)
  • Added a lint rule to verify field private_key for the snowflake_streaming output is in PEM format. (@rockwotj)
  • New mongodb_cdc input for change data capture (CDC) over MongoDB collections. (@rockwotj)
  • Field is_high_watermark added to the redpanda_migrator_offsets output. (@mihaitodor)
  • Metadata field kafka_is_high_watermark added to the redpanda_migrator_offsets input. (@mihaitodor)
  • Input postgres_cdc now emits logical messages to the WAL every hour by default to allow WAL reclaiming for low frequency tables, this frequency is controlled by field heartbeat_interval. (@rockwotj)
  • Output snowflake_streaming now has a commit_timeout field to control how long to wait for a commit in Snowflake. (@rockwotj)
  • Output snowflake_streaming now has a url field to override the hostname for connections to Snowflake, which is required for private link deployments. (@rockwotj)
  • All sql_* components now support the clickhouse driver in cloud builds. (@mihaitodor)

Fixed

  • Fix an issue in the snowflake_streaming output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)
  • Fixed a bug with the redpanda_migrator_offsets input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor)
  • The redpanda_migrator_bundle output no longer drops messages if either the redpanda_migrator or the redpanda_migrator_offsets child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor)
  • Transient errors in snowflake_streaming are now automatically retried in cases it's determined to be safe to do. (@rockwotj)
  • Fixed a panic in the sftp input when Connect shuts down. (@mihaitodor)
  • Fixed an issue where mysql_cdc would not work with timestamps without the parseTime=true DSN parameter. (@rockwotj)
  • Fixed an issue where timestamps at extreme year bounds (i.e. year 0 or year 9999) would be encoded incorrectly in snowflake_streaming. (@rockwotj)
  • The aws_s3 input now drops SQS notifications and emits a warning log message for files which have been deleted before Connect was able to read them. (@mihaitodor)
  • Fixed a bug in snowflake_streaming where string/bytes values that are the min or max value for a column in a batch and were over 32 characters could be corrupted if the write was retried. (@rockwotj)

Changed

  • Output snowflake_streaming has additional logging and debug information when errors arise. (@rockwotj)
  • Input postgres_cdc now does not add a prefix to the replication slot name, if upgrading from a previous version, prefix your current replication slot with rs_ to continue to use the same replication slot. (@rockwotj)
  • The redpanda_migrator output now uses the source topic config when creating a topic in the destination cluster. It also attempts to transfer topic ACLs to the destination cluster even if the topics already exist. (@mihaitodor)
  • When preserve_logical_types is true in schema_registry_decode, convert time logical times into bloblang timestamps instead of duration strings. (@rockwotj)

The full change log can be found here.

Don't miss a new connect release

NewReleases is sending notifications on new releases.