redpanda-data/connect v4.89.2 on GitHub

For installation instructions check out the getting started guide.

Added

general: PostgreSQL CDC benchmarking suite added with Kafka Connect PostgreSQL benchmark infrastructure and configuration. (@ness-david-dedu, #4216)
parquet_encode: Added configurable timestamp unit support (nanosecond, microsecond, millisecond) to make Parquet output readable by Apache Spark, Databricks, AWS Athena, and DuckDB. (@ankit481, #4294)
oracledb_cdc: Added transaction_id to message metadata. (@josephwoodward, #4328)
oracledb_cdc: Added commit_ts_ms to message metadata. (@josephwoodward, #4331)

Fixed

confluent: Avro schema reference resolution now handles arbitrary schema shapes and correctly inlines transitive references, fixing misleading errors and missing nested reference resolution. (@twmb, #4247)
mysql_cdc: IAM token refresh and canal recreation before streaming prevents connection failures when snapshots delay binlog streaming. (@josephwoodward, #4295)
oracledb_cdc: Oracle numeric values with missing leading zeros (e.g., '.5') are now normalized to valid JSON format for proper CDC streaming. (@josephwoodward, #4322)

Unreleased

Added

parquet_encode: Added default_timestamp_unit field (values NANOSECOND, MICROSECOND, MILLISECOND) controlling the precision of TIMESTAMP logical types. Default remains NANOSECOND for backwards compatibility. Use MICROSECOND when writing files for Apache Spark/Databricks, AWS Athena or DuckDB, which do not support TIMESTAMP(NANOS). (#3570)
iceberg, parquet_encode, schema_registry_encode: Added support for the new Decimal and BigDecimal benthos common-schema types in metadata-driven encoding. Iceberg / Parquet / Avro encoders emit native fixed-precision decimal types for Decimal; JSON Schema emits a regex-validated string. BigDecimal is rejected by the bounded-format encoders with a clear error and accepted by JSON Schema as a permissive string pattern. (@Jeffail)
schema_registry_decode: When store_schema_metadata is set, Avro decimal logical-type values are now normalised to canonical decimal strings to match the schema metadata's value contract. (@Jeffail)

Changed

postgresql: NUMERIC and DECIMAL columns now emit Decimal(p, s) schema metadata when precision/scale is declared, or BigDecimal for unparameterised numeric columns. Values are emitted as canonical decimal strings (right-padded to the declared scale for Decimal). Previously these columns surfaced as String with the raw Postgres text. (@Jeffail)
mysql_cdc: DECIMAL and NUMERIC columns now emit Decimal(p, s) schema metadata parsed from the column's raw type, and values are normalised to canonical decimal strings. Previously these columns surfaced as String with the driver's native form. (@Jeffail)
microsoft_sql_server_cdc: DECIMAL and NUMERIC columns now emit Decimal(p, s) schema metadata sourced from sql.ColumnType.DecimalSize(), and values are normalised to canonical decimal strings. MONEY and SMALLMONEY remain typed as String, but their wire form is now a quoted canonical decimal string instead of a raw json.Number. Previously DECIMAL/NUMERIC were json.Number typed as String. (@Jeffail)
oracledb_cdc: NUMBER columns with declared precision and scale > 0 now emit Decimal(p, s) schema metadata; NUMBER without DATA_PRECISION emits BigDecimal. Decimal values flow through as canonical decimal strings; integer-width NUMBER (precision ≤ 18, scale 0) continues to emit int64. The previous json.Number wrapping for NUMBER-as-String columns is gone. (@Jeffail)
mongodb_cdc: bson.Decimal128 and bsonType: "decimal" validator fields now emit BigDecimal schema metadata. Decimal values in document bodies are emitted as canonical decimal strings instead of the previous {"$numberDecimal": "..."} ExtJSON wrapper. (@Jeffail)

The full change log can be found here.