github pathwaycom/pathway v0.31.0

7 hours ago

Added

  • pw.io.sqlite.write connector, which writes a Pathway table into a SQLite database file. Supports two modes: stream_of_changes (default) appends each event alongside time/diff metadata columns, while snapshot maintains the current state of the table via INSERT ... ON CONFLICT DO UPDATE on insertions and DELETE on retractions, keyed on the primary_key parameter. Values are encoded using the same storage-class mapping that pw.io.sqlite.read accepts, so write / read round-trips every supported Pathway type losslessly. init_mode controls whether the destination table is left as-is, auto-created, or replaced on start-up.
  • pw.io.deltalake.read now accepts Delta decimal(p, s) columns. The Pathway type declared in the schema chooses the projection: float converts each value through f64 (lossy in general — both because f64 is binary and because its mantissa carries only ~15–17 significant decimal digits) and emits a one-time warning at startup naming each affected column; str formats the unscaled integer with the column's scale and passes the resulting decimal text through unchanged, lossless for the full Delta precision range (up to 38 digits).
  • pw.io.deltalake.write accepts a Pathway str column when writing into an existing Delta decimal(p, s) column: each row's text is parsed as decimal and stored as the column's fixed-point value. Combined with the lossless decimal → str read path, a Delta decimal column can round-trip through a Pathway pipeline with no precision loss. A string that can't be parsed as a decimal of the column's shape fails the write with an error message naming the offending value, the column's precision and scale, and the specific constraint it violated. Tables that don't contain a decimal column (or that are being created fresh by Pathway) are unaffected.
  • pw.io.deltalake.read now accepts Delta date columns (mapped onto DateTimeNaive / DateTimeUtc at midnight on the calendar day, since Pathway has no native Date type) and timestamp_millis columns (mapped onto the same Pathway types with millisecond precision preserved).
  • The panel widget for table visualization now accepts page_size and table_height parameters.

Changed

  • BREAKING: pw.io.iceberg.write to a Glue catalog no longer accepts DateTimeUtc columns. Glue's metastore has no timezone-aware timestamp type, so previous versions silently dropped the timezone on read-back; writes now fail with an explicit error instead of corrupting the zone. To store UTC timestamps in Glue, convert to DateTimeNaive with UTC-normalized values, or write through the REST catalog, which preserves the timezone.
  • pw.io.sqlite.read now parses every Pathway Value variant. In addition to int, float, str, bytes, pw.Json, and their Optional forms, the reader now accepts bool, pw.DateTimeNaive, pw.DateTimeUtc, pw.Duration, pw.Pointer, pw.PyObjectWrapper, homogeneous tuple / list, and np.ndarray. Composite types are stored as TEXT using the same JSON encoding that pw.io.jsonlines.write emits. Booleans additionally accept PostgreSQL-style textual literals (true/false, yes/no, on/off, t/f, y/n; case-insensitive, whitespace-trimmed), and float columns tolerate values stored with INTEGER storage class.
  • pw.io.mssql.read and pw.io.mssql.write now retry transient SQL Server errors automatically.

Fixed

  • pw.io.http.rest_connector no longer raises TypeError: Cannot instantiate typing.Any when a request column has the inferred default schema type (Any). The cast step now skips columns typed as Any instead of attempting to call the type as a constructor.
  • pw.io.deltalake.read now accepts Delta tables whose integer columns use any of the standard Parquet integer widths (INT_8, INT_16, INT_32, unsigned variants), and whose floating-point columns use FLOAT (32-bit) or FLOAT16. Previously the row-level reader only matched INT_64 and DOUBLE, so tables produced by Spark / DuckDB / pandas with explicit narrower casts read back as zero rows with per-row conversion errors.
  • pw.io.deltalake.write partition columns of type pw.Pointer, pw.Duration, and pw.Json now round-trip correctly through pw.io.deltalake.read. Previously the values were correctly placed in the partition path on write, but the reader had no decoder for those types and produced a conversion error for every row.

Don't miss a new pathway release

NewReleases is sending notifications on new releases.