Added
pw.io.iceberg.read
method for reading Apache Iceberg tables into Pathway.- methods
pw.io.postgres.write
andpw.io.postgres.write_snapshot
now accept an additional argumentinit_mode
, which allows initializing the table before writing. pw.io.deltalake.read
now supports serialization and deserialization for all Pathway data types.- New parser
pathway.xpacks.llm.parsers.DoclingParser
supporting parsing of pdfs with tables and images. - Output connectors now include an optional
name
parameter. If provided, this name will appear in logs and monitoring dashboards. - Automatic naming for input and output connectors has been enhanced.
Changed
- BREAKING:
pw.io.deltalake.read
now requires explicit specification of primary key fields. - BREAKING:
pw.xpacks.llm.question_answering.BaseRAGQuestionAnswerer
now returns a dictionary frompw_ai_answer
endpoint. pw.xpacks.llm.question_answering.BaseRAGQuestionAnswerer
allows optionally returning context documents frompw_ai_answer
endpoint.- BREAKING: When using delay in temporal behavior, current time is updated immediately, not in the next batch.
- BREAKING: The
Pointer
type is now serialized to Delta Tables as raw bytes. pw.io.kafka.write
now allows to specifykey
andheaders
for JSON and CSV data formats.persistent_id
parameter in connectors has been renamed toname
. This newname
parameter allows you to assign names to connectors, which will appear in logs and monitoring dashboards.- Changed names of parsers to be more consistent:
ParseUnstrutured
->UnstructuredParser
,ParseUtf8
->Utf8Parser
.ParseUnstrutured
andParseUtf8
are now deprecated.
Fixed
generate_class
method inSchema
now correctly renders columns ofUnionType
andNone
types.- a bug in delay in temporal behavior. It was possible to emit a single entry twice in a specific situation.
pw.io.postgres.write_snapshot
now correctly handles tables that only have primary key columns.
Removed
- BREAKING:
pw.indexing.build_sorted_index
,pw.indexing.retrieve_prev_next_values
,pw.indexing.sort_from_index
andpw.indexing.SortedIndex
are removed. Sorting is now done withpw.Table.sort
. - BREAKING: Removed deprecated methods
pw.Table.unsafe_promise_same_universe_as
,pw.Table.unsafe_promise_universes_are_pairwise_disjoint
,pw.Table.unsafe_promise_universe_is_subset_of
,pw.Table.left_join
,pw.Table.right_join
,pw.Table.outer_join
,pw.stdlib.utils.AsyncTransformer.result
. - BREAKING: Removed deprecated column
_pw_shard
in the result ofwindowby
. - BREAKING: Removed deprecated functions
pw.debug.parse_to_table
,pw.udf_async
,pw.reducers.npsum
,pw.reducers.int_sum
,pw.stdlib.utils.col.flatten_column
. - BREAKING: Removed deprecated module
pw.asynchronous
. - BREAKING: Removed deprecated access to functions from
pw.io
inpw
. - BREAKING: Removed deprecated classes
pw.UDFSync
,pw.UDFAsync
. - BREAKING: Removed class
pw.xpack.llm.parsers.OpenParse
. It's functionality has been replaced withpw.xpack.llm.parsers.DoclingParser
. - BREAKING: Removed deprecated arguments from input connectors:
value_columns
,primary_key
,types
,default_values
. Schema should be used instead.