Overview
-
Multiple operators have been reworked to avoid taking and releasing
Python's global interpreter lock while iterating over multiple items.
Windowing operators, stateful operators and operators likebranch
will see significant performance improvements.Thanks to @damiondoesthings for helping us track this down!
-
Breaking change
FixedPartitionedSource.build_part
,
DynamicSource.build
,FixedPartitionedSink.build_part
andDynamicSink.build
now take an additionalstep_id
argument. This argument can be used when
labeling custom Python metrics. -
Custom Python metrics can now be collected using the
prometheus-client
library. -
Breaking change The schema registry interface has been removed.
You can still use schema registries, but you need to instantiate
the (de)serializers on your own. This allows for more flexibility.
See theconfluent_serde
andredpanda_serde
examples for how
to use the new interface. -
Fixes bug where items would be incorrectly marked as late in sliding
and tumbling windows in cases where the timestamps are very far from
thealign_to
parameter of the windower. -
Adds
stateful_flat_map
operator. -
Breaking change Removes
builder
argument fromstateful_map
.
Instead, the initial state value is alwaysNone
and you can call
your previous builder by hand in themapper
. -
Breaking change Improves performance by removing the
now: datetime
argument fromFixedPartitionedSource.build_part
,
DynamicSource.build
, andUnaryLogic.on_item
. If you need the
current time, use:
from datetime import datetime, timezone
now = datetime.now(timezone.utc)
- Breaking change Improves performance by removing the
sched: datetime
argument fromStatefulSourcePartition.next_batch
,
StatelessSourcePartition.next_batch
,UnaryLogic.on_notify
. You
should already have the scheduled next awake time in whatever
instance variable you returned in
{Stateful,Stateless}SourcePartition.next_awake
or
UnaryLogic.notify_at
.
What's Changed
- Add Kafka concept section to metadata.json by @whoahbot in #373
- Fixed split_demo example by @Psykopear in #371
- Prevent dataflow hang if next_awake is far in future by @davidselassie in #374
- Adds a basic stub file generator by @davidselassie in #369
- Update metrics and observability guides by @Psykopear in #372
- Fixes pyright errors by @davidselassie in #378
- Shuffles around Kafka objects and updates docstrings by @davidselassie in #382
- Fixes
collect
andbranch
operator test file names by @davidselassie in #385 - Using MyST + Sphinx for API docs by @davidselassie in #383
- Update README.md by @jonasbest in #388
- All docs to Sphinx and RTD by @davidselassie in #386
- Update logo in README.md by @konradsienkowski in #389
- Removes
stateful_map
builder
function and addsstateful_flat_map
by @davidselassie in #387 - Re-enables doctests via Sybil by @davidselassie in #390
- Removes
now
andsched
arguments in input partitions and unary logic by @davidselassie in #391 - fix backup interval of zero raising an exception by @damiondoesthings in #393
- Remove GLIBC 2.27 builder by @whoahbot in #395
- Fix recovery store garbage collection by @davidselassie in #394
- Customize Sphinx docs theme by @konradsienkowski in #400
- Cleanup docs by @whoahbot in #401
- [Docs]: Fix path to Slack icon by @konradsienkowski in #402
- Updates release instructions with Read the Docs stuff by @davidselassie in #403
- Additional recovery tests by @davidselassie in #407
- Adds warnings about using session windows by @davidselassie in #408
- Update max window documentation by @awmatheson in #410
- Windowing concept doc by @davidselassie in #412
- Don't call
time_for
twice by @whoahbot in #414 - Refactors window boundary calculation to avoid overflow by @davidselassie in #415
- Unified Redpanda and Confluent schema registries by @Psykopear in #399
- Fix inconsistent window boundaries panic due to microseconds in timestamps by @davidselassie in #416
- Makes stubgen deterministic by @davidselassie in #417
- Don't take the GIL when iterating over items by @whoahbot in #418
- Start working on custom metrics for Kafka by @whoahbot in #404
- Take 2 on inconsistent boundaries by @davidselassie in #419
- Add benchmarks for core operators by @whoahbot in #420
- Refactor operators by @whoahbot in #422
- Prepare for v0.19.0 release by @whoahbot in #423
New Contributors
- @jonasbest made their first contribution in #388
- @damiondoesthings made their first contribution in #393
Full Changelog: v0.18.2...v0.19.0