github apache/beam v2.33.0
Beam 2.33.0 release

latest releases: v2.61.0-RC1, sdks/v2.61.0-RC1, v2.60.0...
3 years ago

We are happy to present the new 2.33.0 release of Beam.
This release includes both improvements and new functionality.
See the download page for this release.

For more information on changes in 2.33.0, check out the detailed release
notes
.

Highlights

  • Go SDK is no longer experimental, and is officially part of the Beam release process.
    • Matching Go SDK containers are published on release.
    • Batch usage is well supported, and tested on Flink, Spark, and the Python Portable Runner.
      • SDK Tests are also run against Google Cloud Dataflow, but this doesn't indicate reciprocal support.
    • The SDK supports Splittable DoFns, Cross Language transforms, and most Beam Model basics.
    • Go Modules are now used for dependency management.
      • This is a breaking change, see Breaking Changes for resolution.
      • Easier path to contribute to the Go SDK, no need to set up a GO_PATH.
      • Minimum Go version is now Go v1.16
    • See the announcement blogpost for full information once published.

New Features / Improvements

  • Projection pushdown in SchemaIO (BEAM-12609).
  • Upgrade Flink runner to Flink versions 1.13.2, 1.12.5 and 1.11.4 (BEAM-10955).

Breaking Changes

  • Since release 2.30.0, "The AvroCoder changes for BEAM-2303 [changed] the reader/writer from the Avro ReflectDatum* classes to the SpecificDatum* classes" (Java). This default behavior change has been reverted in this release. Use the useReflectApi setting to control it (BEAM-12628).

Deprecations

  • Python GBK will stop supporting unbounded PCollections that have global windowing and a default trigger in Beam 2.34. This can be overriden with --allow_unsafe_triggers. (BEAM-9487).
  • Python GBK will start requiring safe triggers or the --allow_unsafe_triggers flag starting with Beam 2.34. (BEAM-9487).

Bugfixes

  • UnsupportedOperationException when reading from BigQuery tables and converting
    TableRows to Beam Rows (Java)
    (BEAM-12479).
  • SDFBoundedSourceReader behaves much slower compared with the original behavior
    of BoundedSource (Python)
    (BEAM-12781).
  • ORDER BY column not in SELECT crashes (ZetaSQL)
    (BEAM-12759).

Known Issues

  • Spark 2.x users will need to update Spark's Jackson runtime dependencies (spark.jackson.version) to at least version 2.9.2, due to Beam updating its dependencies.
  • See a full list of open issues that affect this version.
  • Go SDK jobs may produce "Failed to deduce Step from MonitoringInfo" messages following successful job execution. The messages are benign and don't indicate job failure. These are due to not yet handling PCollection metrics.

List of Contributors

According to git shortlog, the following people contributed to the 2.33.0 release. Thank you to all contributors!

Ahmet Altay,
Alex Amato,
Alexey Romanenko,
Andreas Bergmeier,
Andres Rodriguez,
Andrew Pilloud,
Andy Xu,
Ankur Goenka,
anthonyqzhu,
Benjamin Gonzalez,
Bhupinder Sindhwani,
Chamikara Jayalath,
Claire McGinty,
Daniel Mateus Pires,
Daniel Oliveira,
David Huntsperger,
Dylan Hercher,
emily,
Emily Ye,
Etienne Chauchot,
Eugene Nikolaiev,
Heejong Lee,
iindyk,
Iñigo San Jose Visiers,
Ismaël Mejía,
Jack McCluskey,
Jan Lukavský,
Jeff Ruane,
Jeremy Lewi,
KevinGG,
Ke Wu,
Kyle Weaver,
lostluck,
Luke Cwik,
Marwan Tammam,
masahitojp,
Mehdi Drissi,
Minbo Bae,
Ning Kang,
Pablo Estrada,
Pascal Gillet,
Pawas Chhokra,
Reuven Lax,
Ritesh Ghorse,
Robert Bradshaw,
Robert Burke,
Rodrigo Benenson,
Ryan Thompson,
Saksham Gupta,
Sam Rohde,
Sam Whittle,
Sayat,
Sayat Satybaldiyev,
Siyuan Chen,
Slava Chernyak,
Steve Niemitz,
Steven Niemitz,
tvalentyn,
Tyson Hamilton,
Udi Meiri,
vachan-shetty,
Venkatramani Rajgopal,
Yichi Zhang,
zhoufek

Don't miss a new beam release

NewReleases is sending notifications on new releases.