github unionai-oss/pandera v0.14.0
v0.14.0: ✍️ Pandera Internals Rewrite [phase 1]

latest releases: v0.20.4, v0.20.3, v0.20.2...
20 months ago

⭐️ Highlights

The main highlight of this release is that phase 1 of the Pandera internals re-write is complete 🎉🚀! This is a backwards-compatible re-write (unit tests FTW 😅) that should just work with your existing pandera code. Please submit bug reports if you encounter any regressions that weren't covered by the current test suite.

These PRs #913 #1109, and #1110 address #381, and essentially decouples pandas-specific logic from the pandera schema specification. In summary:

  • The pandera schema specifications are defined in pandera.api, containing:
    • schema base classes in pandera.api.base
    • pandera schema classes in pandera.api.pandas
    • the global check and hypothesis namespace in pandera.api.checks.Check and pandera.api.hypotheses.Hypothesis
    • decorators are provided in pandera.api.extensions to be able to register builtin and custom checks/hypotheses
  • The pandera backend validation logic is defined in pandera.backends, containing:
    • backend base classes in pandera.backends.base
    • pandas-specific backend validators in pandera.backends.pandas

Now, all pandas-specific logic is isolated to specific modules, where support for additional non-pandas-compliant schema specifications and their associated backends can be implemented either as 1st-party-maintained libraries (see issues for supporting polars and ibis) or 3rd party libraries.

🛣 Rewrite Roadmap

The bulk of the re-write is complete, however there are still some outstanding items:

  • Write validation backends for the existing pandas-like frameworks (dask, pyspark.pandas, modin). This may lead to refactoring some of the abstractions that came out of the rewrite.
  • Write an alpha version of the pandera-ibis package, which will create a schema specification and validation backends for ibis data structures (see issue #1105)
  • Document the process of writing your own 3rd party libraries based on pandera for any arbitrary statistical data container.

What's Changed

New Contributors

Full Changelog: v0.13.4...v0.14.0

Don't miss a new pandera release

NewReleases is sending notifications on new releases.