Version 0.8.0 is a significant update to Great Expectations, with many improvements focused on configurability and usability. See the migrating versions guide for more details on specific changes, which include several breaking changes to configs and APIs.
Highlights include:
-
Validation Operators and Actions. Validation operators make it easy to integrate GE into a variety of pipeline runners. They offer one-line integration that emphasizes configurability. See the validation operators and actions feature guide for more information.
- The DataContext
get_batch
method no longer treatsexpectation_suite_name
orbatch_kwargs
as optional; they must be explicitly specified. - The top-level GE validate method allows more options for specifying the specific data_asset class to use.
- The DataContext
-
First-class support for plugins in a DataContext, with several features that make it easier to configure and
maintain DataContexts across common deployment patterns.- Environments: A DataContext can now manage
environment_and_secrets
more easily thanks to more dynamic and flexible variable substitution. - Stores: A new internal abstraction for DataContexts,
stores_reference
, make extending GE easier by consolidating logic for reading and writing resources from a database, local, or cloud storage. - Types: Utilities configured in a DataContext are now referenced using
class_name
andmodule_name
throughout the DataContext configuration, making it easier to extend or supplement pre-built resources. For now, the "type" parameter is still supported but expect it to be removed in a future release.
- Environments: A DataContext can now manage
-
Partitioners: Batch Kwargs are clarified and enhanced to help easily reference well-known chunks of data using a partition_id. Batch ID and Batch Fingerprint help round out support for enhanced metadata around data assets that GE validates. See
batch_identifiers
for more information. TheGlobReaderGenerator
,QueryGenerator
,S3Generator
,SubdirReaderGenerator
, andTableGenerator
all support partition_id for easily accessing data assets. -
Other Improvements:
- We're beginning a long process of some under-the-covers refactors designed to make GE more maintainable as we begin adding additional features.
- Restructured documentation: our docs have a new structure and have been reorganized to provide space for more easily adding and accessing reference material. Stay tuned for additional detail.
- The command build-documentation has been renamed build-docs and now by default opens the Data Docs in the users' browser.