github kedro-org/kedro 0.18.4

latest releases: 0.19.8, 0.19.7, 0.19.6...
21 months ago

Major features and improvements

  • Make Kedro instantiate datasets from kedro_datasets with higher priority than kedro.extras.datasets. kedro_datasets is the namespace for the new kedro-datasets python package.
  • The config loader objects now implement UserDict and the configuration is accessed through conf_loader['catalog'].
  • You can configure config file patterns through settings.py without creating a custom config loader.
  • Added the following new datasets:
Type Description Location
svmlight.SVMLightDataSet Work with svmlight/libsvm files using scikit-learn library kedro.extras.datasets.svmlight
video.VideoDataSet Read and write video files from a filesystem kedro.extras.datasets.video
video.video_dataset.SequenceVideo Create a video object from an iterable sequence to use with VideoDataSet kedro.extras.datasets.video
video.video_dataset.GeneratorVideo Create a video object from a generator to use with VideoDataSet kedro.extras.datasets.video
  • Implemented support for a functional definition of schema in dask.ParquetDataSet to work with the dask.to_parquet API.

Bug fixes and other changes

  • Fixed kedro micropkg pull for packages on PyPI.
  • Fixed format in save_args for SparkHiveDataSet, previously it didn't allow you to save it as delta format.
  • Fixed save errors in TensorFlowModelDataset when used without versioning; previously, it wouldn't overwrite an existing model.
  • Added support for tf.device in TensorFlowModelDataset.
  • Updated error message for VersionNotFoundError to handle insufficient permission issues for cloud storage.
  • Updated Experiment Tracking docs with working examples.
  • Updated MatplotlibWriter Dataset, TextDataset, plotly.PlotlyDataSet and plotly.JSONDataSet docs with working examples.
  • Modified implementation of the Kedro IPython extension to use local_ns rather than a global variable.
  • Refactored ShelveStore to its own module to ensure multiprocessing works with it.
  • kedro.extras.datasets.pandas.SQLQueryDataSet now takes optional argument execution_options.
  • Removed attrs upper bound to support newer versions of Airflow.
  • Bumped the lower bound for the setuptools dependency to <=61.5.1.

Minor breaking changes to the API

Upcoming deprecations for Kedro 0.19.0

  • kedro test and kedro lint will be deprecated.

Documentation

  • Revised the Introduction to shorten it
  • Revised the Get Started section to remove unnecessary information and clarify the learning path
  • Updated the spaceflights tutorial to simplify the later stages and clarify what the reader needed to do in each phase
  • Moved some pages that covered advanced materials into more appropriate sections
  • Moved visualisation into its own section
  • Fixed a bug that degraded user experience: the table of contents is now sticky when you navigate between pages
  • Added redirects where needed on ReadTheDocs for legacy links and bookmarks

Contributions from the Kedroid community

We are grateful to the following for submitting PRs that contributed to this release: jstammers, FlorianGD, yash6318, carlaprv, dinotuku, williamcaicedo, avan-sh, Kastakin, amaralbf, BSGalvan, levimjoseph, daniel-falk, clotildeguinard, avsolatorio, and picklejuicedev for comments and input to documentation changes

Don't miss a new kedro release

NewReleases is sending notifications on new releases.