pypi datasets 1.16.0

latest releases: 2.19.0, 2.18.0, 2.17.1...
2 years ago

Datasets Changes

Datasets Features

  • Push to hub capabilities for Dataset and DatasetDict by @LysandreJik in #3098:
    • upload your dataset to the Hugging face Hub with the push_to_hub() method !
    • See documentation here
  • 200+ datasets now support streaming:
  • Resolve data_files by split name automatically by @lhoestq in #3221
    • It takes into account the file names to know which file goes into which split
    • See documentation here
  • Filter method for batched=True by @thomasw21 in #3244
  • Adding with_rank arg to pass process rank to map by @TevenLeScao in #3314

Dataset Cards

Metrics Changes

  • New: OpenAI's pass@k code evaluation metric by @lvwerra in #2916
  • Update: BLEURT - options to use updated bleurt checkpoints by @jaehlee in #3235
  • Update: CER - update to support latest release by @mariosasko in #3252
  • Update: WER - update to the documentation by @wooters in #3278

Documentation

Additional improvements and bug fixes

Citation

Deprecations

Full Changelog: 1.15.1...1.16.0

Don't miss a new datasets release

NewReleases is sending notifications on new releases.