github huggingface/datasets 4.4.0

15 hours ago

Dataset Features

  • Add nifti support by @CloseChoice in #7815

    • Load medical imaging datasets from Hugging Face:
    ds = load_dataset("username/my_nifti_dataset")
    ds["train"][0]  # {"nifti": <nibabel.nifti1.Nifti1Image>}
    • Load medical imaging datasets from your disk:
    files = ["/path/to/scan_001.nii.gz", "/path/to/scan_002.nii.gz"]
    ds = Dataset.from_dict({"nifti": files}).cast_column("nifti", Nifti())
    ds["train"][0]  # {"nifti": <nibabel.nifti1.Nifti1Image>}
  • Add num channels to audio by @CloseChoice in #7840

# samples have shape (num_channels, num_samples)
ds = ds.cast_column("audio", Audio())  # default, use all channels
ds = ds.cast_column("audio", Audio(num_channels=2))  # use stereo
ds = ds.cast_column("audio", Audio(num_channels=1))  # use mono

What's Changed

New Contributors

Full Changelog: 4.3.0...4.4.0

Don't miss a new datasets release

NewReleases is sending notifications on new releases.