github common-voice/common-voice release-v1.30.0
Sprint 30: June 10 - June 30

latest releases: sandbox-v1.117.1-rc1, stage-v1.117.2-rc1, release-v1.117.1...
3 years ago

This was a longer sprint than usual to accommodate Mozilla's virtual all hands, which took place between June 15-19. The biggest thing in this release:

  • A new dataset!! Common Voice Corpus 5 is now available to download, as well as singleword benchmark target segment. See the Discourse post for more information
  • This includes some back-end changes that accommodates displaying and saving multiple dataset versions, in preparation for allowing people to access older versions of the dataset more easily

As you might imagine, that took up most of the team's time. Here are the additional features and bugfixes for this release:

  • Migration that backfilled some data for single-sentence record limit for languages that are close to depleting their stock of available sentences to record, to improve the user experience. We are also actively investigating exemptions for smaller languages for this feature.
  • Added safety check to ensure all client_ids are RFC-4122 compliant GUIDs
  • We fixed a long-standing issue where clips were occasionally being saved as the wrong locale
  • Regular localization and sentence import updates

Don't miss a new common-voice release

NewReleases is sending notifications on new releases.