pypi sentencepiece 0.1.98
v0.1.98

latest releases: 0.2.0, 0.1.99
13 months ago

Major changes

  • Python 3.11 support (wheel packages for python 3.11 are available)
  • Includes the entire full sources in the source python package to reduce the pip install troubles.
  • Improves the algorithm to initialize unigram seed vocabulary. Coverage is improved.

New features

  • [ALL] Added the feature to train the model with pre-tokenization boundary constraints. (--pretokenization_delimiter) flag

Bug fixes & minor changes

  • [ALL] Makes the error message more descriptive.
  • [ALL] Fixes the crash error when std::random_device failed
  • [ALL] Fixes the build error on Raspberry pi around atomic operation
  • [ALL] Fixes the minor bugs in nbest enumeration
  • [ALL] Fixes the build error when using the external protobuf library.
  • [ALL] Fixes the build error on a big-endian machine.
  • [Windows] Use /MD build flag instead of /MT.

Don't miss a new sentencepiece release

NewReleases is sending notifications on new releases.