github DrewThomasson/ebook2audiobook 25.2.18
V25.2.18

3 days ago

CHANGELOG

version 25.2.18:

  • version structure is now based on YEAR.MONTH.PATCH_NUMBER

  • Now no need to have admin privileges on Windows to install ebook2audiobook packages (replaced chocolatey by scoop)

  • added MPS processor

  • added custom models dropdown list

  • added voices dropdown list and play button to listen each of them

  • added voice extractor for upload voices (separate vocals from background and music)

  • added delete button for voices, custom models and audiobooks list

  • added builtin voices to the voices list and can be used for all TTS models

  • added "--output_dir" for custom output folder in headless mode

  • added directory options for ebook upload batch files in gradio/gui mode

  • added new output audio format ['m4b', 'm4a', 'mp4', 'webm', 'mov', 'mp3', 'flac', 'wav', 'ogg', 'aac'].
    More can be added on demand.

  • added running conversion cancellation via the ebook upload gradio component (when the "X" is clicked)

  • new global config settings:
    tmp_expire = for inactive session before cleanup, in days
    max_custom_model: max custom model on list (by session id)
    max_custom_voices: max custom voice on list (by session id)
    tts_default_settings: fine tuned XTTS default parameters
    (refer to ./lib/conf.py for all new configuration settings)

  • gradio GUI settings are now saved and restored on refresh and browser exit

  • resume conversion in headless and gradio GUI mode, when client page/connection lost or reloaded
    (however the user should restart the process manually with the same session id)

  • Math symbols and numbers to phonemes are now on all TTS engines
    (non covered languages are pronounced with the default_language_code set in ./lib/conf.py.
    PR are welcome to fix missing translations)

  • audio filtering, normalization and improvement of all upload voices and final audiobook
    to have the best sound presence and clarity.

  • fixed custom model upload

  • fixed missing pages in conversion

  • fixed modules and libraries missing during the installation (regex, mecab etc..)

  • various gradio design improvements

  • optimized multi language sentence splitting to minimize hallucinations and unnatural pauses

  • now numbers and maths symbols are said for fairseq and XTTSv2

  • the TTS model is now loaded once in the script and for all users using the same model

  • added coqui-tts built-in voices for all TTS engines and as standard in all languages

  • added new modal alerts for info, error, exception and warnings

  • removed docker_utils which was a docker with ffmpeg and calibre only

  • removed fine tuned parameters as it caused worse results than better

  • optimized sentences splitting

  • Many more fixes and new features, but don't remember all.... see by yourself ;)

Currently in development:

  • added Terminal output console to gradio/gui
  • implement more TTS engines (list not decided yet)
  • apprise notification
  • implement chapter summarizing to create background music and sounds
  • implement indices in the metadata for each sentence in the final file
    to eventually improve the pronounciation and replace it with the new sentence.
  • add built-in voice list of xttsv2
  • add czhech, croatian and others with cv/vits
  • add music interlude between chapters
  • adding chapters name (if chapters well detected) in place of number in the final metadata
  • split the output in multiple file if > 12hours # chapters as final
  • installation of the right torch and cuda version if GPU available so deepspeed can be used
  • automatic user crash bug report by email via a URL request
  • create a legends.py file for all gradio/gui legends to manage multilanguage
  • mark each sentence number in the metadata with the timecode so
    the user would be able to re*convert one sentence before to export the audiobook
    (it requires to not delete the ebook temp folder)
  • use "websocat" in "cmd.exe" and "bash/zsh" script to connect in headless mode via gradio and avoid tts load at each command

Don't miss a new ebook2audiobook release

NewReleases is sending notifications on new releases.