m-bain/whisperX v3.2.0 on GitHub

Device and Language Support

added Korean wav2vec2 model by @Boulaouaney in #277
Add Czech alignment model by @Thebys in #280
Adding Norwegian Bokmål and Norwegian Nynorsk by @peregilk in #636
Support language names in --language parameter. by @jkukul in #517
Add align model for catalan language. by @davidmartinrius in #581
add missing Cantonese in supported languages by @MahmoudAshraf97 in #617
Add alignment model for Malayalam by @kurianbenoy in #585
Added Romanian phoneme-based ASR model by @Majdoddin in #791
added alignment for sk and sl languages by @jan-panoch in #852
Add war2vec model for Vietnamese in #278
Add Urdu model support for alignment by @abCods in #374
chore(writer): Join words without spaces for ja, zh by @jim60105 in #440

Bug Fixes and Stability Improvements

fix Unequal Stack Size VAD error by @m-bain in #281
fix: Bug in type hinting by @VisionOra in #294
pin faster whisper by @sorgfresser in #474
Fix repeat transcription on different languages and proper suppress_numerals use by @Joemgu7 in #395
fix writer fail on segments 0 by @sorgfresser in #429
fix missing speaker prefix by @invisprints in #438
fix: correct defaut_asr_options with new options (patch 0.8) by @remic33 in #458
Fixes --model_dir path by @canoalberto in #648
fix: force ctranslate to version 4.4.0 by @Barabazs in #946
fix: update faster-whisper dependencies by @cococig in #716
fix: ZeroDivisionError when --print_progress True by @mvoggu in #494
Minor fixes for word options and subtitles by @amolinasalazar in #549
fix unboundlocalerror by @sorgfresser in #554
Fix: Allow vad options to be configurable by passing to FasterWhisperPipeline and merge_chunks. by @abettke in #507
fix minimum input length for torch wav2vec2 models by @MahmoudAshraf97 in #510
fix(diarize): key error on empty track by @characat0 in #518
pip compliance for git+ installs by @spbisc97 in #603

Documentation Updates

adds link to whisperX medium on replicate.com by @CaRniFeXeR in #431
Document --compute_type command line option by @dotgrid in #430
adding link to Replicate demo by @daanelson in #352
fix: typo in error message by @zamoshchin in #493
Fix link in README.md by @jimregan in #668
Update README.md by @valentt in #509
Add a special note about Speaker-Diarization-3.0 in readme by @kaihe-stori in #521
Update README to correct speaker diarization version link by @gillens in #618
Update README.md by @mlopsengr in #630
fix link by @M0HID in #605
Remove torchvision from README by @baer in #378

Miscellaneous Changes

move model to assets by @m-bain in #945
Update alignment.py by @Ayushi-Desynova in #418
Update alignment.py by @awerks in #427
push contributions from main by @m-bain in #290
make diarization faster by @davidas1 in #400
Add device_index option by @sorgfresser in #266
Add transcribe keywords by @sorgfresser in #269
Added download path parameter. by @prameshbajra in #284
Suppress numerals by @m-bain in #303
Add Audacity export by @Ca-ressemble-a-du-fake in #309
Update transcribe.py -> small change in batch_size description by @mabergerx in #382
Suggest using pytorch-cuda 11.8 instead of 11.7 by @tijszwinkels in #255
feat: Add merge chunks chunk_size as arguments. by @jim60105 in #445
A solution to long subtitles and words without timestamps by @awerks in #459
chore(writer): improve text display(ja etc) in json file by @darwintree in #472
add faster whisper threading by @sorgfresser in #473
Pyannote3 by @remic33 in #492
Update alignment.py by @piuy11 in #487
Pass patience and beam_size to faster-whisper. by @jkukul in #527
remove the minimum length for alignment and print the failing segment by @MahmoudAshraf97 in #529
Update setup.py to use pyannote.audio version with working GPU by @wuurrd in #531
Update setup.py to download pyannote depending on platform by @justinwlin in #541
Drop ffmpeg-python dependency and call ffmpeg directly. by @hidenori-endo in #570
no align based on space by @sorgfresser in #556
Update asr.py and make the model parameter be used by @kaka1909 in #580
Move load_model after WhisperModel by @DougTrajano in #584
Update pyannote to 3.1.0 by @remic33 in #586
support for large-v3 by @MahmoudAshraf97 in #599
Added option to load Custom VAD model to load model method by @Swami-Abhinav in #654
Update pyannote to v3.1.1 to fix a diarization problem (and diarize.py) by @santialferez in #646
Get rid of numeral_symbol_tokens variable in printed message by @KossaiSbai in #669
Add Replicate large-v3 demo by @victor-upmeet in #703
local vad model by @m-bain in #944
Feat: add new align models - SHORT by @Equipo45 in #922
Update alignment.py by @peregilk in #687

Full Changelog: v3.1.1...v3.2.0