ggerganov/whisper.cpp v1.7.3 on GitHub

Overview

Massive performance improvements for the Metal backend, especially for beams > 1 and for quantized models
Reduce hallucinations during silence by @jkarthic in #2629
Implement no_speech_thold by @jkarthic in #2625

CPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
M2 Ultra	Metal	tiny	1	1	7.90	1.26	0.35	0.01	`ed733e8`
M2 Ultra	Metal	tiny-q5_0	1	1	8.44	1.23	0.36	0.01	`ed733e8`
M2 Ultra	Metal	tiny-q5_1	1	1	8.26	1.27	0.37	0.01	`ed733e8`
M2 Ultra	Metal	tiny-q8_0	1	1	8.03	1.21	0.35	0.01	`ed733e8`
M2 Ultra	Metal	base	1	1	13.77	1.80	0.42	0.02	`ed733e8`
M2 Ultra	Metal	base-q5_0	1	1	15.02	1.72	0.42	0.02	`ed733e8`
M2 Ultra	Metal	base-q5_1	1	1	14.93	1.74	0.42	0.02	`ed733e8`
M2 Ultra	Metal	base-q8_0	1	1	14.26	1.68	0.41	0.02	`ed733e8`
M2 Ultra	Metal	small	1	1	39.76	3.54	0.85	0.05	`ed733e8`
M2 Ultra	Metal	small-q5_0	1	1	45.07	3.47	0.87	0.05	`ed733e8`
M2 Ultra	Metal	small-q5_1	1	1	44.82	3.49	0.87	0.05	`ed733e8`
M2 Ultra	Metal	small-q8_0	1	1	41.79	3.30	0.84	0.05	`ed733e8`
M2 Ultra	Metal	medium	1	1	106.73	7.28	1.78	0.11	`ed733e8`
M2 Ultra	Metal	medium-q5_0	1	1	124.43	6.63	1.83	0.12	`ed733e8`
M2 Ultra	Metal	medium-q5_1	1	1	124.19	6.70	1.84	0.12	`ed733e8`
M2 Ultra	Metal	medium-q8_0	1	1	113.88	6.52	1.75	0.11	`ed733e8`
M2 Ultra	Metal	medium-dis	1	1	94.97	0.97	0.22	0.01	`ed733e8`
M2 Ultra	Metal	large-v2	1	1	193.33	10.53	2.65	0.20	`ed733e8`
M2 Ultra	Metal	large-v2-q5_0	1	1	229.22	9.52	2.72	0.23	`ed733e8`
M2 Ultra	Metal	large-v2-q5_1	1	1	229.40	9.62	2.73	0.23	`ed733e8`
M2 Ultra	Metal	large-v2-q8_0	1	1	207.30	9.36	2.59	0.21	`ed733e8`
M2 Ultra	Metal	large-v2-dis	1	1	171.43	1.09	0.25	0.02	`ed733e8`
M2 Ultra	Metal	large-v3-turbo	1	1	173.45	1.73	0.41	0.03	`ed733e8`
M2 Ultra	Metal	large-v3-turbo-q5_0	1	1	205.52	1.52	0.42	0.04	`ed733e8`
M2 Ultra	Metal	large-v3-turbo-q8_0	1	1	185.90	1.48	0.40	0.03	`ed733e8`

sync : ggml by @ggerganov in #2573
ruby : Follow source tree change by @KitaitiMakoto in #2580
Add q8_0 models to download-ggml-model.sh by @mrienstra in #2589
ruby : Add low-level methods to transcribe by @KitaitiMakoto in #2585
sync : ggml by @ggerganov in #2608
ruby : Sync whisper.cpp and model download feature by @KitaitiMakoto in #2617
Fix typo in download-ggml-model.sh by @mrienstra in #2623
Add Missing Include Directory for ggml-cpu in whisper.android CMakeLists by @Thamster in #2624
fix: prevent division by zero in soft_max vulkan shader by @gn64 in #2633
cmake : fix "amd64" processor string by @ggerganov in #2638
Fix typo in Java Binding README by @crummyh in #2637
Fix hallucinations during silence by @jkarthic in #2629
Implement no_speech_thold by @jkarthic in #2625
Improve consistency in stream exameple README commands by @crummyh in #2642
ruby : Add no_speech_thold by @KitaitiMakoto in #2641
sync : ggml by @ggerganov in #2639
ci : msys enable SDL2 build by @ggerganov in #2635

Full Changelog: v1.7.2...v1.7.3