ggerganov/whisper.cpp v1.7.3-pre on GitHub

Overview

Massive performance improvements for the Metal backend, especially for beams > 1. Especially for quantized models.
Setting as "pre-release" since there have been major changes to the build system (now using CMake) and I wan't to gather some feedback about how well the project builds now on various platforms. Please leave comments in the discussion to help fix any remaining issues before the official release.

CPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
M2 Ultra	Metal	tiny	1	1	7.90	1.26	0.35	0.01	`ed733e8`
M2 Ultra	Metal	tiny-q5_0	1	1	8.44	1.23	0.36	0.01	`ed733e8`
M2 Ultra	Metal	tiny-q5_1	1	1	8.26	1.27	0.37	0.01	`ed733e8`
M2 Ultra	Metal	tiny-q8_0	1	1	8.03	1.21	0.35	0.01	`ed733e8`
M2 Ultra	Metal	base	1	1	13.77	1.80	0.42	0.02	`ed733e8`
M2 Ultra	Metal	base-q5_0	1	1	15.02	1.72	0.42	0.02	`ed733e8`
M2 Ultra	Metal	base-q5_1	1	1	14.93	1.74	0.42	0.02	`ed733e8`
M2 Ultra	Metal	base-q8_0	1	1	14.26	1.68	0.41	0.02	`ed733e8`
M2 Ultra	Metal	small	1	1	39.76	3.54	0.85	0.05	`ed733e8`
M2 Ultra	Metal	small-q5_0	1	1	45.07	3.47	0.87	0.05	`ed733e8`
M2 Ultra	Metal	small-q5_1	1	1	44.82	3.49	0.87	0.05	`ed733e8`
M2 Ultra	Metal	small-q8_0	1	1	41.79	3.30	0.84	0.05	`ed733e8`
M2 Ultra	Metal	medium	1	1	106.73	7.28	1.78	0.11	`ed733e8`
M2 Ultra	Metal	medium-q5_0	1	1	124.43	6.63	1.83	0.12	`ed733e8`
M2 Ultra	Metal	medium-q5_1	1	1	124.19	6.70	1.84	0.12	`ed733e8`
M2 Ultra	Metal	medium-q8_0	1	1	113.88	6.52	1.75	0.11	`ed733e8`
M2 Ultra	Metal	medium-dis	1	1	94.97	0.97	0.22	0.01	`ed733e8`
M2 Ultra	Metal	large-v2	1	1	193.33	10.53	2.65	0.20	`ed733e8`
M2 Ultra	Metal	large-v2-q5_0	1	1	229.22	9.52	2.72	0.23	`ed733e8`
M2 Ultra	Metal	large-v2-q5_1	1	1	229.40	9.62	2.73	0.23	`ed733e8`
M2 Ultra	Metal	large-v2-q8_0	1	1	207.30	9.36	2.59	0.21	`ed733e8`
M2 Ultra	Metal	large-v2-dis	1	1	171.43	1.09	0.25	0.02	`ed733e8`
M2 Ultra	Metal	large-v3-turbo	1	1	173.45	1.73	0.41	0.03	`ed733e8`
M2 Ultra	Metal	large-v3-turbo-q5_0	1	1	205.52	1.52	0.42	0.04	`ed733e8`
M2 Ultra	Metal	large-v3-turbo-q8_0	1	1	185.90	1.48	0.40	0.03	`ed733e8`

What's Changed

sync : ggml by @ggerganov in #2573
ruby : Follow source tree change by @KitaitiMakoto in #2580
Add q8_0 models to download-ggml-model.sh by @mrienstra in #2589
ruby : Add low-level methods to transcribe by @KitaitiMakoto in #2585
sync : ggml by @ggerganov in #2608

Full Changelog: v1.7.2...v1.7.3-pre