Overview
This is a pre-release since I think there have been some reports about memory leaks which I haven't had the time to investigate and confirm. If these are resolved in the next days, will add them to the official 1.7.2
release next week.
- Various improvements in the Metal backend
- Fix extra memory usage for large samples
- Remove limit for
ggml_context
(i.e. more beams and processors are supported)
CPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
M2 Ultra | METAL | tiny | 1 | 1 | 9.51 | 1.39 | 0.41 | 0.01 | 83ac284 |
M2 Ultra | METAL | tiny-q5_0 | 1 | 1 | 9.57 | 1.41 | 0.42 | 0.01 | 83ac284 |
M2 Ultra | METAL | tiny-q5_1 | 1 | 1 | 8.74 | 1.39 | 0.42 | 0.01 | 83ac284 |
M2 Ultra | METAL | tiny-q8_0 | 1 | 1 | 8.36 | 1.33 | 0.41 | 0.01 | 83ac284 |
M2 Ultra | METAL | base | 1 | 1 | 14.27 | 1.90 | 0.63 | 0.02 | 83ac284 |
M2 Ultra | METAL | base-q5_0 | 1 | 1 | 15.50 | 1.90 | 0.65 | 0.02 | 83ac284 |
M2 Ultra | METAL | base-q5_1 | 1 | 1 | 15.67 | 1.88 | 0.65 | 0.02 | 83ac284 |
M2 Ultra | METAL | base-q8_0 | 1 | 1 | 14.69 | 1.81 | 0.63 | 0.02 | 83ac284 |
M2 Ultra | METAL | small | 1 | 1 | 40.85 | 3.77 | 1.43 | 0.05 | 83ac284 |
M2 Ultra | METAL | small-q5_0 | 1 | 1 | 45.99 | 3.90 | 1.52 | 0.05 | 83ac284 |
M2 Ultra | METAL | small-q5_1 | 1 | 1 | 46.19 | 3.83 | 1.50 | 0.06 | 83ac284 |
M2 Ultra | METAL | small-q8_0 | 1 | 1 | 42.90 | 3.65 | 1.46 | 0.05 | 83ac284 |
M2 Ultra | METAL | medium | 1 | 1 | 109.01 | 7.59 | 3.24 | 0.11 | 83ac284 |
M2 Ultra | METAL | medium-q5_0 | 1 | 1 | 126.78 | 7.55 | 3.45 | 0.13 | 83ac284 |
M2 Ultra | METAL | medium-q5_1 | 1 | 1 | 127.71 | 7.39 | 3.43 | 0.13 | 83ac284 |
M2 Ultra | METAL | medium-q8_0 | 1 | 1 | 115.97 | 7.21 | 3.35 | 0.12 | 83ac284 |
M2 Ultra | METAL | medium-dis | 1 | 1 | 97.74 | 1.06 | 0.36 | 0.01 | 83ac284 |
M2 Ultra | METAL | large-v2 | 1 | 1 | 196.99 | 11.29 | 5.06 | 0.20 | 83ac284 |
M2 Ultra | METAL | large-v2-q5_0 | 1 | 1 | 233.88 | 10.83 | 5.56 | 0.24 | 83ac284 |
M2 Ultra | METAL | large-v2-q5_1 | 1 | 1 | 234.03 | 10.73 | 5.46 | 0.24 | 83ac284 |
M2 Ultra | METAL | large-v2-q8_0 | 1 | 1 | 210.83 | 10.29 | 5.23 | 0.22 | 83ac284 |
M2 Ultra | METAL | large-v2-dis | 1 | 1 | 175.37 | 1.18 | 0.42 | 0.02 | 83ac284 |
M2 Ultra | METAL | large-v3-turbo | 1 | 1 | 177.35 | 1.85 | 0.73 | 0.03 | 83ac284 |
M2 Ultra | METAL | large-v3-turbo-q5_0 | 1 | 1 | 209.31 | 1.69 | 0.80 | 0.04 | 83ac284 |
M2 Ultra | METAL | large-v3-turbo-q8_0 | 1 | 1 | 189.55 | 1.64 | 0.75 | 0.03 | 83ac284 |
What's Changed
- Added OpenVino init on state by @sandrohanea in #2464
- Updating the Quick start by @stsfaroz in #2475
- max_length from max_target_positions by @CrispStrobe in #2477
- Add dtw preset for large-v3-turbo by @rotemdan in #2481
- make : fix GGML_VULKAN=1 build by @ggerganov in #2485
- Add Vulkan notice in README.md by @toboil-features in #2488
- Fix Ruby binding building by @KitaitiMakoto in #2484
- Update of README.md by @toboil-features in #2489
- whisper: fix index overflow by @Josscii in #2505
- ruby : Add Metal support by @KitaitiMakoto in #2516
- ruby: New segment callback by @KitaitiMakoto in #2506
- ruby : add more APIs by @KitaitiMakoto in #2518
- ruby: fix installation test by @KitaitiMakoto in #2519
- When DTW timestamps are enabled, defer new_segment_callback until after DTW compute step by @jettoblack in #2515
- ci : fix openblas build by @ggerganov in #2511
- whisper : reduce ggml_context usage by @ggerganov in #2525
- sync : ggml by @ggerganov in #2528
- passing samples_padded by ref to the threads. by @vinmisra in #2534
- fix ffmpeg v5 build by @stsydow in #2543
- fix: ggml-vulkan logs by @thewh1teagle in #2547
- Fix the instructions on the Ruby binding by @wilsonsilva in #2548
- whisper.swiftui : add model download list & bench methods by @jhen0409 in #2546
- ruby : Add more API by @KitaitiMakoto in #2551
- Fix building workflow for linux/arm64 container by @rai62 in #2555
- sync : ggml by @ggerganov in #2561
- whisper.swiftui : switch Mac dest to Mac (Designed for iPad) by @jhen0409 in #2562
New Contributors
- @stsfaroz made their first contribution in #2475
- @CrispStrobe made their first contribution in #2477
- @toboil-features made their first contribution in #2488
- @KitaitiMakoto made their first contribution in #2484
- @Josscii made their first contribution in #2505
- @jettoblack made their first contribution in #2515
- @vinmisra made their first contribution in #2534
- @stsydow made their first contribution in #2543
- @wilsonsilva made their first contribution in #2548
- @rai62 made their first contribution in #2555
Full Changelog: v1.7.1...v1.7.2-pre